Comparing Quantitative Distributions: Learn It 3

  • Compare data sets by describing their shapes, centers, spreads, and outliers

Which one should I use?

We used two types of graphs to analyze the distribution of a quantitative variable: histograms and dotplots.

  • Some observations about histograms:
    • Individual variable values are not visible.
    • Grouping individuals into bins of equal-sized intervals is particularly useful when analyzing large data sets.
    • We can easily use percentages, also called relative frequencies, to describe the distribution.
    • Descriptions of shape, center, and spread are affected by how the bins are defined.
    • Histograms can make it easier to identify skewness and modality (i.e., whether the distribution is symmetric, skewed, unimodal, bimodal, etc.), particularly for large data sets.
    • Outliers may not be as apparent in histograms, especially if the bin width is too large, which can mask the variability in the data.
  • Some observations about dotplots:
    • Individual variable values are visible, particularly when the data set is small.
    • Descriptions of shape, center, and spread are not affected by how the dotplot is constructed.
    • We can accurately calculate the overall range (largest value – smallest value).
    • Dotplots can make it easier to identify outliers and gaps within the data due to the visibility of individual data points.
    • Dotplots are most effective for smaller datasets, as they can become cluttered and less informative with larger datasets.

How do we decide when to use a dotplot and when to use a histogram?
There are no rules here. Each type of graph can highlight different aspects of the data.

  • Size of the Dataset: The choice between a dotplot and a histogram often depends on the size of the dataset. Dotplots work well for small to moderately sized datasets, while histograms are better suited for larger datasets.
  • Detail Required: Dotplots provide detail at the individual data point level, which can be useful for in-depth analysis, whereas histograms provide a summary view, which can be useful for identifying general patterns.