- Compare data sets by describing their shapes, centers, spreads, and outliers
Let’s analyze the salary data set in the statistical tool and create side-by-side dotplots or histograms.
By displaying the data set in side-by-side dotplots or histograms, we can describe and compare features of the distribution of the quantitative variable. The features used to describe the distribution of a quantitative variable are the shape, center, spread, and presence of outliers.
- Here are some observations about dotplots:
- Individual variable values are visible, particularly when the data set is small.
- Descriptions of shape, center, and spread are not affected by how the dotplot is constructed.
- We can accurately calculate the overall range (largest value – smallest value).
- Here are some observations about histograms:
- Individual variable values are not visible.
- Grouping individuals into bins of equal-sized intervals is particularly useful when analyzing large data sets.
- We can easily use percentages, also called relative frequencies, to describe the distribution.
- Descriptions of shape, center, and spread are affected by how the bins are defined.