Describing Data Numerically: Background You’ll Need 2

  • Read and interpret a histogram

Illustrating Frequency with Histograms

When presented with large data sets, the dotplot is sometimes cumbersome to put together. In addition, it may not be the cleanest way to present the data. For large data sets, a histogram can represent the numerous data points more simply as bars instead of an immense amount of data points in a dotplot.

histogram

Similar to a dotplot, a histogram is another tool to display the frequency and distribution of quantitative data.

Each bar represents a group of data points that fall into an interval of measurable values. The width, called binwidth, of each bar is equivalent and can represent any interval of values desired.

Important notes about histograms:

  • A visualization of quantitative data
  • More concisely display large data sets
  • Histograms do not display individual data values like dotplots.
  • The horizontal axis on a histogram is partitions into intervals.

Comparing Groups with Histograms

When we conduct statistical experiments, we often work with multiple data sets to make inferences regarding the variable of interest. Let’s look at an example where we compare different groups using histograms as the graphical displays.

Dotplots vs Histograms

We used two types of graphs to analyze the distribution of a quantitative variable: dotplots and histograms. In these graphs, we can see:

  • The possible values of the variable.
  • The number of individuals with each variable value or interval of values.

How do we decide when to use a dotplot and when to use a histogram? There are no rules here. Each type of graph can be used to highlight different aspects of the data.

What we know about dotplots:

  • Individual variable values are visible, particularly when the data set is small.
  • Descriptions of shape, center, and spread are not affected by how the dotplot is constructed.
  • We can accurately calculate the overall range (largest value – smallest value).

What we know about histograms:

  • Individual variable values are not visible.
  • Grouping individuals into bins of equal-sized intervals is particularly useful when analyzing large data sets.
  • We can easily use percentages, also called relative frequencies, to describe the distribution.
  • Descriptions of shape, center, and spread are affected by how the bins are defined.