Interpreting the Mean and Median: Fresh Take

  • Name the features of the distribution of a data set using statistical language
  • Describe the connection between the distribution of a data set and its mean and median
Recall that we think of the mean as the “average” data value and the median as the [latex]50[/latex]th percentile, the value that splits the data in half. 
Let’s say the mean of a data set is given as [latex]10.5[/latex] and the median as [latex]11[/latex]. Which of the following statements are true? Explain.

  1. The median tells us a typical value for this data set. That is, if we took all the values and spread them evenly about, each value would be about [latex]11[/latex].
  2. About half the data values fall below [latex]11[/latex] and half fall above.
  3. The most common data value appearing is [latex]10.5[/latex].
  4. A typical data value for this set is [latex]10.5[/latex]. That is, if we distributed the sum of all the values evenly, each value would be about [latex]10.5[/latex].

Recall the data set about employee salaries.

Suppose that the first data set lists the monthly salaries (in thousands of dollars) for all six employees at a company during the month of January. For example, Employee [latex]1[/latex] made [latex]\$4,000[/latex] in January, Employee [latex]2[/latex] made [latex]\$6,000[/latex], and so on. We’ll consider this amount the regular salary per month for each of these employees.
The second data set lists the monthly salaries (in thousands of dollars) for the same six employees during the month of February.

Employee

Monthly Salary in January

(in thousands of dollars)

Monthly Salary in February

(in thousands of dollars)

Employee 1 [latex]4[/latex] [latex]4[/latex]
Employee 2 [latex]6[/latex] [latex]8[/latex]
Employee 3 [latex]3[/latex] [latex]3[/latex]
Employee 4 [latex]5[/latex] [latex]5[/latex]
Employee 5 [latex]6[/latex] [latex]6[/latex]
Employee 6 [latex]3[/latex] [latex]3[/latex]
We saw that the median and the mean employee salaries for January were the same. What can we understand from that information?

  1. The median of the data set implies that ____________ made more than [latex]\$4,500[/latex] in January and _________ made less.
  2. The mean of the data set implies that if the January salaries had been added up and evenly distributed across all six employees, each person would have received ________________.

It was interesting that the mean and the median were identical values. This tells us that the the salaries were evenly distributed among high and low values and the distribution was symmetrical, without skew.

But what happens if we change one of the values in the data set?

Comparing Mean and Median

Let’s look at the data set of employee salaries from February which includes a big raise for one employee. How will the mean of the February salaries compares to the mean of the January salaries?

Was the mean you calculated for February salaries higher, lower, or similar? What do you think caused that to be true?

Now let’s consider a slightly different question.

It may take some time before you really feel comfortable interpreting means and medians and understanding what they imply about a data set. A key idea to take from this activity is that while the median stays relatively fixed in a data set, if one value changes by a large amount, the mean does not. This tells us that the mean is sensitive to the presence of extreme values in the data set.