Measures of Variability: Fresh Take

  • Calculate and describe standard deviation.

Breakdown of the formula for standard deviation of a sample, [latex]s[/latex].

[latex]s=\sqrt{\dfrac{\sum \left(x-\bar{x}\right)^2}{n-1}}[/latex]

  • The distance from each observation to the mean is known as a deviation from the mean and is expressed as [latex]\left(x-\bar{x}\right)[/latex]
  • The deviations from the mean are squared in the formula because some observations are above the mean, thus [latex]\left(x-\bar{x}\right)>0[/latex] (the difference is positive), and some observations are below the mean, thus [latex]\left(x-\bar{x}\right)<0[/latex] (the difference is negative). Squaring ensures the differences will each be expressed as positive distances and won’t cancel each other out when summed up.
  • The [latex]\sum[/latex] symbol sums up the squared deviations for all [latex]n[/latex] observations.
  • The denominator in the formula for a sample standard deviation is [latex]\left(n-1\right)[/latex] rather than [latex]n[/latex] as in the formula for the population standard deviation.
    • Why do we divide by 1 fewer than the sample size, [latex]\left(n-1\right)[/latex]?  

  • The square root is taken in order to express the spread in terms of the units of the observations. Recall that we squared the differences to express them as positive distances, which resulted in squared observation units. Taking the square root can be thought of as “undoing” the earlier squaring. For example, assume that within the context in which you are working, the data are in terms of dollars. If we do not take the square root, the standard deviation will be in terms of dollars squared, which is not something commonly used.
  • The standard deviation, [latex]s[/latex], represents the “typical” distance of an observation from the mean of the data set.
The following steps can be applied to calculate a standard deviation by hand:

  1. Calculate the mean of the population or sample.
  2. Find the deviation between each data value and the mean.
  3. Square each of the deviations.
  4. Add up all the squared differences and divide by:
    • the total number of observations ([latex]n[/latex]) in the case of a population.
    • 1 fewer than the total ([latex]n-1[/latex]) in the case of a sample.
  5. Take the square root of the result of step 4.

Calculating the Standard Deviation

Let’s consider this small sample data set: [latex]2, 2, 4, 5, 6, 7, 9[/latex].

a) Find the mean and the standard deviation of the data set.

Center and spread: the mean is [latex]5[/latex] and the standard deviation is [latex]2.58[/latex].

b) Find the “typical” range of values for this data set. 

Typical range of values is between [latex]2.42[/latex] and [latex]7.58[/latex].