The distance from each observation to the mean is known as a deviation from the mean and is expressed as [latex]\left(x-\bar{x}\right)[/latex]
The deviations from the mean are squared in the formula because some observations are above the mean, thus [latex]\left(x-\bar{x}\right)>0[/latex] (the difference is positive), and some observations are below the mean, thus [latex]\left(x-\bar{x}\right)<0[/latex] (the difference is negative). Squaring ensures the differences will each be expressed as positive distances and won’t cancel each other out when summed up.
The [latex]\sum[/latex] symbol sums up the squared deviations for all [latex]n[/latex] observations.
The denominator in the formula for a sample standard deviation is [latex]\left(n-1\right)[/latex] rather than [latex]n[/latex] as in the formula for the population standard deviation.
Why do we divide by 1 fewer than the sample size, [latex]\left(n-1\right)[/latex]?
Why do we divide by [latex]\left(n-1\right)[/latex]?
Because the sample standard deviation is an underestimation. Recall that a sample is representative of a population if the characteristics of the sample tend to be similar to the characteristics of the population from which it was obtained. The sample standard deviation tends to underestimate the population standard deviation. (This can be shown mathematically but its beyond the scope of what we need here.) We can fix that by increasing the size of our sample standard deviation if we divide by [latex]\left(n-1\right)[/latex] in the sample standard deviation formula rather than by [latex]n[/latex].
Because we are using degrees of freedom in the denominator. You may have heard that the denominator in the standard deviation formula is called the degrees of freedom, abbreviated df. That’s true, and it helps us to compensate for the underestimation that crops up when we divide strictly by sample size. There’s a lot going on here mathematically, but we can think of it this way: Dividing by [latex]\left(n-1\right)[/latex] instead of [latex]n[/latex] helps our sample standard deviation more closely resemble the true (usually unknowable) population standard deviation. This will help make our statistical analysis more reasonable.
What are degrees of freedom, anyway? A nice way to think of degrees of freedom [latex]\left(n-1\right)[/latex] is to imagine a set of three numbers whose mean is, for example [latex]5[/latex]: [latex]4, 5[/latex], and [latex]6[/latex]. If those three numbers were written on pieces of paper in a hat, and you chose two of them, say [latex]4[/latex] and [latex]5[/latex], first, the only way to get a mean of [latex]5[/latex] from the numbers on three scraps of paper would be that the next choice must have a [latex]6[/latex] on it. We could say that the first two scraps were free to vary; they could have been [latex]4[/latex] or [latex]5[/latex] or [latex]6[/latex] as they pleased. But the third pick couldn’t vary. After choosing the [latex]4[/latex] and the [latex]5[/latex] freely first, there was no freedom for the choice of the third in order to obtain the desired mean. Only two of our choices had a degree of freedom, so we say that the degrees of freedom of a sample size of [latex]3[/latex] is [latex]\left(3-1\right)=2[/latex].
The square root is taken in order to express the spread in terms of the units of the observations. Recall that we squared the differences to express them as positive distances, which resulted in squared observation units. Taking the square root can be thought of as “undoing” the earlier squaring. For example, assume that within the context in which you are working, the data are in terms of dollars. If we do not take the square root, the standard deviation will be in terms of dollars squared, which is not something commonly used.
The standard deviation, [latex]s[/latex], represents the “typical” distance of an observation from the mean of the data set.
The following steps can be applied to calculate a standard deviation by hand:
Calculate the mean of the population or sample.
Find the deviation between each data value and the mean.
Square each of the deviations.
Add up all the squared differences and divide by:
the total number of observations ([latex]n[/latex]) in the case of a population.
1 fewer than the total ([latex]n-1[/latex]) in the case of a sample.
Take the square root of the result of step 4.
Calculating the Standard Deviation
Let’s consider this small sample data set: [latex]2, 2, 4, 5, 6, 7, 9[/latex].
a) Find the mean and the standard deviation of the data set.
Center and spread: the mean is [latex]5[/latex] and the standard deviation is [latex]2.58[/latex].
Here are the steps:
Calculate the mean of the sample: [latex]\stackrel{¯}{x}=\frac{2+2+4+5+6+7+9\text{}}{7}\text{}=\text{}\frac{35}{7}=5[/latex]
Find the deviation between each data value and the mean ([latex]x-\stackrel{¯}{x}[/latex]): [latex]\begin{array}{l}2-5=-3\\ 2-5=-3\\ 4-5=-1\\ 5-5=0\\ 6-5=1\text{}\\ 7-5=2\text{}\\ 9-5=4\end{array}[/latex]
Square each of the deviation ([latex](x-\stackrel{¯}{x})^2[/latex]): [latex]\begin{array}{l}{(2-5)}^{2}={(-3)}^{2}=9\\ {(2-5)}^{2}={(-3)}^{2}=9\\ {(4-5)}^{2}={(-1)}^{2}=1\\ {(5-5)}^{2}={0}^{2}=0\\ {(6-5)}^{2}={1}^{2}=1\\ {(7-5)}^{2}={2}^{2}=4\\ {(9-5)}^{2}={4}^{2}=16\end{array}[/latex]
Add up all the squared deviations and divide by [latex]n-1[/latex] (the count minus 1). Note that we divide by [latex]n-1[/latex] instead of [latex]n[/latex] because our data set is from a sample. [latex]\frac{9+9+1+0+1+4+16}{7-1}=\frac{40}{6}\approx 6.67[/latex]
To scale back the value to account for the squaring we did in step 2, we take the square root of the value we found in step 4: [latex]\sqrt{6.67}\approx 2.58[/latex]
b) Find the “typical” range of values for this data set.
Typical range of values is between [latex]2.42[/latex] and [latex]7.58[/latex].
The shaded box on the following dotplot indicates [latex]1[/latex] standard deviation to the right and left of the mean.
Figure 1, Dotplot showing 1 standard deviation to the right and left of the mean.
The standard deviation represents the “typical” distance of an observation from the mean of the data set.
A standard deviation on either side of the mean gives a range of typical values: 5 − 2.58 = 2.42 and 5 + 2.58 = 7.58.
So, the typical data values are between 2.42 and 7.58.