Chi-Square Test of Homogeneity – Learn It 6

  • Understand the standardized residuals from a chi-square test of homogeneity

Even though we have already drawn a conclusion from our hypothesis test, there is still some information we can glean by looking at the difference between the observed count and the expected count for each cell. The data analysis tool calls this difference the residual for that cell (and the idea is similar to the concept of residuals you saw when looking at the differences between observed values and predicted values in the linear regression context).

Residuals are calculated using the formula: [latex]\text{Residual} = \text{Observed} - \text{Expected}[/latex]

Since the values in our cells may vary quite a bit, it’s a good idea to look at what the data analysis tool calls standardized residuals instead.

standardized residuals

These are sometimes referred to as Standardized Pearson residuals.

Standardized residuals are values that standardize the residuals so that if the null hypothesis is assumed to be true, they can be interpreted as normal [latex]z[/latex]-scores.

In particular, most standardized residuals for a given test will fall between [latex]-2[/latex] and [latex]2[/latex]. We can use these standardized residuals to determine how far off our observed count is from what was expected if the null hypothesis is true (i.e., if the distributions are really the same). The sign of the standardized residual tells us whether we observed more cases in that cell than we expected (a positive residual) or fewer cases than we expected (a negative residual).


[Trouble viewing? Click to open in a new tab.]

A word of caution: As we saw in the preview assignment, the degrees of freedom for a chi-square test of homogeneity are not related to the sample size at all, so they do not increase as the sample size increases. The degrees of freedom depend only on the number of rows and columns in the associated two-way table. As a consequence, it can be that if the sample size is very large, a chi-square test may result in rejecting the null hypothesis even when the actual differences between the distributions are small. In our airline example, we had a very large sample size for each population, and we got a very small P-value that led us to reject the null hypothesis.