- Write a null and alternative hypothesis for a chi-square test
- Calculate and interpret the value of a chi-square statistics in context of a real-world problem
We use the chi-square hypothesis test to determine whether the data “fit” a particular distribution or not. The chi-square statistic compares the size of any differences between the expected counts and the actual observed counts.
Chi-Square Test Statistic ([latex]\chi^2[/latex])
You just calculated the value of the chi-square (pronounced “kai-square”) test statistic for this problem.
[latex]\chi^2[/latex] test statistic
The [latex]\chi^2[/latex] test statistic measures the overall distance between observed and expected counts.
The greater the chi-square test statistic, the further the observed counts are from what we expected.
Here is the formula for the chi-square test statistic:
[latex]\chi^2=\sum\dfrac{(\text{Observed}-\text{Expected})^2}{\text{Expected}}[/latex]
This formula shows what we did in the question above — we added up (the large sigma [latex]\sum[/latex] represents summation) the [latex]\dfrac{(O-E)^2}{E}[/latex] for each quarter of the year (each category).
It’s important to remember the intuition behind this formula—we get the differences, square them to get rid of the negative values, and then scale them by dividing the squared differences by the expected counts. In this way, we get a robust measure of the overall difference between the observed and expected counts for a categorical variable.