Chi-Square Test of Independence – Learn It 3

  • Complete a chi-square test of independence
  • Write the conclusion of a chi-square test of independence in context of the problem

The mechanics of performing a chi-square test of independence are the same as those for the chi-square test of homogeneity.

Since we are dealing with two variables here instead of just one, we can find the expected counts for each cell by focusing on the marginal distribution of either variable.

marginal distribution

The marginal distribution of a variable gives the distribution of one of the variables with no regard to the other variable whatsoever.

In the table, this will be either the total row or the total column. One way to remember this is that the “margins” are on the outsides of a piece of paper (sides, top, and bottom), and the total row and column are the outside row and column of the table (on the side and bottom).

    Income level      
    [latex]<$30,000[/latex] [latex]$30,000–$74,999[/latex] [latex]$75,000[/latex] and up Total
Education level Post-Grad Degree [latex]2[/latex] [latex]8[/latex] [latex]46[/latex] [latex]56[/latex]
  College Degree [latex]39[/latex] [latex]113[/latex] [latex]202[/latex] [latex]354[/latex]
  Some College [latex]131[/latex] [latex]138[/latex] [latex]120[/latex] [latex]389[/latex]
  HS Grad [latex]175[/latex] [latex]129[/latex] [latex]65[/latex] [latex]369[/latex]
  No HS Degree [latex]78[/latex] [latex]32[/latex] [latex]8[/latex] [latex]118[/latex]
  Total [latex]425[/latex] [latex]420[/latex] [latex]441[/latex] [latex]1,286[/latex]

If Income level and Education level are independent, the proportion of people with incomes under [latex]$30,000[/latex] should be the same regardless of education level, so it should match the overall proportion of individuals with incomes under [latex]$30,000[/latex]:

[latex]\dfrac{\text{Total individuals with incomes under }$30,000}{\text{Total individuals in the sample}}[/latex][latex]=\dfrac{425}{1286} = 0.33048212 \text{ or } 33.048212\%[/latex]

The proportions you found in the previous table should be the proportions of income level for every value of the variable Education level and Income level.

For example, about [latex]33.05\%[/latex] of the [latex]56[/latex] people with post-grad degrees should have an income level under [latex]$30,000[/latex]:

[latex]33.048212\% \text{ of } 56 = 0.33048212 \times 56 = 18.507[/latex]

For example, of the [latex]425[/latex] individuals sampled with an income level under [latex]$30,000[/latex], about [latex]4.35\%[/latex] of them should have post-graduate degrees, so there is an expected count of

[latex]4.354588\% \text{ of } 425 = 0.04354588 \times 425 = 18.507[/latex]

Calculating each expected count for a table is tedious, but it is needed to calculate the [latex]\chi^2[/latex] value to make an inference about the population.

However, we can utilize technology to help us conduct all of the calculations.