Comparing Two Population Means (Independent Samples): Learn It 3

Lumen Learning

Comparing Two Population Means (Independent Samples): Learn It 3

Complete a two-sample [latex]t[/latex]-test for independent population means from hypotheses to conclusions

Let’s visualize the difference in means.

<span data-mce-type="bookmark" style="display: inline-block; width: 0px; overflow: hidden; line-height: 0;" class="mce_SELRES_start"></span>
[Trouble viewing? Click to open in a new tab.]

The analysis above uses descriptive statistics only. How can we make an inference about the increase in hate crimes in the United States?

We can make an inference using a hypothesis test to provide evidence that there is an increase in the number of hate crimes between 2019 and 2020. A hypothesis test is needed to go beyond the visualizations and show that the difference is not simply sampling variability.

Standard Error of the Difference of Means

We can use a hypothesis test to determine if the observed difference in sample means is consistent with a hypothesized difference in population means.

To do this, we use what we know about the sampling distribution of [latex]\bar{x}_1-\bar{x}_2[/latex] and, in particular, its estimated standard deviation (the standard error). Recall that you learned that the difference in the sample means, [latex]\bar{x}_1-\bar{x}_2[/latex], also has an approximately normal distribution, centered at the difference of the population means, [latex]\mu_1-\mu_2[/latex].

The standard deviation is given by the following formula: [latex]\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}[/latex]

In practice, we will have to estimate the standard deviation because it depends on the unknown population standard deviations. Replacing [latex]\sigma_1[/latex] and [latex]\sigma_2[/latex] by the sample standard deviations [latex]s_1[/latex] and [latex]s_2[/latex], we will get the standard error of the difference.

standard error of the difference of means

standard error of [latex]\bar{x}_1-\bar{x}_2[/latex]: [latex]\sqrt{\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2}}[/latex]

Now that we have the standard error, we can calculate the test statistic, P-value, and make an inference about the population. We can leverage technology to generate our statistical output and use it to interpret our results.

County in California	Year 2019	Year 2020
Alameda	2.6	3.1
Amador	6.3	7.5
Butte	2.3	1.4
Colusa	4.7	4.6
Contra Costa	3	2.3
Del Notre	3.6	3.6
El Dorado	0.5	2.6
Fresno	1.6	1.7
Glenn	3.5	3.5
Humboldt	7.4	2.2
Imperial	0.6	1.1
Kern	0.8	1.2
Kings	0.7	0.7
Lake	4.7	4.7
Lassen	3.3	3.3
Los Angeles	4.1	5.3
Marin	5	8.6
Mendocino	2.3	1.2
Merced	0.4	1.1
Monterey	1.6	2.1
Napa	1.4	1.4
Nevada	1	1
Orange	2.3	2.6
Placer	1.5	2.5
Riverside	1.5	1.6
Sacramento	1	1.4
San Benito	1.6	1.6
San Bernardino	1.1	2
San Diego	3	3.7
San Fransisco	7.4	6.2
San Joaquin	0.7	1.3
San Luis Obispo	4.6	4.6
San Mateo	2.2	2.5
Santa Barbara	0.2	0.9
Santa Clara	2.6	6.3
Santa Cruz	3.7	4.8
Shasta	0.6	3.9
Solano	1.6	3.6
Sonoma	2.2	5.3
Stanislaus	1.3	3.1
Sutter	1	4.2
Tehama	3.1	6.2
Tulare	0.2	0.4
Tuolumne	1.8	1.8
Ventura	0.8	1.8
Yolo	2.3	8.2
Yuba	1.3	2.5