Module 12: Cheat Sheet

Essential Concepts

One-Sample Hypothesis Test for Means

The null hypothesis ([latex]H_0[/latex]) is a statement about the population that either is believed to be true or is used to put forth an argument unless it can be shown to be incorrect beyond a reasonable doubt.
- [latex]H_0: \mu=\mu_0[/latex], [latex]\mu_0[/latex] is the null value.
The alternative hypothesis ([latex]H_A[/latex]) is a claim about the population that is contradictory to [latex]H_0[/latex] and what we conclude when we reject [latex]H_0[/latex].
- [latex]H_A: \mu \lt \mu_0[/latex], [latex]\mu_0[/latex] is the null value.
- [latex]H_A: \mu>\mu_0[/latex], [latex]\mu_0[/latex] is the null value.
- [latex]H_A: \mu\ne \mu_0[/latex], [latex]\mu_0[/latex] is the null value.

1. The sample is a random sample from the population of interest or it is reasonable to regard the sample as if it is random. It is reasonable to regard the sample as a random sample if it was selected in a way that should result in a sample that is representative of the population.
2. For each population, the distribution of the variable that was measured is approximately normal, or the sample size for the sample from that population is large. Usually, a sample of size [latex]30[/latex] or more is considered to be “large.” If a sample size is less than [latex]30[/latex], you should look at a plot of the data from that sample (a dotplot, a boxplot, or, if the sample size isn’t really small, a histogram) to make sure that the distribution looks approximately symmetric and that there are no outliers.

Test statistics: [latex]t[/latex]-statistic: [latex]t=\dfrac{\stackrel{¯}{x}-μ}{\frac{s}{\sqrt{n}}}[/latex]
The distribution of [latex]t[/latex]-scores depends on the degrees of freedom, that is, [latex]df = n – 1[/latex].

Two-Sample Hypothesis Test for Means

Independent samples are two samples where the individuals selected for the first sample do not influence the individuals selected for the second sample.
A hypothesis test for comparing two population means is often referred to as a two-sample t-test.
The null hypothesis ([latex]H_0[/latex]) is a statement about the population that is either believed to be true or is used to put forth an argument unless it can be shown to be incorrect beyond a reasonable doubt.
- Null hypothesis: [latex]H_0: \mu_1=\mu_2[/latex] or [latex]H_0: \mu_1-\mu_2=0[/latex]
The alternative hypothesis ([latex]H_A[/latex]) is a claim about the population that is contradictory to [latex]H_0[/latex] and what we conclude when we reject [latex]H_0[/latex].
- Alternative hypothesis:
  - [latex]H_A: \mu_1\lt \mu_2[/latex] or [latex]H_A: \mu_1-\mu_2\lt 0[/latex]
  - [latex]H_A: \mu_1>\mu_2[/latex] or [latex]H_A: \mu_1-\mu_2>0[/latex]
  - [latex]H_A: \mu_1\ne \mu_2[/latex] or [latex]H_A: \mu_1-\mu_2\ne0[/latex]
Conditions for a t-test

1. The sample should be randomly selected or reasonably representative of the population.
2. For each population, the distribution of the variable that was measured is approximately normal, or the sample size for the sample from that population is large. Usually, a sample of size [latex]30[/latex] or more is considered to be “large.” If a sample size is less than [latex]30[/latex], you should look at a plot of the data from that sample (a dotplot, a boxplot, or, if the sample size isn’t really small, a histogram) to make sure that the distribution looks approximately symmetric and that there are no outliers.

standard error of [latex]\bar{x}_1-\bar{x}_2[/latex]: [latex]\sqrt{\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2}}[/latex]
The test statistic to compare two population means is calculated using the following formula: [latex]t = \dfrac{\text{estimate of parameter - null hypothesis value}}{\text{standard error}} = \dfrac{(\bar{x}_1-\bar{x}_2)-(\mu_1-\mu_2)}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}}[/latex]

Paired samples or Dependent samples are samples that are chosen in a way that results in the observations in one sample being paired with the observations in the other sample.
The mean of the differences is equal to the difference in means: [latex]\mu_d = \mu_\text{after}-\mu_\text{before}[/latex]
In summary, where [latex]k[/latex] is the value of the null hypothesis, we have:

Alternative Hypothesis for Independent Samples	Alternative Hypothesis for Dependent Samples
[latex]H_A: \mu_1-\mu_2>k[/latex]	[latex]H_A: \mu_d>k[/latex]
[latex]H_A: \mu_1-\mu_2 \lt k[/latex]	[latex]H_A: \mu_d \lt k[/latex]
[latex]H_A: \mu_1-\mu_2 \ne k[/latex]	[latex]H_A: \mu_d \ne k[/latex]

The notations for the summary statistics used to compare paired populations or samples are shown in the following table. We will use [latex]d[/latex] to represent the difference variable.

Summary Statistics	Notation
Population mean of difference	[latex]\mu_d[/latex]
Sample mean of difference	[latex]\bar{d}[/latex]
Population standard deviation of difference	[latex]\sigma_d[/latex]
Sample standard deviation of difference	[latex]s_d[/latex]

The test statistic for the dependent (paired) t-test is calculated using the following formulas: [latex]\text{standard error of the difference}=\dfrac{s_d}{\sqrt{n}}[/latex]
[latex]\text{test statistic }(t)=\dfrac{\text{estimator - null value}}{\text{standard error of estimator}}=\dfrac{\bar{d}-\text{null value}}{\text{standard error of difference}}[/latex]

[latex]t[/latex]-statistic

[latex]t=\dfrac{\stackrel{¯}{x}-μ}{\frac{s}{\sqrt{n}}}[/latex]

standard error of difference of means

[latex]\bar{x}_1-\bar{x}_2[/latex]: [latex]\sqrt{\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2}}[/latex]

standard deviation of difference of means

[latex]\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}[/latex]

independent sample

random samples from each population

paired samples, dependent samples

samples that are chosen in a way that results in the observations in one sample being paired with the observations in the other sample