- Complete a randomization test involving a difference in proportions
Using Confidence Interval for Hypothesis Testing
The primary purpose of a confidence interval is to estimate some unknown parameter; for this section, it is the difference in population proportions.
Recall that we can also use confidence intervals to support decisions in hypothesis testing, especially when the test is two-tailed.
- Bootstrap resampling is typically used to estimate confidence intervals.
- The randomization resampling is typically used to test a hypothesis.
Let’s revisit our Peanut Allergy example!
Let’s simulate a large number of samples under the assumption of the null hypothesis using the statistical tool below. Because we are using a confidence interval to support our decision in the hypothesis testing, we will use bootstrap resampling to find our confidence interval.
-
- Select “Contingency Table” under “Enter Data.”
- Type “Peanut” for the row variable, with “Avoiders” and “Eaters” for the category labels.
- Type “Conditions” for the column variable, with “Allergic” and “Not allergic” as the category labels.
- Enter the table below:
| Allergic | Not allergic | |
| Peanut avoiders | 35 | 220 |
| Peanut eaters | 5 | 240 |
Step 2: Now select “Bootstrap Distribution” at the top. You should see the contingency table you entered as the “Observed Contingency Table.” Generate a [latex]1000[/latex] bootstrap samples of the data.
Step 3: Click on the “Find bootstrap percentile confidence interval” and adjust the confidence interval accordingly. The confidence interval will be displayed next to the sliders.
[Trouble viewing? Click to open in a new tab.]
Notice that we achieved the same conclusion using the randomization and bootstrapping resampling method.
The confidence interval (through bootstrapping) gives us all of the plausible values of the statistics based on the resampling of the data. Therefore, it provides information about statistical significance, as well as the direction and strength of the effect. This allows us to make a decision regarding a hypothesis test.
Hypothesis testing using P-value (through randomization test), on the other hand, is a yes or no decision. Therefore, P-values are more direct than confidence intervals. The statistical inference/conclusion based on the P-value is a little more simplistic (strong evidence or not).
It should be clear that P-values and confidence intervals are not contradictory statistical concepts. This means that bootstrapping and randomization tests are not contradictory statistical resampling concepts either. They are complementary to one another.