The Statistical Investigative Process
We typically begin our statistical thinking with an investigative question in mind.
For example, we might want to know the answer to questions such as:
- What percentage of U.S. adults support the ban of TikTok on U.S. government devices? (Population: U.S. adults)
- Do cell phone signals affect honey bee behavior? (Population: bees)
- Do cars get better gas mileage with a new gasoline additive? (Population: cars)
population
A population is an entire group, usually a large group, that is of interest in the statistical study.
A population can be people or other things, such as animals or objects.
When we refer to a population, we are including every member of that group.
Often, the population is so large that we cannot collect information from every individual, so we select a sample from the population. In this step, we collect (or produce) data from this sample.
Of course, we need a sample that represents the population well. This involves careful planning, but also involves chance. For example, if our goal is to determine the percentage of U.S. adults who support the ban of TikTok on U.S. government devices, we do not want our sample to contain only a specific political party line, only a certain race, or only a certain age range. We want to give everyone the same opportunity to be in the sample, so we will let chance select the sample.
sample
A sample is a subset or subgroup of a population.
A sample is a fraction of a population, preferably randomly selected, that is more manageable for data collection.
We want the sample to reflect the overall diversity within a population.
Next in the statistical investigation, we carefully define what kind of information we plan to gather. Then, we collect the data using surveys, interviews, behavior monitoring, studies, overlapping existing data sets, etc.
Data are often presented in a long list of information. To make sense of it all, we summarize the data using graphs and different numerical measures, such as percentages or averages. We call this step exploratory data analysis.
Remember that our goal is to answer a question about a population based on a sample. Samples will vary due to chance, but we can still answer our question despite that variability. We can explain how sample results vary in relation to the population as a whole based on an understanding of probability.
The final step in the process is to interpret the results of the sample and use that to make an inference about the population. This inference is the conclusion we reach from our sample data that answers our original question about the population.

However, often in the statistical investigative process, our conclusions lead us to new questions or the next statistical investigation. Many look at the statistical investigative process as a cycle rather than a linear path.
Statistics is a fundamental discipline in advancing both scientific discoveries, business choices, and personal decisions, which are key reasons to understand the statistical investigative process.