Download a PDF of this page here.
Download the Spanish version here.
Essential Concepts
- Population refers to the group of people or objects that researchers want to learn about.
- A parameter is a number that summarizes a characteristic of the entire population. It can be an average, percentage, or any other value that is calculated using data from the whole population.
- A census is a type of survey where data is collected from every single member of the population. It’s like counting or surveying every person or object in a group.
- When we want to study a big group of people but can’t survey everyone, we choose a smaller group called a sample. The sample should represent the whole group.
- A statistic is a number that tells us something about a smaller group called a sample.
- Observational units are the group of individuals, animals, or objects that are being studied or surveyed in a research study. They are the ones we want to learn about and collect information from.
- Variables are the different characteristics or qualities of the observational units that we measure or record in a study.
- Data can be divided into two types: qualitative and quantitative.
- Qualitative data describes qualities or attributes of a group, like hair color or blood type. It uses words or categories to describe these attributes, such as black hair or blood type AB+.
- Quantitative data involves counting or measuring things. It uses numbers to represent information, like the amount of money you have or the weight of an object. It can be further divided into discrete data (counting whole numbers) or continuous data (numbers including fractions or decimals), such as the number of phone calls you receive in a day or the length of those calls.
- When collecting a sample, it’s important to choose it in a way that represents the whole group. Sampling bias happens when some members of the population are more likely to be chosen than others, which can lead to wrong conclusions about the entire group. Random samples are preferred because they have no bias, but even random samples can vary and may not perfectly represent the population.
- Simple random sampling means that when selecting a sample, every individual or entity in the population has an equal and fair chance of being chosen. It ensures that each member and any group from the population has an equal probability of being selected for the sample.
- In systematic sampling, each person or object in the population is assigned a number. Then, we choose individuals at regular intervals, like every [latex]5th[/latex] or [latex]10th[/latex] person, starting from a randomly selected point. This way, we ensure that every “n”th member of the population is included in the sample.
- Stratified sampling is when a population is split into different groups based on certain criteria, like location or age. Then, a sample is chosen from each group using methods like random selection, but the size of each sample is based on the size of the group in the population. This helps ensure that each subgroup is represented properly in the sample.
- Quota sampling is a modified version of stratified sampling where samples are collected from each subgroup until a specific target or quota is reached.
- Cluster sampling is a method where instead of selecting individual people or objects, the population is divided into smaller groups called clusters. Then, a few of these clusters are randomly chosen to be part of the sample for the study.
- Convenience samples and voluntary response samples are considered among the least reliable sampling methods.
- Convenience sampling is when samples are chosen based on who is readily available or convenient to include.
- Voluntary response sampling is a method where individuals choose to participate in the sample on their own accord.
- There are several ways a study can be biased even before collecting data. One way is through sampling bias, where the sample is not representative of the whole group. Another type is voluntary response bias, which happens when data is collected only from volunteers, leading to an unbalanced representation. Other biases can come from researchers having an interest in the outcome, participants giving inaccurate responses, fear of not being anonymous, question wording influencing answers, people refusing to participate, or leaving out certain groups from the study.
- Observational studies involve observing and measuring, while experiments involve measuring the effects of a treatment.
- Confounding happens when there are two possible factors that could have caused a result, but we can’t tell which one is actually responsible.
- The placebo effect occurs when a person’s belief in a treatment affects its effectiveness, even if the treatment itself doesn’t have any real impact. To account for this, a placebo, which is a fake treatment, is often used as a comparison in studies.
- In blind studies, participants are unaware if they are receiving the actual treatment or a placebo. In double-blind studies, even the people interacting with the participants don’t know who is in which group (treatment or control).
- Experimental design consists of two key components: the factor of interest, which is the variable we think has an impact, and the response variable, which is the variable we believe is influenced by the factor of interest.
- Randomized block design is a method used in experiments where similar subjects are grouped into blocks, each differing in ways that might affect the outcome. Nuisance factors can be controlled by adding them to the experimental design, and blocking refers to grouping similar subjects together and randomly assigning them to different treatments within each group.
Glossary
blind study
one in which the participant does not know whether or not they are receiving the treatment or a placebo
block
a group of subjects that are similar
blocking
the grouping together of homogeneous (similar) experimental units followed by the random assignment of the experimental units within each group to a treatment
census
a survey of an entire population
cluster sampling
where the population is divided into subgroups (clusters) and a set of subgroups are selected to be in the sample
confounding
when there are two potential variables that could have caused the outcome and it is not possible to determine which actually caused the result
control group
group that does not receive the treatment of interest or the placebo
convenience sampling
the practice of samples chosen by selecting whoever is convenient
double-blind study
one in which those interacting with the participants don’t know who is in the treatment group and who is in the control group
experiment
a study in which the effects of a treatment are measured
experimental group
group that receives the treatment of interest
experimental unit
single object or individual to be measured in the experiment
factor of interest
the explanatory variable (independent variable), which is what we suspect has an effect on the response variable
loaded questions
when the question wording influences the responses
non-response bias
when people refusing to participate in the study can influence the validity of the outcome
observational study
a study based on observations or measurements
observational units
the group of individuals, animals, or objects who are being measured or surveyed in a study
parameter
a value (average, percentage, etc.) calculated using all the data from a population
perceived lack of anonymity
when the responder fears giving an honest answer might negatively affect them
placebo
a dummy treatment given to control for the placebo effect
placebo effect
when the effectiveness of a treatment is influenced by the patient’s perception of how effective they think the treatment will be, so a result might be seen even if the treatment is ineffectual
population
the group the collected data is intended to describe
qualitative data
the result of categorizing or describing attributes of a population
quantitative data
the result of counting or measuring attributes of a population
quota sampling
where samples are collected in each subgroup until the desired quota is met
random sample
where each member of the population has an equal probability of being chosen
response bias
when the responder gives inaccurate responses for any reason
response factor
the dependent variable, which we suspect is affected by the factor of interest
sample
a smaller subset of the entire population, ideally one that is fairly representative of the whole population
sampling bias
when a sample is collected from a population and some members of the population are not as likely to be chosen as others
self-interest study
bias that can occur when the researchers have an interest in the outcome
simple random sample
where every member of the population and any group of members has an equal probability of being chosen
statistic
a value (average, percentage, etc.) calculated using the data from a sample
stratified sampling
where random samples are taken from each subgroup (or strata) with sample sizes proportional to the size of the subgroup in the population.
systematic sampling
every [latex]n[/latex]th member of the population is selected to be in the sample
undercoverage
occurs when some groups of the population are left out of the sampling process
variables
the characteristics of the observational units in a study
voluntary response bias
the sampling bias that often occurs when the sample is volunteers
voluntary response sampling
allowing the sample to volunteer