{"id":1821,"date":"2023-04-14T14:43:45","date_gmt":"2023-04-14T14:43:45","guid":{"rendered":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/?post_type=chapter&#038;p=1821"},"modified":"2025-08-27T00:24:51","modified_gmt":"2025-08-27T00:24:51","slug":"sampling-and-experimentation-learn-it-1","status":"web-only","type":"chapter","link":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/chapter\/sampling-and-experimentation-learn-it-1\/","title":{"raw":"Sampling and Experimentation: Learn It 1","rendered":"Sampling and Experimentation: Learn It 1"},"content":{"raw":"<section class=\"textbox learningGoals\">\r\n<ul>\r\n\t<li>Identify methods for obtaining a random sample of the intended population of a study<\/li>\r\n\t<li>Identify types of sample bias<\/li>\r\n\t<li>Identify the differences between observational studies and experiments, and the treatment in an experiment<\/li>\r\n\t<li>Determine whether an experiment may have been influenced by confounding<\/li>\r\n<\/ul>\r\n<\/section>\r\n<h2>Statistical Inference<\/h2>\r\n<p>The process of taking a statistic from a sample and determining a parameter for a population is called statistical inference.<\/p>\r\n<center>\r\n[caption id=\"attachment_2841\" align=\"aligncenter\" width=\"920\"]<img class=\"wp-image-2841\" src=\"https:\/\/content-cdn.one.lumenlearning.com\/wp-content\/uploads\/sites\/10\/2022\/10\/28214703\/2.1.L.Diagram1-1024x457.png\" alt=\"A visual representation of the process of statistical inference. Step 1, take sample. Step 2, sample shows a relationship. Step 3, does that mean there is a real relationship in the population? Or was the relationship in the sample just due to chance? The visual component shows a large purple circle labeled population with a smaller yellow circle in it. There is an arrow labeled random sampling from the large purple circle to a smaller yellow circle labeled Sample. There is an arrow from the sample circle to the word Statistic, which is described as A summary measure of a sample (calculated from an observed sample). There is an arrow from the word statistic labeled Inference and another arrow from the large purple circle to the word Parameter. The arrow labeled inference also says sampling must be unbiased. Underneath the word parameter, it says a summary measure associated with the population (usually unknown)\" width=\"920\" height=\"410\" \/> Figure 1. The process of statistical inference[\/caption]\r\n<\/center>\r\n<section class=\"textbox recall\">A <strong>population <\/strong>is the group of individuals or entities (such as animals or objects) that our research question pertains to (e.g., all Americans). A <strong>sample <\/strong>is a group of individuals or entities on which we collect data. One primary use of statistics is to make inferences about a population based on data collected from a sample from that population. A <strong>parameter <\/strong>is a numerical measure that summarizes a population. A <strong>statistic <\/strong>is a numerical summary measure of a sample.<\/section>\r\n<section class=\"textbox seeExample\">Imagine a small college with only [latex]200[\/latex] students, and suppose that [latex]60%[\/latex] of these students are eligible for financial aid. In this simplified situation, we can identify the population, the variable, and the parameter.\r\n\r\n<ul>\r\n\t<li><strong>Population:<\/strong> [latex]200[\/latex] students at the college.<\/li>\r\n\t<li><strong>Variable:<\/strong> <em>Eligibility for financial aid<\/em> is a categorical variable, so we use a proportion as a summary.<\/li>\r\n\t<li><strong>Parameter = Population Proportion:<\/strong> [latex]60%[\/latex] students or [latex]0.6[\/latex] of the population is eligible for financial aid.<\/li>\r\n<\/ul>\r\n\r\nNote: Populations are usually much larger than [latex]200[\/latex] people. Also, in real situations, we do not know the population proportion. We are using a simplified situation to investigate how random samples relate to the population. This is the first step in creating a probability model that will be useful in inference. <em>How accurate are random samples at predicting this population proportion of [latex]0.60[\/latex]?<\/em> To answer this question, we randomly select [latex]8[\/latex] students and determine the proportion who are eligible for financial aid. We repeat this process several times. Here are the results for [latex]3[\/latex] random samples: <img class=\"wp-image-3593 size-full aligncenter\" src=\"https:\/\/content-cdn.one.lumenlearning.com\/wp-content\/uploads\/sites\/10\/2022\/10\/05174016\/2.1.L.Diagram2-1.png\" alt=\"Financial aid eligibility: 3 random samples of students consisting of 8 students each (out of a total population of 200 students). The proportion eligible for financial aid in the population is .60. In the random samples, each student is assigned a number and then categorized as elibigle for financial aid or not. The sample proportions are as follows: Sample 1 has 6 students eligible for aid and six divided by 8 is 0.75. Sample 2 has 5 students eligible for aid and five divided by 8 is 0.625. Sample 3 has 3 students eligible for aid and three divided by 8 is 0.375. When you average the sample proportions and round to the tens place you get a proportion of .60. \" width=\"781\" height=\"566\" \/> [reveal-answer q=\"257212\"]More about these random samples.[\/reveal-answer] [hidden-answer a=\"257212\"] Notice the following about these random samples:\r\n\r\n<ul>\r\n\t<li>Each random sample came from a population in which the proportion eligible for financial aid is [latex]0.60[\/latex], but sample proportions vary. Each random sample has a different proportion who are eligible for financial aid.<\/li>\r\n\t<li>Some sample proportions are larger than the population proportion of [latex]0.60[\/latex]; some sample proportions are smaller than the population proportion.<\/li>\r\n\t<li>Some samples give good estimates of the population proportion. Some do not. In this case, [latex]0.625[\/latex] is a much better estimate than [latex]0.375[\/latex].<\/li>\r\n\t<li>A lot of variability occurs in these sample proportions. It is not surprising, therefore, that a sample of [latex]8[\/latex] students may give an inaccurate estimate of the proportion of those eligible for financial aid in the population. It makes sense that small samples of only [latex]8[\/latex] students may not represent the population accurately. Later we investigate the effect of increasing the size of the sample.<\/li>\r\n\t<li>The variability we see in proportions from random samples is due to chance.[\/hidden-answer]<\/li>\r\n<\/ul>\r\n<\/section>\r\n<h2>Sampling Bias<\/h2>\r\n<p>Remember that the ideal sample should be representative of the entire population.<\/p>\r\n<p>In statistics, a\u00a0<strong>sampling bias<\/strong>\u00a0is created when a sample is collected from a population and some members of the population are not as likely to be chosen as others (remember, each member of the population should have an equally likely chance of being chosen). When a sampling bias happens, there can be incorrect conclusions drawn about the population that is being studied.<\/p>\r\n<section class=\"textbox keyTakeaway\">\r\n<div>\r\n<h3>sampling bias<\/h3>\r\n<strong>Sampling bias<\/strong> occurs when some members of the intended population are less likely to be included in the sample than others, resulting in a sample that is not representative of the population as a whole.<\/div>\r\n<\/section>\r\n<p>When we say a random sample represents the population well, we mean that there is <em>no inherent bias<\/em> in this sampling technique. It is important to acknowledge, though, that this does not mean all random samples are necessarily \u201cperfect.\u201d<\/p>\r\n<p>Random samples are still random, and therefore no random sample will be exactly the same as another. One random sample may give a fairly accurate representation of the population, while another random sample might be \u201coff\u201d purely because of chance. Unfortunately, when looking at a particular sample (which is what happens in practice), we never know how much it differs from the population.<\/p>\r\n<section class=\"textbox tryIt\">[ohm2_question hide_question_numbers=1]668[\/ohm2_question]<\/section>\r\n<section class=\"textbox tryIt\">[ohm2_question hide_question_numbers=1]669[\/ohm2_question]<\/section>","rendered":"<section class=\"textbox learningGoals\">\n<ul>\n<li>Identify methods for obtaining a random sample of the intended population of a study<\/li>\n<li>Identify types of sample bias<\/li>\n<li>Identify the differences between observational studies and experiments, and the treatment in an experiment<\/li>\n<li>Determine whether an experiment may have been influenced by confounding<\/li>\n<\/ul>\n<\/section>\n<h2>Statistical Inference<\/h2>\n<p>The process of taking a statistic from a sample and determining a parameter for a population is called statistical inference.<\/p>\n<div style=\"text-align: center;\">\n<figure id=\"attachment_2841\" aria-describedby=\"caption-attachment-2841\" style=\"width: 920px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-2841\" src=\"https:\/\/content-cdn.one.lumenlearning.com\/wp-content\/uploads\/sites\/10\/2022\/10\/28214703\/2.1.L.Diagram1-1024x457.png\" alt=\"A visual representation of the process of statistical inference. Step 1, take sample. Step 2, sample shows a relationship. Step 3, does that mean there is a real relationship in the population? Or was the relationship in the sample just due to chance? The visual component shows a large purple circle labeled population with a smaller yellow circle in it. There is an arrow labeled random sampling from the large purple circle to a smaller yellow circle labeled Sample. There is an arrow from the sample circle to the word Statistic, which is described as A summary measure of a sample (calculated from an observed sample). There is an arrow from the word statistic labeled Inference and another arrow from the large purple circle to the word Parameter. The arrow labeled inference also says sampling must be unbiased. Underneath the word parameter, it says a summary measure associated with the population (usually unknown)\" width=\"920\" height=\"410\" \/><figcaption id=\"caption-attachment-2841\" class=\"wp-caption-text\">Figure 1. The process of statistical inference<\/figcaption><\/figure>\n<\/div>\n<section class=\"textbox recall\">A <strong>population <\/strong>is the group of individuals or entities (such as animals or objects) that our research question pertains to (e.g., all Americans). A <strong>sample <\/strong>is a group of individuals or entities on which we collect data. One primary use of statistics is to make inferences about a population based on data collected from a sample from that population. A <strong>parameter <\/strong>is a numerical measure that summarizes a population. A <strong>statistic <\/strong>is a numerical summary measure of a sample.<\/section>\n<section class=\"textbox seeExample\">Imagine a small college with only [latex]200[\/latex] students, and suppose that [latex]60%[\/latex] of these students are eligible for financial aid. In this simplified situation, we can identify the population, the variable, and the parameter.<\/p>\n<ul>\n<li><strong>Population:<\/strong> [latex]200[\/latex] students at the college.<\/li>\n<li><strong>Variable:<\/strong> <em>Eligibility for financial aid<\/em> is a categorical variable, so we use a proportion as a summary.<\/li>\n<li><strong>Parameter = Population Proportion:<\/strong> [latex]60%[\/latex] students or [latex]0.6[\/latex] of the population is eligible for financial aid.<\/li>\n<\/ul>\n<p>Note: Populations are usually much larger than [latex]200[\/latex] people. Also, in real situations, we do not know the population proportion. We are using a simplified situation to investigate how random samples relate to the population. This is the first step in creating a probability model that will be useful in inference. <em>How accurate are random samples at predicting this population proportion of [latex]0.60[\/latex]?<\/em> To answer this question, we randomly select [latex]8[\/latex] students and determine the proportion who are eligible for financial aid. We repeat this process several times. Here are the results for [latex]3[\/latex] random samples: <img loading=\"lazy\" decoding=\"async\" class=\"wp-image-3593 size-full aligncenter\" src=\"https:\/\/content-cdn.one.lumenlearning.com\/wp-content\/uploads\/sites\/10\/2022\/10\/05174016\/2.1.L.Diagram2-1.png\" alt=\"Financial aid eligibility: 3 random samples of students consisting of 8 students each (out of a total population of 200 students). The proportion eligible for financial aid in the population is .60. In the random samples, each student is assigned a number and then categorized as elibigle for financial aid or not. The sample proportions are as follows: Sample 1 has 6 students eligible for aid and six divided by 8 is 0.75. Sample 2 has 5 students eligible for aid and five divided by 8 is 0.625. Sample 3 has 3 students eligible for aid and three divided by 8 is 0.375. When you average the sample proportions and round to the tens place you get a proportion of .60.\" width=\"781\" height=\"566\" \/> <\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q257212\">More about these random samples.<\/button> <\/p>\n<div id=\"q257212\" class=\"hidden-answer\" style=\"display: none\"> Notice the following about these random samples:<\/p>\n<ul>\n<li>Each random sample came from a population in which the proportion eligible for financial aid is [latex]0.60[\/latex], but sample proportions vary. Each random sample has a different proportion who are eligible for financial aid.<\/li>\n<li>Some sample proportions are larger than the population proportion of [latex]0.60[\/latex]; some sample proportions are smaller than the population proportion.<\/li>\n<li>Some samples give good estimates of the population proportion. Some do not. In this case, [latex]0.625[\/latex] is a much better estimate than [latex]0.375[\/latex].<\/li>\n<li>A lot of variability occurs in these sample proportions. It is not surprising, therefore, that a sample of [latex]8[\/latex] students may give an inaccurate estimate of the proportion of those eligible for financial aid in the population. It makes sense that small samples of only [latex]8[\/latex] students may not represent the population accurately. Later we investigate the effect of increasing the size of the sample.<\/li>\n<li>The variability we see in proportions from random samples is due to chance.<\/div>\n<\/div>\n<\/li>\n<\/ul>\n<\/section>\n<h2>Sampling Bias<\/h2>\n<p>Remember that the ideal sample should be representative of the entire population.<\/p>\n<p>In statistics, a\u00a0<strong>sampling bias<\/strong>\u00a0is created when a sample is collected from a population and some members of the population are not as likely to be chosen as others (remember, each member of the population should have an equally likely chance of being chosen). When a sampling bias happens, there can be incorrect conclusions drawn about the population that is being studied.<\/p>\n<section class=\"textbox keyTakeaway\">\n<div>\n<h3>sampling bias<\/h3>\n<p><strong>Sampling bias<\/strong> occurs when some members of the intended population are less likely to be included in the sample than others, resulting in a sample that is not representative of the population as a whole.<\/div>\n<\/section>\n<p>When we say a random sample represents the population well, we mean that there is <em>no inherent bias<\/em> in this sampling technique. It is important to acknowledge, though, that this does not mean all random samples are necessarily \u201cperfect.\u201d<\/p>\n<p>Random samples are still random, and therefore no random sample will be exactly the same as another. One random sample may give a fairly accurate representation of the population, while another random sample might be \u201coff\u201d purely because of chance. Unfortunately, when looking at a particular sample (which is what happens in practice), we never know how much it differs from the population.<\/p>\n<section class=\"textbox tryIt\"><iframe loading=\"lazy\" id=\"ohm668\" class=\"resizable\" src=\"https:\/\/ohm.one.lumenlearning.com\/multiembedq.php?id=668&theme=lumen&iframe_resize_id=ohm668&source=tnh\" width=\"100%\" height=\"150\"><\/iframe><\/section>\n<section class=\"textbox tryIt\"><iframe loading=\"lazy\" id=\"ohm669\" class=\"resizable\" src=\"https:\/\/ohm.one.lumenlearning.com\/multiembedq.php?id=669&theme=lumen&iframe_resize_id=ohm669&source=tnh\" width=\"100%\" height=\"150\"><\/iframe><\/section>\n","protected":false},"author":15,"menu_order":10,"template":"","meta":{"_candela_citation":"[]","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"part":86,"module-header":"learn_it","content_attributions":[],"internal_book_links":[],"video_content":null,"cc_video_embed_content":{"cc_scripts":"","media_targets":[]},"try_it_collection":null,"_links":{"self":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/1821"}],"collection":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/users\/15"}],"version-history":[{"count":16,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/1821\/revisions"}],"predecessor-version":[{"id":15729,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/1821\/revisions\/15729"}],"part":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/parts\/86"}],"metadata":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/1821\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/media?parent=1821"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapter-type?post=1821"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/contributor?post=1821"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/license?post=1821"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}