{"id":885,"date":"2023-03-20T19:17:59","date_gmt":"2023-03-20T19:17:59","guid":{"rendered":"https:\/\/content.one.lumenlearning.com\/introstatstest\/chapter\/measures-of-variability-learn-it-7\/"},"modified":"2025-05-08T03:16:00","modified_gmt":"2025-05-08T03:16:00","slug":"measures-of-variability-learn-it-7","status":"publish","type":"chapter","link":"https:\/\/content.one.lumenlearning.com\/introstatstest\/chapter\/measures-of-variability-learn-it-7\/","title":{"raw":"Measures of Variability: Learn It 6","rendered":"Measures of Variability: Learn It 6"},"content":{"raw":"<section class=\"textbox learningGoals\">\r\n<ul>\r\n\t<li>Describe the differences in variability in histograms and dotplots.<\/li>\r\n\t<li><span data-sheets-value=\"{&quot;1&quot;:2,&quot;2&quot;:&quot;Calculate the standard deviation and then describe what that value means&quot;}\" data-sheets-userformat=\"{&quot;2&quot;:4865,&quot;3&quot;:{&quot;1&quot;:0},&quot;11&quot;:0,&quot;12&quot;:0,&quot;15&quot;:&quot;Calibri&quot;}\">Calculate and describe standard deviation.<\/span><\/li>\r\n<\/ul>\r\n<\/section>\r\n<h2>Deciding Which Measurements to Use<\/h2>\r\n<p>We now have a choice between two measurements of center and spread: We can use the median with the interquartile range, or we can use the mean with the standard deviation. How do we decide which measurements to use?<\/p>\r\n<p>Our next examples show that the shape of the distribution and the presence of outliers help us answer this question.<\/p>\r\n<section class=\"textbox tryIt\">This boxplot is a summary of homework scores earned by a student. Notice that the distribution of scores has an outlier. This student has mostly high homework scores with one score of [latex]0[\/latex].\r\n[caption id=\"attachment_6482\" align=\"aligncenter\" width=\"681\"]<img class=\"wp-image-6482\" src=\"https:\/\/content-cdn.one.lumenlearning.com\/wp-content\/uploads\/sites\/27\/2023\/03\/20191758\/4.4.L.Boxplot6.png\" alt=\"Appropriate alternative text can be found in the description below.\" width=\"681\" height=\"143\" \/> Figure 1. Boxplot showing a student's homework scores, with most scores clustered high and one outlier at 0, indicating a much lower value than the rest.[\/caption]\r\nHere are some observations about the homework data:\r\n\r\n<ul>\r\n\t<li>Five-number summary: Minimum: [latex]0[\/latex], [latex]Q1: 82[\/latex], median: [latex]84.5[\/latex], [latex]Q3: 89[\/latex], maximum: [latex]100[\/latex]<\/li>\r\n\t<li>Median = [latex]84.5[\/latex]<\/li>\r\n\t<li>Mean = [latex]81.8[\/latex]<\/li>\r\n\t<li>IQR = [latex]7[\/latex]<\/li>\r\n\t<li>Range = [latex]100[\/latex]<\/li>\r\n\t<li>Standard deviation = [latex]17.6[\/latex]<\/li>\r\n<\/ul>\r\n<p>[reveal-answer q=\"251755\"]Which measure of center and spread is the better numerical summary of the student\u2019s performance on homework?[\/reveal-answer]<br \/>\r\n[hidden-answer a=\"251755\"]<\/p>\r\n<p>The \"typical\" range of scores based on the first and third quartiles is [latex]82[\/latex] to [latex]89[\/latex].<\/p>\r\n<p>The typical range of scores based on mean \u00b1 SD is [latex]64.2[\/latex] to [latex]99.4[\/latex] (Here\u2019s how we calculated this: [latex]81.8 - 17.6 = 64.2, 81.8 + 17.6 = 99.4[\/latex].)<\/p>\r\n<p>Which is the better summary of the student\u2019s performance on homework?<\/p>\r\n<p>The typical range based on the mean and standard deviation is not a good summary of this student\u2019s homework scores. Here we see that the outlier decreases the mean so that the mean is too low to be representative of this student\u2019s typical performance. We also see that the outlier increases the standard deviation, which gives the impression of a wide variability in scores. This makes sense because the standard deviation measures the average deviation of the data from the mean. So, a point that has a large deviation from the mean will increase the average of the deviations. In this example, a single score is responsible for giving the impression that the student\u2019s typical homework scores are lower than they really are.<\/p>\r\n<p>The typical range based on the first and third quartiles gives a better summary of this student\u2019s performance on homework because the outlier does not affect the quartile marks.<\/p>\r\n<p>The better numerical summaries of student's performance on this homework data set are the <strong>five-number summary (which includes <\/strong><strong>median)<\/strong>, <strong>IQR, and range<\/strong>. [\/hidden-answer]<\/p>\r\n<\/section>\r\n<p>These examples illustrate some general guidelines for choosing numerical summaries:<\/p>\r\n<ul>\r\n\t<li>Like the mean, the standard deviation is strongly affected by outliers and skew in the data. Therefore, use the <strong>mean and the standard deviation<\/strong> as measures of center and spread only for distributions that are reasonably <strong>symmetric<\/strong> with a central peak. When outliers are present, the mean and standard deviation are not a good choice.<\/li>\r\n\t<li>Use the <strong>five-number summary (which includes the median, IQR, and range)<\/strong> for all other cases.<\/li>\r\n<\/ul>\r\n<p>Both of these examples also highlight another important principle: Always plot the data<em>.<\/em><\/p>\r\n<p>We need to use a graph to determine the shape of the distribution. By looking at the shape, we can determine which measures of center and spread best describe the data.<\/p>\r\n<section class=\"textbox example\">[reveal-answer q=\"010878\"]See the Example Question[\/reveal-answer]<br \/>\r\n[hidden-answer a=\"010878\"][ohm2_question hide_question_numbers=1]1479[\/ohm2_question][\/hidden-answer]<br \/>\r\n[videopicker divId=\"tnh-video-picker\" title=\"Evaluate Distributions of Quantitative Variables\" label=\"Select Instructor\"]<br \/>\r\n[videooption displayName=\"Dr. Pamela E. Harris\" value=\"https:\/\/www.youtube.com\/watch?v=-3CPv4sxddQ\"][videooption displayName=\"Dr. Aris Winger\" value=\"https:\/\/www.youtube.com\/watch?v=uO-RADSx45g\"] [videooption displayName=\"Dr. Lane Fisher\" value=\"https:\/\/www.youtube.com\/watch?v=5xezQ1fJgn0\"]<br \/>\r\n[\/videopicker]<\/section>","rendered":"<section class=\"textbox learningGoals\">\n<ul>\n<li>Describe the differences in variability in histograms and dotplots.<\/li>\n<li><span data-sheets-value=\"{&quot;1&quot;:2,&quot;2&quot;:&quot;Calculate the standard deviation and then describe what that value means&quot;}\" data-sheets-userformat=\"{&quot;2&quot;:4865,&quot;3&quot;:{&quot;1&quot;:0},&quot;11&quot;:0,&quot;12&quot;:0,&quot;15&quot;:&quot;Calibri&quot;}\">Calculate and describe standard deviation.<\/span><\/li>\n<\/ul>\n<\/section>\n<h2>Deciding Which Measurements to Use<\/h2>\n<p>We now have a choice between two measurements of center and spread: We can use the median with the interquartile range, or we can use the mean with the standard deviation. How do we decide which measurements to use?<\/p>\n<p>Our next examples show that the shape of the distribution and the presence of outliers help us answer this question.<\/p>\n<section class=\"textbox tryIt\">This boxplot is a summary of homework scores earned by a student. Notice that the distribution of scores has an outlier. This student has mostly high homework scores with one score of [latex]0[\/latex].<\/p>\n<figure id=\"attachment_6482\" aria-describedby=\"caption-attachment-6482\" style=\"width: 681px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-6482\" src=\"https:\/\/content-cdn.one.lumenlearning.com\/wp-content\/uploads\/sites\/27\/2023\/03\/20191758\/4.4.L.Boxplot6.png\" alt=\"Appropriate alternative text can be found in the description below.\" width=\"681\" height=\"143\" \/><figcaption id=\"caption-attachment-6482\" class=\"wp-caption-text\">Figure 1. Boxplot showing a student&#8217;s homework scores, with most scores clustered high and one outlier at 0, indicating a much lower value than the rest.<\/figcaption><\/figure>\n<p>Here are some observations about the homework data:<\/p>\n<ul>\n<li>Five-number summary: Minimum: [latex]0[\/latex], [latex]Q1: 82[\/latex], median: [latex]84.5[\/latex], [latex]Q3: 89[\/latex], maximum: [latex]100[\/latex]<\/li>\n<li>Median = [latex]84.5[\/latex]<\/li>\n<li>Mean = [latex]81.8[\/latex]<\/li>\n<li>IQR = [latex]7[\/latex]<\/li>\n<li>Range = [latex]100[\/latex]<\/li>\n<li>Standard deviation = [latex]17.6[\/latex]<\/li>\n<\/ul>\n<p><div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q251755\">Which measure of center and spread is the better numerical summary of the student\u2019s performance on homework?<\/button><\/p>\n<div id=\"q251755\" class=\"hidden-answer\" style=\"display: none\">\n<p>The &#8220;typical&#8221; range of scores based on the first and third quartiles is [latex]82[\/latex] to [latex]89[\/latex].<\/p>\n<p>The typical range of scores based on mean \u00b1 SD is [latex]64.2[\/latex] to [latex]99.4[\/latex] (Here\u2019s how we calculated this: [latex]81.8 - 17.6 = 64.2, 81.8 + 17.6 = 99.4[\/latex].)<\/p>\n<p>Which is the better summary of the student\u2019s performance on homework?<\/p>\n<p>The typical range based on the mean and standard deviation is not a good summary of this student\u2019s homework scores. Here we see that the outlier decreases the mean so that the mean is too low to be representative of this student\u2019s typical performance. We also see that the outlier increases the standard deviation, which gives the impression of a wide variability in scores. This makes sense because the standard deviation measures the average deviation of the data from the mean. So, a point that has a large deviation from the mean will increase the average of the deviations. In this example, a single score is responsible for giving the impression that the student\u2019s typical homework scores are lower than they really are.<\/p>\n<p>The typical range based on the first and third quartiles gives a better summary of this student\u2019s performance on homework because the outlier does not affect the quartile marks.<\/p>\n<p>The better numerical summaries of student&#8217;s performance on this homework data set are the <strong>five-number summary (which includes <\/strong><strong>median)<\/strong>, <strong>IQR, and range<\/strong>. <\/div>\n<\/div>\n<\/section>\n<p>These examples illustrate some general guidelines for choosing numerical summaries:<\/p>\n<ul>\n<li>Like the mean, the standard deviation is strongly affected by outliers and skew in the data. Therefore, use the <strong>mean and the standard deviation<\/strong> as measures of center and spread only for distributions that are reasonably <strong>symmetric<\/strong> with a central peak. When outliers are present, the mean and standard deviation are not a good choice.<\/li>\n<li>Use the <strong>five-number summary (which includes the median, IQR, and range)<\/strong> for all other cases.<\/li>\n<\/ul>\n<p>Both of these examples also highlight another important principle: Always plot the data<em>.<\/em><\/p>\n<p>We need to use a graph to determine the shape of the distribution. By looking at the shape, we can determine which measures of center and spread best describe the data.<\/p>\n<section class=\"textbox example\">\n<div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q010878\">See the Example Question<\/button><\/p>\n<div id=\"q010878\" class=\"hidden-answer\" style=\"display: none\"><iframe loading=\"lazy\" id=\"ohm1479\" class=\"resizable\" src=\"https:\/\/ohm.one.lumenlearning.com\/multiembedq.php?id=1479&theme=lumen&iframe_resize_id=ohm1479&source=tnh\" width=\"100%\" height=\"150\"><\/iframe><\/div>\n<\/div>\n<div class=\"wp-nocaption \"><\/div>\n<div id=\"tnh-video-picker\" class=\"videoPicker\">\n<h3>Evaluate Distributions of Quantitative Variables<\/h3>\n<form><label>Select Instructor:<\/label><select name=\"video\"><option value=\"https:\/\/www.youtube.com\/embed\/-3CPv4sxddQ\">Dr. Pamela E. Harris<\/option><option value=\"https:\/\/www.youtube.com\/embed\/uO-RADSx45g\">Dr. Aris Winger<\/option><option value=\"https:\/\/www.youtube.com\/embed\/5xezQ1fJgn0\">Dr. Lane Fisher<\/option><\/select><\/form>\n<div class=\"videoContainer\"><iframe src=\"https:\/\/www.youtube.com\/embed\/-3CPv4sxddQ\" allowfullscreen><\/iframe><\/div>\n<\/section>\n","protected":false},"author":13,"menu_order":34,"template":"","meta":{"_candela_citation":"[]","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"part":834,"module-header":"learn_it","content_attributions":[],"internal_book_links":[],"video_content":[{"divId":"tnh-video-picker","title":"Evaluate Distributions of Quantitative Variables","label":"Select Instructor","video_collection":[{"displayName":"Dr. Pamela E. Harris","value":"https:\/\/www.youtube.com\/embed\/-3CPv4sxddQ"},{"displayName":"Dr. Aris Winger","value":"https:\/\/www.youtube.com\/embed\/uO-RADSx45g"},{"displayName":"Dr. Lane Fisher","value":"https:\/\/www.youtube.com\/embed\/5xezQ1fJgn0"}]}],"cc_video_embed_content":{"cc_scripts":"","media_targets":[]},"try_it_collection":null,"_links":{"self":[{"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/pressbooks\/v2\/chapters\/885"}],"collection":[{"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/wp\/v2\/users\/13"}],"version-history":[{"count":8,"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/pressbooks\/v2\/chapters\/885\/revisions"}],"predecessor-version":[{"id":6475,"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/pressbooks\/v2\/chapters\/885\/revisions\/6475"}],"part":[{"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/pressbooks\/v2\/parts\/834"}],"metadata":[{"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/pressbooks\/v2\/chapters\/885\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/wp\/v2\/media?parent=885"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/pressbooks\/v2\/chapter-type?post=885"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/wp\/v2\/contributor?post=885"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/wp\/v2\/license?post=885"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}