{"id":1576,"date":"2023-04-11T16:27:22","date_gmt":"2023-04-11T16:27:22","guid":{"rendered":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/?post_type=chapter&#038;p=1576"},"modified":"2024-10-18T20:54:13","modified_gmt":"2024-10-18T20:54:13","slug":"representing-data-graphically-fresh-take","status":"web-only","type":"chapter","link":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/chapter\/representing-data-graphically-fresh-take\/","title":{"raw":"Representing Data Graphically: Fresh Take","rendered":"Representing Data Graphically: Fresh Take"},"content":{"raw":"<section class=\"textbox learningGoals\">\r\n<ul>\r\n\t<li style=\"list-style-type: none;\">\r\n<ul>\r\n\t<li>Organize data using tables, charts, and graphs<\/li>\r\n\t<li>Create and analyze side-by-side and stacked bar graphs<\/li>\r\n\t<li>Create and analyze graphs of quantitative data<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ul>\r\n<\/section>\r\n<h2>Visualizing Data<\/h2>\r\n<p>Recall that categorical data is data that is separated into distinct categories. Categorical variables are described using words. The count of each category is collected, which can be displayed using a table or a graph.<\/p>\r\n<div class=\"textbox shaded\">\r\n<p><strong>The Main Idea<\/strong><\/p>\r\n<p><b>Frequency tables\u00a0<\/b>list all the types of a categorical variable along with how many there are of each. Each category total is divided by the total of all the data to obtain the proportion of the total data that is contained in the category. The proportion may then be converted to a percentage, which is often called the \"relative frequency.\"<\/p>\r\n<p style=\"padding-left: 30px;\">\u00a0Ex. In a particular statistics class, [latex]10[\/latex] students major in business, [latex]5[\/latex] major in biology, and [latex]12[\/latex] major in health sciences. We can find the proportion of each major in the class by dividing the number appearing in that major by the total students. There are [latex]27[\/latex] total students given in the [latex]3[\/latex] majors. [latex]\\dfrac{10}{27}\\approx0.37[\/latex], which tells us about [latex]37%[\/latex] of the class majors in business.<\/p>\r\n<p><strong>Bar graphs\u00a0<\/strong>can also display either the count of each category or the proportion or percentage, depending on how the vertical axis is labeled.<\/p>\r\n<p style=\"padding-left: 30px;\">The horizontal axis lists each category of the variable.<\/p>\r\n<p style=\"padding-left: 30px;\">If the vertical axis lists counts of each, then the height of each bar above its category indicates the number of individual observations in that category.<\/p>\r\n<p style=\"padding-left: 30px;\">If the vertical axis lists percentages, then the height of the bar will indicate the proportion or percentage of each category out of the total observations.<\/p>\r\n<p style=\"padding-left: 30px;\">A <strong>Pareto chart<\/strong> is a bar graph ordered from highest to lowest frequency.<\/p>\r\n<p><strong>Pie charts\u00a0<\/strong>display either percentages or counts of each category arranged as slices of pie. The size of the slice corresponds to the proportion of observations in the category.<\/p>\r\n<p style=\"padding-left: 30px;\">In our example of the majors present in a statistics class, we calculated that [latex]37%[\/latex] of the students major in business.\u00a0[latex]\\dfrac{5}{27}\\approx0.185[\/latex], which tells us [latex]18.5%[\/latex] major in biology.\u00a0[latex]\\dfrac{12}{27}\\approx0.444[\/latex] indicates that about [latex]44.4%[\/latex] major in health sciences. The percentages don't total to [latex]100%[\/latex] due to rounding.<\/p>\r\n<\/div>\r\n<p>See the video below for a visual demonstration of how these charts are constructed from data collected on a categorical variable.<\/p>\r\n<section class=\"textbox watchIt\"><iframe src=\"\/\/plugin.3playmedia.com\/show?mf=10350385&amp;p3sdk_version=1.10.1&amp;p=20361&amp;pt=375&amp;video_id=Rx8wSEDq5Hs&amp;video_target=tpm-plugin-ua8zayzf-Rx8wSEDq5Hs\" width=\"800px\" height=\"450px\" frameborder=\"0\" marginwidth=\"0px\" marginheight=\"0px\"><\/iframe><br \/>\r\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/Bar+Chart+Pie+Chart+Frequency+Tables+%7C+Statistics+Tutorial+%7C+MarinStatsLectures.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cBar Chart, Pie Chart, Frequency Tables | Statistics Tutorial | MarinStatsLectures\u201d here (opens in new window).<\/a><\/p>\r\n<\/section>\r\n<h2>Interpreting Side-by-Side and Stacked Bar Graphs<\/h2>\r\n<div class=\"textbox shaded\">\r\n<h4><strong>The Main Idea<\/strong><\/h4>\r\n<p><strong>Side-by-side bar graphs<\/strong> are bar graphs that represent data for two categorical variables from more than one group by creating two bars on the chart for each group. Side-by-side bar graphs are most efficient when presenting data counts (not percentages).<\/p>\r\n<p><strong>Stacked bar graphs<\/strong> also represent data for two categorical variables from more than one group but stacked rather than side-by-side. Stacked bar graphs are most efficient when presenting percentages of the data in each group (not counts).<\/p>\r\n<p>The example problem below presents the same data displayed four different ways: as a contingency table (counts), as a conditional distribution (percentages), as side-by-side bar graphs (counts), and as stacked bar graphs (percentages). This won't always be the case; sometimes a bar graph will display percentages, but these examples represent efficient uses of these displays. Also note that these bars are vertical, but some side-by-side and stacked graphs are displayed horizontally.<\/p>\r\n<\/div>\r\n<section class=\"textbox example\">The following displays present alcohol consumption by students at a college. Counts are given for four categories (abstaining, light consumption, moderate consumption, and heavy consumption) for first-year, second-year, third-year, and fourth-year students.The Contingency Table shows the level of alcohol consumption self-reported by [latex]253[\/latex] students. We can see, for example, that of the [latex]95[\/latex] second-year students who responded, [latex]27[\/latex] of them identified themselves as light drinkers.<center><img class=\"aligncenter wp-image-1889\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5826\/2022\/11\/09035253\/Contingency-Table.jpg\" alt=\"A contingency showing Alcohol consumption counts by Class Year. The columns are Class Year, Abstain, Light, Moderate, Heavy, and Total. For Class Year 1, abstain is 8, light is 17, moderate is 21, heavy is 1, and the total is 47. For class year 2, abstain is 16, light is 27, moderate is 46, heavy is 6, and the total is 95. For Class Year 3, Abstain is 7, Light is 18, Moderate is 24, Heavy is 5, and the total is 54. For Class Year 4, Abstain is 3, Light is 21, Moderate is 29, Heavy is 4, and the total is 57. For all the class years combined, Abstain is 34, Light is 83, Moderate is 120, Heavy is 16, and the Total is 253.\" width=\"549\" height=\"193\" \/><\/center>\r\n<p>&nbsp;<\/p>\r\n<ol>\r\n\t<li>How many [latex]3[\/latex]rd year students identified themselves as heavy drinkers?<br \/>\r\n<p class=\"p1\">[reveal-answer q=\"190834\"]Show Answer[\/reveal-answer]<\/p>\r\n<p class=\"p1\">[hidden-answer a=\"190834\"]<\/p>\r\n<p class=\"p1\">[latex]5[\/latex] [\/hidden-answer]<br \/>\r\n<br \/>\r\n<\/p>\r\n<p class=\"p1\"><span style=\"font-family: 'Public Sans', -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen-Sans, Ubuntu, Cantarell, 'Helvetica Neue', sans-serif;\">Now let's look at the data as percentages rather than counts.<\/span><\/p>\r\n<center><img class=\"aligncenter wp-image-1890\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5826\/2022\/11\/09040027\/Conditional-Distribution.jpg\" alt=\"Conditional distribution of Alcohol consumption counts by Class Year. The columns are Class Year, Abstain, Light, Moderate, Heavy, and Total. For Class Year 1, abstain is 0.170, light is 0.362, moderate is 0.447, heavy is 0.021, and the total is 1. For class year 2, abstain is 0.168, light is 0.284, moderate is 0.484, heavy is 0.063, and the total is 1. For class year 3, abstain is 0.130, light is 0.333, moderate is 0.444, heavy is 0.093, and the total is 1. For class year 4, abstain is 0.053, light is 0.368, moderate is 0.509, heavy is 0.070, and the total is 1.\" width=\"550\" height=\"181\" \/><\/center>\r\n<p>Note that the percentages are given in decimal form. Multiply by [latex]100[\/latex] to convert to percentages. Each row adds up to [latex]1[\/latex] ([latex]100%[\/latex]) of all responses for each class year. We can see that the [latex]27[\/latex] of [latex]95[\/latex] [latex]2[\/latex]nd year students we located in the table above is represented in this table as [latex]\\dfrac{27}{95}\\approx0.284[\/latex], which is about [latex]28.4%[\/latex].<br \/>\r\n<br \/>\r\n<\/p>\r\n<\/li>\r\n\t<li>What percentage of [latex]3[\/latex]rd year students identified themselves as heavy drinkers?\r\n\r\n\r\n<p class=\"p1\" style=\"padding-left: 40px;\">[reveal-answer q=\"8877562\"]Show Answer[\/reveal-answer]<\/p>\r\n<p class=\"p1\" style=\"padding-left: 40px;\">[hidden-answer a=\"8877562\"] [latex]5\/54 = 9.3%[\/latex]<\/p>\r\n<p class=\"p1\" style=\"padding-left: 40px;\">[\/hidden-answer]<\/p>\r\n<\/li>\r\n<\/ol>\r\n<\/section>\r\n<section class=\"textbox example\"><center><img class=\"aligncenter wp-image-1891 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5826\/2022\/11\/09040633\/newplot-4.png\" alt=\"A bar graph of Alcohol Consumption by Year, with the count on the vertical axis and the class year on the horizontal axis. The class years are 1, 2, 3, and 4. There is also a key saying that yellow represents abstain, green represents light, blue represents moderate, and red represents heavy. The data is the data from the contingency table showing Alcohol consumption counts by Class Year.\" width=\"711\" height=\"360\" \/><\/center>Here, we see the data from the contingency table displayed as side-by-side bar graphs. The horizontal axis contains the four class years of students while the vertical axis indicates the height of each bar in numbers of students. The differently colored bars each represent an alcohol consumption category.<br \/>\r\n<br \/>\r\n1. Try to locate the [latex]27[\/latex] second-year students who identified as light drinkers. What color is the bar and where is it located along the horizontal axis?\r\n\r\n\r\n<p class=\"p1\">[reveal-answer q=\"908234\"]Show Answer[\/reveal-answer]<\/p>\r\n<p class=\"p1\">[hidden-answer a=\"908234\"]The green bar above [latex]2[\/latex] on the horizontal axis.<\/p>\r\n<p class=\"p1\">[\/hidden-answer]<\/p>\r\n<p>&nbsp;<\/p>\r\n<p>2. Which class year reports the most moderate drinking?<\/p>\r\n<p class=\"p1\">[reveal-answer q=\"266144\"]Show Answer[\/reveal-answer]<\/p>\r\n<p class=\"p1\">[hidden-answer a=\"266144\"] Year [latex]2[\/latex]<\/p>\r\n<p class=\"p1\">[\/hidden-answer]<\/p>\r\n<\/section>\r\n<section class=\"textbox example\"><center><img class=\"aligncenter wp-image-1892 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5826\/2022\/11\/09041316\/newplot-3.png\" alt=\"A stacked bar graph of Alcohol Consumption by Year, with the count on the vertical axis and the class year on the horizontal axis. The class years are 1, 2, 3, and 4. There is also a key saying that yellow represents abstain, green represents light, blue represents moderate, and red represents heavy. The data is the data from the contingency table showing Alcohol consumption counts by Class Year. Each spot on the horizontal axis has one bar, divided into four sections, one of each color.\" width=\"711\" height=\"360\" \/><\/center>Finally, we see the information from the conditional distribution displayed as stacked bar graphs.\r\n\r\n\r\n<p>1. Which class year reported the lowest proportion of heavy drinkers?<\/p>\r\n<p class=\"p1\">[reveal-answer q=\"835577\"]Show Answer[\/reveal-answer]<\/p>\r\n<p class=\"p1\">[hidden-answer a=\"835577\"][latex]1[\/latex]st year: the red section at the top of the stacked bar above [latex]1[\/latex] on the horizontal axis represents [latex]2.1%[\/latex] of all [latex]1[\/latex]st year students who reported drinking heavily.<\/p>\r\n<p class=\"p1\">[\/hidden-answer]<\/p>\r\n<p>&nbsp;<\/p>\r\n<p>2. How did alcohol consumption change from class year to class year?<\/p>\r\n<p class=\"p1\">[reveal-answer q=\"800399\"]Show Answer[\/reveal-answer]<\/p>\r\n<p class=\"p1\">[hidden-answer a=\"800399\"](answers vary)<\/p>\r\n<p class=\"p1\">[\/hidden-answer]<\/p>\r\n<\/section>\r\n<h2>Dot Plots and Histograms<\/h2>\r\n<div class=\"textbox shaded\">\r\n<h4>The Main Idea<\/h4>\r\n<p><span class=\"TextRun SCXW26391111 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun SCXW26391111 BCX0\">A <strong>dot plot<\/strong> takes a collection of quantitative data points and distributes them across a horizontal axis (a number line). Each value is represented by a single dot on the dot plot.\u00a0 Identical values get stacked up so we can tell at a glance <\/span><span class=\"NormalTextRun SCXW26391111 BCX0\">which values showed up in <\/span><span class=\"NormalTextRun SCXW26391111 BCX0\">large quantities<\/span><span class=\"NormalTextRun SCXW26391111 BCX0\"> in the dataset and which are <\/span><span class=\"NormalTextRun SCXW26391111 BCX0\">rarer<\/span><span class=\"NormalTextRun SCXW26391111 BCX0\">. From a <\/span><span class=\"NormalTextRun SpellingErrorV2Themed SCXW26391111 BCX0\">dot plot<\/span><span class=\"NormalTextRun SCXW26391111 BCX0\">, if there <\/span><span class=\"NormalTextRun SCXW26391111 BCX0\">aren\u2019t<\/span><span class=\"NormalTextRun SCXW26391111 BCX0\"> too many data points, we can count the number of\u00a0observations and <\/span><span class=\"NormalTextRun SCXW26391111 BCX0\">locate<\/span><span class=\"NormalTextRun SCXW26391111 BCX0\"> the exact median of the data.<\/span><\/span><span class=\"EOP SCXW26391111 BCX0\" data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:259}\">\u00a0We can also discern the shape of the data distribution (is it symmetric or bunched up to one side or the other?).<\/span><\/p>\r\n<p><span class=\"TextRun SCXW125914864 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun SCXW125914864 BCX0\">A <strong>histogram<\/strong> is like a bar chart for quantitative variables. It takes all the data measurements collected and groups them into bins of equal width. The person creating the histogram, whether by technology or by hand, chooses the bin-width. The smaller the bin, the finer the detail, and vice-versa, large bin-width may hide detail by flattening out\u00a0variation in the data. From a histogram, we can see summary information about the data set and discern the shape and center of the data. <\/span><\/span><span class=\"EOP SCXW125914864 BCX0\" data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:259}\">\u00a0<\/span><\/p>\r\n<\/div>\r\n<p>The two videos below demonstrate how to read and interpret these quantitative graphs.<\/p>\r\n<section class=\"textbox watchIt\"><iframe src=\"\/\/plugin.3playmedia.com\/show?mf=10350386&amp;p3sdk_version=1.10.1&amp;p=20361&amp;pt=375&amp;video_id=le8lFMyg0nk&amp;video_target=tpm-plugin-hu86zhdd-le8lFMyg0nk\" width=\"800px\" height=\"450px\" frameborder=\"0\" marginwidth=\"0px\" marginheight=\"0px\"><\/iframe><br \/>\r\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/Interpreting+Dot+Plots.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cInterpreting Dot Plots\u201d here (opens in new window).<\/a><\/p>\r\n<\/section>\r\n<section class=\"textbox watchIt\"><iframe src=\"\/\/plugin.3playmedia.com\/show?mf=10350387&amp;p3sdk_version=1.10.1&amp;p=20361&amp;pt=375&amp;video_id=RZJ4qqQboHQ&amp;video_target=tpm-plugin-bzkfb2by-RZJ4qqQboHQ\" width=\"800px\" height=\"450px\" frameborder=\"0\" marginwidth=\"0px\" marginheight=\"0px\"><\/iframe><br \/>\r\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/Distributions+and+Their+Shapes.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cDistributions and Their Shapes\u201d here (opens in new window).<\/a><\/p>\r\n<\/section>\r\n<section class=\"textbox seeExample\">Here we have three graphs of the same set of hip girth measurements (circumference\/distance around someone's hips) for [latex]507[\/latex] adults who exercise regularly. From the dot plot, we can see that the distribution of hip measurements has an overall range of [latex]79[\/latex] to [latex]128[\/latex] cm. For convenience, we started the axis at [latex]75[\/latex] and ended the axis at [latex]130[\/latex].<center><img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031543\/m2_summarizing_data_topic_2_1_Topic2_1Histograms1of4_image1.png\" alt=\"Dot plot showing distribution of hip measurements of 507 adults. Most of the data points are right-skewed\" width=\"650\" height=\"310\" \/><\/center>\r\n<p>&nbsp;<\/p>\r\n\r\n\r\nTo create a histogram, divide the variable values into equal-sized intervals called <strong>bins<\/strong>. In this graph, we chose bins with a width of [latex]5[\/latex] cm. Each bin contains a different number of individuals. For example, [latex]48[\/latex] adults have hip measurements between [latex]85[\/latex] and [latex]90[\/latex] cm, and [latex]97[\/latex] adults have hip measurements between [latex]100[\/latex] and [latex]105[\/latex] cm.<center><img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031545\/m2_summarizing_data_topic_2_1_Topic2_1Histograms1of4_image2.png\" alt=\"Dot plot showing distribution of hip measurements of 507 adults, with white and gray bars overlaid on the dots every 5 cm.\" width=\"650\" height=\"310\" \/><\/center>\r\n<p>&nbsp;<\/p>\r\n\r\n\r\nHere is a histogram. Each bin is now a bar. The height of the bar indicates the number of individuals with hip measurements in the interval for that bin. As before, we can see that [latex]48[\/latex] adults have hip measurements between [latex]85[\/latex] and [latex]90[\/latex] cm, and [latex]97[\/latex] adults have hip measurements between [latex]100[\/latex] and [latex]105[\/latex] cm.<center><img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031547\/m2_summarizing_data_topic_2_1_Topic2_1Histograms1of4_image3.png\" alt=\"Histogram showing distribution of hip measurements of 150 adults, with bars indicating number of adults in each interval. The highest proportion of hip girth is in the ninety to one hundred cm range.\" width=\"650\" height=\"310\" \/><\/center>\r\n<p>&nbsp;<\/p>\r\n<em>Note:<\/em> In the histogram, the count is the number of individuals in each bin. The count is also called the <strong>frequency<\/strong>. From these counts, we can determine a percentage of individuals with a given interval of variable values. This percentage is called a <strong>relative frequency<\/strong>.<\/section>\r\n<section class=\"textbox example\">Using the data and graphs above, approximately what percentage of the sample has hip measurements between [latex]85[\/latex] and [latex]90[\/latex] cm?[reveal-answer q=\"405449\"]Show Solution[\/reveal-answer]<br \/>\r\n[hidden-answer a=\"405449\"] Of the [latex]507[\/latex] adults in the data set, [latex]48[\/latex] have hip measurements between [latex]85[\/latex] and [latex]90[\/latex] cm. [latex]48[\/latex] out of [latex]507[\/latex] is [latex]48 \u00f7 507 \u2248 0.095 = 9.5%[\/latex] So approximately [latex]9.5%[\/latex] of the adults in this sample have hip girths between [latex]85[\/latex] and [latex]90[\/latex] cm.[\/hidden-answer]<\/section>","rendered":"<section class=\"textbox learningGoals\">\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>Organize data using tables, charts, and graphs<\/li>\n<li>Create and analyze side-by-side and stacked bar graphs<\/li>\n<li>Create and analyze graphs of quantitative data<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/section>\n<h2>Visualizing Data<\/h2>\n<p>Recall that categorical data is data that is separated into distinct categories. Categorical variables are described using words. The count of each category is collected, which can be displayed using a table or a graph.<\/p>\n<div class=\"textbox shaded\">\n<p><strong>The Main Idea<\/strong><\/p>\n<p><b>Frequency tables\u00a0<\/b>list all the types of a categorical variable along with how many there are of each. Each category total is divided by the total of all the data to obtain the proportion of the total data that is contained in the category. The proportion may then be converted to a percentage, which is often called the &#8220;relative frequency.&#8221;<\/p>\n<p style=\"padding-left: 30px;\">\u00a0Ex. In a particular statistics class, [latex]10[\/latex] students major in business, [latex]5[\/latex] major in biology, and [latex]12[\/latex] major in health sciences. We can find the proportion of each major in the class by dividing the number appearing in that major by the total students. There are [latex]27[\/latex] total students given in the [latex]3[\/latex] majors. [latex]\\dfrac{10}{27}\\approx0.37[\/latex], which tells us about [latex]37%[\/latex] of the class majors in business.<\/p>\n<p><strong>Bar graphs\u00a0<\/strong>can also display either the count of each category or the proportion or percentage, depending on how the vertical axis is labeled.<\/p>\n<p style=\"padding-left: 30px;\">The horizontal axis lists each category of the variable.<\/p>\n<p style=\"padding-left: 30px;\">If the vertical axis lists counts of each, then the height of each bar above its category indicates the number of individual observations in that category.<\/p>\n<p style=\"padding-left: 30px;\">If the vertical axis lists percentages, then the height of the bar will indicate the proportion or percentage of each category out of the total observations.<\/p>\n<p style=\"padding-left: 30px;\">A <strong>Pareto chart<\/strong> is a bar graph ordered from highest to lowest frequency.<\/p>\n<p><strong>Pie charts\u00a0<\/strong>display either percentages or counts of each category arranged as slices of pie. The size of the slice corresponds to the proportion of observations in the category.<\/p>\n<p style=\"padding-left: 30px;\">In our example of the majors present in a statistics class, we calculated that [latex]37%[\/latex] of the students major in business.\u00a0[latex]\\dfrac{5}{27}\\approx0.185[\/latex], which tells us [latex]18.5%[\/latex] major in biology.\u00a0[latex]\\dfrac{12}{27}\\approx0.444[\/latex] indicates that about [latex]44.4%[\/latex] major in health sciences. The percentages don&#8217;t total to [latex]100%[\/latex] due to rounding.<\/p>\n<\/div>\n<p>See the video below for a visual demonstration of how these charts are constructed from data collected on a categorical variable.<\/p>\n<section class=\"textbox watchIt\"><iframe loading=\"lazy\" src=\"\/\/plugin.3playmedia.com\/show?mf=10350385&amp;p3sdk_version=1.10.1&amp;p=20361&amp;pt=375&amp;video_id=Rx8wSEDq5Hs&amp;video_target=tpm-plugin-ua8zayzf-Rx8wSEDq5Hs\" width=\"800px\" height=\"450px\" frameborder=\"0\" marginwidth=\"0px\" marginheight=\"0px\"><\/iframe><\/p>\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/Bar+Chart+Pie+Chart+Frequency+Tables+%7C+Statistics+Tutorial+%7C+MarinStatsLectures.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cBar Chart, Pie Chart, Frequency Tables | Statistics Tutorial | MarinStatsLectures\u201d here (opens in new window).<\/a><\/p>\n<\/section>\n<h2>Interpreting Side-by-Side and Stacked Bar Graphs<\/h2>\n<div class=\"textbox shaded\">\n<h4><strong>The Main Idea<\/strong><\/h4>\n<p><strong>Side-by-side bar graphs<\/strong> are bar graphs that represent data for two categorical variables from more than one group by creating two bars on the chart for each group. Side-by-side bar graphs are most efficient when presenting data counts (not percentages).<\/p>\n<p><strong>Stacked bar graphs<\/strong> also represent data for two categorical variables from more than one group but stacked rather than side-by-side. Stacked bar graphs are most efficient when presenting percentages of the data in each group (not counts).<\/p>\n<p>The example problem below presents the same data displayed four different ways: as a contingency table (counts), as a conditional distribution (percentages), as side-by-side bar graphs (counts), and as stacked bar graphs (percentages). This won&#8217;t always be the case; sometimes a bar graph will display percentages, but these examples represent efficient uses of these displays. Also note that these bars are vertical, but some side-by-side and stacked graphs are displayed horizontally.<\/p>\n<\/div>\n<section class=\"textbox example\">The following displays present alcohol consumption by students at a college. Counts are given for four categories (abstaining, light consumption, moderate consumption, and heavy consumption) for first-year, second-year, third-year, and fourth-year students.The Contingency Table shows the level of alcohol consumption self-reported by [latex]253[\/latex] students. We can see, for example, that of the [latex]95[\/latex] second-year students who responded, [latex]27[\/latex] of them identified themselves as light drinkers.<\/p>\n<div style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-1889\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5826\/2022\/11\/09035253\/Contingency-Table.jpg\" alt=\"A contingency showing Alcohol consumption counts by Class Year. The columns are Class Year, Abstain, Light, Moderate, Heavy, and Total. For Class Year 1, abstain is 8, light is 17, moderate is 21, heavy is 1, and the total is 47. For class year 2, abstain is 16, light is 27, moderate is 46, heavy is 6, and the total is 95. For Class Year 3, Abstain is 7, Light is 18, Moderate is 24, Heavy is 5, and the total is 54. For Class Year 4, Abstain is 3, Light is 21, Moderate is 29, Heavy is 4, and the total is 57. For all the class years combined, Abstain is 34, Light is 83, Moderate is 120, Heavy is 16, and the Total is 253.\" width=\"549\" height=\"193\" \/><\/div>\n<p>&nbsp;<\/p>\n<ol>\n<li>How many [latex]3[\/latex]rd year students identified themselves as heavy drinkers?\n<p class=\"p1\">\n<div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q190834\">Show Answer<\/button><\/p>\n<p class=\"p1\">\n<div id=\"q190834\" class=\"hidden-answer\" style=\"display: none\">\n<p class=\"p1\">[latex]5[\/latex] <\/div>\n<\/div>\n<p class=\"p1\"><span style=\"font-family: 'Public Sans', -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen-Sans, Ubuntu, Cantarell, 'Helvetica Neue', sans-serif;\">Now let&#8217;s look at the data as percentages rather than counts.<\/span><\/p>\n<div style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-1890\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5826\/2022\/11\/09040027\/Conditional-Distribution.jpg\" alt=\"Conditional distribution of Alcohol consumption counts by Class Year. The columns are Class Year, Abstain, Light, Moderate, Heavy, and Total. For Class Year 1, abstain is 0.170, light is 0.362, moderate is 0.447, heavy is 0.021, and the total is 1. For class year 2, abstain is 0.168, light is 0.284, moderate is 0.484, heavy is 0.063, and the total is 1. For class year 3, abstain is 0.130, light is 0.333, moderate is 0.444, heavy is 0.093, and the total is 1. For class year 4, abstain is 0.053, light is 0.368, moderate is 0.509, heavy is 0.070, and the total is 1.\" width=\"550\" height=\"181\" \/><\/div>\n<p>Note that the percentages are given in decimal form. Multiply by [latex]100[\/latex] to convert to percentages. Each row adds up to [latex]1[\/latex] ([latex]100%[\/latex]) of all responses for each class year. We can see that the [latex]27[\/latex] of [latex]95[\/latex] [latex]2[\/latex]nd year students we located in the table above is represented in this table as [latex]\\dfrac{27}{95}\\approx0.284[\/latex], which is about [latex]28.4%[\/latex].<\/p>\n<\/li>\n<li>What percentage of [latex]3[\/latex]rd year students identified themselves as heavy drinkers?\n<p class=\"p1\" style=\"padding-left: 40px;\">\n<div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q8877562\">Show Answer<\/button><\/p>\n<p class=\"p1\" style=\"padding-left: 40px;\">\n<div id=\"q8877562\" class=\"hidden-answer\" style=\"display: none\"> [latex]5\/54 = 9.3%[\/latex]<\/p>\n<p class=\"p1\" style=\"padding-left: 40px;\"><\/div>\n<\/div>\n<\/li>\n<\/ol>\n<\/section>\n<section class=\"textbox example\">\n<div style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-1891 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5826\/2022\/11\/09040633\/newplot-4.png\" alt=\"A bar graph of Alcohol Consumption by Year, with the count on the vertical axis and the class year on the horizontal axis. The class years are 1, 2, 3, and 4. There is also a key saying that yellow represents abstain, green represents light, blue represents moderate, and red represents heavy. The data is the data from the contingency table showing Alcohol consumption counts by Class Year.\" width=\"711\" height=\"360\" \/><\/div>\n<p>Here, we see the data from the contingency table displayed as side-by-side bar graphs. The horizontal axis contains the four class years of students while the vertical axis indicates the height of each bar in numbers of students. The differently colored bars each represent an alcohol consumption category.<\/p>\n<p>1. Try to locate the [latex]27[\/latex] second-year students who identified as light drinkers. What color is the bar and where is it located along the horizontal axis?<\/p>\n<p class=\"p1\">\n<div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q908234\">Show Answer<\/button><\/p>\n<p class=\"p1\">\n<div id=\"q908234\" class=\"hidden-answer\" style=\"display: none\">The green bar above [latex]2[\/latex] on the horizontal axis.<\/p>\n<p class=\"p1\"><\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<p>2. Which class year reports the most moderate drinking?<\/p>\n<p class=\"p1\">\n<div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q266144\">Show Answer<\/button><\/p>\n<p class=\"p1\">\n<div id=\"q266144\" class=\"hidden-answer\" style=\"display: none\"> Year [latex]2[\/latex]<\/p>\n<p class=\"p1\"><\/div>\n<\/div>\n<\/section>\n<section class=\"textbox example\">\n<div style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-1892 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5826\/2022\/11\/09041316\/newplot-3.png\" alt=\"A stacked bar graph of Alcohol Consumption by Year, with the count on the vertical axis and the class year on the horizontal axis. The class years are 1, 2, 3, and 4. There is also a key saying that yellow represents abstain, green represents light, blue represents moderate, and red represents heavy. The data is the data from the contingency table showing Alcohol consumption counts by Class Year. Each spot on the horizontal axis has one bar, divided into four sections, one of each color.\" width=\"711\" height=\"360\" \/><\/div>\n<p>Finally, we see the information from the conditional distribution displayed as stacked bar graphs.<\/p>\n<p>1. Which class year reported the lowest proportion of heavy drinkers?<\/p>\n<p class=\"p1\">\n<div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q835577\">Show Answer<\/button><\/p>\n<p class=\"p1\">\n<div id=\"q835577\" class=\"hidden-answer\" style=\"display: none\">[latex]1[\/latex]st year: the red section at the top of the stacked bar above [latex]1[\/latex] on the horizontal axis represents [latex]2.1%[\/latex] of all [latex]1[\/latex]st year students who reported drinking heavily.<\/p>\n<p class=\"p1\"><\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<p>2. How did alcohol consumption change from class year to class year?<\/p>\n<p class=\"p1\">\n<div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q800399\">Show Answer<\/button><\/p>\n<p class=\"p1\">\n<div id=\"q800399\" class=\"hidden-answer\" style=\"display: none\">(answers vary)<\/p>\n<p class=\"p1\"><\/div>\n<\/div>\n<\/section>\n<h2>Dot Plots and Histograms<\/h2>\n<div class=\"textbox shaded\">\n<h4>The Main Idea<\/h4>\n<p><span class=\"TextRun SCXW26391111 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun SCXW26391111 BCX0\">A <strong>dot plot<\/strong> takes a collection of quantitative data points and distributes them across a horizontal axis (a number line). Each value is represented by a single dot on the dot plot.\u00a0 Identical values get stacked up so we can tell at a glance <\/span><span class=\"NormalTextRun SCXW26391111 BCX0\">which values showed up in <\/span><span class=\"NormalTextRun SCXW26391111 BCX0\">large quantities<\/span><span class=\"NormalTextRun SCXW26391111 BCX0\"> in the dataset and which are <\/span><span class=\"NormalTextRun SCXW26391111 BCX0\">rarer<\/span><span class=\"NormalTextRun SCXW26391111 BCX0\">. From a <\/span><span class=\"NormalTextRun SpellingErrorV2Themed SCXW26391111 BCX0\">dot plot<\/span><span class=\"NormalTextRun SCXW26391111 BCX0\">, if there <\/span><span class=\"NormalTextRun SCXW26391111 BCX0\">aren\u2019t<\/span><span class=\"NormalTextRun SCXW26391111 BCX0\"> too many data points, we can count the number of\u00a0observations and <\/span><span class=\"NormalTextRun SCXW26391111 BCX0\">locate<\/span><span class=\"NormalTextRun SCXW26391111 BCX0\"> the exact median of the data.<\/span><\/span><span class=\"EOP SCXW26391111 BCX0\" data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:259}\">\u00a0We can also discern the shape of the data distribution (is it symmetric or bunched up to one side or the other?).<\/span><\/p>\n<p><span class=\"TextRun SCXW125914864 BCX0\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun SCXW125914864 BCX0\">A <strong>histogram<\/strong> is like a bar chart for quantitative variables. It takes all the data measurements collected and groups them into bins of equal width. The person creating the histogram, whether by technology or by hand, chooses the bin-width. The smaller the bin, the finer the detail, and vice-versa, large bin-width may hide detail by flattening out\u00a0variation in the data. From a histogram, we can see summary information about the data set and discern the shape and center of the data. <\/span><\/span><span class=\"EOP SCXW125914864 BCX0\" data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:259}\">\u00a0<\/span><\/p>\n<\/div>\n<p>The two videos below demonstrate how to read and interpret these quantitative graphs.<\/p>\n<section class=\"textbox watchIt\"><iframe loading=\"lazy\" src=\"\/\/plugin.3playmedia.com\/show?mf=10350386&amp;p3sdk_version=1.10.1&amp;p=20361&amp;pt=375&amp;video_id=le8lFMyg0nk&amp;video_target=tpm-plugin-hu86zhdd-le8lFMyg0nk\" width=\"800px\" height=\"450px\" frameborder=\"0\" marginwidth=\"0px\" marginheight=\"0px\"><\/iframe><\/p>\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/Interpreting+Dot+Plots.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cInterpreting Dot Plots\u201d here (opens in new window).<\/a><\/p>\n<\/section>\n<section class=\"textbox watchIt\"><iframe loading=\"lazy\" src=\"\/\/plugin.3playmedia.com\/show?mf=10350387&amp;p3sdk_version=1.10.1&amp;p=20361&amp;pt=375&amp;video_id=RZJ4qqQboHQ&amp;video_target=tpm-plugin-bzkfb2by-RZJ4qqQboHQ\" width=\"800px\" height=\"450px\" frameborder=\"0\" marginwidth=\"0px\" marginheight=\"0px\"><\/iframe><\/p>\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/Distributions+and+Their+Shapes.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cDistributions and Their Shapes\u201d here (opens in new window).<\/a><\/p>\n<\/section>\n<section class=\"textbox seeExample\">Here we have three graphs of the same set of hip girth measurements (circumference\/distance around someone&#8217;s hips) for [latex]507[\/latex] adults who exercise regularly. From the dot plot, we can see that the distribution of hip measurements has an overall range of [latex]79[\/latex] to [latex]128[\/latex] cm. For convenience, we started the axis at [latex]75[\/latex] and ended the axis at [latex]130[\/latex].<\/p>\n<div style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031543\/m2_summarizing_data_topic_2_1_Topic2_1Histograms1of4_image1.png\" alt=\"Dot plot showing distribution of hip measurements of 507 adults. Most of the data points are right-skewed\" width=\"650\" height=\"310\" \/><\/div>\n<p>&nbsp;<\/p>\n<p>To create a histogram, divide the variable values into equal-sized intervals called <strong>bins<\/strong>. In this graph, we chose bins with a width of [latex]5[\/latex] cm. Each bin contains a different number of individuals. For example, [latex]48[\/latex] adults have hip measurements between [latex]85[\/latex] and [latex]90[\/latex] cm, and [latex]97[\/latex] adults have hip measurements between [latex]100[\/latex] and [latex]105[\/latex] cm.<\/p>\n<div style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031545\/m2_summarizing_data_topic_2_1_Topic2_1Histograms1of4_image2.png\" alt=\"Dot plot showing distribution of hip measurements of 507 adults, with white and gray bars overlaid on the dots every 5 cm.\" width=\"650\" height=\"310\" \/><\/div>\n<p>&nbsp;<\/p>\n<p>Here is a histogram. Each bin is now a bar. The height of the bar indicates the number of individuals with hip measurements in the interval for that bin. As before, we can see that [latex]48[\/latex] adults have hip measurements between [latex]85[\/latex] and [latex]90[\/latex] cm, and [latex]97[\/latex] adults have hip measurements between [latex]100[\/latex] and [latex]105[\/latex] cm.<\/p>\n<div style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031547\/m2_summarizing_data_topic_2_1_Topic2_1Histograms1of4_image3.png\" alt=\"Histogram showing distribution of hip measurements of 150 adults, with bars indicating number of adults in each interval. The highest proportion of hip girth is in the ninety to one hundred cm range.\" width=\"650\" height=\"310\" \/><\/div>\n<p>&nbsp;<\/p>\n<p><em>Note:<\/em> In the histogram, the count is the number of individuals in each bin. The count is also called the <strong>frequency<\/strong>. From these counts, we can determine a percentage of individuals with a given interval of variable values. This percentage is called a <strong>relative frequency<\/strong>.<\/section>\n<section class=\"textbox example\">Using the data and graphs above, approximately what percentage of the sample has hip measurements between [latex]85[\/latex] and [latex]90[\/latex] cm?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q405449\">Show Solution<\/button><\/p>\n<div id=\"q405449\" class=\"hidden-answer\" style=\"display: none\"> Of the [latex]507[\/latex] adults in the data set, [latex]48[\/latex] have hip measurements between [latex]85[\/latex] and [latex]90[\/latex] cm. [latex]48[\/latex] out of [latex]507[\/latex] is [latex]48 \u00f7 507 \u2248 0.095 = 9.5%[\/latex] So approximately [latex]9.5%[\/latex] of the adults in this sample have hip girths between [latex]85[\/latex] and [latex]90[\/latex] cm.<\/div>\n<\/div>\n<\/section>\n","protected":false},"author":15,"menu_order":10,"template":"","meta":{"_candela_citation":"[]","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"part":1572,"module-header":"fresh_take","content_attributions":[],"internal_book_links":[],"video_content":null,"cc_video_embed_content":{"cc_scripts":"","media_targets":[]},"try_it_collection":null,"_links":{"self":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/1576"}],"collection":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/users\/15"}],"version-history":[{"count":39,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/1576\/revisions"}],"predecessor-version":[{"id":15382,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/1576\/revisions\/15382"}],"part":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/parts\/1572"}],"metadata":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/1576\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/media?parent=1576"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapter-type?post=1576"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/contributor?post=1576"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/license?post=1576"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}