{"id":1722,"date":"2023-04-13T16:20:08","date_gmt":"2023-04-13T16:20:08","guid":{"rendered":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/?post_type=chapter&#038;p=1722"},"modified":"2025-08-27T00:24:21","modified_gmt":"2025-08-27T00:24:21","slug":"data-collection-basics-fresh-take","status":"web-only","type":"chapter","link":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/chapter\/data-collection-basics-fresh-take\/","title":{"raw":"Data Collection Basics: Fresh Take","rendered":"Data Collection Basics: Fresh Take"},"content":{"raw":"<section class=\"textbox learningGoals\">\r\n<ul>\r\n\t<li>Understand the difference between a census and a sample, and be able to identify the population being studied<\/li>\r\n\t<li>Distinguish between a value calculated from a sample and one calculated from a population<\/li>\r\n\t<li>Categorize a measurement as either numeric or qualitative<\/li>\r\n<\/ul>\r\n<\/section>\r\n<h2>Population vs Sample<\/h2>\r\n<div class=\"textbox shaded\">\r\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\r\n<p>In statistics, the distinction between a population and a sample is pivotal. The <strong>population<\/strong> refers to the entire group that is the focus of a researcher's interest, which can be people, animals, objects, or events. It includes all individuals or items that possess the characteristics that the researcher wants to study.<\/p>\r\n<p>On the other hand, a <strong>sample<\/strong> is a subset of this population, selected for participation in the study. Samples are used because it is often impractical or impossible to study the entire population. Therefore, a well-selected sample should be representative of the population, allowing researchers to make inferences about the larger group based on the sample's data. Different sampling methods are employed to ensure the sample is as representative as possible, reducing bias and facilitating the generalization of the study's results.<\/p>\r\n<\/div>\r\n<section class=\"textbox watchIt\"><iframe title=\"YouTube video player\" src=\"https:\/\/www.youtube.com\/embed\/eIZD1BFfw8E\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><br \/>\r\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/Population+vs+Sample.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cPopulation vs Sample\u201d here (opens in new window).<\/a><\/p>\r\n<\/section>\r\n<section class=\"textbox example\">To determine the average length of fish in a lake, researchers catch [latex]20[\/latex] fish and measure them. What is the sample and population in this study?<br \/>\r\n[reveal-answer q=\"707088\"]Show Solution[\/reveal-answer]<br \/>\r\n[hidden-answer a=\"707088\"]The sample is the [latex]20[\/latex] fish caught. The population is all fish in the lake. The sample may be somewhat unrepresentative of the population since not all fish may be large enough to catch the bait.[\/hidden-answer]<\/section>\r\n<h2>Quantitative or Categorical<\/h2>\r\n<p>Once we have gathered data, we might wish to classify it.\u00a0 Roughly speaking, data can be classified as categorical data or quantitative data.<\/p>\r\n<center>\r\n[caption id=\"attachment_1021\" align=\"aligncenter\" width=\"640\"]<a href=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2016\/12\/29214357\/12210424505_2da556e2df_z.jpg\"><img class=\"wp-image-1021 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2016\/12\/29214357\/12210424505_2da556e2df_z.jpg\" alt=\"vertical lines of colored circles\" width=\"640\" height=\"321\" \/><\/a> Figure 1. This is a visualization of quantitative data\u00a0[\/caption]\r\n<\/center>\r\n<div class=\"textbox shaded\">\r\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\r\n<p><strong>Categorical (qualitative) data<\/strong> are pieces of information that allow us to classify the objects under investigation into various categories.<\/p>\r\n<p><strong>Quantitative data<\/strong> are responses that are numerical in nature and with which we can perform meaningful arithmetic calculations.<\/p>\r\n<\/div>\r\n<section class=\"textbox seeExample\">We might conduct a survey to determine the name of the favorite movie that each person in a math class saw in a movie theater. When we conduct such a survey, the responses would look like: <em>Avatar: The Way of Water<\/em>, <em>The Super Mario Bros. Movie<\/em>, or <em>Creed III<\/em>. We might count the number of people who give each answer, but the answers themselves do not have any numerical values: we cannot perform computations with an answer like \"<em>Avatar: The Way of Water<\/em>.\" Is this categorical or quantitative data?<br \/>\r\n[reveal-answer q=\"914414\"]Show Solution[\/reveal-answer]<br \/>\r\n[hidden-answer a=\"914414\"]This would be categorical data.[\/hidden-answer]<\/section>\r\n<section class=\"textbox seeExample\">A survey could ask the number of movies you have seen in a movie theater in the past [latex]12[\/latex] months ([latex]0, 1, 2, 3, 4, . . .[\/latex]).\u00a0Is this categorical or quantitative data?<br \/>\r\n[reveal-answer q=\"798578\"]Show Solution[\/reveal-answer]<br \/>\r\n[hidden-answer a=\"798578\"]This would be quantitative data. Other examples of quantitative data would be the running time of the movie you saw most recently ([latex]104[\/latex] minutes, [latex]137[\/latex] minutes, [latex]104[\/latex] minutes, . . .) or the amount of money you paid for a movie ticket the last time you went to a movie theater ([latex]$10.45, $11.75, $12[\/latex], . . .).[\/hidden-answer]<\/section>\r\n<p>Sometimes, determining whether data is categorical or quantitative can be a bit trickier. \u00a0In the next example, the data collected is in numerical form, but it is not quantitative data. Read on to find out why.<\/p>\r\n<section class=\"textbox seeExample\">Suppose we gather respondents' ZIP codes in a survey to track their geographical location. Is this categorical or quantitative?<center><a href=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2017\/02\/09002155\/Screen-Shot-2017-02-08-at-4.21.15-PM.png\"><img class=\" wp-image-1453\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2017\/02\/09002155\/Screen-Shot-2017-02-08-at-4.21.15-PM-300x230.png\" alt=\"Map of Portland, OR with zip codes.\" width=\"244\" height=\"187\" \/><\/a><\/center><center><strong><span style=\"font-size: 10pt;\">Zip Codes for Portland, OR<\/span><\/strong><\/center>\r\n<p>[reveal-answer q=\"103310\"]Show Solution[\/reveal-answer]<br \/>\r\n[hidden-answer a=\"103310\"]ZIP codes are numbers, but we can't do any meaningful mathematical calculations with them (it doesn't make sense to say that [latex]98036[\/latex] is \"twice\" [latex]49018[\/latex]\u00a0\u2014 that's like saying that Lynnwood, WA is \"twice\" Battle Creek, MI, which doesn't make sense at all), so ZIP codes are really categorical data.[\/hidden-answer]<\/p>\r\n<\/section>\r\n<section class=\"textbox seeExample\">A survey about the movie you most recently attended includes the question \"How would you rate the movie you just saw?\" with these possible answers:\r\n\r\n<ol style=\"list-style-type: decimal;\">\r\n\t<li>it was awful<\/li>\r\n\t<li>it was just OK<\/li>\r\n\t<li>I liked it<\/li>\r\n\t<li>it was great<\/li>\r\n\t<li>best movie ever!<\/li>\r\n<\/ol>\r\n<p>Is this categorical or quantitative?<br \/>\r\n[reveal-answer q=\"286755\"]Show Solution[\/reveal-answer]<br \/>\r\n[hidden-answer a=\"286755\"]<\/p>\r\n<p>Again, there are numbers associated with the responses, but we can't really do any calculations with them: a movie that rates a [latex]4[\/latex] is not necessarily twice as good as a movie that rates a [latex]2[\/latex], whatever that means; if two people see the movie and one of them thinks it stinks and the other thinks it's the best ever it doesn't necessarily make sense to say that \"on average they liked it.\"<\/p>\r\n<p>As we study movie-going habits and preferences, we shouldn't forget to specify the population under consideration. If we survey [latex]7-9[\/latex] year-olds the runaway favorite might be <em>The Super Mario Bros. Movie<\/em>. [latex]14-17[\/latex] year-olds might prefer <em>Avatar: The Way of Water<\/em>. And [latex]33-37[\/latex] year-olds might prefer . . . well, <em>The Super Mario Bros. Movie<\/em>.<\/p>\r\n<p>[\/hidden-answer]<\/p>\r\n<\/section>\r\n<p>The examples in this page are discussed further in the following video:<\/p>\r\n<section class=\"textbox watchIt\"><iframe title=\"YouTube video player\" src=\"https:\/\/www.youtube.com\/embed\/mxZqyB01qPY\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><br \/>\r\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/Qualitative+and+Quantitative.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cQualitative and Quantitative\u201d here (opens in new window).<\/a><\/p>\r\n<\/section>","rendered":"<section class=\"textbox learningGoals\">\n<ul>\n<li>Understand the difference between a census and a sample, and be able to identify the population being studied<\/li>\n<li>Distinguish between a value calculated from a sample and one calculated from a population<\/li>\n<li>Categorize a measurement as either numeric or qualitative<\/li>\n<\/ul>\n<\/section>\n<h2>Population vs Sample<\/h2>\n<div class=\"textbox shaded\">\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\n<p>In statistics, the distinction between a population and a sample is pivotal. The <strong>population<\/strong> refers to the entire group that is the focus of a researcher&#8217;s interest, which can be people, animals, objects, or events. It includes all individuals or items that possess the characteristics that the researcher wants to study.<\/p>\n<p>On the other hand, a <strong>sample<\/strong> is a subset of this population, selected for participation in the study. Samples are used because it is often impractical or impossible to study the entire population. Therefore, a well-selected sample should be representative of the population, allowing researchers to make inferences about the larger group based on the sample&#8217;s data. Different sampling methods are employed to ensure the sample is as representative as possible, reducing bias and facilitating the generalization of the study&#8217;s results.<\/p>\n<\/div>\n<section class=\"textbox watchIt\"><iframe loading=\"lazy\" title=\"YouTube video player\" src=\"https:\/\/www.youtube.com\/embed\/eIZD1BFfw8E\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/Population+vs+Sample.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cPopulation vs Sample\u201d here (opens in new window).<\/a><\/p>\n<\/section>\n<section class=\"textbox example\">To determine the average length of fish in a lake, researchers catch [latex]20[\/latex] fish and measure them. What is the sample and population in this study?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q707088\">Show Solution<\/button><\/p>\n<div id=\"q707088\" class=\"hidden-answer\" style=\"display: none\">The sample is the [latex]20[\/latex] fish caught. The population is all fish in the lake. The sample may be somewhat unrepresentative of the population since not all fish may be large enough to catch the bait.<\/div>\n<\/div>\n<\/section>\n<h2>Quantitative or Categorical<\/h2>\n<p>Once we have gathered data, we might wish to classify it.\u00a0 Roughly speaking, data can be classified as categorical data or quantitative data.<\/p>\n<div style=\"text-align: center;\">\n<figure id=\"attachment_1021\" aria-describedby=\"caption-attachment-1021\" style=\"width: 640px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2016\/12\/29214357\/12210424505_2da556e2df_z.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-1021 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2016\/12\/29214357\/12210424505_2da556e2df_z.jpg\" alt=\"vertical lines of colored circles\" width=\"640\" height=\"321\" \/><\/a><figcaption id=\"caption-attachment-1021\" class=\"wp-caption-text\">Figure 1. This is a visualization of quantitative data\u00a0<\/figcaption><\/figure>\n<\/div>\n<div class=\"textbox shaded\">\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\n<p><strong>Categorical (qualitative) data<\/strong> are pieces of information that allow us to classify the objects under investigation into various categories.<\/p>\n<p><strong>Quantitative data<\/strong> are responses that are numerical in nature and with which we can perform meaningful arithmetic calculations.<\/p>\n<\/div>\n<section class=\"textbox seeExample\">We might conduct a survey to determine the name of the favorite movie that each person in a math class saw in a movie theater. When we conduct such a survey, the responses would look like: <em>Avatar: The Way of Water<\/em>, <em>The Super Mario Bros. Movie<\/em>, or <em>Creed III<\/em>. We might count the number of people who give each answer, but the answers themselves do not have any numerical values: we cannot perform computations with an answer like &#8220;<em>Avatar: The Way of Water<\/em>.&#8221; Is this categorical or quantitative data?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q914414\">Show Solution<\/button><\/p>\n<div id=\"q914414\" class=\"hidden-answer\" style=\"display: none\">This would be categorical data.<\/div>\n<\/div>\n<\/section>\n<section class=\"textbox seeExample\">A survey could ask the number of movies you have seen in a movie theater in the past [latex]12[\/latex] months ([latex]0, 1, 2, 3, 4, . . .[\/latex]).\u00a0Is this categorical or quantitative data?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q798578\">Show Solution<\/button><\/p>\n<div id=\"q798578\" class=\"hidden-answer\" style=\"display: none\">This would be quantitative data. Other examples of quantitative data would be the running time of the movie you saw most recently ([latex]104[\/latex] minutes, [latex]137[\/latex] minutes, [latex]104[\/latex] minutes, . . .) or the amount of money you paid for a movie ticket the last time you went to a movie theater ([latex]$10.45, $11.75, $12[\/latex], . . .).<\/div>\n<\/div>\n<\/section>\n<p>Sometimes, determining whether data is categorical or quantitative can be a bit trickier. \u00a0In the next example, the data collected is in numerical form, but it is not quantitative data. Read on to find out why.<\/p>\n<section class=\"textbox seeExample\">Suppose we gather respondents&#8217; ZIP codes in a survey to track their geographical location. Is this categorical or quantitative?<\/p>\n<div style=\"text-align: center;\"><a href=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2017\/02\/09002155\/Screen-Shot-2017-02-08-at-4.21.15-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-1453\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2017\/02\/09002155\/Screen-Shot-2017-02-08-at-4.21.15-PM-300x230.png\" alt=\"Map of Portland, OR with zip codes.\" width=\"244\" height=\"187\" \/><\/a><\/div>\n<div style=\"text-align: center;\"><strong><span style=\"font-size: 10pt;\">Zip Codes for Portland, OR<\/span><\/strong><\/div>\n<p><div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q103310\">Show Solution<\/button><\/p>\n<div id=\"q103310\" class=\"hidden-answer\" style=\"display: none\">ZIP codes are numbers, but we can&#8217;t do any meaningful mathematical calculations with them (it doesn&#8217;t make sense to say that [latex]98036[\/latex] is &#8220;twice&#8221; [latex]49018[\/latex]\u00a0\u2014 that&#8217;s like saying that Lynnwood, WA is &#8220;twice&#8221; Battle Creek, MI, which doesn&#8217;t make sense at all), so ZIP codes are really categorical data.<\/div>\n<\/div>\n<\/section>\n<section class=\"textbox seeExample\">A survey about the movie you most recently attended includes the question &#8220;How would you rate the movie you just saw?&#8221; with these possible answers:<\/p>\n<ol style=\"list-style-type: decimal;\">\n<li>it was awful<\/li>\n<li>it was just OK<\/li>\n<li>I liked it<\/li>\n<li>it was great<\/li>\n<li>best movie ever!<\/li>\n<\/ol>\n<p>Is this categorical or quantitative?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q286755\">Show Solution<\/button><\/p>\n<div id=\"q286755\" class=\"hidden-answer\" style=\"display: none\">\n<p>Again, there are numbers associated with the responses, but we can&#8217;t really do any calculations with them: a movie that rates a [latex]4[\/latex] is not necessarily twice as good as a movie that rates a [latex]2[\/latex], whatever that means; if two people see the movie and one of them thinks it stinks and the other thinks it&#8217;s the best ever it doesn&#8217;t necessarily make sense to say that &#8220;on average they liked it.&#8221;<\/p>\n<p>As we study movie-going habits and preferences, we shouldn&#8217;t forget to specify the population under consideration. If we survey [latex]7-9[\/latex] year-olds the runaway favorite might be <em>The Super Mario Bros. Movie<\/em>. [latex]14-17[\/latex] year-olds might prefer <em>Avatar: The Way of Water<\/em>. And [latex]33-37[\/latex] year-olds might prefer . . . well, <em>The Super Mario Bros. Movie<\/em>.<\/p>\n<\/div>\n<\/div>\n<\/section>\n<p>The examples in this page are discussed further in the following video:<\/p>\n<section class=\"textbox watchIt\"><iframe loading=\"lazy\" title=\"YouTube video player\" src=\"https:\/\/www.youtube.com\/embed\/mxZqyB01qPY\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/Qualitative+and+Quantitative.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cQualitative and Quantitative\u201d here (opens in new window).<\/a><\/p>\n<\/section>\n","protected":false},"author":15,"menu_order":9,"template":"","meta":{"_candela_citation":"[]","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"part":86,"module-header":"fresh_take","content_attributions":[],"internal_book_links":[],"video_content":null,"cc_video_embed_content":{"cc_scripts":"","media_targets":[]},"try_it_collection":null,"_links":{"self":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/1722"}],"collection":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/users\/15"}],"version-history":[{"count":27,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/1722\/revisions"}],"predecessor-version":[{"id":15727,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/1722\/revisions\/15727"}],"part":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/parts\/86"}],"metadata":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/1722\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/media?parent=1722"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapter-type?post=1722"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/contributor?post=1722"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/license?post=1722"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}