{"id":2060,"date":"2023-04-24T19:27:43","date_gmt":"2023-04-24T19:27:43","guid":{"rendered":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/?post_type=chapter&#038;p=2060"},"modified":"2025-08-27T00:05:52","modified_gmt":"2025-08-27T00:05:52","slug":"data-exploration-learn-it-3","status":"web-only","type":"chapter","link":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/chapter\/data-exploration-learn-it-3\/","title":{"raw":"Data Exploration: Learn It 3","rendered":"Data Exploration: Learn It 3"},"content":{"raw":"<h2>Data Exploration: Employee Salaries<\/h2>\r\n<p>Salary data from two companies is presented below, Company A and Company B, both in the same field and geographic region. We want to compare the salaries by looking at graphical representations of the data.<\/p>\r\n<p>Salaried Employees: Company A<\/p>\r\n<center>\r\n<table border=\"0\" cellspacing=\"0\" cellpadding=\"0\" align=\"center\">\r\n<tbody>\r\n<tr>\r\n<td align=\"right\">[latex]68340[\/latex]<\/td>\r\n<td align=\"right\">[latex]87282[\/latex]<\/td>\r\n<td align=\"right\">[latex]103802[\/latex]<\/td>\r\n<td align=\"right\">[latex]128863[\/latex]<\/td>\r\n<td align=\"right\">[latex]140085[\/latex]<\/td>\r\n<td align=\"right\">[latex]162300[\/latex]<\/td>\r\n<td align=\"right\">[latex]177109[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td align=\"right\">[latex]70138[\/latex]<\/td>\r\n<td align=\"right\">[latex]90553[\/latex]<\/td>\r\n<td align=\"right\">[latex]106562[\/latex]<\/td>\r\n<td align=\"right\">[latex]128933[\/latex]<\/td>\r\n<td align=\"right\">[latex]147419[\/latex]<\/td>\r\n<td align=\"right\">[latex]168676[\/latex]<\/td>\r\n<td align=\"right\">[latex]180174[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td align=\"right\">[latex]71417[\/latex]<\/td>\r\n<td align=\"right\">[latex]95226[\/latex]<\/td>\r\n<td align=\"right\">[latex]120701[\/latex]<\/td>\r\n<td align=\"right\">[latex]130780[\/latex]<\/td>\r\n<td align=\"right\">[latex]149514[\/latex]<\/td>\r\n<td align=\"right\">[latex]169409[\/latex]<\/td>\r\n<td align=\"right\">[latex]180221[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td align=\"right\">[latex]71867[\/latex]<\/td>\r\n<td align=\"right\">[latex]97042[\/latex]<\/td>\r\n<td align=\"right\">[latex]123313[\/latex]<\/td>\r\n<td align=\"right\">[latex]136204[\/latex]<\/td>\r\n<td align=\"right\">[latex]152008[\/latex]<\/td>\r\n<td align=\"right\">[latex]170031[\/latex]<\/td>\r\n<td align=\"right\">[latex]185837[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td align=\"right\">[latex]84675[\/latex]<\/td>\r\n<td align=\"right\">[latex]100531[\/latex]<\/td>\r\n<td align=\"right\">[latex]125614[\/latex]<\/td>\r\n<td align=\"right\">[latex]138920[\/latex]<\/td>\r\n<td align=\"right\">[latex]155032[\/latex]<\/td>\r\n<td align=\"right\">[latex]175118[\/latex]<\/td>\r\n<td align=\"right\">[latex]189320[\/latex]<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/center>\r\n<p>&nbsp;<\/p>\r\n<p>Salaried Employees: Company B<\/p>\r\n<center>\r\n<table border=\"0\" cellspacing=\"0\" cellpadding=\"0\" align=\"center\">\r\n<tbody>\r\n<tr>\r\n<td align=\"right\">[latex]35472[\/latex]<\/td>\r\n<td align=\"right\">[latex]43467[\/latex]<\/td>\r\n<td align=\"right\">[latex]53624[\/latex]<\/td>\r\n<td align=\"right\">[latex]65096[\/latex]<\/td>\r\n<td align=\"right\">[latex]72290[\/latex]<\/td>\r\n<td align=\"right\">[latex]110351[\/latex]<\/td>\r\n<td align=\"right\">[latex]124732[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td align=\"right\">[latex]36983[\/latex]<\/td>\r\n<td align=\"right\">[latex]46652[\/latex]<\/td>\r\n<td align=\"right\">[latex]57946[\/latex]<\/td>\r\n<td align=\"right\">[latex]66235[\/latex]<\/td>\r\n<td align=\"right\">[latex]75279[\/latex]<\/td>\r\n<td align=\"right\">[latex]117574[\/latex]<\/td>\r\n<td align=\"right\">[latex]228920[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td align=\"right\">[latex]38382[\/latex]<\/td>\r\n<td align=\"right\">[latex]49655[\/latex]<\/td>\r\n<td align=\"right\">[latex]59096[\/latex]<\/td>\r\n<td align=\"right\">[latex]69721[\/latex]<\/td>\r\n<td align=\"right\">[latex]107368[\/latex]<\/td>\r\n<td align=\"right\">[latex]118810[\/latex]<\/td>\r\n<td align=\"right\">[latex]245427[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td align=\"right\">[latex]41674[\/latex]<\/td>\r\n<td align=\"right\">[latex]53231[\/latex]<\/td>\r\n<td align=\"right\">[latex]59709[\/latex]<\/td>\r\n<td align=\"right\">[latex]71289[\/latex]<\/td>\r\n<td align=\"right\">[latex]108236[\/latex]<\/td>\r\n<td align=\"right\">[latex]119112[\/latex]<\/td>\r\n<td align=\"right\">[latex]275024[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td align=\"right\">[latex]43256[\/latex]<\/td>\r\n<td align=\"right\">[latex]53506[\/latex]<\/td>\r\n<td align=\"right\">[latex]61724[\/latex]<\/td>\r\n<td align=\"right\">[latex]72211[\/latex]<\/td>\r\n<td align=\"right\">[latex]109472[\/latex]<\/td>\r\n<td align=\"right\">[latex]124678[\/latex]<\/td>\r\n<td align=\"right\">[latex]293012[\/latex]<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/center>\r\n<h3>Hands-on Spreadsheet: Explore the Data<\/h3>\r\n<p>The examples shown below will use the <a href=\"https:\/\/products.office.com\/en-us\/excel-c\">Microsoft Excel<\/a> spreadsheet but you can also use an open-source spreadsheet such as\u00a0<a href=\"https:\/\/www.openoffice.org\/product\/calc.html\">Apache OpenOffice Calc\u00a0<\/a>or\u00a0<a href=\"https:\/\/www.google.com\/sheets\/about\/\">Google Sheets<\/a>.<\/p>\r\n<h3>Step [latex]1[\/latex]: Store the data<\/h3>\r\n<ol>\r\n\t<li>Type or copy the data into a new spreadsheet. Title the tab Employee Salaries. Place the columns of data side by side in column A and column B.<\/li>\r\n\t<li>Obtain descriptive statistics for each company's data.<\/li>\r\n\t<li>Analyze the descriptive statistics and compare the companies' data.<br \/>\r\n[reveal-answer q=\"839486\"]Solution[\/reveal-answer]<br \/>\r\n[hidden-answer a=\"839486\"]<center><img class=\"aligncenter wp-image-4409 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/4685\/2020\/04\/13121927\/EmployeeSalary_01.jpg\" alt=\"Data and summary statistics in Excel\" width=\"370\" height=\"719\" \/><\/center><br \/>\r\nWe notice substantial difference between the center and the spread of the data between company A and company B. Company A's salary range appears to be symmetrically distributed with a higher mean and median salary than company B. Company B has a smaller minimum value and a larger maximum value but is highly skewed (with a skewness score of [latex]1.85[\/latex]). The standard deviation of company B's data is vastly larger than company A's. Since B's mean is pulled significantly to the right of its median, we expect a small number of much higher salaries at the top.[\/hidden-answer]<\/li>\r\n<\/ol>\r\n<h3>Step [latex]2[\/latex]: Create a box and whisker plot with both data series on the same graph<\/h3>\r\n<ol>\r\n\t<li>Select both columns of data together with their labels.<\/li>\r\n\t<li>Click on Insert then Box and Whisker. You should see both sets of data appear as parallel box plots on the graph in different colors. Click to select it.<\/li>\r\n\t<li>Click the plus sign next to the chart. Click to select Legend. The data column labels should appear at the top under the Chart Title. You can now delete the Chart Title.<\/li>\r\n<\/ol>\r\n<h3>Step [latex]3[\/latex]. Create a scatter plot with both data series on the same graph<\/h3>\r\n<ol>\r\n\t<li>Follow the same steps as in Step [latex]2[\/latex] above, except this time choose Scatter Plot instead of Box and Whiskers.<center>\r\n[caption id=\"attachment_4410\" align=\"aligncenter\" width=\"885\"]<img class=\"wp-image-4410 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/4685\/2020\/04\/13122008\/EmployeeSalary_02.jpg\" alt=\"Data, summary statistics, a boxplot, and a dotplot in Excel\" width=\"885\" height=\"715\" \/> Figure 1. Create a scatter plot[\/caption]\r\n<\/center><\/li>\r\n<\/ol>\r\n<h3>Step [latex]4[\/latex]: Analyze the data<\/h3>\r\n<p>As we saw in the descriptive statistics, Company A has a tighter distribution of salaries about its center. Company B possesses extremes at both ends of salary range. The salaries in B are persistently and substantially lower than A's are with the notable exception of four outliers at the top end. These are pulling the mean of B far to the right of the median.<\/p>\r\n<section class=\"textbox tryIt\">[ohm2_question hide_question_numbers=1]10945[\/ohm2_question]<\/section>","rendered":"<h2>Data Exploration: Employee Salaries<\/h2>\n<p>Salary data from two companies is presented below, Company A and Company B, both in the same field and geographic region. We want to compare the salaries by looking at graphical representations of the data.<\/p>\n<p>Salaried Employees: Company A<\/p>\n<div style=\"text-align: center;\">\n<table cellpadding=\"0\" style=\"border-spacing: 0px; margin: auto;\">\n<tbody>\n<tr>\n<td align=\"right\">[latex]68340[\/latex]<\/td>\n<td align=\"right\">[latex]87282[\/latex]<\/td>\n<td align=\"right\">[latex]103802[\/latex]<\/td>\n<td align=\"right\">[latex]128863[\/latex]<\/td>\n<td align=\"right\">[latex]140085[\/latex]<\/td>\n<td align=\"right\">[latex]162300[\/latex]<\/td>\n<td align=\"right\">[latex]177109[\/latex]<\/td>\n<\/tr>\n<tr>\n<td align=\"right\">[latex]70138[\/latex]<\/td>\n<td align=\"right\">[latex]90553[\/latex]<\/td>\n<td align=\"right\">[latex]106562[\/latex]<\/td>\n<td align=\"right\">[latex]128933[\/latex]<\/td>\n<td align=\"right\">[latex]147419[\/latex]<\/td>\n<td align=\"right\">[latex]168676[\/latex]<\/td>\n<td align=\"right\">[latex]180174[\/latex]<\/td>\n<\/tr>\n<tr>\n<td align=\"right\">[latex]71417[\/latex]<\/td>\n<td align=\"right\">[latex]95226[\/latex]<\/td>\n<td align=\"right\">[latex]120701[\/latex]<\/td>\n<td align=\"right\">[latex]130780[\/latex]<\/td>\n<td align=\"right\">[latex]149514[\/latex]<\/td>\n<td align=\"right\">[latex]169409[\/latex]<\/td>\n<td align=\"right\">[latex]180221[\/latex]<\/td>\n<\/tr>\n<tr>\n<td align=\"right\">[latex]71867[\/latex]<\/td>\n<td align=\"right\">[latex]97042[\/latex]<\/td>\n<td align=\"right\">[latex]123313[\/latex]<\/td>\n<td align=\"right\">[latex]136204[\/latex]<\/td>\n<td align=\"right\">[latex]152008[\/latex]<\/td>\n<td align=\"right\">[latex]170031[\/latex]<\/td>\n<td align=\"right\">[latex]185837[\/latex]<\/td>\n<\/tr>\n<tr>\n<td align=\"right\">[latex]84675[\/latex]<\/td>\n<td align=\"right\">[latex]100531[\/latex]<\/td>\n<td align=\"right\">[latex]125614[\/latex]<\/td>\n<td align=\"right\">[latex]138920[\/latex]<\/td>\n<td align=\"right\">[latex]155032[\/latex]<\/td>\n<td align=\"right\">[latex]175118[\/latex]<\/td>\n<td align=\"right\">[latex]189320[\/latex]<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>&nbsp;<\/p>\n<p>Salaried Employees: Company B<\/p>\n<div style=\"text-align: center;\">\n<table cellpadding=\"0\" style=\"border-spacing: 0px; margin: auto;\">\n<tbody>\n<tr>\n<td align=\"right\">[latex]35472[\/latex]<\/td>\n<td align=\"right\">[latex]43467[\/latex]<\/td>\n<td align=\"right\">[latex]53624[\/latex]<\/td>\n<td align=\"right\">[latex]65096[\/latex]<\/td>\n<td align=\"right\">[latex]72290[\/latex]<\/td>\n<td align=\"right\">[latex]110351[\/latex]<\/td>\n<td align=\"right\">[latex]124732[\/latex]<\/td>\n<\/tr>\n<tr>\n<td align=\"right\">[latex]36983[\/latex]<\/td>\n<td align=\"right\">[latex]46652[\/latex]<\/td>\n<td align=\"right\">[latex]57946[\/latex]<\/td>\n<td align=\"right\">[latex]66235[\/latex]<\/td>\n<td align=\"right\">[latex]75279[\/latex]<\/td>\n<td align=\"right\">[latex]117574[\/latex]<\/td>\n<td align=\"right\">[latex]228920[\/latex]<\/td>\n<\/tr>\n<tr>\n<td align=\"right\">[latex]38382[\/latex]<\/td>\n<td align=\"right\">[latex]49655[\/latex]<\/td>\n<td align=\"right\">[latex]59096[\/latex]<\/td>\n<td align=\"right\">[latex]69721[\/latex]<\/td>\n<td align=\"right\">[latex]107368[\/latex]<\/td>\n<td align=\"right\">[latex]118810[\/latex]<\/td>\n<td align=\"right\">[latex]245427[\/latex]<\/td>\n<\/tr>\n<tr>\n<td align=\"right\">[latex]41674[\/latex]<\/td>\n<td align=\"right\">[latex]53231[\/latex]<\/td>\n<td align=\"right\">[latex]59709[\/latex]<\/td>\n<td align=\"right\">[latex]71289[\/latex]<\/td>\n<td align=\"right\">[latex]108236[\/latex]<\/td>\n<td align=\"right\">[latex]119112[\/latex]<\/td>\n<td align=\"right\">[latex]275024[\/latex]<\/td>\n<\/tr>\n<tr>\n<td align=\"right\">[latex]43256[\/latex]<\/td>\n<td align=\"right\">[latex]53506[\/latex]<\/td>\n<td align=\"right\">[latex]61724[\/latex]<\/td>\n<td align=\"right\">[latex]72211[\/latex]<\/td>\n<td align=\"right\">[latex]109472[\/latex]<\/td>\n<td align=\"right\">[latex]124678[\/latex]<\/td>\n<td align=\"right\">[latex]293012[\/latex]<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<h3>Hands-on Spreadsheet: Explore the Data<\/h3>\n<p>The examples shown below will use the <a href=\"https:\/\/products.office.com\/en-us\/excel-c\">Microsoft Excel<\/a> spreadsheet but you can also use an open-source spreadsheet such as\u00a0<a href=\"https:\/\/www.openoffice.org\/product\/calc.html\">Apache OpenOffice Calc\u00a0<\/a>or\u00a0<a href=\"https:\/\/www.google.com\/sheets\/about\/\">Google Sheets<\/a>.<\/p>\n<h3>Step [latex]1[\/latex]: Store the data<\/h3>\n<ol>\n<li>Type or copy the data into a new spreadsheet. Title the tab Employee Salaries. Place the columns of data side by side in column A and column B.<\/li>\n<li>Obtain descriptive statistics for each company&#8217;s data.<\/li>\n<li>Analyze the descriptive statistics and compare the companies&#8217; data.\n<div class=\"qa-wrapper\" style=\"display: block\"><button class=\"show-answer show-answer-button collapsed\" data-target=\"q839486\">Solution<\/button><\/p>\n<div id=\"q839486\" class=\"hidden-answer\" style=\"display: none\">\n<div style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-4409 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/4685\/2020\/04\/13121927\/EmployeeSalary_01.jpg\" alt=\"Data and summary statistics in Excel\" width=\"370\" height=\"719\" \/><\/div>\n<p>\nWe notice substantial difference between the center and the spread of the data between company A and company B. Company A&#8217;s salary range appears to be symmetrically distributed with a higher mean and median salary than company B. Company B has a smaller minimum value and a larger maximum value but is highly skewed (with a skewness score of [latex]1.85[\/latex]). The standard deviation of company B&#8217;s data is vastly larger than company A&#8217;s. Since B&#8217;s mean is pulled significantly to the right of its median, we expect a small number of much higher salaries at the top.<\/div>\n<\/div>\n<\/li>\n<\/ol>\n<h3>Step [latex]2[\/latex]: Create a box and whisker plot with both data series on the same graph<\/h3>\n<ol>\n<li>Select both columns of data together with their labels.<\/li>\n<li>Click on Insert then Box and Whisker. You should see both sets of data appear as parallel box plots on the graph in different colors. Click to select it.<\/li>\n<li>Click the plus sign next to the chart. Click to select Legend. The data column labels should appear at the top under the Chart Title. You can now delete the Chart Title.<\/li>\n<\/ol>\n<h3>Step [latex]3[\/latex]. Create a scatter plot with both data series on the same graph<\/h3>\n<ol>\n<li>Follow the same steps as in Step [latex]2[\/latex] above, except this time choose Scatter Plot instead of Box and Whiskers.\n<div style=\"text-align: center;\">\n<figure id=\"attachment_4410\" aria-describedby=\"caption-attachment-4410\" style=\"width: 885px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-4410 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/4685\/2020\/04\/13122008\/EmployeeSalary_02.jpg\" alt=\"Data, summary statistics, a boxplot, and a dotplot in Excel\" width=\"885\" height=\"715\" \/><figcaption id=\"caption-attachment-4410\" class=\"wp-caption-text\">Figure 1. Create a scatter plot<\/figcaption><\/figure>\n<\/div>\n<\/li>\n<\/ol>\n<h3>Step [latex]4[\/latex]: Analyze the data<\/h3>\n<p>As we saw in the descriptive statistics, Company A has a tighter distribution of salaries about its center. Company B possesses extremes at both ends of salary range. The salaries in B are persistently and substantially lower than A&#8217;s are with the notable exception of four outliers at the top end. These are pulling the mean of B far to the right of the median.<\/p>\n<section class=\"textbox tryIt\"><iframe loading=\"lazy\" id=\"ohm10945\" class=\"resizable\" src=\"https:\/\/ohm.one.lumenlearning.com\/multiembedq.php?id=10945&theme=lumen&iframe_resize_id=ohm10945&source=tnh\" width=\"100%\" height=\"150\"><\/iframe><\/section>\n","protected":false},"author":15,"menu_order":19,"template":"","meta":{"_candela_citation":"[]","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"part":1572,"module-header":"learn_it","content_attributions":[],"internal_book_links":[],"video_content":null,"cc_video_embed_content":{"cc_scripts":"","media_targets":[]},"try_it_collection":null,"_links":{"self":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/2060"}],"collection":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/users\/15"}],"version-history":[{"count":19,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/2060\/revisions"}],"predecessor-version":[{"id":15713,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/2060\/revisions\/15713"}],"part":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/parts\/1572"}],"metadata":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/2060\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/media?parent=2060"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapter-type?post=2060"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/contributor?post=2060"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/license?post=2060"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}