{"id":8284,"date":"2023-09-29T14:30:14","date_gmt":"2023-09-29T14:30:14","guid":{"rendered":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/?post_type=chapter&#038;p=8284"},"modified":"2024-10-18T20:57:52","modified_gmt":"2024-10-18T20:57:52","slug":"modeling-and-analysis-fresh-take","status":"web-only","type":"chapter","link":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/chapter\/modeling-and-analysis-fresh-take\/","title":{"raw":"Modeling and Analysis: Fresh Take","rendered":"Modeling and Analysis: Fresh Take"},"content":{"raw":"<section class=\"textbox learningGoals\">\r\n<ul>\r\n\t<li>Differentiate correlation from causation<\/li>\r\n\t<li>Decide on the suitability of interpolation and extrapolation<\/li>\r\n\t<li>Identify the appropriate way to represent data and mathematical models<\/li>\r\n\t<li>Use multiple representations to choose a model<\/li>\r\n\t<li>Recognize the limits of models<\/li>\r\n<\/ul>\r\n<\/section>\r\n<h2>Distinguishing Between Correlation and Causation<\/h2>\r\n<div class=\"textbox shaded\">\r\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\r\n<p>Understanding the difference between correlation and causation is crucial in data interpretation. Correlation indicates a relationship where changes in one variable are associated with changes in another, but it doesn't imply causation. Causation implies a direct cause-and-effect relationship between variables.<\/p>\r\n<p><strong>Key Concepts:<\/strong><\/p>\r\n<ul>\r\n\t<li><strong>Correlation<\/strong>: A statistical relationship where changes in one variable are linked to changes in another.<\/li>\r\n\t<li><strong>Causation<\/strong>: A deeper connection where changes in one variable directly cause changes in another.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<section class=\"textbox watchIt\"><iframe title=\"YouTube video player\" src=\"https:\/\/www.youtube.com\/embed\/U-_f8RQIIiw?si=7IXykNxQWsEAVTKv\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe>\r\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/CRITICAL+THINKING+-+Fundamentals_+Correlation+and+Causation.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cCRITICAL THINKING - Fundamentals: Correlation and Causation\u201d here (opens in new window).<\/a><\/p>\r\n<\/section>\r\n<h2>Interpolation and Extrapolation in Data Analysis<\/h2>\r\n<div class=\"textbox shaded\">\r\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\r\n<p>Interpolation and extrapolation are methods used to make predictions based on data. Interpolation involves predicting values within the domain and range of the data. Extrapolation extends predictions beyond the available data, often with higher uncertainty.<\/p>\r\n<p><strong>Key Concepts:<\/strong><\/p>\r\n<ul>\r\n\t<li><strong>Interpolation<\/strong>: Estimating values within the known range of data points.<\/li>\r\n\t<li><strong>Extrapolation<\/strong>: Extending predictions beyond the existing data set, which can lead to model breakdown.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<section class=\"textbox watchIt\"><iframe src=\"\/\/plugin.3playmedia.com\/show?mf=11328612&amp;p3sdk_version=1.10.1&amp;p=20361&amp;pt=375&amp;video_id=bEANDlJkqcU&amp;video_target=tpm-plugin-ul6qln27-bEANDlJkqcU\" width=\"800px\" height=\"450px\" frameborder=\"0\" marginwidth=\"0px\" marginheight=\"0px\"><\/iframe>\r\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/Making+Predictions+on+a+Scatter+Plot+Using+Interpolation+and+Extrapolation.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cMaking Predictions on a Scatter Plot Using Interpolation and Extrapolation\u201d here (opens in new window).<\/a><\/p>\r\n<\/section>\r\n<h2>Effective Data Representation<\/h2>\r\n<div class=\"textbox shaded\">\r\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\r\n<p>The form in which data is presented can greatly influence its impact and interpretability. Different representations like graphs, tables, and equations each have their own advantages and limitations. Consider the purpose, audience, and complexity of the data to choose the most appropriate representation.<\/p>\r\n<p><strong>Key Considerations for Data Representation:<\/strong><\/p>\r\n<ul>\r\n\t<li><strong>Graphs<\/strong>: Ideal for visualizing trends, relationships, and making quick comparisons.<\/li>\r\n\t<li><strong>Tables<\/strong>: Best for organizing raw data, facilitating quick look-up of specific values, and providing a detailed view.<\/li>\r\n\t<li><strong>Equations<\/strong>: Offer a mathematical framework to succinctly express complex relationships between variables.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<h2>Using Multiple Representations for Model Selection<\/h2>\r\n<div class=\"textbox shaded\">\r\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\r\n<p>Utilizing multiple forms of data representation can offer a fuller, more nuanced picture of what the data is saying. Each form has its strengths and limitations, and combining them can provide a more comprehensive understanding. Consider multiple metrics and the context of the data to select the most appropriate model. Analyze the interpretability and relevance of each model to the specific questions being addressed.<\/p>\r\n<p><strong>Strategies for Model Comparison:<\/strong><\/p>\r\n<ul>\r\n\t<li><strong>Overlay Graphs<\/strong>: Compare models by overlaying their graphs on the same axes.<\/li>\r\n\t<li><strong>Tabulate Key Metrics<\/strong>: Create a table listing key metrics for each model for a side-by-side comparison.<\/li>\r\n\t<li><strong>Equation Analysis<\/strong>: Compare the terms and coefficients in the equations to understand the differences in the relationships they propose.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<h2>Selecting the Best Model<\/h2>\r\n<div class=\"textbox shaded\">\r\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\r\n<p>Choosing the right model for data analysis is crucial for accurate predictions and informed decisions. Consider criteria like Goodness of Fit, Simplicity, Predictive Accuracy, and Interpretability.<\/p>\r\n<p><strong>Key Criteria for Model Selection:<\/strong><\/p>\r\n<ul>\r\n\t<li><strong>Goodness of Fit<\/strong>: Measures how well the model replicates observed data. Use statistical tests like Chi-Square or [latex]R^2[\/latex] for evaluation.<\/li>\r\n\t<li><strong>Simplicity (Principle of Parsimony)<\/strong>: Prefer simpler models when they explain data as well as more complex ones.<\/li>\r\n\t<li><strong>Predictive Accuracy<\/strong>: Assess how well the model performs on new, unseen data, often using cross-validation.<\/li>\r\n\t<li><strong>Interpretability<\/strong>: The ease of understanding the model's workings, crucial in fields like healthcare and finance.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<h2>Navigating Common Pitfalls in Model Selection<\/h2>\r\n<div class=\"textbox shaded\">\r\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\r\n<p>The process of selecting the most appropriate model for data analysis is fraught with potential pitfalls. Awareness of these issues is key to achieving accurate and meaningful results.<\/p>\r\n<p><strong>Common Pitfalls:<\/strong><\/p>\r\n<ul>\r\n\t<li><strong>Overfitting<\/strong>: This occurs when a model is too tailored to the training data, capturing noise and outliers, leading to poor performance on new data. Regularization techniques like Lasso and Ridge regression can help mitigate this risk.<\/li>\r\n\t<li><strong>Ignoring Data Quality<\/strong>: Quality data is crucial for meaningful analysis. Overlooking data quality can lead to skewed results. Prioritize exploratory data analysis to handle missing values, manage outliers, and understand variable distributions.<\/li>\r\n\t<li><strong>Not Considering Business Context<\/strong>: A model that is statistically sound may not be practical in a real-world setting. Involve domain experts in the model selection process to ensure practical applicability.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<section class=\"textbox watchIt\"><iframe src=\"\/\/plugin.3playmedia.com\/show?mf=11328613&amp;p3sdk_version=1.10.1&amp;p=20361&amp;pt=375&amp;video_id=wCEgUfWVLrI&amp;video_target=tpm-plugin-0xigkr42-wCEgUfWVLrI\" width=\"800px\" height=\"450px\" frameborder=\"0\" marginwidth=\"0px\" marginheight=\"0px\"><\/iframe>\r\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/What+is+overfitting%3F.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cWhat is overfitting?\u201d here (opens in new window).<\/a><\/p>\r\n<\/section>\r\n<h2>Recognizing the Limits of Modeling<\/h2>\r\n<div class=\"textbox shaded\">\r\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\r\n<p>Every model, no matter how sophisticated, has limitations due to the assumptions and simplifications made during its creation. Understanding these limitations is crucial for accurate interpretation and application of models.<\/p>\r\n<p><strong>Key Limitations to Consider:<\/strong><\/p>\r\n<ul>\r\n\t<li><strong>Assumptions<\/strong>: Models like linear regression assume a linear relationship between variables, which may not always be accurate.<\/li>\r\n\t<li><strong>Data Quality<\/strong>: The reliability of a model is heavily dependent on the quality of the data used. Poorly collected data or biased samples can skew results.<\/li>\r\n\t<li><strong>Contextual Application<\/strong>: Models may perform differently in real-world settings compared to controlled environments.<\/li>\r\n\t<li><strong>Ethical Considerations<\/strong>: It's important to consider potential biases and ethical implications of models.<\/li>\r\n<\/ul>\r\n<\/div>","rendered":"<section class=\"textbox learningGoals\">\n<ul>\n<li>Differentiate correlation from causation<\/li>\n<li>Decide on the suitability of interpolation and extrapolation<\/li>\n<li>Identify the appropriate way to represent data and mathematical models<\/li>\n<li>Use multiple representations to choose a model<\/li>\n<li>Recognize the limits of models<\/li>\n<\/ul>\n<\/section>\n<h2>Distinguishing Between Correlation and Causation<\/h2>\n<div class=\"textbox shaded\">\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\n<p>Understanding the difference between correlation and causation is crucial in data interpretation. Correlation indicates a relationship where changes in one variable are associated with changes in another, but it doesn&#8217;t imply causation. Causation implies a direct cause-and-effect relationship between variables.<\/p>\n<p><strong>Key Concepts:<\/strong><\/p>\n<ul>\n<li><strong>Correlation<\/strong>: A statistical relationship where changes in one variable are linked to changes in another.<\/li>\n<li><strong>Causation<\/strong>: A deeper connection where changes in one variable directly cause changes in another.<\/li>\n<\/ul>\n<\/div>\n<section class=\"textbox watchIt\"><iframe loading=\"lazy\" title=\"YouTube video player\" src=\"https:\/\/www.youtube.com\/embed\/U-_f8RQIIiw?si=7IXykNxQWsEAVTKv\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/CRITICAL+THINKING+-+Fundamentals_+Correlation+and+Causation.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cCRITICAL THINKING &#8211; Fundamentals: Correlation and Causation\u201d here (opens in new window).<\/a><\/p>\n<\/section>\n<h2>Interpolation and Extrapolation in Data Analysis<\/h2>\n<div class=\"textbox shaded\">\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\n<p>Interpolation and extrapolation are methods used to make predictions based on data. Interpolation involves predicting values within the domain and range of the data. Extrapolation extends predictions beyond the available data, often with higher uncertainty.<\/p>\n<p><strong>Key Concepts:<\/strong><\/p>\n<ul>\n<li><strong>Interpolation<\/strong>: Estimating values within the known range of data points.<\/li>\n<li><strong>Extrapolation<\/strong>: Extending predictions beyond the existing data set, which can lead to model breakdown.<\/li>\n<\/ul>\n<\/div>\n<section class=\"textbox watchIt\"><iframe loading=\"lazy\" src=\"\/\/plugin.3playmedia.com\/show?mf=11328612&amp;p3sdk_version=1.10.1&amp;p=20361&amp;pt=375&amp;video_id=bEANDlJkqcU&amp;video_target=tpm-plugin-ul6qln27-bEANDlJkqcU\" width=\"800px\" height=\"450px\" frameborder=\"0\" marginwidth=\"0px\" marginheight=\"0px\"><\/iframe><\/p>\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/Making+Predictions+on+a+Scatter+Plot+Using+Interpolation+and+Extrapolation.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cMaking Predictions on a Scatter Plot Using Interpolation and Extrapolation\u201d here (opens in new window).<\/a><\/p>\n<\/section>\n<h2>Effective Data Representation<\/h2>\n<div class=\"textbox shaded\">\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\n<p>The form in which data is presented can greatly influence its impact and interpretability. Different representations like graphs, tables, and equations each have their own advantages and limitations. Consider the purpose, audience, and complexity of the data to choose the most appropriate representation.<\/p>\n<p><strong>Key Considerations for Data Representation:<\/strong><\/p>\n<ul>\n<li><strong>Graphs<\/strong>: Ideal for visualizing trends, relationships, and making quick comparisons.<\/li>\n<li><strong>Tables<\/strong>: Best for organizing raw data, facilitating quick look-up of specific values, and providing a detailed view.<\/li>\n<li><strong>Equations<\/strong>: Offer a mathematical framework to succinctly express complex relationships between variables.<\/li>\n<\/ul>\n<\/div>\n<h2>Using Multiple Representations for Model Selection<\/h2>\n<div class=\"textbox shaded\">\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\n<p>Utilizing multiple forms of data representation can offer a fuller, more nuanced picture of what the data is saying. Each form has its strengths and limitations, and combining them can provide a more comprehensive understanding. Consider multiple metrics and the context of the data to select the most appropriate model. Analyze the interpretability and relevance of each model to the specific questions being addressed.<\/p>\n<p><strong>Strategies for Model Comparison:<\/strong><\/p>\n<ul>\n<li><strong>Overlay Graphs<\/strong>: Compare models by overlaying their graphs on the same axes.<\/li>\n<li><strong>Tabulate Key Metrics<\/strong>: Create a table listing key metrics for each model for a side-by-side comparison.<\/li>\n<li><strong>Equation Analysis<\/strong>: Compare the terms and coefficients in the equations to understand the differences in the relationships they propose.<\/li>\n<\/ul>\n<\/div>\n<h2>Selecting the Best Model<\/h2>\n<div class=\"textbox shaded\">\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\n<p>Choosing the right model for data analysis is crucial for accurate predictions and informed decisions. Consider criteria like Goodness of Fit, Simplicity, Predictive Accuracy, and Interpretability.<\/p>\n<p><strong>Key Criteria for Model Selection:<\/strong><\/p>\n<ul>\n<li><strong>Goodness of Fit<\/strong>: Measures how well the model replicates observed data. Use statistical tests like Chi-Square or [latex]R^2[\/latex] for evaluation.<\/li>\n<li><strong>Simplicity (Principle of Parsimony)<\/strong>: Prefer simpler models when they explain data as well as more complex ones.<\/li>\n<li><strong>Predictive Accuracy<\/strong>: Assess how well the model performs on new, unseen data, often using cross-validation.<\/li>\n<li><strong>Interpretability<\/strong>: The ease of understanding the model&#8217;s workings, crucial in fields like healthcare and finance.<\/li>\n<\/ul>\n<\/div>\n<h2>Navigating Common Pitfalls in Model Selection<\/h2>\n<div class=\"textbox shaded\">\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\n<p>The process of selecting the most appropriate model for data analysis is fraught with potential pitfalls. Awareness of these issues is key to achieving accurate and meaningful results.<\/p>\n<p><strong>Common Pitfalls:<\/strong><\/p>\n<ul>\n<li><strong>Overfitting<\/strong>: This occurs when a model is too tailored to the training data, capturing noise and outliers, leading to poor performance on new data. Regularization techniques like Lasso and Ridge regression can help mitigate this risk.<\/li>\n<li><strong>Ignoring Data Quality<\/strong>: Quality data is crucial for meaningful analysis. Overlooking data quality can lead to skewed results. Prioritize exploratory data analysis to handle missing values, manage outliers, and understand variable distributions.<\/li>\n<li><strong>Not Considering Business Context<\/strong>: A model that is statistically sound may not be practical in a real-world setting. Involve domain experts in the model selection process to ensure practical applicability.<\/li>\n<\/ul>\n<\/div>\n<section class=\"textbox watchIt\"><iframe loading=\"lazy\" src=\"\/\/plugin.3playmedia.com\/show?mf=11328613&amp;p3sdk_version=1.10.1&amp;p=20361&amp;pt=375&amp;video_id=wCEgUfWVLrI&amp;video_target=tpm-plugin-0xigkr42-wCEgUfWVLrI\" width=\"800px\" height=\"450px\" frameborder=\"0\" marginwidth=\"0px\" marginheight=\"0px\"><\/iframe><\/p>\n<p>You can view the\u00a0<a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Quantitative+Reasoning+-+2023+Build\/Transcriptions\/What+is+overfitting%3F.txt\" target=\"_blank\" rel=\"noopener\">transcript for \u201cWhat is overfitting?\u201d here (opens in new window).<\/a><\/p>\n<\/section>\n<h2>Recognizing the Limits of Modeling<\/h2>\n<div class=\"textbox shaded\">\n<p><strong>The Main Idea\u00a0<\/strong><\/p>\n<p>Every model, no matter how sophisticated, has limitations due to the assumptions and simplifications made during its creation. Understanding these limitations is crucial for accurate interpretation and application of models.<\/p>\n<p><strong>Key Limitations to Consider:<\/strong><\/p>\n<ul>\n<li><strong>Assumptions<\/strong>: Models like linear regression assume a linear relationship between variables, which may not always be accurate.<\/li>\n<li><strong>Data Quality<\/strong>: The reliability of a model is heavily dependent on the quality of the data used. Poorly collected data or biased samples can skew results.<\/li>\n<li><strong>Contextual Application<\/strong>: Models may perform differently in real-world settings compared to controlled environments.<\/li>\n<li><strong>Ethical Considerations<\/strong>: It&#8217;s important to consider potential biases and ethical implications of models.<\/li>\n<\/ul>\n<\/div>\n","protected":false},"author":15,"menu_order":28,"template":"","meta":{"_candela_citation":"[]","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"part":88,"module-header":"fresh_take","content_attributions":[],"internal_book_links":[],"video_content":null,"cc_video_embed_content":{"cc_scripts":"","media_targets":[]},"try_it_collection":null,"_links":{"self":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/8284"}],"collection":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/users\/15"}],"version-history":[{"count":8,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/8284\/revisions"}],"predecessor-version":[{"id":12860,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/8284\/revisions\/12860"}],"part":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/parts\/88"}],"metadata":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapters\/8284\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/media?parent=8284"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/pressbooks\/v2\/chapter-type?post=8284"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/contributor?post=8284"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/quantitativereasoning\/wp-json\/wp\/v2\/license?post=8284"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}