{"id":1439,"date":"2023-06-22T02:28:34","date_gmt":"2023-06-22T02:28:34","guid":{"rendered":"https:\/\/content.one.lumenlearning.com\/introstatstest\/chapter\/module-16-cheat-sheet\/"},"modified":"2025-02-11T04:23:32","modified_gmt":"2025-02-11T04:23:32","slug":"module-16-cheat-sheet","status":"publish","type":"chapter","link":"https:\/\/content.one.lumenlearning.com\/introstatstest\/chapter\/module-16-cheat-sheet\/","title":{"raw":"Module 15: Cheat Sheet","rendered":"Module 15: Cheat Sheet"},"content":{"raw":"<h4 style=\"text-align: right;\"><a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Statistics+Exemplar\/Cheat+Sheets\/Module+15_+Cheat+Sheet.pdf\" target=\"_blank\" rel=\"noopener\">Download a pdf of this page here.<\/a><\/h4>\r\n<h2>Essential Concepts<\/h2>\r\n<ul>\r\n\t<li>Steps for Hypothesis Testing for Significance of Slope:<\/li>\r\n<\/ul>\r\n<ol>\r\n\t<li>Write out the null and alternative hypotheses.\r\n\r\n<ul>\r\n\t<li>Null Hypothesis: [latex]\\beta_1 = 0[\/latex]<\/li>\r\n\t<li>Alternative Hypothesis: [latex]\\beta_1 \\ne 0[\/latex]<\/li>\r\n<\/ul>\r\n<\/li>\r\n\t<li>Check the conditions for the hypothesis test. For testing the significance of the regression slope, we require:\r\n\r\n<ul>\r\n\t<li>A random sample of data<\/li>\r\n\t<li>A linear trend<\/li>\r\n\t<li>No obvious trends in residual plot<\/li>\r\n<\/ul>\r\n<\/li>\r\n\t<li>Calculate the test statistic: [latex]t=\\dfrac{b-0}{[\\text{std. error of }b]} = \\dfrac{b}{SE_b}[\/latex]<\/li>\r\n\t<li>Calculate a P-value.<\/li>\r\n\t<li>Compare the P-value to the significance level, [latex]\\alpha[\/latex], to make a decision.<br \/>\r\n<div style=\"display: grid; grid-template-columns: repeat(1, minmax(0, 1fr)); overflow: auto; white-space: normal;\" tabindex=\"0\">\r\n<table>\r\n<thead>\r\n<tr>\r\n<th><strong>Decision<\/strong><\/th>\r\n<th><strong>Conclusion<\/strong><\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td>If P-value [latex]\\le\\alpha[\/latex], there is enough evidence to reject the null hypothesis.<\/td>\r\n<td>At the [latex]\\alpha\\times[\/latex]100% significance level, the data provide convincing evidence in support of the alternative hypothesis.<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>If P-value [latex]\\gt\\alpha[\/latex], there is not enough evidence to reject the null hypothesis.<\/td>\r\n<td>At the [latex]\\alpha\\times[\/latex]100% significance level, the data do not provide convincing evidence in support of the alternative hypothesis.<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n<\/li>\r\n\t<li>Write a conclusion in context (e.g., we do\/do not have convincing evidence\u2026).<\/li>\r\n<\/ol>\r\n<ul>\r\n\t<li>An ANOVA is a way to \u201cpartition\u201d the variation in the data. In other words, it divides the total variation into two parts: the part that is explained by the regression model (SSRegression) and the part that remains unexplained (SSResiduals).<\/li>\r\n<\/ul>\r\n<p style=\"text-align: center;\">[latex]\\text{SSTotal} = \\text{SSRegression} + \\text{SSResiduals}[\/latex]<\/p>\r\n<ul>\r\n\t<li>\r\n<div style=\"display: grid; grid-template-columns: repeat(1, minmax(0, 1fr)); overflow: auto; white-space: normal;\" tabindex=\"0\">\r\n<table>\r\n<thead>\r\n<tr>\r\n<th style=\"width: 84.9306px;\">Source<\/th>\r\n<th style=\"width: 103.594px;\">[latex]df[\/latex]<\/th>\r\n<th style=\"width: 195.052px;\">Sum sq ([latex]\\text{SS}[\/latex])<\/th>\r\n<th style=\"width: 203.472px;\">Mean sq ([latex]\\text{MS}[\/latex])<\/th>\r\n<th style=\"width: 251.042px;\">F value<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td>Regression<\/td>\r\n<td>[latex]p[\/latex]<\/td>\r\n<td>[latex]\\text{SSRegression}[\/latex]<\/td>\r\n<td>[latex]\\text{MSRegression} = \\dfrac{\\text{SSRegression}}{p}[\/latex]<\/td>\r\n<td>[latex]F = \\dfrac{\\text{MSRegression}}{\\text{MSResiduals}}[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Residuals<\/td>\r\n<td>[latex]n-1-p[\/latex]<\/td>\r\n<td>[latex]\\text{SSResiduals}[\/latex]<\/td>\r\n<td>[latex]\\text{MSResiduals} = \\dfrac{\\text{SSResiduals}}{n-1-p}[\/latex]<\/td>\r\n<td>&nbsp;<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>Total<\/strong><\/td>\r\n<td>[latex]n-1[\/latex]<\/td>\r\n<td>[latex]\\text{SSTotal}[\/latex]<\/td>\r\n<td>&nbsp;<\/td>\r\n<td>&nbsp;<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n<\/li>\r\n\t<li>When the objective is to estimate the mean value of the response variable for a particular value of the explanatory variable, [latex]x_0[\/latex], we will calculate a <strong>confidence interval for the mean response<\/strong>, where [latex]x_0[\/latex]\u00a0is the confidence level associated with the interval<strong>. <\/strong>This interval gives us a range of plausible values of the mean response for the subset of the population with a value of the explanatory variable equal to [latex]x_0[\/latex].<\/li>\r\n\t<li>When the objective is to predict the value of the response variable for an individual observation with the explanatory variable equal to [latex]x_0[\/latex], we will calculate a [latex]C[\/latex]<strong>% prediction<\/strong><strong> interval for an individual response, <\/strong>where [latex]C[\/latex]\u00a0is the confidence level associated with the interval. This interval gives us a range of plausible values of the response for an individual observation that has a value of the explanatory variable equal to [latex]x_0[\/latex].<\/li>\r\n\t<li>Data transformation is the process of applying mathematical functions to raw data to make it more useful for analysis. The goal is to adjust for different scales, distributions, or nonlinear relationships. Common transformations include adding a constant, squaring, cubing, taking square roots, or applying logarithms to each data value. The choice of transformation depends on the nature of the data and the desired analysis.<\/li>\r\n<\/ul>\r\n<h2>Key Equations<\/h2>\r\n<p><strong>ANOVA for Regression<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\">[latex]\\text{SSTotal} = \\text{SSRegression} + \\text{SSResiduals}[\/latex]<\/p>\r\n<p><strong>[latex]R^2[\/latex]<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\">[latex]R^2 = \\dfrac{\\text{variation explained}}{\\text{total variation}} = \\dfrac{\\text{SSRegression}}{\\text{SSTotal}} = 1-\\dfrac{\\text{SSResiduals}}{\\text{SSTotal}}[\/latex]<\/p>\r\n<p><strong>Test Statistics for the Hypothesis Test for Significance of Slope<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\">[latex]t=\\dfrac{b-0}{[\\text{std. error of }b]} = \\dfrac{b}{SE_b}[\/latex]<\/p>\r\n<h2>Glossary<\/h2>\r\n<p data-start=\"2065\" data-end=\"2219\"><strong data-start=\"2065\" data-end=\"2106\">confidence interval for mean response<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\" data-start=\"2065\" data-end=\"2219\">A range of plausible values for the mean response variable at a given value of the explanatory variable.<\/p>\r\n<p><strong>data transformation<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\">the application of a deterministic mathematical function to each point in a data set<\/p>\r\n<p data-start=\"1895\" data-end=\"2061\"><strong data-start=\"1895\" data-end=\"1910\">F-statistic<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\" data-start=\"1895\" data-end=\"2061\">A ratio used in ANOVA for regression to compare the explained variance to the unexplained variance, testing the overall significance of the model.<\/p>\r\n<p data-start=\"2467\" data-end=\"2590\"><strong data-start=\"2467\" data-end=\"2489\">log transformation<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\" data-start=\"2467\" data-end=\"2590\">Applying the natural logarithm to data values to stabilize variance and linearize relationships.<\/p>\r\n<p data-start=\"1479\" data-end=\"1684\"><strong data-start=\"1479\" data-end=\"1538\">mean square for regression (<span class=\"katex\"><span class=\"katex-mathml\">\r\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><mi>M<\/mi><msub><mi>S<\/mi><mtext>Regression<\/mtext><\/msub><\/mrow><annotation encoding=\"application\/x-tex\">MS_{\\text{Regression}}<\/annotation><\/semantics><\/math>\r\n<\/span><\/span>)<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\" data-start=\"1479\" data-end=\"1684\">The average variation explained by the regression model, calculated as the sum of squares for regression divided by the number of predictors.<\/p>\r\n<p data-start=\"1688\" data-end=\"1891\"><strong data-start=\"1688\" data-end=\"1745\">mean square for residuals (<span class=\"katex\"><span class=\"katex-mathml\">\r\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><mi>M<\/mi><msub><mi>S<\/mi><mtext>Residuals<\/mtext><\/msub><\/mrow><annotation encoding=\"application\/x-tex\">MS_{\\text{Residuals}}<\/annotation><\/semantics><\/math>\r\n<\/span><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"mord\"><span class=\"msupsub\"><span class=\"vlist-t vlist-t2\"><span class=\"vlist-r\"><span class=\"vlist-s\">\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>)<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\" data-start=\"1688\" data-end=\"1891\">The average unexplained variation in the response variable, calculated as the sum of squares for residuals divided by the degrees of freedom.<\/p>\r\n<p data-start=\"830\" data-end=\"961\"><strong data-start=\"830\" data-end=\"841\">P-value<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\" data-start=\"830\" data-end=\"961\">The probability of obtaining a test statistic as extreme as the one observed, assuming the null hypothesis is true.<\/p>\r\n<p data-start=\"2223\" data-end=\"2463\"><strong data-start=\"2223\" data-end=\"2246\">prediction interval<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\" data-start=\"2223\" data-end=\"2463\">A range of plausible values for an individual response variable at a given value of the explanatory variable. The interval is wider than the confidence interval because it accounts for individual variability.<\/p>\r\n<p data-start=\"2727\" data-end=\"2834\"><strong data-start=\"2727\" data-end=\"2756\">reciprocal transformation<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\" data-start=\"2727\" data-end=\"2834\">Using the inverse of data values to reduce the impact of large variances.<\/p>\r\n<p data-start=\"375\" data-end=\"538\"><strong data-start=\"375\" data-end=\"409\">regression slope (<span class=\"katex\"><span class=\"katex-mathml\">\r\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><msub><mi>\u03b2<\/mi><mn>1<\/mn><\/msub><\/mrow><annotation encoding=\"application\/x-tex\">\\beta_1<\/annotation><\/semantics><\/math>\r\n<\/span><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"mord\"><span class=\"mord mathnormal\">\u03b2<\/span><span class=\"msupsub\"><span class=\"vlist-t vlist-t2\"><span class=\"vlist-r\"><span class=\"vlist\"><span class=\"sizing reset-size6 size3 mtight\"><span class=\"mord mtight\">1<\/span><\/span><\/span><span class=\"vlist-s\">\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>)<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\" data-start=\"375\" data-end=\"538\">The coefficient representing the rate of change of the response variable for each unit increase in the explanatory variable.<\/p>\r\n<p data-start=\"2594\" data-end=\"2723\"><strong data-start=\"2594\" data-end=\"2624\">square root transformation<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\" data-start=\"2594\" data-end=\"2723\">Taking the square root of data values to reduce right-skewness and normalize the distribution.<\/p>\r\n<p data-start=\"542\" data-end=\"666\"><strong data-start=\"542\" data-end=\"584\">standard error of the slope (<span class=\"katex\"><span class=\"katex-mathml\">\r\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><mi>S<\/mi><msub><mi>E<\/mi><mi>b<\/mi><\/msub><\/mrow><annotation encoding=\"application\/x-tex\">SE_b<\/annotation><\/semantics><\/math>\r\n<\/span><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"mord mathnormal\">S<\/span><span class=\"mord\"><span class=\"mord mathnormal\">E<\/span><span class=\"msupsub\"><span class=\"vlist-t vlist-t2\"><span class=\"vlist-r\"><span class=\"vlist\"><span class=\"sizing reset-size6 size3 mtight\"><span class=\"mord mathnormal mtight\">b<\/span><\/span><\/span><span class=\"vlist-s\">\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>)<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\" data-start=\"542\" data-end=\"666\">A measure of the variability in the estimated slope across different samples.<\/p>\r\n<p data-start=\"965\" data-end=\"1134\"><strong data-start=\"965\" data-end=\"1027\">sum of squares for regression (<span class=\"katex\"><span class=\"katex-mathml\">\r\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><mi>S<\/mi><msub><mi>S<\/mi><mtext>Regression<\/mtext><\/msub><\/mrow><annotation encoding=\"application\/x-tex\">SS_{\\text{Regression}}<\/annotation><\/semantics><\/math>\r\n<\/span><\/span>)<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\" data-start=\"965\" data-end=\"1134\">The portion of the total variation in the response variable that is explained by the regression model.<\/p>\r\n<p data-start=\"1138\" data-end=\"1312\"><strong data-start=\"1138\" data-end=\"1198\">sum of squares for residuals (<span class=\"katex\"><span class=\"katex-mathml\">\r\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><mi>S<\/mi><msub><mi>S<\/mi><mtext>Residuals<\/mtext><\/msub><\/mrow><annotation encoding=\"application\/x-tex\">SS_{\\text{Residuals}}<\/annotation><\/semantics><\/math>\r\n<\/span><\/span>)<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\" data-start=\"1138\" data-end=\"1312\">The portion of the total variation in the response variable that remains unexplained by the regression model.<\/p>\r\n<p data-start=\"670\" data-end=\"826\"><strong data-start=\"670\" data-end=\"702\">test statistic (<span class=\"katex\"><span class=\"katex-mathml\">\r\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><mi>t<\/mi><\/mrow><annotation encoding=\"application\/x-tex\">t<\/annotation><\/semantics><\/math>\r\n<\/span><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"mord mathnormal\">t<\/span><\/span><\/span><\/span>-value)<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\" data-start=\"670\" data-end=\"826\">A measure of how many standard errors the estimated slope is away from zero, used in hypothesis testing for regression.<\/p>\r\n<p data-start=\"1316\" data-end=\"1475\"><strong data-start=\"1316\" data-end=\"1364\">total sum of squares (<span class=\"katex\"><span class=\"katex-mathml\">\r\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><mi>S<\/mi><msub><mi>S<\/mi><mtext>Total<\/mtext><\/msub><\/mrow><annotation encoding=\"application\/x-tex\">SS_{\\text{Total}}<\/annotation><\/semantics><\/math>\r\n<\/span><\/span>)<\/strong><\/p>\r\n<p style=\"padding-left: 40px;\" data-start=\"1316\" data-end=\"1475\">The total variation in the response variable, equal to the sum of the explained and unexplained variation.<\/p>\r\n<p>&nbsp;<\/p>","rendered":"<h4 style=\"text-align: right;\"><a href=\"https:\/\/course-building.s3.us-west-2.amazonaws.com\/Statistics+Exemplar\/Cheat+Sheets\/Module+15_+Cheat+Sheet.pdf\" target=\"_blank\" rel=\"noopener\">Download a pdf of this page here.<\/a><\/h4>\n<h2>Essential Concepts<\/h2>\n<ul>\n<li>Steps for Hypothesis Testing for Significance of Slope:<\/li>\n<\/ul>\n<ol>\n<li>Write out the null and alternative hypotheses.\n<ul>\n<li>Null Hypothesis: [latex]\\beta_1 = 0[\/latex]<\/li>\n<li>Alternative Hypothesis: [latex]\\beta_1 \\ne 0[\/latex]<\/li>\n<\/ul>\n<\/li>\n<li>Check the conditions for the hypothesis test. For testing the significance of the regression slope, we require:\n<ul>\n<li>A random sample of data<\/li>\n<li>A linear trend<\/li>\n<li>No obvious trends in residual plot<\/li>\n<\/ul>\n<\/li>\n<li>Calculate the test statistic: [latex]t=\\dfrac{b-0}{[\\text{std. error of }b]} = \\dfrac{b}{SE_b}[\/latex]<\/li>\n<li>Calculate a P-value.<\/li>\n<li>Compare the P-value to the significance level, [latex]\\alpha[\/latex], to make a decision.\n<div style=\"display: grid; grid-template-columns: repeat(1, minmax(0, 1fr)); overflow: auto; white-space: normal;\" tabindex=\"0\">\n<table>\n<thead>\n<tr>\n<th><strong>Decision<\/strong><\/th>\n<th><strong>Conclusion<\/strong><\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>If P-value [latex]\\le\\alpha[\/latex], there is enough evidence to reject the null hypothesis.<\/td>\n<td>At the [latex]\\alpha\\times[\/latex]100% significance level, the data provide convincing evidence in support of the alternative hypothesis.<\/td>\n<\/tr>\n<tr>\n<td>If P-value [latex]\\gt\\alpha[\/latex], there is not enough evidence to reject the null hypothesis.<\/td>\n<td>At the [latex]\\alpha\\times[\/latex]100% significance level, the data do not provide convincing evidence in support of the alternative hypothesis.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/li>\n<li>Write a conclusion in context (e.g., we do\/do not have convincing evidence\u2026).<\/li>\n<\/ol>\n<ul>\n<li>An ANOVA is a way to \u201cpartition\u201d the variation in the data. In other words, it divides the total variation into two parts: the part that is explained by the regression model (SSRegression) and the part that remains unexplained (SSResiduals).<\/li>\n<\/ul>\n<p style=\"text-align: center;\">[latex]\\text{SSTotal} = \\text{SSRegression} + \\text{SSResiduals}[\/latex]<\/p>\n<ul>\n<li>\n<div style=\"display: grid; grid-template-columns: repeat(1, minmax(0, 1fr)); overflow: auto; white-space: normal;\" tabindex=\"0\">\n<table>\n<thead>\n<tr>\n<th style=\"width: 84.9306px;\">Source<\/th>\n<th style=\"width: 103.594px;\">[latex]df[\/latex]<\/th>\n<th style=\"width: 195.052px;\">Sum sq ([latex]\\text{SS}[\/latex])<\/th>\n<th style=\"width: 203.472px;\">Mean sq ([latex]\\text{MS}[\/latex])<\/th>\n<th style=\"width: 251.042px;\">F value<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Regression<\/td>\n<td>[latex]p[\/latex]<\/td>\n<td>[latex]\\text{SSRegression}[\/latex]<\/td>\n<td>[latex]\\text{MSRegression} = \\dfrac{\\text{SSRegression}}{p}[\/latex]<\/td>\n<td>[latex]F = \\dfrac{\\text{MSRegression}}{\\text{MSResiduals}}[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>Residuals<\/td>\n<td>[latex]n-1-p[\/latex]<\/td>\n<td>[latex]\\text{SSResiduals}[\/latex]<\/td>\n<td>[latex]\\text{MSResiduals} = \\dfrac{\\text{SSResiduals}}{n-1-p}[\/latex]<\/td>\n<td>&nbsp;<\/td>\n<\/tr>\n<tr>\n<td><strong>Total<\/strong><\/td>\n<td>[latex]n-1[\/latex]<\/td>\n<td>[latex]\\text{SSTotal}[\/latex]<\/td>\n<td>&nbsp;<\/td>\n<td>&nbsp;<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/li>\n<li>When the objective is to estimate the mean value of the response variable for a particular value of the explanatory variable, [latex]x_0[\/latex], we will calculate a <strong>confidence interval for the mean response<\/strong>, where [latex]x_0[\/latex]\u00a0is the confidence level associated with the interval<strong>. <\/strong>This interval gives us a range of plausible values of the mean response for the subset of the population with a value of the explanatory variable equal to [latex]x_0[\/latex].<\/li>\n<li>When the objective is to predict the value of the response variable for an individual observation with the explanatory variable equal to [latex]x_0[\/latex], we will calculate a [latex]C[\/latex]<strong>% prediction<\/strong><strong> interval for an individual response, <\/strong>where [latex]C[\/latex]\u00a0is the confidence level associated with the interval. This interval gives us a range of plausible values of the response for an individual observation that has a value of the explanatory variable equal to [latex]x_0[\/latex].<\/li>\n<li>Data transformation is the process of applying mathematical functions to raw data to make it more useful for analysis. The goal is to adjust for different scales, distributions, or nonlinear relationships. Common transformations include adding a constant, squaring, cubing, taking square roots, or applying logarithms to each data value. The choice of transformation depends on the nature of the data and the desired analysis.<\/li>\n<\/ul>\n<h2>Key Equations<\/h2>\n<p><strong>ANOVA for Regression<\/strong><\/p>\n<p style=\"padding-left: 40px;\">[latex]\\text{SSTotal} = \\text{SSRegression} + \\text{SSResiduals}[\/latex]<\/p>\n<p><strong>[latex]R^2[\/latex]<\/strong><\/p>\n<p style=\"padding-left: 40px;\">[latex]R^2 = \\dfrac{\\text{variation explained}}{\\text{total variation}} = \\dfrac{\\text{SSRegression}}{\\text{SSTotal}} = 1-\\dfrac{\\text{SSResiduals}}{\\text{SSTotal}}[\/latex]<\/p>\n<p><strong>Test Statistics for the Hypothesis Test for Significance of Slope<\/strong><\/p>\n<p style=\"padding-left: 40px;\">[latex]t=\\dfrac{b-0}{[\\text{std. error of }b]} = \\dfrac{b}{SE_b}[\/latex]<\/p>\n<h2>Glossary<\/h2>\n<p data-start=\"2065\" data-end=\"2219\"><strong data-start=\"2065\" data-end=\"2106\">confidence interval for mean response<\/strong><\/p>\n<p style=\"padding-left: 40px;\" data-start=\"2065\" data-end=\"2219\">A range of plausible values for the mean response variable at a given value of the explanatory variable.<\/p>\n<p><strong>data transformation<\/strong><\/p>\n<p style=\"padding-left: 40px;\">the application of a deterministic mathematical function to each point in a data set<\/p>\n<p data-start=\"1895\" data-end=\"2061\"><strong data-start=\"1895\" data-end=\"1910\">F-statistic<\/strong><\/p>\n<p style=\"padding-left: 40px;\" data-start=\"1895\" data-end=\"2061\">A ratio used in ANOVA for regression to compare the explained variance to the unexplained variance, testing the overall significance of the model.<\/p>\n<p data-start=\"2467\" data-end=\"2590\"><strong data-start=\"2467\" data-end=\"2489\">log transformation<\/strong><\/p>\n<p style=\"padding-left: 40px;\" data-start=\"2467\" data-end=\"2590\">Applying the natural logarithm to data values to stabilize variance and linearize relationships.<\/p>\n<p data-start=\"1479\" data-end=\"1684\"><strong data-start=\"1479\" data-end=\"1538\">mean square for regression (<span class=\"katex\"><span class=\"katex-mathml\"><br \/>\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><mi>M<\/mi><msub><mi>S<\/mi><mtext>Regression<\/mtext><\/msub><\/mrow><annotation encoding=\"application\/x-tex\">MS_{\\text{Regression}}<\/annotation><\/semantics><\/math><br \/>\n<\/span><\/span>)<\/strong><\/p>\n<p style=\"padding-left: 40px;\" data-start=\"1479\" data-end=\"1684\">The average variation explained by the regression model, calculated as the sum of squares for regression divided by the number of predictors.<\/p>\n<p data-start=\"1688\" data-end=\"1891\"><strong data-start=\"1688\" data-end=\"1745\">mean square for residuals (<span class=\"katex\"><span class=\"katex-mathml\"><br \/>\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><mi>M<\/mi><msub><mi>S<\/mi><mtext>Residuals<\/mtext><\/msub><\/mrow><annotation encoding=\"application\/x-tex\">MS_{\\text{Residuals}}<\/annotation><\/semantics><\/math><br \/>\n<\/span><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"mord\"><span class=\"msupsub\"><span class=\"vlist-t vlist-t2\"><span class=\"vlist-r\"><span class=\"vlist-s\">\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>)<\/strong><\/p>\n<p style=\"padding-left: 40px;\" data-start=\"1688\" data-end=\"1891\">The average unexplained variation in the response variable, calculated as the sum of squares for residuals divided by the degrees of freedom.<\/p>\n<p data-start=\"830\" data-end=\"961\"><strong data-start=\"830\" data-end=\"841\">P-value<\/strong><\/p>\n<p style=\"padding-left: 40px;\" data-start=\"830\" data-end=\"961\">The probability of obtaining a test statistic as extreme as the one observed, assuming the null hypothesis is true.<\/p>\n<p data-start=\"2223\" data-end=\"2463\"><strong data-start=\"2223\" data-end=\"2246\">prediction interval<\/strong><\/p>\n<p style=\"padding-left: 40px;\" data-start=\"2223\" data-end=\"2463\">A range of plausible values for an individual response variable at a given value of the explanatory variable. The interval is wider than the confidence interval because it accounts for individual variability.<\/p>\n<p data-start=\"2727\" data-end=\"2834\"><strong data-start=\"2727\" data-end=\"2756\">reciprocal transformation<\/strong><\/p>\n<p style=\"padding-left: 40px;\" data-start=\"2727\" data-end=\"2834\">Using the inverse of data values to reduce the impact of large variances.<\/p>\n<p data-start=\"375\" data-end=\"538\"><strong data-start=\"375\" data-end=\"409\">regression slope (<span class=\"katex\"><span class=\"katex-mathml\"><br \/>\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><msub><mi>\u03b2<\/mi><mn>1<\/mn><\/msub><\/mrow><annotation encoding=\"application\/x-tex\">\\beta_1<\/annotation><\/semantics><\/math><br \/>\n<\/span><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"mord\"><span class=\"mord mathnormal\">\u03b2<\/span><span class=\"msupsub\"><span class=\"vlist-t vlist-t2\"><span class=\"vlist-r\"><span class=\"vlist\"><span class=\"sizing reset-size6 size3 mtight\"><span class=\"mord mtight\">1<\/span><\/span><\/span><span class=\"vlist-s\">\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>)<\/strong><\/p>\n<p style=\"padding-left: 40px;\" data-start=\"375\" data-end=\"538\">The coefficient representing the rate of change of the response variable for each unit increase in the explanatory variable.<\/p>\n<p data-start=\"2594\" data-end=\"2723\"><strong data-start=\"2594\" data-end=\"2624\">square root transformation<\/strong><\/p>\n<p style=\"padding-left: 40px;\" data-start=\"2594\" data-end=\"2723\">Taking the square root of data values to reduce right-skewness and normalize the distribution.<\/p>\n<p data-start=\"542\" data-end=\"666\"><strong data-start=\"542\" data-end=\"584\">standard error of the slope (<span class=\"katex\"><span class=\"katex-mathml\"><br \/>\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><mi>S<\/mi><msub><mi>E<\/mi><mi>b<\/mi><\/msub><\/mrow><annotation encoding=\"application\/x-tex\">SE_b<\/annotation><\/semantics><\/math><br \/>\n<\/span><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"mord mathnormal\">S<\/span><span class=\"mord\"><span class=\"mord mathnormal\">E<\/span><span class=\"msupsub\"><span class=\"vlist-t vlist-t2\"><span class=\"vlist-r\"><span class=\"vlist\"><span class=\"sizing reset-size6 size3 mtight\"><span class=\"mord mathnormal mtight\">b<\/span><\/span><\/span><span class=\"vlist-s\">\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>)<\/strong><\/p>\n<p style=\"padding-left: 40px;\" data-start=\"542\" data-end=\"666\">A measure of the variability in the estimated slope across different samples.<\/p>\n<p data-start=\"965\" data-end=\"1134\"><strong data-start=\"965\" data-end=\"1027\">sum of squares for regression (<span class=\"katex\"><span class=\"katex-mathml\"><br \/>\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><mi>S<\/mi><msub><mi>S<\/mi><mtext>Regression<\/mtext><\/msub><\/mrow><annotation encoding=\"application\/x-tex\">SS_{\\text{Regression}}<\/annotation><\/semantics><\/math><br \/>\n<\/span><\/span>)<\/strong><\/p>\n<p style=\"padding-left: 40px;\" data-start=\"965\" data-end=\"1134\">The portion of the total variation in the response variable that is explained by the regression model.<\/p>\n<p data-start=\"1138\" data-end=\"1312\"><strong data-start=\"1138\" data-end=\"1198\">sum of squares for residuals (<span class=\"katex\"><span class=\"katex-mathml\"><br \/>\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><mi>S<\/mi><msub><mi>S<\/mi><mtext>Residuals<\/mtext><\/msub><\/mrow><annotation encoding=\"application\/x-tex\">SS_{\\text{Residuals}}<\/annotation><\/semantics><\/math><br \/>\n<\/span><\/span>)<\/strong><\/p>\n<p style=\"padding-left: 40px;\" data-start=\"1138\" data-end=\"1312\">The portion of the total variation in the response variable that remains unexplained by the regression model.<\/p>\n<p data-start=\"670\" data-end=\"826\"><strong data-start=\"670\" data-end=\"702\">test statistic (<span class=\"katex\"><span class=\"katex-mathml\"><br \/>\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><mi>t<\/mi><\/mrow><annotation encoding=\"application\/x-tex\">t<\/annotation><\/semantics><\/math><br \/>\n<\/span><span class=\"katex-html\" aria-hidden=\"true\"><span class=\"base\"><span class=\"mord mathnormal\">t<\/span><\/span><\/span><\/span>-value)<\/strong><\/p>\n<p style=\"padding-left: 40px;\" data-start=\"670\" data-end=\"826\">A measure of how many standard errors the estimated slope is away from zero, used in hypothesis testing for regression.<\/p>\n<p data-start=\"1316\" data-end=\"1475\"><strong data-start=\"1316\" data-end=\"1364\">total sum of squares (<span class=\"katex\"><span class=\"katex-mathml\"><br \/>\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><semantics><mrow><mi>S<\/mi><msub><mi>S<\/mi><mtext>Total<\/mtext><\/msub><\/mrow><annotation encoding=\"application\/x-tex\">SS_{\\text{Total}}<\/annotation><\/semantics><\/math><br \/>\n<\/span><\/span>)<\/strong><\/p>\n<p style=\"padding-left: 40px;\" data-start=\"1316\" data-end=\"1475\">The total variation in the response variable, equal to the sum of the explained and unexplained variation.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"author":8,"menu_order":1,"template":"","meta":{"_candela_citation":"[]","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"part":1438,"module-header":"cheat_sheet","content_attributions":[],"internal_book_links":[],"video_content":null,"cc_video_embed_content":{"cc_scripts":"","media_targets":[]},"try_it_collection":null,"_links":{"self":[{"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/pressbooks\/v2\/chapters\/1439"}],"collection":[{"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/wp\/v2\/users\/8"}],"version-history":[{"count":8,"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/pressbooks\/v2\/chapters\/1439\/revisions"}],"predecessor-version":[{"id":6268,"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/pressbooks\/v2\/chapters\/1439\/revisions\/6268"}],"part":[{"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/pressbooks\/v2\/parts\/1438"}],"metadata":[{"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/pressbooks\/v2\/chapters\/1439\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/wp\/v2\/media?parent=1439"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/pressbooks\/v2\/chapter-type?post=1439"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/wp\/v2\/contributor?post=1439"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/content.one.lumenlearning.com\/introstatstest\/wp-json\/wp\/v2\/license?post=1439"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}