Download a pdf of this page here.
Essential Concepts
- A linear regression model with two or more explanatory variables is called a multiple linear regression model. Since there is more than one explanatory variable, the model is no longer a line. In fact, we can include [latex]p[/latex] explanatory variables in our model. The equation for the estimated model that uses [latex]p[/latex] variables is[latex]\hat{y} = a + b_1 \cdot x_1 + b_2 \cdot x_2 + ... + b_p \cdot x_p[/latex]where [latex]b_1, b_2, ... ,b_p[/latex] are the regression coefficients for explanatory variables [latex]x_1, x_2, ... ,x_p[/latex], respectively. In multiple linear regression, [latex]b_1, b_2, ... , b_p[/latex] are called partial slopes.
- The coefficient of determination, [latex]R^2[/latex], is used to determine the percentage of variability in the response variable that is accounted for by the explanatory variables.
- In multiple linear regression, the [latex]y[/latex]-axis has the residual values and the [latex]x[/latex]-axis has the explanatory variables and/or the fitted values. For a multiple linear regression model, you create a residual plot for each continuous explanatory variable, as well as the fitted value.
- An indicator variable is a binary variable with only two values: [latex]0[/latex] and [latex]1[/latex]. When creating an indicator variable, we assign the value of [latex]1[/latex] for a certain category, and the value of [latex]0[/latex] is used for all other categories.
- A reference group is the value of the categorical variable that is not represented explicitly by the indicator variable (which is why we only require [latex]k-1[/latex] indicator variables to define our regression model).
- An interaction occurs when an explanatory variable has a different effect on the response variable, depending on the values of another explanatory variable. An interaction term is a variable that represents an interaction between two variables.
Key Equations
multiple linear regression model
[latex]\hat{y} = a + b_1 \cdot x_1 + b_2 \cdot x_2 + ... + b_p \cdot x_p[/latex]
partial slopes
[latex]b_1, b_2, ... , b_p[/latex]
Glossary
indicator variable
a binary variable with only two values: [latex]0[/latex] and [latex]1[/latex]
interaction
an explanatory variable that has a different effect on the response variable, depending on the values of another explanatory variable
interaction term
a variable that represents an interaction between two variables
multiple linear regression model
a linear regression model with two or more explanatory variables