Line of Best Fit: Learn It 1

  • Recognize when a linear regression model will fit a given data set.
  • Use technology to create scatterplots, find the line of best fit and find the correlation coefficient.
  • Find the estimated slope and [latex]y[/latex]-intercept for a linear regression model.
  • Use the line of best fit to predict values.

Explanatory and Response Variables

Bivariate data are defined as pairs of data values, where each pair consists of two different measurements that come from the same individual or unit. There are two variables to consider when working with bivariate data sets:

  • The explanatory variable ([latex]x[/latex]) is the variable thought to explain or predict the response variable of a study.
  • The response variable ([latex]y[/latex]) measures the outcome of interest in the study. This variable is thought to depend in some way on the explanatory variable. It is often referred to as the “variable of interest” for the researcher. (And in previous math classes, this variable may have been referred to as the “dependent variable.”)

Identifying explanatory and response variables can sometimes be difficult. When trying to identify explanatory and response variables, make sure to carefully read the scenario and keep the following phrases in mind:

The explanatory variable is used to predict the response variable

The explanatory variable is used to calculate the response variable

The explanatory variable is used to determine the response variable

It is good practice to identify both variables and then ask yourself, “Which one is the main outcome or focus of the study?” This variable will be the response variable and the other variable will be your explanatory variable. It is not up to the researcher(s) to decide the main focus or outcome of a pre-existing study. Instead, researchers need to carefully read the context of the study to identify which variable is being used to explain (the explanatory variable) an outcome or response (the response variable).

A teacher wonders if students’ number of absences per semester is related to academic performance in her classes. She might look back on her class records from previous semesters and generate a data set by observing both the final overall average grade and total number of missed classes for each student in a random sample of students. This is an example of a bivariate data set.
Determine the explanatory and response variable of this scenario.