Module 5: Background You’ll Need 1

  • Identify the two quantitative variables as an explanatory variable and a response variable.

Scatterplots

Scatterplots are used to illustrate the relationship between two quantitative variables.

When investigating relationships between two quantitative variables, scatterplots are a simple way to visually represent the spread, direction, strength of relationship, and potential outliers of the data. With larger data sets, a scatterplot can more succinctly display the overall pattern than when the data is presented as a table. This visualization can also hint at the general shape of the relationship (for example, increasing linear, decreasing linear, or non-linear curves) while also helping us identify any deviations from that pattern.

Note: The explanatory variable is typically placed on the horizontal x-axis. The response variable is on the vertical y-axis. Sometimes the variables do not have a clear explanatory–response relationship. In this case, there is no rule to follow. Plot the variables on either axis.

Highway Signs

A research firm conducts a study to explore the relationship between a driver’s age and the driver’s ability to read highway signs. The subjects are a random sample of 30 drivers between the ages of 18 and 82. (Source: Jessica M. Utts and Robert F. Heckard, Mind on Statistics [Brooks/Cole, 2002]. Original source: Data collected by The Last Resource, Inc., Bellfonte, PA.)

Because the purpose of this study is to explore the effect of age on the driver’s ability to read highway signs,

  • the explanatory variable is age, and
  • the response variable is the maximum distance at which the driver can read a highway sign, or maximum reading distance.

Both variables are quantitative.

Here is what the raw data look like:

Raw data: Drivers’ ages (explanatory variables) and distance (response variables) at which they can see highway sign
Figure 1. A data table showing driver age as the explanatory variable and stopping distance as the response variable.

In this data set, the individuals are the 30 drivers. For each driver, we have two values: age and maximum reading distance.

To explore the relationship between age and distance, we create a graph called a scatterplot. To create a scatterplot, we use an ordered pair (x, y) to represent the data for each driver. The x-coordinate is the explanatory variable: driver’s age. The y-coordinate is the response variable: maximum reading distance.

Here is the completed scatterplot:

Completed scatterplot, where each dot represents a driver's age and maximum distance at which they can read a road sign
Figure 2. A scatterplot showing a negative association between driver age and the maximum distance at which a sign can be read—older drivers tend to read signs from shorter distances.