- Create scatterplots for bivariate data and answer questions from the graph.
- Describe the trend of bivariate data.
- Calculate the correlation coefficient and explain what it means.
Strength of the Linear Relationship
The strength of the relationship is a description of how closely the data follow the linear form of the relationship.

In the leftmost scatterplot, the data points follow exactly the linear pattern. This is an example of a perfect linear relationship. In the second left scatterplot, the data points almost follow a linear pattern. This is an example of a strong linear relationship. In the third scatterplot, the data points also follow a linear pattern, but the points are not as close to forming a line. The data is more scattered. This is an example of a weak linear relationship. Labeling a relationship as strong or weak is not very precise.
The Pearson Correlation Coefficient, [latex]r[/latex], is a more precise measure of the strength and direction of the linear relationship between two quantitative variables.
correlation coefficient
The correlation coefficient, ([latex]r[/latex]), is a numeric measure that measures the strength and direction of a linear relationship between two quantitative variables.
Calculation: [latex]r[/latex] is calculated using the following formula: [latex]r=\dfrac{\Sigma\left(\frac{x-\stackrel{¯}{x}}{{s}_{x}}\right)\left(\frac{y-\stackrel{¯}{y}}{{s}_{y}}\right)}{n-1}[/latex] where [latex]n[/latex] is the sample size; [latex]x[/latex] is a data value for the explanatory variable; [latex]\stackrel{¯}{x}[/latex] is the mean of the [latex]x[/latex]-values; [latex]{s}_{x}[/latex] is the standard deviation of the [latex]x[/latex]-values; similarly, for the terms involving [latex]y[/latex]. To calculate [latex]r[/latex], the term [latex]\left(\frac{x-\stackrel{¯}{x}}{{s}_{x}}\right)\left(\frac{y-\stackrel{¯}{y}}{{s}_{y}}\right)[/latex] is calculated for each individual or observational unit. These terms are added together, then the sum is divided by [latex](n-1)[/latex].
However, the calculation of [latex]r[/latex] is not the focus of this course. We use a statistics package to calculate the correlation coefficient for us, and the emphasis of this course is on the interpretation of [latex]r[/latex]‘s value.
The following table contains general guidelines for describing the strength of a linear relationship based on the value of the associated correlation coefficient.
| Correlation Coefficient, [latex]r[/latex] | General Interpretation |
| [latex]-1[/latex] to [latex]-0.7[/latex] | Strong negative linear relationship |
| [latex]-0.7[/latex] to [latex]-0.3[/latex] | Moderate negative linear relationship |
| [latex]-0.3[/latex] to [latex]-0.1[/latex] | Weak negative linear relationship |
| [latex]-0.1[/latex] to [latex]0.1[/latex] | Negligible or no linear relationship |
| [latex]0.1[/latex] to [latex]0.3[/latex] | Weak positive linear relationship |
| [latex]0.3[/latex] to [latex]0.7[/latex] | Moderate positive linear relationship |
| [latex]0.7[/latex] to [latex]1[/latex] | Strong positive linear relationship |
Properties of [latex]r[/latex]
- The correlation does not change when the units of measurement of either one of the variables change. In other words, if we change the units of measurement of the explanatory variable and/or the response variable, it has no effect on the correlation ([latex]r[/latex]).
- The correlation measures only the strength of a linear relationship between two variables. It ignores any other type of relationship, no matter how strong it is, which illustrates an important rule: Always make a scatterplot of the data before calculating and interpreting the meaning of [latex]r[/latex].
- Association does not imply causation. Do not interpret a high correlation between explanatory and response variables as a cause-and-effect relationship.