The correlation between two variables may be calculated in more than one way. We will tell you about a measure known as Pearson's Product Moment Coefficient, which is commonly used.
Pearson's measure is denoted by the letter r. In order to understand what the calculation for r is doing, we need to think for a moment about what correlation really means. We saw in level 1 that the correlation coefficient tells you two things about the linear relationship between two variables:
It is easiest to think of all these things with reference to a scatter plot of the data. If there is a linear relationship between the two variables, they will tend to form a straight line on the scatter plot. If the points on the scatter plot all line up perfectly, then correlation is 1 (or -1 if the slope goes down). If the points cluster about a line, forming a thin cloud, then correlation will be less. The thicker the cloud, the weaker the correlation.
- The direction of the relationship (positive or negative)
- The strength of the relationship - that is, how consistently linear it is
The section below steps you through a visual explanation of how the formula for Pearson's Correlation Coefficient is derived. Use the Next and Previous links to step through it.
|Step 1 of 8|
Consider the diagram to the right. It shows a scatterplot of two variables, X and Y. You can see that they are related, but not perfectly.