## Coefficient of Correlation

The coefficient of correlation, also known as the correlation coefficient or Pearson’s correlation coefficient (denoted by ‘r’), is a statistical measure that quantifies the strength and direction of the linear relationship between two variables. The correlation coefficient ranges from -1 to 1, where:

- A value of 1 indicates a perfect positive linear relationship between the variables, meaning that when one variable increases, the other also increases proportionally.
- A value of -1 indicates a perfect negative linear relationship, meaning that when one variable increases, the other decreases proportionally.
- A value of 0 suggests no linear relationship between the variables, meaning that the variables do not appear to be related in a linear manner.

The correlation coefficient is a widely used tool in finance, economics, and various other fields to analyze the relationship between different variables, such as stock prices, interest rates, or economic indicators.

It’s important to note that correlation does not imply causation. A high correlation between two variables indicates that they have a strong linear relationship, but it does not necessarily mean that changes in one variable cause changes in the other.

To calculate the correlation coefficient, you can use the following formula:

\(\large r = \frac{n(\Sigma xy) – (\Sigma x)(\Sigma y)}{\sqrt{[n\Sigma x^2 – (\Sigma x)^2][n\Sigma y^2 – (\Sigma y)^2]}} \)

where:

- n is the number of data points.
- Î£xy is the sum of the product of each pair of x and y values.
- Î£x and Î£y are the sums of x and y values, respectively.
- Î£x2 and Î£y2 are the sums of the squares of x and y values, respectively.

When r is close to 1 or -1, it indicates a strong linear relationship between x and y. If rr is close to 0, it indicates little to no linear relationship between x and y.

## Example of the Coefficient of Correlation

Let’s go through a small example to calculate the coefficient of correlation rr between two sets of data points.

Suppose we have the following pairs of xx and yy values:

x: 1, 2, 3, 4, 5

y: 2, 3, 5, 4, 6

Given these values, we can calculate the necessary sums:

- Î£x
- Î£y
- Î£xy
- Î£x2
- Î£y2

Î£x = 1 + 2 + 3 + 4 + 5 = 15

Î£y = 2 + 3 + 5 + 4 + 6 = 20

Î£xy = 1(2) + 2(3) + 3(5) + 4(4) + 5(6) = 60

Î£x^{2} = 1^{2} + 2^{2} + 3^{2} + 4^{2} + 5^{2} = 55

Î£y^{2} = 2^{2} + 3^{2} + 5^{2} + 4^{2} + 6^{2} = 90

Using the formula for r:

\(\large r = \frac{n(\Sigma xy) – (\Sigma x)(\Sigma y)}{\sqrt{[n\Sigma x^2 – (\Sigma x)^2][n\Sigma y^2 – (\Sigma y)^2]}} \)

And plugging in our values:

\(\large r = \frac{5(60) – (15)(20)}{\sqrt{[5(55) – (15)^2][5(90) – (20)^2]}} \)

Let’s break it down step-by-step:

Numerator:

5(60) âˆ’ 15(20) = 300 âˆ’ 300 = 0

Denominator:

5(55) âˆ’ (15)^{2} = 275 âˆ’ 225 = 50

5(90) âˆ’ (20)^{2} = 450 âˆ’ 400 = 50

\(\sqrt {50 Ã— 50} = \sqrt {2500} â€‹= 50 \)

So,

\(r = \frac{0}{50} = 0 \)

The solution for this example is r = 0. This means that there’s no linear correlation between the two sets of data points.