Changing
Minds
.org

How we change what others think, feel, believe and do

 

Disciplines

 

Techniques

 

Principles

 

Explanations

 

Theories

 

 

Home

 

Blog!

 

Quotes

 

Guest articles

 

Analysis

 

Books

 

Guestbook

 

Links

 

 

Now, you can
buy the book!

"Go beyond
the site"


Add/share/save
this page:

Add to Google

 

 

 

 

Pearson correlation

 

Explanations > Social ResearchAnalysis > Pearson correlation

Description | Example | Discussion | See also

 

Description

Pearson devised a very common way of measuring correlation, often called the Pearson Product-Moment Correlation. It is is used when both variables are at least at interval level and data is parametric.

It is calculated by dividing the covariance of the two variables by the product of their standard deviations.

r = SUM((xi - xbar)(y - ybar)) / ((n - 1) * sx * sy)

Where x and y are the variables, xi is a single value of x, xbar is the mean of all x's, n is the number of variables, and sx is the standard deviation of all x's.

r may also be considered as being:

r2 = explained variation / total variation

where variation is calculated as the Sum of the Squares, SS

In other words, it is the proportion of variation that can be explained. A high explained proportion is good, and a value of one is perfect correlation. For example an r of 0.8 explains 64% of the variance.

When calculated from a population, Pearson's coefficient is denoted with the Greek letter 'rho' (ρ). When calculated from a sample, it is denoted with 'r'.

The Coefficient of Determination is calculated as r2.

Example

 

x y x-xbar y-ybar (x-xbar) *
(y-ybar)
1 2 -3.7 -2.3 8.51
3 5 -1.7 0.7 -1.19
5 6 0.3 1.7 0.51
6 6 1.3 1.7 2.21
8 7 3.3 2.7 8.91
9 7 4.3 2.7 11.61
6 5 1.3 0.7 0.91
4 3 -0.7 -1.3 0.91
3 1 -1.7 -3.3 5.61
2 1 -2.7 -3.3 8.91
         
n: 10        
Totals: 57 43    

46.90

         
xbar ybar      
Means: 4.70 4.30 (xbar is mean of x)
         
sx sy      
Std dev: 2.58 2.36      
         

 

Hence:

Pearson r = sum((xi - xbar)(y - ybar)) / ((n - 1) * sx * sy)

             = 0.854

 

This is quite high, showing a moderately good correlation between the sets of numbers.

 

Discussion

Pearson is a parametric statistic and assumes:

  1. A normal distribution.
  2. Interval or ratio data.
  3. A linear relationship between X and Y

The coefficient of determination, r2, represents the percent of the variance in the dependent variable explained by the dependent variable.

Correlation explains a certain amount of variance, but not all. This works on a square law, so a correlation of 0.5 indicates that the independent variable explains 25% of the variance of the dependent variable, and a correlation of 0.9 accounts for 81% of the of the variance.

This  means that the unexplained variance is indicated by (1-r2). This i typically due to random factors.

Pearson's Correlation is also known as the Pearson Product-Moment Correlation or Sample Correlation Coefficient. 'r' is also known as 'Pearson's r'.

See also

Spearman correlation, Kendall correlation, Types of reliability

 

Contact Caveat About Students Webmasters Awards Guestbook Feedback Sitemap Changes

 

 

  © Syque 2002-2008

TOP

Massive Content -- Maximum Speed