back to other calculators

Correlation Test Online Calculator

This free online correlation test calculator shows the strength of the correlation between two things and displays Pearson, Spearman, Kendall correlation coefficients with p-values and scatterplot diagram. It tells you what kind of relationship exists between the two variables, and also the certainty level.

Powered by AnswerMiner
1012
2022
3019
4042
5052
6062
70172
8082
9092
Clear table
Pearson correlation coefficient is
r = 0.7569
p-value = 0.0182
Spearman correlation coefficient is
rs = ρ(rho) = 0.9333
p-value = 0.0007
Kendall correlation coefficient is
τ(tau) = 0.8333
p-value = 0.0009

What does correlation and its strength mean?

If you know two numerical data about one kind of thing, and you have information about several pieces of this kind of things, then you can examine if there is a relationship between the two values. If you draw a diagram with the two values (on axes X and Y), then the stronger the correlation, the more unequivocally can be seen on the diagram if the two values move together, i.e if the first value of the datapair is bigger then the second value of the same datapair is usually bigger too.

How to use this tool

First of all find out between what kinds of things you want to examine correlation. It can be the weight and height of people, or temperate and number of ice creams sold, or the age and performance of an employee, and so on, any kind of numerical X-Y value pairs. Write at least five value-pairs into the cells of the table above. After entering the datapairs, a scatterplot diagram immediately appears where you can visually check how the values move together and you can also see the strength of the correlation (coefficient), and the sureness (p-value) of the result is also shown.

Which correlation method? Pearson or Spearman or Kendall?

Pearson correlation coefficient is the most commonly used method, although it is very sensitive to outliers. Spearman and Kendall correlation coefficients are not sensitive to outliers but their explanatory power is lower. Read our correlation coefficient demistified blogpost.

Why you can't be absolutely sure?

Because the experienced correlation between X and Y columns may come from the work of coincidence. Your data comes from an experiment or observation that is not exactly repeatable, they are not accurate, there is a fluke in them. If you would measure again, you would get different values. This distribution causes that you can be sure about the relationship between the things only if you have several data and if the correlation is strong. The more data you have and the more strong the relationship between values, the bigger the certainty. Think about it: If you have a small 12 rows chart, in which you have the seasons and their average temperatures and the number of computers sold in that season, then if there is a weak correlation between the values, this may be the work of coincidence, so you cannot say it with complete certainty that computer sales are in connection with the temperature. However if you have 365 lines of datapairs about the temperature of each days and the number of ice creams sold then - because of the many data and strong correlation - it is already sure that there is a relationship.

What does the p-value tell you?

This certainty value shows, how likely it is, that the observed correlation coefficient came out only by coincidence. A low p-value (below 0.05) means that you can be sure about the fact that there is a correlation between the two kind of values, i.e. they move together on the diagram. A high p-value (above 0.05) means that you cannot be sure whether there is correlation between your numbers or not.

What does low sureness mean?

It means that from these numbers it cannot be known whether there is a correlation between the two values or not. So it does not mean that there is no correlation and the relationship experienced is only the work of coincidence but it means you cannot be sure, whether it is the work of coincidence or a real connection exists. The certainty is said to be low under 95% (p-value above 0.05), so you can't be sure about the result. Generally, 95% or bigger sureness is required. If you reach 99% or better (p-value below 0.01 or lower), then you can be sure already. (But there is 1% chance, that the difference happened because of a very rare coincidence.)

What does high sureness mean?

It means that it is sure that there is a correlation between the values. Usually above 95% or 99% certainty level (p-value below 0.05 or 0.01) is considered to be high. It's important that despite of the certainty being high, it only means that there is a relationship between the two values, but the strength of the connection between the two datapairs may be minimal or negligible. This is why you must also check the experienced strength of the correlation.

How to increase the certainty?

You need more data. If you continue your experiment or observation with a larger number of events, you will get better certainty, even if the strength of the correlation doesn't change.

How does it work?

A complex algorithm calculates the correlation coefficients and the statistical significance related to them. But we don't want to bore you with math behind it, just use it. You don't need university-level math in order to use a tool that is based on that.

Upload or connect your datasource and analyse data in your spreadsheets

Try AnswerMiner free