In our world, everything is interconnected, somewhere it is visible to the naked eye, and somewhere people do not even suspect the existence of such a relationship. Nevertheless, in statistics, when they mean mutual dependence, the term "correlation" is often used. It can often be found in economic literature. Let's try together to figure out what the essence of this concept consists of, what are the coefficients and how to interpret the obtained values.
The concept
So what is correlation? As a rule, this term refers to the statistical relationship of two or more parameters. If the value of one or more of them changes, this inevitably affects the size of the others. For the mathematical determination of the strength of such an interdependence, it is customary to use various coefficients. It should be noted that in the case when a change in one parameter does not lead to a regular change in another, but affects any statistical characteristic of this parameter, such a relationship is not correlation, but simply statistical.
Term history
In order to better understand what correlation is, let's plunge a little into history. This term appeared in the XVIII century thanks to the efforts of the French paleontologist Georges Cuvier. This scientist developed the so-called "law of correlation" of organs and parts of living things, which allowed to restore the appearance of an ancient fossil animal, having only a few of its remains. In statistics, this word has come into use since 1886 with the light hand of the English statistician and biologist Francis Galton. The name of the term already contains its decoding: not just and not only the relationship - “relation”, but relations that have something together - “co-relation”. However, only Galton's student, biologist and mathematician K. Pearson (1857 - 1936) could clearly explain mathematically what correlation is. It was he who first derived the exact formula for calculating the corresponding coefficients.
Pair correlation
This is the name of the relationship between two specific quantities. For example, it has been proven that annual advertising costs in the United States are very closely related to gross domestic product. It is estimated that between these values ​​in the period from 1956 to 1977, the correlation coefficient was 0.9699. Another example is the number of visits to an online store and its sales volume. A close relationship was found between such quantities as beer sales and air temperature, average monthly temperature for a particular place in the current and previous year, etc. How to interpret the pair correlation coefficient? Immediately, we note that it takes a value from -1 to 1, and a negative number indicates the inverse, and a positive one indicates a direct relationship. The larger the modulus of the calculation result, the stronger the values ​​affect each other. A zero value indicates the absence of dependence, a value less than 0.5 indicates a weak, and otherwise - a pronounced relationship.
Pearson Correlation
Depending on the scale on which the variables are measured, one or another indicator is used for the calculations ( Fechner, Spearman, Kendall coefficient, etc.). When examining interval values, the most commonly used indicator was invented by Karl Pearson.
This coefficient shows the degree of linear relationships between the two parameters. When they talk about the correlation relation, most often they mean it. This indicator has become so popular that its formula is in Excel and, if you wish, you can yourself figure out in practice what correlation is without going into the intricacies of complex formulas. The syntax of this function is: PEARSON (array1, array2). The first and second arrays are usually substituted with the corresponding ranges of numbers.