Covariance Dictionary Definitions

The term covariance is not part of the dictionary prepared by the Royal Spanish Academy (RAE). The concept, however, is used in the field of statistics and probability to name the value that reflects the degree of joint variation that is recorded in two random variables, taking their means as a measure.

According to, the covariance, therefore, allows us to discover if the variables maintain a dependency link. The data also contributes to knowing other parameters.

It is known by the name of random variable to a function that assigns a value to the result of a random experiment, usually of a numerical type. A random experiment, on the other hand, is one that can yield different results even if it is performed more than once under the same conditions, so that each experience becomes impossible to predict and therefore impossible to reproduce.

A very common example of a random experiment, which we can try in our daily lives, is the throwing of a die: even if it is thrown on the same surface, with the same hand or cup, and applying more or less the same force and direction, it does not it is possible to predict which of its faces will be pointing up.

If the low values of one variable correspond to the low values of another variable, or if the same occurs with the high values of both, the covariance has a positive value and is qualified as direct. On the other hand, if the low values of one variable correspond to the highest values ​​of another variable and vice versa, the covariance is negative and is defined as inverse. The existing trend in the linear relationship established between the variables, in this way, is expressed by the sign of the covariance.

There are different formulas to calculate the covariance. It can be said that the covariance is the arithmetic mean that arises from the product of the deviations of the variables with respect to their own means.

Suppose the variables are the results of the History and Geography assessments of five students:

History (P) grades for all five students: 6, 5, 7, 7, 4 (total = 29)
Geography (S) grades for all five students: 7, 3, 4, 3, 5 (total = 22)

Then you have to tabulate, multiplying the results of the evaluations of each student:

P x S: 42 (since 6 x 7 = 42), 15 (5 x 3), 28 (7 x 4), 21 (7 x 3), 20 (4 x 5). Total of the sum of the results = 126)

The mean of P: 29 / 5 = 5.8
The mean of S: 22 / 5 = 4.4


PS Covariance: (126 / 5) – 5.8 x 4.4
PS Covariance: 25.2 – 5.8 x 4.4
PS Covariance: 25.2 – 25.52
PS Covariance: -0.32

In addition to knowing whether two given random variables have a relationship of dependence on each other, the covariance is used for the estimation of parameters such as the regression line and the linear correlation coefficient.

The regression line is also known as linear fit or linear regression, and is a concept belonging to the field of statistics that comprises a mathematical model used to approximate the dependence that exists between a group of variables and a random term.

The linear correlation coefficient, on the other hand, is an indicator of the direction and strength of a linear relationship (in mathematics, that is if the value of one quantity depends on the value of another) and proportionality (a ratio or constant relationship that occurs between magnitudes that can be measured) between two statistical variables (they are characteristics that can fluctuate, with values ​​that can be observed and measured).

It is important to differentiate the following two types of covariance: the one that occurs between two random variables, which is considered a property of the joint distribution, that is, of the events of both that occur simultaneously; the sample, which is used as a statistical estimate of the parameter.