Probability theory works with random variables. For random variables, there are so-called distribution laws. Such a law describes its random value with absolute completeness. However, when working with real sets of random variables, it is often very difficult to immediately establish the law of their distribution and are limited to a certain set of numerical characteristics. For example, calculating the mean and variance of a random variable is often very useful.
Why is it necessary
If the essence of mathematical expectation is close in meaning to the average value of the quantity, then in this case the variance says how the values ββof our magnitude are scattered around this mathematical expectation. For example, if we measured IQ in a group of people and want to study the measurement results (sample), the mathematical expectation will show an approximate average intelligence coefficient for this group of people, and if we calculate the variance of the sample, we will find out how the results are grouped around the mathematical expectation: a bunch near it (small spread of IQ) or more evenly over the entire section from minimum to maximum result (large spread, and somewhere in the middle - mat. expectation).
To calculate the variance, a new characteristic of a random variable is needed - the deviation of the value from the mathematical expectation.
Deviation
To understand how to calculate the variance, you must first deal with the deviation. Its definition is the difference between the value that a random variable takes and its mathematical expectation. Roughly speaking, in order to understand how the quantity is "scattered", you need to look at how its deviation is distributed. That is, we replace the value of the value with the value of its deviation from the mat. expectations and already investigate its distribution law.
The distribution law of a discrete, that is, accepting individual values ββof a random variable is written in the form of a table, where the value of the quantity is correlated with the probability of its occurrence. Then, in the law of the distribution of deviations, the random variable is replaced by its formula, in which there is a quantity (which has retained its probability) and its own mat. expectation.
Properties of the law of distribution of deviations of a random variable
We have written the law of the distribution of deviations of a random variable. So far we can extract from it only such a characteristic as mathematical expectation. For convenience, it is better to take a numerical example.
Let there be a distribution law of some random variable: X is the value, p is the probability.
We calculate the expected value by the formula and immediately deviations.
We draw a new deviation distribution table.
We calculate the expected value here.
It turns out zero. There is only one example, but it will always be so: it is not difficult to prove in the general case. The formula for the mathematical expectation of deviation can be decomposed into the difference in the mathematical expectation of a random variable and, no matter how crooked it may sound, the mathematical expectation of mat. expectations (recursion, however) that are one and the same, therefore, their difference will be equal to zero.
This is expected: after all, deviations in the sign can be both positive and negative, therefore, on average they should give zero.
How to calculate the variance of a discrete case. values
If the mat. waiting for a deviation to calculate is meaningless, you need to look for something else. You can simply take the absolute values ββof the deviations (modulo); but with modules itβs not so simple, so the deviations are squared, and then they are considered their mathematical expectation. Actually, this is what is meant when they talk about how to calculate the variance.
That is, we take the deviations, square them and compile a table of the squares of the deviations and probabilities that correspond to random variables. This is the new law of distribution. To calculate the mathematical expectation, it is necessary to add the product of the squared deviation and probability.
Simpler formula
However, the article began with the fact that the distribution law of the initial random variable is often unknown. Therefore, you need something easier. Indeed, there is another formula that allows one to calculate the variance of a sample using only mat. expectations:
Dispersion - the difference between the mat. expectation of a square of a random variable and, conversely, the square of its mat. expectations.
There is evidence for this, but it does not make sense to bring it here, since it has no practical value (and we only need to calculate the variance).
How to calculate the variance of a random variable in variational series
In real statistics, it is impossible to reflect all random variables (because, roughly speaking, as a rule, they are infinitely many). Therefore, what falls into the study is the so-called representative sample from some general population. And, since the numerical characteristics of any random variable from such a population are calculated from the sample, they are called selective: the sample average, respectively, the sample variance. It can be calculated in the same way as the usual one (through the squares of the deviations).
However, such a dispersion is called biased. The unbiased dispersion formula looks a little different. It is usually required to calculate it.
Small addition
One more numerical characteristic is associated with dispersion. It also serves to evaluate how a random variable is scattered around its mat. expectations. There is no big difference in the methods of calculating the variance and the standard deviation: the last is the square root of the former.