Distribution functions of a random variable. How to find the distribution function of a random variable

To find the distribution functions of random variables and their variables, it is necessary to study all the features of this field of knowledge. There are several different methods for finding the values ​​in question, including changing a variable and generating a moment. Distribution - a concept based on such elements as dispersion, variation. However, they characterize only the degree of spread of scattering.

Random Distribution Functions

More important functions of random variables are those that are related and independent, and equally distributed. For example, if X1 is the weight of a randomly selected individual from the male population, X2 is the weight of another, ..., and Xn is the weight of another person from the male population, then you need to find out how the random function X is distributed. In this case, the classical theorem, called the central limit, is applicable. It allows us to show that for large n the function follows standard distributions.

Functions of One Random Variable

The central limit theorem is intended to approximate discrete considered values, such as binomial and Poisson. The distribution functions of random variables are considered, first of all, on simple values ​​of one variable. For example, if X is a continuous random variable having its own probability distribution. In this case, we study how to find the density function Y using two different approaches, namely, the method of the distribution function and changing the variable. At first, only one-to-one values ​​are considered. Then you need to modify the technique of changing the variable to find its probability. Finally, you need to find out how the inverse cumulative distribution function can help model random numbers that follow certain sequential patterns.

Methodology for the distribution of the considered values

The method of the probability distribution function of a random variable is applicable in order to find its density. Using this method, the cumulative value is calculated. Then, differentiating it, we can obtain the probability density. Now, with a distribution function method, a few more examples can be considered. Let X be a continuous random variable with a certain probability density.

What is the probability density function of x2? If you look or plot the function (top and right) y = x2, it can be noted that it is increasing X and 0 <y <1. Now it is necessary to use the method in question to find Y. First we find the cumulative distribution function, we just need to differentiate to get the probability density. By doing so, we get: 0 <y <1. The distribution method has been successfully implemented to find Y when Y is an increasing function of X. By the way, f (y) integrates into 1 over y.

In the last example, great caution was used to index cumulative functions and probability density with either X or Y to indicate which random variable they belonged to. For example, when finding the cumulative distribution function of Y, we got X. If you need to find a random variable X and its density, then you just need to differentiate it.

Variable Change Technique

Let X be a continuous random variable given by a distribution function with a common denominator f (x). In this case, if you put the value of y in X = v (Y), you get the value of x, for example, v (y). Now, we need to obtain the distribution function of the continuous random variable Y. Where the first and second equality takes place from the definition of cumulative Y. The third equality holds because the parts of the function for which u (X) ≤ y are also true that X ≤ v (Y ) And the latter is performed to determine the probability in a continuous random variable X. Now we need to take the derivative of FY (y), the cumulative distribution function Y, to obtain the probability density Y.

Distribution function of a continuous random variable

Generalization for the decrease function

Let X be a continuous random variable with general f (x) defined over c1 <x <c2. And let Y = u (X) be a decreasing function of X with the inverse X = v (Y). Since the function is continuous and decreases, there is an inverse function X = v (Y).

To solve this problem, one can collect quantitative data and use the empirical cumulative distribution function. Possessing this information and appealing with it, it is necessary to combine samples of means, standard deviations, media data and so on.

Similarly, even a fairly simple probabilistic model can have a huge number of results. For example, if you flip a coin 332 times. Then the number of results obtained from the coups is greater than that of google (10100) - the number, but not less than 100 quintillion times higher than the elementary particles in the known universe. The analysis that gives an answer to every possible result is not interesting. A simpler concept, such as the number of heads or the longest tail stroke, will be required. To focus on issues of interest, a certain outcome is adopted. The definition in this case is as follows: a random variable is a real function with a probability space.

The range S of a random variable is sometimes called the state space. Thus, if X is the value in question, then N = X2, exp ↵X, X2 + 1, tan2 X, bXc, and so on. The last of them, rounding X to the nearest integer, is called the gender function.

Distribution functions

Once the distribution function of a random variable x of interest is determined, the question usually becomes: “What are the chances that X falls into some subset of the values ​​of B?”. For example, B = {odd numbers}, B = {greater than 1} or B = {between 2 and 7} to indicate those results that have X, the value of a random variable, in subset A. Thus, in the above example, you can describe the events as follows.

{X is an odd number}, {X is greater than 1} = {X> 1}, {X is between 2 and 7} = {2 <X <7} to correspond to the three options above for subset B. Many properties of random variables do not are interconnected with a particular X. They rather depend on how X distributes its values. This leads to a definition that sounds like this: the distribution function of a random variable x is cumulative and is determined by quantitative observations.

Discrete Random Variable Distribution Function

Random variables and distribution functions

Thus, it is possible to calculate the probability that the distribution function of the random variable x takes values ​​in the interval by subtraction. Consider including or excluding endpoints.

A random variable will be called discrete if it has a finite or countable infinite state space. Thus, X is the number of heads on three independent flips of the offset coin, which rises with probability p. We need to find the cumulative distribution function of the discrete random variable FX for X. Let X be the number of peaks in the collection of three cards. Then Y = X3 via FX. FX starts at 0, ends at 1, and does not decrease with increasing x values. The cumulative FX distribution function of the discrete random variable X is constant, with the exception of jumps. At the jump, the FX is continuous. The statement about the correct continuity of the distribution function from the probability property can be proved using the definition. It sounds like this: a constant random variable has a cumulative FX, which is differentiable.

To show how this can happen, we can give an example: a target with a unit radius. Presumably. the dart is evenly distributed on the specified area. For some λ> 0. Thus, the distribution functions of continuous random variables smoothly increase. FX has the properties of a distribution function.

A man waits for a bus at a bus stop until he arrives. Having decided for himself that he will refuse when the wait reaches 20 minutes. Here it is necessary to find the cumulative distribution function for T. The time when a person will still be at the bus station or will not leave. Despite the fact that the cumulative distribution function is defined for each random variable. All the same, quite often other characteristics will be used: mass for a discrete variable and a distribution density function of a random variable. Typically, a value is displayed through one of these two values.

Find the distribution function of a random variable

Mass Functions

These values ​​are considered by the following properties, which have a general (mass character). The first is based on the fact that the probabilities are not negative. The second follows from the observation that the set for all x = 2S, the state space for X, forms a partition of the probabilistic freedom of X. Example: throws of a biased coin, the results of which are independent. You can continue to perform certain actions until you get a goal shot. Let X denote a random variable that gives the number of tails in front of the first head. And p stands for probability in any given action.

So, the mass probability function has the following characteristic features. Since the terms form a numerical sequence, X is called a geometric random variable. Geometric pattern c, cr, cr2 ,. ,,, crn has a sum. And therefore, sn has a limit at n 1. In this case, an infinite sum is a limit.

The mass function above forms a geometric sequence with a ratio. Therefore, the natural numbers a and b. The difference in the values ​​of the distribution function is equal to the value of the mass function.

The considered density values ​​have the definition: X is a random variable whose distribution FX has a derivative. FX satisfying Z xFX (x) = fX (t) dt-1 is called the probability density function. And X is called a continuous random variable. In the main calculus theorem, the density function is a derivative of the distribution. Probabilities can be calculated by calculating certain integrals.

Since data are collected from several observations, more than one random variable must be considered at a time to model experimental procedures. Therefore, the set of these values ​​and their joint distribution for the two variables X1 and X2 means viewing events. For discrete random variables, joint probabilistic mass functions are determined. For continuous, fX1, X2 are considered, where the joint probability density is satisfied.

Independent Random Variables

Two random variables X1 and X2 are independent if any two events associated with them are the same. In words, the probability that two events {X1 2 B1} and {X2 2 B2} occur simultaneously, y is equal to the product of the variables indicated above, that each of them occurs individually. For independent discrete random variables, there is a joint probabilistic mass function, which is the product of the limiting volume of ions. For continuous random variables that are independent, the joint probability density function is the product of the limiting density values. In conclusion, we consider n independent observations x1, x2 ,. ,,, xn arising from an unknown density or mass function f. For example, an unknown parameter in functions for an exponential random variable describing the waiting time of a bus.

The random variable is given by the distribution function

Simulate random variables

The main goal of this theoretical field is to provide the tools necessary for developing inferential procedures based on the sound principles of statistical science. Thus, one of the very important applications of the software is the ability to generate pseudo-data to simulate actual information. This makes it possible to test and improve analysis methods before the need to use them in real databases. This is required in order to investigate data properties through modeling. For many commonly used families of random variables, R provides commands for creating them. For other circumstances, methods of modeling a sequence of independent random variables that have a common distribution will be needed.

Discrete random variables and sample Command. The sample command is used to create simple and stratified random samples. As a result, if you enter the sequence x, sample (x, 40) selects 40 records from x in such a way that all variants of size 40 have the same probability. This uses the default R command to fetch without replacement. It can also be used to model discrete random variables. To do this, provide the state space in the vector x and the mass function f. A call to replace = TRUE indicates that sampling occurs with replacement. Then, to give a sample of n independent random variables having a common mass function f, a sample is used (x, n, replace = TRUE, prob = f).

It is determined that 1 is the smallest value represented, and 4 is the largest of all. If prob = f is omitted, then the pattern will choose evenly from the values ​​in the vector x. To check the simulation against the mass function that generated the data, you can pay attention to the double equal sign, ==. And recounting the observations that take every possible value for x. You can make a table. Repeat this for 1000 and compare the simulation with the corresponding mass function.

Probability Transformation Illustration

First, model the homogeneous distribution functions of the random variables u1, u2 ,. ,,, un on the interval [0, 1]. About 10% of the numbers should be in the range of [0.3, 0.4]. This corresponds to 10% of simulations in the interval [0.28, 0.38] for a random variable with the FX distribution function shown. Similarly, about 10% of random numbers should be in the range [0.7, 0.8]. This corresponds to 10% of simulations on the interval [0.96, 1.51] of a random variable with the distribution function FX. These values ​​on the x axis can be obtained by taking the inverse of FX. If X is a continuous random variable with density fX positive everywhere in its domain, then the distribution function strictly increases. In this case, FX has the inverse function FX-1, known as the quantile function. FX (x) u only when x FX-1 (u). The probability transformation follows from the analysis of the random variable U = FX (X).

Probability distribution function of a random variable

FX has a range from 0 to 1. It cannot take values ​​below 0 or above 1. For values ​​of u between 0 and 1. If you can simulate U, then you need to simulate a random variable with the distribution of FX through the quantile function. Take the derivative to see that the density u varies within 1. Since the random variable U has a constant density over the interval of its possible values, it is called uniform over the interval [0, 1]. It is modeled in R using the runif command. Identity is called probabilistic transformation. You can see how it works in the example with a dart board. X between 0 and 1, the distribution function u = FX (x) = x2, and therefore the quantile function x = FX-1 (u). It is possible to simulate independent observations of the distance from the center of the dart panel, while creating uniform random variables U1, U2 ,. ., Un. The distribution function and the empirical are based on 100 simulations of the distribution of the darts board. For an exponential random variable, it is assumed that u = FX (x) = 1 - exp (- x), and therefore x = - 1 ln (1 - u). Sometimes logic consists of equivalent statements. In this case, you need to combine the two parts of the argument. The intersection identity is similar for all 2 {S ii} S, instead of some value. The union Ci is equal to the state space S and each pair is mutually excluded. Since Bi - is divided into three axioms. Each check is based on the corresponding probability P. For any subset. Using identity to make sure that the answer is independent of whether the endpoints of the interval are included.

The law of distribution of a function of a random variable

Exponential function and its variables

For each result, all events ultimately use the second property of probability continuity, which is considered axiomatic. The law of distribution of a function of a random variable here shows that each has its own decision and answer.

Source: https://habr.com/ru/post/E18342/


All Articles