Multivariate analysis: types, examples, methods of analysis, purpose and results

Analysis of variance multivariate analysis is a combination of various statistical methods that are designed to test hypotheses and the relationship between the studied factors and certain signs that do not have a quantitative description. Also, a similar technique allows you to determine the degree of interaction of factors and their influence on certain processes. All of these definitions sound rather confusing, so let's look into them in more detail in our article.

Criteria and types of analysis of variance

The method of analysis of variance multivariate analysis is most often used to find the relationship between a continuous quantitative variable and nominal qualitative characteristics. In fact, this technique is a test of various hypotheses about the equality of different arithmetic samples. Thus, it can also be considered as a criterion for comparing several samples. However, the results will be identical if only two elements are used for comparison. The study of the t-criterion shows that a similar technique allows us to study the problem of hypotheses in more detail than any other known method.

Also, one cannot fail to note the fact that some types of analysis of variance are based on a certain law: the sum of squares of intergroup deviations and the sum of squares of intra-group deviations are absolutely equal. As a study, the Fisher criterion is used, which is used for a detailed analysis of intragroup variances. Although this requires prerequisites for the normality of the distribution, as well as the homoskedasticity of the samples, the equality of variances. As for the type of analysis of variance, the following are distinguished:

  • multivariate or multivariate analysis;
  • one-factor or one-dimensional analysis.

It is not difficult to guess that the second one considers the dependence of one attribute and the quantity being studied, and the first one is based on the analysis of several attributes at once. In addition, multifactorial dispersion does not allow one to reveal a stronger relationship between several elements, since the dependence of several quantities is studied at once (although it is much simpler to carry out the method).

Factors

Thinking about the methods of multivariate correlation analysis? Then you should know that for a detailed study, you should study those factors that control the circumstances of the experiment and affect the final result. Also, factors can mean methods and levels of processing values ​​that characterize a particular manifestation of a particular condition. In this case, the numbers are given in an ordinal or nominal measurement system. If a problem arises with the data grouping, you have to resort to the use of the same numerical values, which slightly changes the final result.

Analysis of the dependence of factors and consequences.

It should also be understood that the number of observations and groups cannot be excessively large, because this leads to an excess of data and the inability to complete the calculation. At the same time, the method of grouping depends not only on the volume, but also on the nature of the variation of certain values. The size and number of intervals in the analysis can be determined by the principle of equal frequencies, as well as equal gaps between them. As a result, all the studies obtained will be indicated in the statistics of multivariate analysis, which should be based on various examples. We will return to this in the following sections.

Purpose of analysis of variance

So, sometimes situations may arise when it is necessary to compare two or more different samples. In this case, it would be most logical to apply multivariate correlation and regression analysis based on the study of the hypothesis and the relationship of various factors in the degree of regression. The name of the methodology also indicates the fact that in the research process various components of the variance are used.

Analysis of ideas and variance.

What is the essence of the study? To begin with, two or more indicators are divided into separate parts, each of which corresponds to the action of a certain factor. After this, a series of research procedures are carried out to find the relationship between different samples and the relationships between them. To understand in more detail such a complex, but interesting technique, we recommend that you study some examples of multivariate correlation analysis presented in the following sections of our article.

First example

There are several automatic machines in the production workshop, each of which is designed to manufacture a specific part. The size of the manufactured item is a random value, which depends not only on the settings of the machine itself, but also on random deviations that will inevitably arise as a result of the production of parts. But how can a worker determine the correct operation of a machine if he initially produces defective parts? That's right, you need to purchase the same part on the market and compare its dimensions with what is obtained during production. After that, you can adjust the equipment so that it produces parts of the right size. And it doesn’t matter at all that there is a manufacturing defect, because it is also taken into account in the calculations.

Production machines.

At the same time, if there are certain indicators on the machines that allow you to determine the intensity of adjustment (X and Y axes, depths, and so on), then the indicators on all machines will be completely different. If the measurements turned out to be exactly the same, then production defects can be ignored altogether. However, this is extremely rare, especially if the errors are measured in millimeters. But if the released part has the same dimensions as the standard purchased on the market, then there can be no question of marriage, since in the production of the "ideal" a machine was also used that gave certain errors, which were probably also taken into account by the workers.

Second example

For the manufacture of a certain device that runs on electricity, it is necessary to use several types of different insulating paper: electrical, capacitor, and so on. In addition, the device can be impregnated with resin, varnish, epoxy compounds and other chemical elements that extend the life of the device. Well, various leaks under a vacuum cylinder at elevated pressure are easily eliminated using the method of heating or pumping out air. However, if the master had previously used only one element from each list, various difficulties may arise in the production process using the new technology. Moreover, almost certainly, a similar situation will be caused due to one element. However, it will be almost impossible to calculate which particular factor affects the poor performance of the device. That is why it is recommended to use not a multivariate analysis method, but a univariate one in order to quickly deal with the cause of the malfunction.

Analysis of production charts.

Of course, when using various tools and instruments that track the influence of a factor on the final result, the study is simplified by several times, however, a novice engineer can’t afford to acquire such units. That is why it is recommended to use a one-way analysis of variance, which allows to identify the cause of the malfunction in a matter of minutes. To do this, it will be enough to set one of the most likely hypotheses, and then begin to prove it through experiments and analysis of the performance indicators of the device. Pretty soon, the wizard will be able to find the cause of the problem and fix it, replacing one of the samples with an alternative option.

Example three

Another example of multivariate analysis. Suppose that a trolleybus depot can serve several routes during the day. Trolleybuses of completely different brands operate on these same routes, and 50 different controllers collect fare. However, the depot management is interested in how it is possible to compare several different indicators that affect the total revenue: the trolleybus brand, route efficiency and employee skill. To see economic feasibility, it is necessary to analyze in detail the effect of each of these factors on the final result. For example, some controllers may not do well in their duties, so more responsible employees will have to be hired. Most passengers do not like to ride old trolleybuses, so it is most advisable to use a new brand. However, if both of these factors go along with the fact that most of the routes are highly demanded, is it worth changing anything at all?

Trolleybuses in Europe.

The researcher’s task is to obtain as much useful information as possible with regard to the influence of each of the factors on the final result using one analytical method. For this, it is necessary to put forward at least 3 different hypotheses, which will have to be proved in various ways. Analysis of variance allows us to solve such problems in the shortest possible time and get the maximum of useful information, especially if the multiphase method is used. However, do not forget that one-way analysis gives much more confidence about the influence of a factor, since it examines the sample in more detail. For example, if the depot directs all efforts to the analysis of the conductors, then it will be possible to identify many unscrupulous workers on all routes.

Univariate analysis

One-way analysis is a set of research methods aimed at analyzing a specific factor for the final result in a particular case. Also quite often, a similar technique is used to compare the greatest impact between two factors. If we draw an analogy with the same depot, you should first analyze separately the impact of different routes and brands of trolleybuses on profitability, then compare the results with each other and determine in which direction it will be best to develop the station.

Enterprise risk analysis.

In addition, do not forget about such a concept as the null hypothesis - that is, a hypothesis that cannot be discarded and in any case, all the factors listed to one degree or another influence it. Even if we compare only the routes and brands of trolleybuses among ourselves, the influence of the professionalism of the conductors will not go anywhere. Therefore, even if this factor cannot be analyzed, the effect of the null hypothesis should not be forgotten. For example, if you decide to investigate the dependence of profit on the route, let the same conductor enter the flight so that the readings are as accurate as possible.

Two-factor analysis

A man analyzes the data.

Most often, this technique is also called the comparison method and is used to identify the dependence of two factors on each other. In practice, you will have to use various tables with accurate indicators so as not to get confused in your own calculations and the influence of factors on them. For example, you can run two completely different trolleybuses along two identical routes at the same time, neglecting the factor of the null hypothesis (choose two responsible conductors). In this case, the comparison of the two situations will be of the highest quality, since the experiment takes place at the same time.

Multivariate analysis with repeated experiments

This method is used in practice much more often than others, especially when it comes to a group of novice researchers. Repeated experience allows not only to verify the influence of one factor or another on the final result, but also to find errors that were made during the study. For example, most inexperienced analysts forget about the presence of one or several null hypotheses at once, which leads to inaccurate results during the study. Continuing with the depot example, we can analyze the influence of various factors in different seasons of the year, since the number of passengers in winter is very different from the summer. In addition, repeated experience may lead the researcher to new ideas and new hypotheses.

Video and conclusion

We hope our article helped you understand what the multivariate correlation analysis method is based on. If you still have any questions on this topic, we recommend that you watch a short video. It describes in detail the methods of analysis of variance using a specific example.

As you can see, multivariate analysis is a rather complicated, but very interesting process that allows you to identify the dependence of certain factors on the final result. This technique can be applied in absolutely all areas of life and can be effectively used for doing business. Also, the multivariate analysis model can be used to achieve breakthrough tasks using simple methods.

Source: https://habr.com/ru/post/C8435/


All Articles