What is a data warehouse? As a rule, this is the base in which the whole mass of information on the activities of a company is stored. But it is often necessary to single out from this entire large-scale complex data on one direction of the organization’s work, unit, and service question. Here another type of storage comes to the rescue - the so-called data marts. What is it, what are its advantages, disadvantages, varieties, we will consider throughout the article.
What is it?
What are data marts? The English version is Data Mart. There are several synonyms for the concept:
- Specialized repository of information (data).
- Data Kiosk.
- Data market and so on.
Let's define the interpretation of the term "data showcase":
- A section of the database, data warehouse, which is designed to be an array of highly specialized, thematic information, oriented to the requests of employees of a certain department, the vector of the organization’s work.
- A specialized information repository that contains information on one of the company's activity vectors.
- A set of thematically related databases (databases) related to specific areas of the organization.
Posting an ad in the data store will fail. It is one of the types of storage of internal information of the organization, and not the provision of information to a wide range of users.
The idea of creating data marts was proposed in 1991 by Forrester Research. The authors presented this repository of information as a certain set of specific databases that contain information related to specific vectors of the corporation.
Forrester Research highlighted the following strengths of its project - data marts:
- Presenting to analysts only the information that is really needed for a specific work task, profile of performance.
- Maximum proximity of the target part of the data warehouse to a specific user.
- The content of thematic subsets of data pre-aggregated by specialists, which are further easier to configure and design.
- To implement a data showcase (a specialized type of data warehouse), high-power computing equipment is not required.
But the same Forrester Research talked about the weaknesses of their invention:
- The implementation of a geographically distributed information system whose redundancy is poorly controlled.
- No methods, methods that could ensure the integrity and consistency of the information stored in the storefront (database of highly specialized) information are not expected.
We now turn to a new topic.
The main example of data marts is thematic subsets of pre-aggregated information. Accordingly, such databases are much easier to design and configure. They create similar windows to search for specific answers to user requests. The data in them are adapted by the creator for certain groups of employees. Such optimization facilitates the procedure of filling shop windows, helps to increase the productivity of such databases.
It is known that the design of integrated data warehouses is a rather complicated process, which can stretch even for several years. But data marts concretized by separate structures of an enterprise or firm are easier and faster to create. I must say that several storefronts can successfully coexist with the main repository of information, giving a partial picture of it.
As we mentioned, designing data marts is a technologically lightweight process. But the creators of VD need to remember that during the construction later problems may arise with the integration of information (if the design was carried out without taking into account the integrated business model).
Independent Storefronts: Examples
SQL Data Showcase is an analytical structure that supports the operation of one of the applications, departments, and business sections. Its employees generalize their information requirements, adapt the storefront to their own official needs. Next is the provision of personnel in contact with this data, determined by means of interactive reporting.
Independent data marts have historically been formed in large organizations that have a large number of independent units with their own information technology departments. Examples of them can be identified as follows:
- Showcase data marketing unit. Includes information about the company's products, its customers, sales plans and so on.
- Showcase of sales data.
- VD finance department.
- VD risk assessment units and more.
Advantages of independent showcases
Let's highlight the key benefits of data marts that are found by direct creators and users:
- As much as possible focused on the employee, provide him with only the information that is necessary when performing a job.
- Significantly less "weigh" than the database.
- Creating storefronts is a technologically easy process (than designing complex data warehouses). In addition, filling the VD and the work of end users with them are easier.
- They contain aggregated information on certain topics.
- Fairly fast implementation of data marts.
- Create answers to a specific set of questions.
- The data is optimized for use by a certain circle of users. This facilitates the process of filling the VD, increases the system performance.
Disadvantages of Independent Storefronts
Let's determine the disadvantages of data marts that are highlighted by users and designers:
- Complex control of data integrity, redundancy, and consistency. There are frequent cases when identical information is stored in several display cases at once, overloading the system. Data can often be duplicated. All this together leads to an increase in the cost of storing information.
- It is not easy to work with a storefront if several data sources are used for it. It is also difficult to fill such a VD - you cannot do without a whole team of professionals.
- Combining the information accumulated in different display cases is not provided, that is, the data will not be consolidated at the company level.
- It makes it impossible to present a complete picture of the status of the organization.
And what happens if we combine the concepts of data marts and data warehousing? This question was asked in 1994 by M. Demarest. It was he who proposed combining the above concepts for the further use of the data warehouse (database) as an integrated single source in the design of data marts.
This solution combines three levels:
- General corporate database, whose basis is a relational DBMS (database management system). It has a slightly denormalized or normalized scheme (or detailed data).
- Database (DB) of a specific department, organizational unit, end user user. It is already implemented on the basis of multidimensional DBMS (aggregated data).
- Workplaces of end users-users, on which analytical tools are directly installed.
This multidimensional structure will eventually become standard in many companies. The main reason for this is the combination of the advantages of two approaches:
- Compact storage of detailed information, support for large-scale databases, which is implemented on the basis of relational database management systems.
- Simple setup, quick response to user request when working with aggregated information based on multidimensional database management systems.
Advantages of three-dimensional showcases
The advantages of this type of VD are as follows:
- Simplified creation of such data marts as they are populated from a standardized reliable single source.
- VDs are synchronized and compatible with the corporate database.
- Relatively easy storage expansion, the ability to add new storefronts.
- Guaranteed good system performance.
Disadvantages of 3D display cases
A number of minuses are also highlighted here:
- Redundancy of information, which leads to an increase in data storage requirements.
- Consistency with the architecture of a number of areas with potentially different requirements is needed.
We have sorted out what a data showcase is, learned what the differences are between the concept of independent and three-level VD, what are the key advantages and disadvantages of such information storages of a large company.