Grouping the same data into tables can be done in various ways. Attributes in relationships should be grouped according to the relational principle, that is, duplication of data should be completely minimized, and the procedure for processing them with subsequent updating should be simplified. One of the primary tasks in designing databases is the elimination of redundancy, and it is achieved through normalization.
Normalization of databases is a kind of formal apparatus of restrictions on the creation of tables, which allows to eliminate duplication, with the mandatory ensuring of the consistency of stored information, reducing the labor costs associated with maintaining and maintaining the database. The normalization operation consists in decomposing the source database tables into simpler ones. At each stage of this process, the tables are necessarily brought into normal forms. Each level of normalization is characterized by a certain set of restrictions, which all tables must correspond to. Thus, the removal of non-key information from the tables, which is redundant, is carried out.
Database normalization is based on the concept of a functional relationship between attributes. It is generally accepted that one attribute depends on another, if at each moment in time a certain value of the second attribute corresponds to no more than one value of the first.
Database normalization is a general concept, however, it is customary to subdivide it into several normal forms, which will be discussed later.
An information object is considered to correspond to the first normal form, when the value of each attribute is unique. If some attribute has a repeating value, then the object cannot be considered to belong to the first normal form. It turns out that you can still create some kind of entity, that is, an information object.
It is customary to consider an information object belonging to the second normal form when it already consists of the first normal form, but each of its attributes, not consisting of a potential key, completely depends on the functional plan on each of the potential keys.
An information object is considered to belong to the third normal form if it already consists in the second normal form, but there is not a single transitive dependence of non-key objects on the keys in it. By transitive dependence it is customary to understand the obvious dependence between the fields.
Normalization of the database sets the main goal for the developer, which is to bring all relations to the third normal form. Only in this way in the future will it be possible to create an effective information system.
Database Normalization: Basic Rules
It is worth formulating a set of rules that should be followed in the normalization work. First of all, it is worth excluding duplicate groups. It is necessary to create a separate table that stores each set of related attributes, in which to create a separate key. Next, be sure to eliminate redundant data. In cases where the dependence of the attribute is observed only on part of the key, then it must be set in a separate table. The third rule is the mandatory exclusion of columns that are not dependent on the key. Attributes should be placed in an isolated table if they do not have the proper effect on the key. Be sure to isolate independent plural relationships. In this case, we are talking about the fact that there is no concrete relationship between several relationships. Lastly, it is worth isolating multiple relationships that are connected semantically. This completes the normalization of the database, after which the development process begins.