Relational Algebra in Databases: Operations, Examples

Typically, database systems are equipped with a query language that can help its users request instances. There are two of these types - relational algebra and relational calculus. The first is a procedural query language that accepts relationship instances as input and displays example relationships as output. Uses unary or binary calculi for this. Relational algebra is performed recursively, and intermediate results are treated as relationships.

Relational algebra

Cartesian product (Ξ§)

Combines the information of two different relationships into one.

Designations - r Ξ§ s,

where r and s are relations, and their output will be determined as

r Χ s = {qt | q ∈ r and t ∈ s}.

Conclusion. Establishes a relationship that shows all books and articles written using a textbook.

Rename the operation (ρ).

Relational algebra relationships are results, but without any name. The renaming operation allows you to change the output value, indicated by the small Greek letter ρ .

Designation - ρ x (E),

where the result of the expression E is stored with the name x.

Additional operations:

  • set intersection;
  • assignment;
  • natural mix.

Relational calculus

The query language is non-procedural, that is, it says what to do, but does not explain how to implement it. Relational calculus exists in two forms:

  • tuple correlation calculus;
  • filtering variable ranges.

Legend - T / State: returns all tuples of T satisfying the condition. Result. Returns tuples with a name. TRC can be quantified. Existential (βˆƒ) and universal quantifiers (βˆ€) can be used. Conclusion. The above query will give the same result as the previous one.

Domain Relational Calculus DRC

The filter variable uses the attribute domain instead of the integer tuple values ​​(as done in the TRC mentioned above).

Notation - {a 1 , a 2 , a 3 , ..., a n | P (a 1 , a 2 , a 3 , ..., a n )},

where a1, a2 are attributes, and P denotes formulas constructed by internal values.

Conclusion. Sets the article, page, and topic from the TutorialsPoint relationship, where subject is the database.

Like TRC, DRC can also be written using existential and universal quantifiers. DRC also includes relational algebra operators. The strength of the expression of calculation, calculus and correlation of relations between points is equivalent.

Relational data model relational algebra

Variations and schemes of relational calculus and algebra

The ER model, when conceptualized in diagrams, provides a good overview of essential relationships that are easier to understand. Schematic images can be compared with a relational scheme, that is, they can be created together with each other. It is not possible to import all ER constraints into a relational model, but an approximate structure can be generated. There are several processes and algorithms available to convert diagrams into this system. Some of them are automated, while others are created manually. ER charts mainly consist of the following criteria:

  • entity and its attributes;
  • communication, which is the association between the above values.

The juxtaposition of objects and relations occurs in different ways and patterns. For example, an entity is an object of the real world with some attributes. The matching process, the algorithm is as follows:

  • create a table for each object;
  • attributes should become table fields with the corresponding data types;
  • declare a primary key.

A relationship is an association between entities. The compilation process is as follows:

  • create a table for relationships;
  • add primary keys of all participating entities as table fields with the corresponding data types;
  • if the relation has any attribute, set each attribute as a table field;
  • combine the primary key that makes up all the rest for the participating objects;
  • specify all foreign key restrictions.

The display of weak sets and hierarchical objects occurs according to a specific system. First of all, it is necessary to understand the essential foundations and definitions of these values. A weak set of objects is one that does not have any primary key associated with it. The display process is as follows:

  • create a table for a weak set of objects;
  • add all attributes to the scheme as a field;
  • specify the primary key for identification;
  • set all foreign key restrictions.

The mapping of hierarchical objects based on specialization or generalization of the language of relational algebra occurs in the form of sequential entities. The algorithm is as follows:

  • create tables for all objects of a higher lower level;
  • add primary keys;
  • at a low level, implement all other attributes of lower-level objects;
  • declare the primary keys of the table;
  • set foreign key restrictions.

Relational Algebra Operations

Existing options for describing, storing, changing information

SQL is a programming language for relational databases. It is designed over algebra and correlation calculus of tuples. SQL comes as a package with all major DBMS distributions. Contains both data and languages ​​for manipulating them. Using the properties of SQL data definition of relational algebra, you can design and change the database schema, while the control and adjustment properties, as well as data changes, allow you to store and retrieve information installed in the system. Uses the following set of commands to determine structure and system:

  • creates new databases, tables and views from the DBMS.
  • throws out teams.
  • changes the database schema.
  • this command adds an attribute to an object of type string.

SQL is equipped with a data manipulation language (DML). It modifies the database instance by inserting, updating, and deleting information. DML is responsible for changing all data. SQL contains the following set of commands in the DML section:

  1. SELECT is one of the main query commands. It is similar to the projection operation of relational algebra. It selects attributes based on the condition described in the WHERE application.
  2. FROM - this section takes a name as an argument from which attributes should be selected / projected. If more than one name is given, this item corresponds to the Cartesian product.
  3. WHERE - This section defines the predicate or conditions that must be met in order to qualify the projected attribute.

There are also commands:

  • insert;
  • change in values;
  • removal.

Relational Algebra in Databases

Creating Relational Algebra Queries

When constructing a search, the task is to find the structure of operations that will lead to the correct conclusion. The basic operations of relational algebra are simple operations with one or two relations as operands. The combined effects of the sequence determine the end result. Since the system of relational algebra in databases is quite simple, many intermediate results can be obtained before reaching the final conclusion, they are also used as operands that produce new data obtained.

For most operators, the order of the queries and their execution does not matter, which means that the same conclusion can be achieved by generating and combining intermediate data in different ways. In practice, database searches are fairly easy. The system for performing operations and intermediate results is determined by the query optimizer. When forming questions, requirements
First select which relationships are necessary to achieve an answer, and then indicate operations and intermediate results. The query structure of relational algebra in the database with the results can be represented in the form of a diagram. Requirement optimizers try to organize the most efficient implementation. In practice, this usually means that they try to minimize intermediate results as quickly as possible. Common examples of relational algebra will help.

Example 1

Information need: information about the cars of the 1996 model, where during the inspection in 1999, shortcomings were discovered.

First, information about the machines is displayed to understand the values ​​of all the attributes of the relationship. Inspection information is stored in the β€œCheck” table, and if malfunctions are found, they are recorded in the β€œProblem” table. So these three tables are needed to get the information you need.

Only cars of 1996 are interesting. The model range of the car is presented as the value of the set attribute in the row of the table of information about the car. The first interim result consists of tuples representing the 1996 variants.

Thus, only rows that span this period are needed. You must use selection to extract them. Now there are cars and inspections that were required. Then the rows are joined using the join operation. A common register number must be connected to them, since it is the only common column, a natural connection is used.

To find out if any malfunctions were detected during the checks, you need to link the problem lines to the check. After connecting the control series to the cars, you can connect this result to the fault table. Accession should be based on a common registration number and a verified date. These are the only common columns in the tables, so a natural join is used.

Relational algebra is a language

Calculation options without intermediate results

Example 2

Required information: Name of driver for model year 1995 or older cars that have not been tested for year 2000. The name is in the "Driver" table. Law enforcement agencies are described in the table "Inspection and cars in the dining room machine." So these three tables are needed. First, you need to find out cars that have not been inspected in 2000. It is impossible to solve this problem using only the inspection indicated in the table, since it contains data about these checks that were done, and not about those that were not implemented. This problem is solved by searching for complementary vehicles that are tested before 2000. In fact, only their registration numbers are needed.

There are other examples besides the above that show how you can change or find any information. Query options can be optimized using special operations. In fact, to make finding and finding data the easiest and easiest, there is a relational model of calculus.

Where information is secured and protected

The relational model of relational algebra data is stored in file formats containing records. At the physical level, the actual information is fixed in the electromagnetic format on any device. These storage devices can be divided into three categories:

  1. Primary This category includes memory that is directly accessible to the CPU. Registers, fast memory (cache) and main memory (RAM) are directly accessible to the central one, since all of them are located on the motherboard or chipset. This storage is usually very small, ultrafast and unstable. To maintain the condition requires a constant power source. In the event of a failure, all of its data is lost.
  2. Secondary Used to store information for future use or backup. Includes memory devices that are not part of the chipset or processor motherboard, such as magnetic disks, optical disks (DVDs, CDs, etc.), hard drives, flash drives, and magnetic tapes.
  3. Tertiary. Used to store huge amounts of data. Since such storage devices are external to the computer system, they are the slowest in speed. These storage gadgets are mainly used to back up the entire system. Optical disks and magnetic tapes are widely used as tertiary storage.

For query efficiency, special operations of relational algebra are important.

Storage structure

A computer system has a clearly defined memory hierarchy. The CPU has direct access to the main system, as well as to the built-in registers. The access time to the main memory is obviously less than the processor speed. To minimize this discrepancy, a cache is introduced. The cache provides the fastest access time and contains the data that most often accesses the CPU.

The memory with the fastest access is the most expensive. Larger storage devices offer less speed and are cheaper, but they can store huge amounts of data compared to processor registers or cache memory.

Magnetic and hard drives are the most common secondary storage devices in modern computer systems. They are called magnetic, consist of a metal base. These discs are placed vertically on the spindle. The read / write head moves between them and is used to magnetize or remove such a spot under it. It can be recognized as 0 (zero) or 1 (one).

Hard drives are formatted in a clearly defined order for efficient data storage. It has many concentric circles called tracks. Each track is further divided into sectors, which typically store 512 bytes of data.

SQL relational algebra

File operations

Operations on the system of the language of relational algebra and its database can be generally classified into two categories:

  • update;
  • Search.

The first category modifies data values ​​by inserting, deleting, or updating. On the other hand, search operations do not edit information, but retrieve it after optional conditional filtering. In both types of operations, selection plays a significant role. In addition to creating and deleting a file, there can be several operations that can be performed in them:

  1. Open - exists in one of two reading or writing modes. In the first case, the operating system does not allow anyone to modify the data. In other words, the data is only read. Files opened in read mode can be shared by several objects. Recording mode allows you to change data. Files can be read, but cannot be shared.
  2. Close is the most important operation from the point of view of the operating system, as it removes all locks (if in the shared mode), saves the data (if changed) on the secondary medium and frees all buffers and handlers associated with the file.
  3. Indexing is an information structure method for efficiently retrieving records from system files based on some attributes where the system was run. Defined based on attributes.

Indexing can be of the following type:

  1. Primary is defined in an ordered data file. The information file is arranged in a key field.
  2. A secondary index is generated from a field that is a candidate key and has a unique value in each record or not a key with duplicate values.
  3. Clustering is defined in an ordered data file, in a non-key field.

Relational algebra relational calculus

A database management system or DBMS refers to the technology of storing and retrieving user information with maximum efficiency along with appropriate security measures. A detailed examination of this issue leads to the conclusion that relational algebra is the language of operators that use relations as arguments and return them as a result.

Source: https://habr.com/ru/post/E10093/


All Articles