⚕️ 🍉 🤶🏾 RLE Algorithm: Description, Features, Rules and Examples 🛣️ 🈷️ 🤲🏿

The RLE algorithm is a data compression algorithm that is supported by most raster image file formats: TIFF, BMP, and PCX. RLE is suitable for compressing any type of data regardless of its information content, but the content of the data affects the compression ratio. Despite the fact that most RLE algorithms cannot provide high compression ratios for more complex methods, this tool is easy to implement and quickly execute, which makes it a good alternative.

What is the RLE compression algorithm based on?

RLE works by reducing the physical size of a repeating string of characters. This string, called run, is usually encoded in two bytes. The first byte represents the number of characters in the run and is called the run counter. In practice, an encoded run can include between 1 and 128 and 256 characters. A counter usually contains the number of characters minus one (a value in the range of values from 0 to 127 and 255). The second byte is the value of the character in the run, which is contained in the range of values from 0 to 255 and is called the start value.

Without compression, a character run of 15 characters usually requires 15 bytes for storage:

AAAAAAAAAAAAAAAA.

After RLE encoding, only two bytes are required on the same line: 15A.

The 15A encoding generated to denote a character string is called an RLE packet. In this code, the first byte, 15, is the run counter and contains the required number of repetitions. The second byte, A, is the run value and contains the actual repeated value in the run.

A new packet is generated each time the start character changes, or each time the number of characters in the run exceeds the maximum number. Suppose that a 15-character string according to conditions contains 4 different characters:

AAAAAAbbbXXXXXt

Using encoding with a string length, it can be compressed into four two-byte packets:

6A3b5X1t

After encoding along the length of the string for a 15-byte string, only eight bytes of data will be needed to represent the string, unlike the original 15 bytes. In this case, runtime encoding yielded a compression ratio of almost 2 to 1.

Features

Long runs are rare in some data types. For example, plain ASCII text rarely contains long runs. In the previous example, the last run (containing the t character) was only one character in length. The 1 character run still works. Both the start count and the start value must be recorded for each run in 2 characters. To encode a mileage using the RLE algorithm, information of at least two characters is required. In this connection, the launch of single characters actually takes up more space. For the same reasons, data consisting entirely of 2-character runs remains unchanged after RLE encoding.

The RLE compression algorithm schemes are simple and fast, but their effectiveness depends on the type of encoded image data. A black and white image that is mostly white, such as a page in a book, will be very well encoded due to the large amount of related data having the same color. However, an image with many colors, such as a photograph, will not be encoded as well. This is due to the fact that the complexity of the image is expressed in the form of a large number of different colors. And because of this complexity, there will be relatively few runs of the same color.

Length Encoding Options

There are several encoding options at runtime. Image data is typically encoded in a sequential process that processes visual content as a 1D stream, rather than as a 2D data map. In sequential processing, the bitmap image is encoded starting from the upper left corner and is sent from left to right along each scan line to the lower right corner of the bitmap image. But alternative RLE schemes can also be written to encode data along the length of the bitmap along the columns for compression into 2D tiles, or even to encode pixels diagonally in a zigzag fashion. Odd RLE variants can be used in highly specialized applications, but are usually quite rare.

Run-length coding algorithm

Another rare option is the RLE coding algorithm with a path length error. These tools usually perform lossless compression, but discarding data during the encoding process, usually by zeroing out one or two least significant bits in each pixel, can increase compression ratios without adversely affecting the quality of complex images. This RLE algorithm program only works well with real-world images that contain many subtle variations in pixel values.

Cross coding

Cross-coding is the combination of scan lines that occurs when the compression process loses the distinction between the original lines. If the data of individual lines is combined by the RLE retry coding algorithm, the point where one scan line is stopped and the other is lost is vulnerable and difficult to detect.

Sometimes cross-coding occurs, which complicates the decoding process, adding the cost of time. For raster image file formats, this method aims to organize the raster image along the scan lines. Although many file format specifications explicitly indicate that these lines must be individually encoded, many applications encode these images as a continuous stream, ignoring line boundaries.

How to encode an image using the RLE algorithm?

Individual coding of scan lines is advantageous in cases where the application should use only some part of the image. Suppose a photo contains 512 scan lines, and you only need to display lines from 100 to 110. If we did not know where the scan lines started and ended with encoded image data, our application would have to decode lines 1 to 100 before finding ten lines that are required. If the transitions between the scanning lines were marked with some easily recognizable demarcation marker, the application could simply read the encoded data, counting the markers, until it reaches the lines it needs. But this approach would be quite ineffective.

Alternative option

Another option for determining the starting point of any particular scan line in the encoded data block is to build a table of scan lines. This tabular structure usually contains one element for each scan line in the image, and each element contains the offset value of the corresponding scan line. To find the first RLE packet of the scan line 10, all the decoder needs to do is find the values of the offset position stored in the tenth element of the scan line search table. The scan line table may also contain the number of bytes used to encode each line. Using this method in order to find the first RLE packet of scan line 10, your decoder will combine the values of the first 9 elements of the scan line table. The first packet for scan line 10 will begin with this byte offset from the start of the RLE encoded image data.

Units

The parts of the length coding algorithms that vary are decisions that are made based on the type of data that is being decoded (for example, the length of the data run). RLE schemes used to compress raster graphics are usually divided into classes according to the type of atomic (that is, the most fundamental) elements that they encode. The three classes used by most image file formats are bit, byte, and pixel RLE.

Compression quality

The bit levels of RLE schemes encode runs of several bits in a scan line and ignore byte and word boundaries. Only monochrome (black and white), 1-bit images contain enough bits to make this class of RLE coding effective. A typical bit level RLE scheme encodes from one to 128 bits in a single byte packet. The seven least significant bits contain the start counter minus one, and the most significant bit contains the value of a bit run equal to 0 or 1. A run longer than 128 pixels is divided into several RLE-encoded packets.

Exceptions

Byte-level RLEs encode runs of the same byte values, ignoring some bits and word boundaries in the scan line. The most common RLE scheme at the byte level encodes byte runs into 2-byte packets. The first byte contains a counter from 0 to 255, and the second contains the value of the byte trigger. The addition of a two-byte encoding scheme with the ability to store literal, unwritten byte runs in a stream of encoded data is also common.

In such a scheme, the seven least significant bits of the first byte contain a run counter minus one, and the most significant bit of the first byte is an indicator of the type of start that follows the byte of the run count. If the most significant bit is set to 1, it indicates an encoded run. The encoded runs are decoded by reading the mileage value and repeating its number of times indicated by the number of cycles. If the most significant bit is set to 0, a literal run is displayed, which means that the count bytes of the next run are read literally from the encoded image data. Then the byte of the execution counter contains a value in the range from 0 to 127 (start counter minus one). Byte-level RLE schemes are good for image data that is stored as one byte per pixel.

Pixel-level RLE schemes are used when two or more consecutive bytes of image data are used to store values of one pixel. At the pixel level, bits are ignored, and bytes are only counted to identify each pixel value. The size of the encoded packets depends on the size of the encoded pixel values. The number of bits or bytes per pixel is stored in the header of the image file. The start of image data stored as 3-byte pixel values is encoded in a 4-byte packet, with one byte of counting the number of operations, followed by three bytes with bytes. The encoding method remains the same as with byte RLE.

RLE Algorithm: Description, Features, Rules and Examples