Collection
zero Useful+1
zero

Run length code

Run length code
The common lossless compression algorithm uses two adjacent pixels with the same color value in a scan line byte To indicate that the first byte is a count value, which is used to specify the number of times the pixel repeats; The second byte is the value of the specific pixel. The image quality can be better saved, but relatively Lossy compression The compression ratio of this method is relatively low
Chinese name
Run length code
Foreign name
run-length encoding
Features
data compression

Run Length Encoding (RLE) Definition

Announce
edit
Line length coding is a lossless data compression technology independent of the nature of data.
Variable length coding is a compression technology that "uses fixed length codes to replace continuously repeated raw data". [1]
For example, a set of data strings "AAAABBBCCDEEEE" consists of 4 A, 3 B, 2 C, 1 D, and 4 E. The data can be compressed into 4A3B2C1D4E (from 14 units to 10 units) through variable length coding. [1]
In short, it has the advantage of compressing the amount of data with high repeatability into small units; However, its disadvantage is that if the frequency of the data is not high, the amount of data compressed may be larger than the original data, for example, the original data "ABCDE", and the compression result is "1A1B1C1D1E" (from 5 units to 10 units). [1]
Line length coding. If there are duplicate pixels on the same line, it is to record a pixel and the number of duplicates of pixels, but not each pixel. The compression ratio is related to specific image content; [1]

Overview of RLE run length coding

Announce
edit
At present, compression technology has been widely used in various software, audio, video format and other fields. In general, there are two different types of image format compression: lossy compression and lossless compression [1]. Lossy compression can greatly compress the file data by using the principle of visual recognition, but it will affect the image quality. The basic principle of lossless compression is that the same color information only needs to be saved once, which can delete some duplicate data and greatly reduce the capacity of images to be saved on disk. The advantage of lossless compression method is that it can better preserve the quality of the image, but the compression rate of this method is relatively low compared with lossy compression. Common lossless compression algorithms include RLE LZW Etc. [1]

Basic principle of RLE compression algorithm

Announce
edit
RLE (Run Length Encoding) compression algorithm is an image file compression method used in the Windows system. Its basic idea is: use two bytes to represent adjacent pixels with the same color value in a scan line, and the first byte is a count value used to specify the number of pixel repetition; The second byte is the value of the specific pixel [2]. The purpose of reducing the space occupied by files is to compress and remove redundant bytes or redundant bits in bytes. For example, there is a string RRRRRGGBBBBBBB that represents the value of color pixels. After being compressed with the RLE compression method, it can be replaced by 5R2G6B. Obviously, the string length of the latter is much smaller than that of the former. The decoding is carried out according to the same rules as the encoding, and the data obtained after restoration is identical to the data before compression. Therefore, RLE is a lossless compression technology. [1]

Improvement of RLE compression algorithm

Announce
edit
RLE compression algorithm is very efficient for large amount of data repetition. However, when the color value of an image pixel is different from that of each adjacent pixel, such as the color string GBR, it will become 1G1B1R after compression by this method, which will double the length of the data string. This is a "sick" situation. In order to avoid the "sick" situation as far as possible, the basic method of RLE needs to be improved. The improved method is to distinguish the count byte from the image pixel byte in specific implementation, and use the upper two bits of the count byte as the compression flag. For single pixel data with different color values of each adjacent pixel, 1 count is added only when the count node is 2 bits higher than all 1 (that is, C0), otherwise the pixel value is directly output, thus avoiding doubling the length after compression. In this way, the upper two bits of the count byte itself are all 1, that is, the count byte is C0H+n (the number of consecutive identical bytes of pixel data). When the value of a single image data is greater than or equal to C0, C1 is output first, then the image data value is output, otherwise the data is output directly. If there are the following series of data: D2, 20, 30, 30, 30, C0, C1, C1, E2, E2, E2,..., E2 (132), E0, E0, D4, the compressed data is: [1] C1,D2,20,C3,30,C1,C0,C2,C1,FF,E2,FF,E2,C6,E2,C2,E0,C1,D4, It can be seen from this compression process that the single image data D2, C0 and D4 are preceded by count byte C1, but not before 20. This can effectively avoid abnormal expansion after compression. On the basis of the above improvements, we found that since a byte can only be FFH at most, the maximum n can only be FFH - C0H=3FH=(63) 10, so when n>63, it needs to be compressed several times. For example, 132 data E2 is represented by 6 bytes (FF, E2, FF, E2, C6, E2). In order to reduce the number of bytes required for large quantities of duplicate data, we further improve it: the bytes immediately following FF are still count bytes. Such as the above data: [1] D2, 20, 30, 30, 30, C0, C1, C1, E2, E2, E2,..., E2 (132), E0, E0, D4. The compressed data is: C1, D2,20,C3,30,C1,C0,C2,C1,FF,45,E2,C2,E0,C1,D4。 Comparing the two groups of data, 132 data E2 can be represented by 3 bytes (FF, 45, E2), effectively reducing the amount of data. An extreme case is that the number of times a certain data is repeated is FF. For this special case, we add a 00 byte after the FF byte to distinguish the representation. Through such improvement, the compression and decompression will not increase too much complexity, but the compression efficiency will be improved. [1]
------------------

Compression strategy

Announce
edit

compress

First, use a temporary function Q to read the first data, and then compare the next data with the Q value. If the data is the same, the counter will add 1; If the data are different, the value stored in the counter and the Q value will be output, and then the initial counter will be, and the Q value will be changed to the next data. By analogy, data compression is completed.
The following is a simple algorithm: [1] input: AAABCCBCCCCAA [1]
For i=1: size (input) if (Q=input (i)) counter+1 else output first item=counter value, output next item=Q value, Q is replaced by input (i), counter value is replaced by 0 end

decompression

The method is to read integers (represented by C) and data (represented by B) one by one, convert the binary codes of C and B into decimal integers and original data symbols respectively, and finally output data B for a total of C times, that is, complete data decompression once; Then repeat the above steps to complete all data output. [1]

RLE Features

Announce
edit
From the examples given above, it is not difficult to see how large the compression ratio can be obtained by RLE, which mainly depends on the characteristics of the image itself. If the image blocks with the same color are larger, the number of image blocks is smaller, and the compression ratio is higher. On the contrary, RLE is powerless to deal with natural images with rich colors. There are often few consecutive pixels with the same color on the same line, and fewer consecutive lines with the same color value on consecutive lines. If still used RLE code Method can not compress image data, but may make the original image data larger. Therefore, the specific implementation needs to be combined with other compression coding technologies. [1]
--------------------
Run Length
A feasible solution is to divide the values in the data stream into two categories: one category has a value stroke less than
Or equal to 128, output according to the original value; Another type of numerical stroke is greater than 128
The value is output after adding 128. To sum up, the improved itinerary
The code algorithm is as follows:
① Output the value whose stroke is less than or equal to 2 according to the original value;
② For the value with a stroke greater than 2, add 128 to it and enter
And output the stroke size at the next adjacent position.
Limitations of RLE algorithm
At RLE data compression Only when the number of repeated bytes is greater than
At 3:00, it can compress, and a special
Character is used as flag bit, so when RLE compression method is adopted
The following restrictions on compression ratio must be addressed [8]. [1]
(1) In the original image data, divide the part background image Pixels of
There are no more consecutive identical pixels with the same value. So how to mention
The problem with the same data value in high images is to improve data compression Beat off
Key; [1]
(2) How to find a special character , make it in the processed graph
Problems that are not used or seldom used; [1]
(3) How to improve duplicate bytes in case of duplicate bytes
A limited number of questions (up to 255). [1]