Collection
zero Useful+1
zero

LZW algorithm

String table compression algorithm
LZW algorithm, also known as "string table compression algorithm", implements compression by creating a string table and using shorter codes to represent longer strings. The LZW compression algorithm is a patent of Unisys, valid until 2003, so there is no limit to its use
Chinese name
LZW algorithm
Foreign name
Lempel-Ziv-Welch Encoding
Alias
String table compression algorithm
term of validity
2003
Patents
Unisys
Abbreviation
LZW compression algorithm

brief introduction

Announce
edit
The correspondence between string and code is generated dynamically during compression, and is implied in the compressed data. When decompressing, it is recovered according to the table lossless compression .
According to the compression algorithm of Lempel Ziv Welch Encoding (LZW for short), it is implemented in any language
LZW compression algorithm [1] Basic concept of LZW compression: there are three important objects: data stream (CharStream), code stream (CodeStream), and compiled table (String Table). In encoding, the data stream is the input object (data sequence of text file), and the encoding stream is the output object (encoded data after compression operation); When decoding, the encoded stream is the input object and the data stream is the output object; The compilation table is an object that needs to be used for both encoding and decoding.
Character: most basic data elements In the text file, it is a byte, and in the raster data, it is the index value of a pixel's color in the specified color list;
String: consists of several consecutive characters;
Prefix: It is also a string, but it is usually used in front of another character, and its length can be 0;
Root: a string of length;
Code: a number, taken from the code stream according to a fixed length (code length), and compiled the mapping value of the table; Pattern: a string, from data stream And mapped to the compiled table entry
The basic principle of LZW compression algorithm: extract different characters from the original text file data, create a compilation table based on these characters, and then replace the corresponding characters in the original text file data with the index of the characters in the compilation table to reduce the size of the original data. Looks and palette The principle of image implementation is similar, but it should be noted that the compilation table here is not created in advance, but is dynamically created according to the original file data. When decoding, the original compilation table must be restored from the encoded data

algorithm

Announce
edit
LZW compression algorithm
LZW algorithm flowchart
LZW algorithm maps the input string to a fixed length (usually 12 bits) codeword based on the transformation string table (dictionary) T. Of the 12 bit 4096 possible codes, 256 represent single characters, and 3840 are the remaining strings.
The strings in LZW dictionary are prefixed, that is, ω K ∈ T=>; ω
T。
LZW algorithm flow
Step 1: The dictionary at the beginning contains all possible roots, but the current prefix P is empty;
Step 2: Current character (C):=the next character in the character stream;
Step 3: Determine whether the suffix character string P+C is in the dictionary
(1) If "Yes": P:=P+C//(extend P with C);
(2) If "No"
① Outputting the codeword representing the current prefix P to the codeword stream;
② Add the suffix character string P+C to the dictionary;
③ Let P:=C//(the current P contains only one character C);
Step 4: Determine whether there is any codeword to be translated in the codeword stream
(1) If "Yes", return to step 2;
(2) If "No"
① Outputting the codeword representing the current prefix P to the codeword stream;
② End.
LZW decompression algorithm
The specific decompression steps are as follows:
(1) Dictionary contains all roots at the beginning of decoding.
(2) Read the first codeword cW (which represents a root) in the encoded data stream.
(3) Output String.cW to the character data stream Charstream.
(4) Let pW=cW.
(5) Read the next codeword cW of the encoded data stream.
(6) Does String.cW exist in the dictionary at present?
YES: (1) Output String.cW to character data stream;
(2) Make P=String. pW;
(3) Make C=the first character of String.cW;
(4) Add the string P+C to Dictionray.
NO: (1) Make P=String.pW;
(2) Make C=the first character of String.pW;
(3) Output the string P+C to the character data stream and add it to Dictionray (it is now consistent with cW).
(7) Is there a Codeword in the encoded data stream?
YES: Return (4) to continue decoding.
NO: End decoding.

characteristic

Announce
edit
LZW code can be effectively used character Frequency redundancy occurs for compression, and dictionary is generated adaptively, but location redundancy cannot be used effectively.
The specific characteristics are as follows:
(l)LZW compression technique It has good processing effect for data with little predictability, and is commonly used in TIF format image compression The average compression ratio is more than 2:1, and the maximum compression ratio can reach 3:1.
(2) For data stream The consecutive repeated bytes and strings in, LZW compression Technology has a high compression ratio.
(3) In addition to image data processing, LZW compression technology is also used in text programs data compression Domain.
(4) There are many variants of LZW compression technology, such as the common ARC, RKARC, and PKZIP efficient compression programs.
(5) For images with arbitrary width and pixel bit length, there is a stable compression process. Compression and decompression Fast.
(6) The requirements for machine hardware conditions are not high, and compression and decompression can be performed on Intel 80386 computers.