The LZW patents in all countries where they were granted have expired, which makes it patent-free and thus free for anyone to use anywhere. The LZW algorithm itself is quite ingenuous. With a relatively simple algorithm decent compression factors are achieved. One big advantage of LZW is that it can be made really fast and to not to take humongous amounts of memory to compress or decompress.
I present here a very efficient implementation of LZW both speedwise and memorywise.
The following table presents compression and decompression times of several example files. The tests were run on a 3.4GHz Pentium4 running Suse 9.3. My implementation of LZW supports a freely-specifiable maximum bitsize for the dictionary indices. 16 bits was used in these tests. In these tests I first read the entire file into memory, then run the LZW algorithm and then write the result. These tests thus measure the time taken to encode/decode from memory to memory; the time taken for I/O is not included.
||586 kB||0.03 s||192 kB||32.7%||0.02 s|
||4.0 MB||0.45 s||2.9 MB||73.0%||0.22 s|
||8.7 MB||0.90 s||6.0 MB||69.7%||0.46 s|
||110.4 MB||4.74 s||20.6 MB||18.7%||3.62 s|
The memory requirement for this LZW implementation depends on the maximum bitsize of the dictionary. For example in practical C++ implementations the dictionary can be allocated as two arrays (one for the indices and another for the byte values), thus avoiding memory management overhead. The following table presents the minimum amount of memory required by the dictionary (in a C++ implementation) for some maximum bitsizes when LZW-encoding. The second column assumes that if the maximum bitsize is less or equal to 16 then 2-byte index values are used, else 4-byte index values. The third column gives the required memory assuming that 4-byte index values are always used.
|12||36 kB||68 kB|
|14||144 kB||272 kB|
|16||576 kB||1088 kB|
Basically no additional memory (besides the input and output data) is required.
It is also possible to allocate the dictionary as one single array, with the character of each item embedded with the index values. In this case, however, the character will usually require 4 bytes instead of one, increasing the memory usage slightly (by about 18%).
In practice the data will usually compress the better the larger the maximum bitsize (although maximum bitsizes exceeding about 20 will usually have only minimal effect).
The LZW decoder requires even less memory (because it doesn't need the ancillary data used by the encoder for fast dictionary searching).
The most notable feature of this implementation is that the heaviest step in the whole encoding process - the dictionary search - can be done in a fixed amount of byte comparisons (at most 256 of them) regardless of the dictionary size and the length of the strings. Thus, technically speaking, the dictionary search is an O(1) operation, ie. a constant-time operation. This is done without the need for any external data container (all the ancillary data required for this is stored inside the dictionary itself).
Some LZW encoding implementations use an external hash table to perform the dictionary searches. Often these implementations suffer from large memory usage requirements and/or slowness (because the search time may depend on the size of the dictionary and/or the size of the strings inside it). The solution presented here never requires more memory than what was presented in the table above, and always performs a dictionary search in (at most) a fixed amount of steps regardless of the dictionary size.
Another notable feature is that the memory usage of the dictionary is not dependent on the input. In other words, even if the input produces very long strings in the dictionary, that doesn't make the dictionary require more memory. This is so for both encoding and decoding.
Example implementation in C++
The example implementation is released under the GPL license.
Copyright 2007: Juha Nieminen