The basic algorithm description only mentions *"output the index to the
result"* without specifying how exactly those index values should be
written to the result.

Naturally the most space-efficient way of writing them is to write only as many bits for each index as is necessary. If, for example, our maximum bitsize of the dictionary is 16, then the index values should be written as 16-bit integers to the output (thus taking 2 bytes per index).

However, it should be noted that every index value written to the output has a maximum value of the size of the dictionary at that point. In other words, if our dictionary contains eg. only 350 entries so far, the next index written to the output will be between 0 and 350. Thus if we write that value using 16 bits, the highest 7 bits of that value will always be 0. This is, naturally, a waste of space.

Dynamic LZW (which is used eg. in the GIF format) only writes as many bits as the current dictionary size requires. In other words, if the current dictionary size is 350 elements, then the index value will be written as a 9-bit integer (instead of eg. a 16-bit one). When the element with index 511 is added to the dictionary then the output bitsize is incremented by one, ie. all indices will now by written as 10-bit integers.

In theory doing this is quite simple:

- At the beginning initialize a variable indicating the current bitsize with the number of bits necessary to store the index to the next unused dictionary element (in our simple case this would be 9 bits, as it's the amount of bits required to store the value 256).
- For efficiency you can also initialize another variable indicating the
maximum index value which can be represented with that bitsize (in this
case it would be 511). This can be calculated with a simple
"
`(1 << bitsize) - 1`

" - Each time the size of the dictionary reaches that maximum index value, increment the bitsize variable and calculate the new maximum index value which can be stored with that many bits.
- When writing an index to the result, write only
*bitsize*least significant bits of it (the higher bits will all be zeros). - When the dictionary gets full because it reaches the maximum bitsize, besides resetting the dictionary also reset the bitsize variable to the initial value.

The most laborious part is to actually create an output stream writer which writes values in chunks of the specified bits. Doing this correctly and efficiently requires minute attention to detail.

Of course an easy solution would be to create a function which just writes one bit at a time to the result. However, doing it this way would add overhead to the encoding because it's slower than writing entire bytes at a time.

Note that it's a good idea to reserve one index value to have the special meaning "end of input" (this is, for example, what the GIF format does). This can simply be the first free index value after the dictionary has been initialized. New dictionary entries will be added after it, and thus this index value will never be used for anything. When the encoding is done, output this index value as the last one. This way the decoder will know when to stop reading and will not be confused by possible remainder values in the encoded data which appear after the last encoded value.

One great feature of LZW is that there's no requirement for the initial size of the dictionary. The initial dictionary doesn't need to have exactly 256 entries. It can have any amount of entries and the algorithm will still work equally well.

This can be taken advantage of if the input uses less than 256 different byte values. For example text files usually use much less than that amount of different characters (typically less than 100). It would thus be a waste of space and resources to allocate a dictionary for all possible 256 values when not all of them are even used.

If the maximum byte value used in the input is, for example 125 (as is rather usual with English text files), we could just as well initialize the dictionary with the first 125 values and start adding new entries from index 126 forwards. This not only gives us a few more dictionary entries, but it also allows us to start writing the indices using 8 bits instead of 9.

We can push this even further and take advantage of the number of different values used in the input instead of simply looking at the maximum value used. In other words, if the input uses, for example, only 87 different values, we could initialize the dictionary with 87 entries and start the new items from index 88 forwards.

In order for this to work we have to map the input byte values to values starting from 0 forwards. In other words, the smallest byte value in the input is mapped to a 0, the next smallest to a 1 and so on.

This map must be written to the result so that the decoder will know how to make the inverse conversion. The map can be written using 32 bytes.

Of course this assumes that we can read the entire input before starting the encoding. If we can't, then the mapping can simply be skipped.

Part 4: Fast LZW dictionary search

Copyright 2007: Juha Nieminen