Archiver comparison

1 Introduction

This page compares the compression ratios and times of several modern archiver programs as well as some older ones.

1.1 Test system

The programs were tested in a Sun Blade 150 computer (UltraSparc 650MHz, 1GB RAM) with the available command-line versions of each archiver program.

Each archiver was used with their maximum compression settings and tested with four types of data detailed below and their compression ratios and running times measured.

1.2 Archiver programs

This is the complete list of tested archiver programs:

2 Test results

2.1 Text compression

This test was performed by compressing 11 big (ascii) text files ranging from 118kB to 543kB in size (the text files are from the IETF RFC repository).

The total amount of data to be compressed was 3393 kilobytes.

Text compression chart

As we can see from the chart, 7-zip takes the lead with bzip2 being a good second. However, 7-zip takes quite a lot more time than the latter.

2.2 Binary compression

This test was performed by compressing 6 big executable binaries (Sparc-Solaris) ranging from 603kB to 4628kB. The binaries include, among others, emacs, netscape, pov-ray and Xsun.

The total amount of data to be compressed was 13156 kilobytes.

Binary compression chart

7-zip takes a clear lead in this case (seemingly Sparc binaries are quite optimal for it). The rest of the archivers are quite clearly divided into moderns and classics (although with a rather small margin). For some reason binaries are pathological timewise for the old rar. Rar takes this time the second place, but only very marginally from bzip2.

2.3 Image compression

This test was performed by compressing 7 full-color images in raw TGA format (ie. no compression) ranging from 729kB to 2304kB.

As a comparison, the images were also converted to png format and recompressed with pngcrush and the total size of these png files were then added to the chart. Since PNG is a completely lossless format, the effect of converting the images to PNG is the same as if they were compressed with a regular archiver; thus this comparison is quite rational.

These are small versions of the images used for the test:

Test photo 1 Test photo 2 Test photo 3 Test photo 4 Test photo 5 Test photo 6 Test photo 7

The total amount of data to be compressed was 8776 kilobytes.

Image compression chart

Perhaps a bit unexpectedly (or for some people not), the png files recompressed with pngcrush are clearly smaller than the raw images compressed with the other archivers. From the other archives 7-zip takes once again the lead, but bzip2 compares very well (and is also faster).

C++ sources compression

This test was performed by compressing the POV-Ray 3.6 source code.

This test is different from the previous ones in that instead of consisting of a few huge files, it consists of tons of small ones (107 files ranging from 2kB to 219kB).

This makes a big difference between archivers which are able to compress all the files as if they were one big file and those which are not. If the archiver handles each file individually, the total compression ratio will suffer noticeably. Thus in this test these archivers were also tested in conjuntion with tar (which packs all the files into one big file).

The archivers which got advantage from using tar were zip, lha, advzip and rar (rar should support solid archiving, ie. handling the files as if they were one big file, but for some reason the old version used here did not support that, even though it was mentioned in its command-line options; perhaps a newer version would not have this problem). 7-zip was the only one without need for tar (in fact, tarring the files only made a bigger 7-zip file).

The total amount of data to be compressed was 3506 kilobytes.

Source compression chart

The same trend continues: 7-zip takes a clear lead with bzip2 being a good second (but much faster). All the non-tarred archivers perform very poorly in this test (althouh lha performs quite poorly even with tar). Tarring the files before archiving helps quite some, but none of the other programs still get quite close to 7-zip and bzip2. For some reason C++ source code seems to be good for 7-zip and rather pathological for the others.

Conclusion

7-zip is a clear winner compression-wise, but at a cost of being somewhat slow. If compression ratio is all that matters, 7-zip seems the way to go. (Note that one reason for its slowness might be that it's a beta version which performs checks during the compression.)

bzip2 compares quite well with 7-zip and is clearly faster in all tests. It clearly gives the best compression/speed ratio. If compressing very well very fast is imperative, bzip2 seems like the correct choice. OTOH a slight disadvantage is that it compresses only one file, requiring thus an external archiver, usually tar.

Saying something about rar is not very relevant because only a quite ancient version was tested (as explained at the beginning of this page). This old version gives a rather good compression but is extremely slow and thus is not a good alternative.

advzip is excellent if you have zip files which you don't want to convert to anything else, but you would want them to be slightly smaller. advzip can be used to recompress them. While the sizes might not be reduced astonishingly much, it still doesn't hurt.

gzip gives a rather good compression ratio very fast. If the wanted compression/speed ratio is balanced towards the speed, gzip is a good choice (but has the same problems as bzip2).

zip is the classic and compares quite equally with gzip. Since it has archiving capabilities in itself (unlike gzip) it might even be a better choice than gzip (unless you are distributing Unix sources, where tar+gzip is the de-facto standard). Also compatibility and portability of the archives themselves between different platforms is probably better. However, best possible compression ratios are not to be expected.

lha: The nostalgic archiver (which was extremely popular in the Amiga) compares surprisingly well with its later competitor zip. However, since it clearly ends up at the bottom of all tests, it's recommended only for those old hackers who want to get some retro vibes. (In fact, lha is still quite popular in Japan.)

pngcrush is the right tool to recompress those PNG files (however, advpng should also be tried because it sometimes gives even smaller results than pngcrush). As seen in the image compression test, PNG files compress surprisingly well compared to archivers when lossless compression of images is needed.


© Copyright 2004 Juha Nieminen


Valid HTML 4.0!