diff options
Diffstat (limited to 'contrib/compressor/algorithm.doc')
-rw-r--r-- | contrib/compressor/algorithm.doc | 58 |
1 files changed, 58 insertions, 0 deletions
diff --git a/contrib/compressor/algorithm.doc b/contrib/compressor/algorithm.doc new file mode 100644 index 00000000..74a7646c --- /dev/null +++ b/contrib/compressor/algorithm.doc @@ -0,0 +1,58 @@ +The compressor achieves an average compression rate of 60% of the +original size which is on par with "gzip". It seems that you cannot do +much better for compressing compiled binaries. This means that the +break even point for using compressed images is reached, once the +uncompressed size approaches 1.5kB. We can stuff more than 12kB into +an 8kB EPROM and more than 25kB into an 16kB EPROM. As there is only +32kB of RAM for both the uncompressed image and its BSS area, this +means that 32kB EPROMs will hardly ever be required. + +The compression algorithm uses a 4kB ring buffer for buffering the +uncompressed data. Before compression starts, the ring buffer is +filled with spaces (ASCII character 0x20). The algorithm tries to +find repeated input sequences of a maximum length of 60 bytes. All +256 different input bytes plus the 58 (60 minus a threshold of 2) +possible repeat lengths form a set of 314 symbols. These symbols are +adaptively Huffman encoded. The algorithm starts out with a Huffmann +tree that assigns equal code lengths to each of the 314 symbols +(slightly favoring the repeat symbols over symbols for regular input +characters), but it will be changed whenever the frequency of any of +the symbols changes. Frequency counts are kept in 16bit words until +the total number of compressed codes totals 2^15. Then, all frequency +counts will be halfed (rounding to the bigger number). For unrepeated +characters (symbols 0..255) the Huffman code is written to the output +stream. For repeated characters the Huffmann code, which denotes the +length of the repeated character sequence, is written out and then the +index in the ring buffer is computed. From this index, the algorithm +computes the offset relative to the current index into the ring +buffer. Thus, for typical input data, one would expect that short to +medium range offsets are more frequent than extremely short or medium +range to long range offsets. Thus the 12bit (for a 4kB buffer) offset +value is statically Huffman encoded using a precomputed Huffman tree +that favors those offset values that are deemed to be more +frequent. The Huffman encoded offset is written to the output data +stream, directly following the code that determines the length of +repeated characters. + +This algorithm, as implemented in the C example code, looks very good +and its operating parameters are already well optimized. This also +explains why it achieves compression ratios comparable with +"gzip". Depending on the input data, it sometimes excells considerably +beyond what "gzip -9" does, but this phenomenon does not appear to be +typical. There are some flaws with the algorithm, such as the limited +buffer sizes, the adaptive Huffman tree which takes very long to +change, if the input characters experience a sudden change in +distribution, and the static Huffman tree for encoding offsets into +the buffer. The slow changes of the adaptive Huffman tree are +partially counteracted by artifically keeping a 16bit precision for +the frequency counts, but this does not come into play until 32kB of +compressed data is output, so it does not have any impact on our use +for "etherboot", because the BOOT Prom does not support uncompressed +data of more then 32kB (c.f. doc/spec.doc). + +Nonetheless, these problems do not seem to affect compression of +compiled programs very much. Mixing object code with English text, +would not work too well though, and the algorithm should be reset in +between. Actually, we might gain a little improvement, if text and +data segments were compressed individually, but I have not +experimented with this option, yet. |