1 files changed, 58 insertions, 0 deletions
diff --git a/contrib/compressor/algorithm.doc b/contrib/compressor/algorithm.doc
new file mode 100644
index 00000000..74a7646c
--- /dev/null
+++ b/contrib/compressor/algorithm.doc
@@ -0,0 +1,58 @@
+The  compressor achieves  an  average compression  rate of 60%  of the
+original size which is on par with "gzip". It seems that you cannot do
+much better for compressing  compiled  binaries.  This means that  the
+break even  point  for using compressed  images is   reached, once the
+uncompressed size approaches 1.5kB. We  can stuff more than 12kB  into
+an 8kB EPROM and more than 25kB into an 16kB EPROM.   As there is only
+32kB of RAM  for both the uncompressed  image  and its BSS  area, this
+means that 32kB EPROMs will hardly ever be required.
+
+The compression  algorithm uses a  4kB  ring buffer  for buffering the
+uncompressed data. Before   compression starts,  the  ring buffer   is
+filled  with spaces (ASCII  character  0x20).  The algorithm tries  to
+find repeated  input sequences of a  maximum length of  60 bytes.  All
+256 different input  bytes  plus the 58 (60   minus a threshold of  2)
+possible  repeat lengths form a set  of 314 symbols. These symbols are
+adaptively Huffman encoded.  The  algorithm starts out with a Huffmann
+tree  that  assigns equal code lengths    to each of  the  314 symbols
+(slightly favoring the repeat  symbols over symbols for regular  input
+characters), but  it will be changed whenever  the frequency of any of
+the symbols  changes. Frequency counts are  kept in 16bit  words until
+the total number of compressed codes totals 2^15.  Then, all frequency
+counts will be halfed (rounding to the bigger number).  For unrepeated
+characters (symbols 0..255) the Huffman code  is written to the output
+stream.  For repeated characters the  Huffmann code, which denotes the
+length of the repeated character sequence, is written out and then the
+index in the ring buffer is computed.   From this index, the algorithm
+computes  the offset   relative to  the current  index  into  the ring
+buffer. Thus,  for typical input data,  one would expect that short to
+medium range offsets are more frequent  than extremely short or medium
+range to long range offsets. Thus the  12bit (for a 4kB buffer) offset
+value  is statically Huffman encoded  using a precomputed Huffman tree
+that favors  those  offset  values    that  are deemed to   be    more
+frequent. The  Huffman encoded offset  is  written to the output  data
+stream,  directly  following the code  that   determines the length of
+repeated characters.
+
+This algorithm, as implemented in the  C example code, looks very good
+and  its operating parameters are   already well optimized. This  also
+explains   why  it achieves     compression ratios    comparable  with
+"gzip". Depending on the input data, it sometimes excells considerably
+beyond what "gzip -9" does, but this  phenomenon does not appear to be
+typical. There are some flaws with  the algorithm, such as the limited
+buffer  sizes, the  adaptive  Huffman tree  which takes  very  long to
+change, if    the input  characters  experience   a sudden   change in
+distribution, and the static Huffman   tree for encoding offsets  into
+the  buffer.   The slow  changes of   the  adaptive  Huffman  tree are
+partially counteracted by  artifically keeping  a 16bit precision  for
+the frequency counts, but  this does not  come into play until 32kB of
+compressed data is output, so  it does not  have any impact on our use
+for "etherboot", because  the BOOT Prom  does not support uncompressed
+data of more then 32kB (c.f. doc/spec.doc).
+
+Nonetheless,  these problems  do  not  seem  to affect  compression of
+compiled  programs very much.  Mixing  object code with English  text,
+would not work too  well though, and  the algorithm should be reset in
+between. Actually, we  might  gain a little  improvement, if  text and
+data   segments    were compressed  individually,    but   I have  not
+experimented with this option, yet.