summaryrefslogtreecommitdiffstats
path: root/doc/clzip.info
diff options
context:
space:
mode:
Diffstat (limited to 'doc/clzip.info')
-rw-r--r--doc/clzip.info177
1 files changed, 89 insertions, 88 deletions
diff --git a/doc/clzip.info b/doc/clzip.info
index b66195e..786d8c1 100644
--- a/doc/clzip.info
+++ b/doc/clzip.info
@@ -11,14 +11,14 @@ File: clzip.info, Node: Top, Next: Introduction, Up: (dir)
Clzip Manual
************
-This manual is for Clzip (version 1.7-rc1, 23 May 2015).
+This manual is for Clzip (version 1.7, 7 July 2015).
* Menu:
* Introduction:: Purpose and features of clzip
-* Algorithm:: How clzip compresses the data
* Invoking clzip:: Command line interface
* File format:: Detailed format of the compressed file
+* Algorithm:: How clzip compresses the data
* Examples:: A small tutorial with examples
* Problems:: Reporting bugs
* Concept index:: Index of concepts
@@ -30,7 +30,7 @@ This manual is for Clzip (version 1.7-rc1, 23 May 2015).
copy, distribute and modify it.

-File: clzip.info, Node: Introduction, Next: Algorithm, Prev: Top, Up: Top
+File: clzip.info, Node: Introduction, Next: Invoking clzip, Prev: Top, Up: Top
1 Introduction
**************
@@ -53,7 +53,8 @@ availability:
recovery means. The lziprecover program can repair bit-flip errors
(one of the most common forms of data corruption) in lzip files,
and provides data recovery capabilities, including error-checked
- merging of damaged copies of a file.
+ merging of damaged copies of a file. *note Data safety:
+ (lziprecover)Data safety.
* The lzip format is as simple as possible (but not simpler). The
lzip manual provides the code of a simple decompressor along with
@@ -87,6 +88,11 @@ bzip2, which makes it safer than compressors returning ambiguous warning
values (like gzip) when it is used as a back end for other programs like
tar or zutils.
+ Clzip will automatically use the smallest possible dictionary size
+for each file without exceeding the given limit. Keep in mind that the
+decompression memory requirement is affected at compression time by the
+choice of dictionary size limit.
+
The amount of memory required for compression is about 1 or 2 times
the dictionary size limit (1 if input file size is less than dictionary
size limit, else 2) plus 9 times the dictionary size really used. The
@@ -94,11 +100,6 @@ option '-0' is special and only requires about 1.5 MiB at most. The
amount of memory required for decompression is about 46 kB larger than
the dictionary size really used.
- Clzip will automatically use the smallest possible dictionary size
-for each file without exceeding the given limit. Keep in mind that the
-decompression memory requirement is affected at compression time by the
-choice of dictionary size limit.
-
When compressing, clzip replaces every file given in the command line
with a compressed version of itself, with the name "original_name.lz".
When decompressing, clzip attempts to guess the name for the
@@ -138,75 +139,9 @@ automatically creating multi-member output. The members so created are
large, about 2 PiB each.

-File: clzip.info, Node: Algorithm, Next: Invoking clzip, Prev: Introduction, Up: Top
-
-2 Algorithm
-***********
-
-In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a
-concrete algorithm; it is more like "any algorithm using the LZMA coding
-scheme". For example, the option '-0' of lzip uses the scheme in almost
-the simplest way possible; issuing the longest match it can find, or a
-literal byte if it can't find a match. Inversely, a much more elaborated
-way of finding coding sequences of minimum size than the one currently
-used by lzip could be developed, and the resulting sequence could also
-be coded using the LZMA coding scheme.
-
- Clzip currently implements two variants of the LZMA algorithm; fast
-(used by option -0) and normal (used by all other compression levels).
-
- The high compression of LZMA comes from combining two basic,
-well-proven compression ideas: sliding dictionaries (LZ77/78) and
-markov models (the thing used by every compression algorithm that uses
-a range encoder or similar order-0 entropy coder as its last stage)
-with segregation of contexts according to what the bits are used for.
-
- Clzip is a two stage compressor. The first stage is a Lempel-Ziv
-coder, which reduces redundancy by translating chunks of data to their
-corresponding distance-length pairs. The second stage is a range encoder
-that uses a different probability model for each type of data;
-distances, lengths, literal bytes, etc.
-
- Here is how it works, step by step:
-
- 1) The member header is written to the output stream.
-
- 2) The first byte is coded literally, because there are no previous
-bytes to which the match finder can refer to.
-
- 3) The main encoder advances to the next byte in the input data and
-calls the match finder.
-
- 4) The match finder fills an array with the minimum distances before
-the current byte where a match of a given length can be found.
-
- 5) Go back to step 3 until a sequence (formed of pairs, repeated
-distances and literal bytes) of minimum price has been formed. Where the
-price represents the number of output bits produced.
-
- 6) The range encoder encodes the sequence produced by the main
-encoder and sends the produced bytes to the output stream.
-
- 7) Go back to step 3 until the input data are finished or until the
-member or volume size limits are reached.
-
- 8) The range encoder is flushed.
-
- 9) The member trailer is written to the output stream.
-
- 10) If there are more data to compress, go back to step 1.
-
-
-The ideas embodied in clzip are due to (at least) the following people:
-Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for
-the definition of Markov chains), G.N.N. Martin (for the definition of
-range encoding), Igor Pavlov (for putting all the above together in
-LZMA), and Julian Seward (for bzip2's CLI).
-
-
-File: clzip.info, Node: Invoking clzip, Next: File format, Prev: Algorithm, Up: Top
+File: clzip.info, Node: Invoking clzip, Next: File format, Prev: Introduction, Up: Top
-3 Invoking clzip
+2 Invoking clzip
****************
The format for running clzip is:
@@ -246,7 +181,7 @@ The format for running clzip is:
'-F'
'--recompress'
- Force recompression of files whose name already has the '.lz' or
+ Force re-compression of files whose name already has the '.lz' or
'.tlz' suffix.
'-k'
@@ -363,9 +298,9 @@ invalid input file, 3 for an internal consistency error (eg, bug) which
caused clzip to panic.

-File: clzip.info, Node: File format, Next: Examples, Prev: Invoking clzip, Up: Top
+File: clzip.info, Node: File format, Next: Algorithm, Prev: Invoking clzip, Up: Top
-4 File format
+3 File format
*************
Perfection is reached, not when there is no longer anything to add, but
@@ -434,7 +369,73 @@ additional information before, between, or after them.

-File: clzip.info, Node: Examples, Next: Problems, Prev: File format, Up: Top
+File: clzip.info, Node: Algorithm, Next: Examples, Prev: File format, Up: Top
+
+4 Algorithm
+***********
+
+In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a
+concrete algorithm; it is more like "any algorithm using the LZMA coding
+scheme". For example, the option '-0' of lzip uses the scheme in almost
+the simplest way possible; issuing the longest match it can find, or a
+literal byte if it can't find a match. Inversely, a much more elaborated
+way of finding coding sequences of minimum size than the one currently
+used by lzip could be developed, and the resulting sequence could also
+be coded using the LZMA coding scheme.
+
+ Clzip currently implements two variants of the LZMA algorithm; fast
+(used by option '-0') and normal (used by all other compression levels).
+
+ The high compression of LZMA comes from combining two basic,
+well-proven compression ideas: sliding dictionaries (LZ77/78) and
+markov models (the thing used by every compression algorithm that uses
+a range encoder or similar order-0 entropy coder as its last stage)
+with segregation of contexts according to what the bits are used for.
+
+ Clzip is a two stage compressor. The first stage is a Lempel-Ziv
+coder, which reduces redundancy by translating chunks of data to their
+corresponding distance-length pairs. The second stage is a range encoder
+that uses a different probability model for each type of data;
+distances, lengths, literal bytes, etc.
+
+ Here is how it works, step by step:
+
+ 1) The member header is written to the output stream.
+
+ 2) The first byte is coded literally, because there are no previous
+bytes to which the match finder can refer to.
+
+ 3) The main encoder advances to the next byte in the input data and
+calls the match finder.
+
+ 4) The match finder fills an array with the minimum distances before
+the current byte where a match of a given length can be found.
+
+ 5) Go back to step 3 until a sequence (formed of pairs, repeated
+distances and literal bytes) of minimum price has been formed. Where the
+price represents the number of output bits produced.
+
+ 6) The range encoder encodes the sequence produced by the main
+encoder and sends the produced bytes to the output stream.
+
+ 7) Go back to step 3 until the input data are finished or until the
+member or volume size limits are reached.
+
+ 8) The range encoder is flushed.
+
+ 9) The member trailer is written to the output stream.
+
+ 10) If there are more data to compress, go back to step 1.
+
+
+The ideas embodied in clzip are due to (at least) the following people:
+Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for
+the definition of Markov chains), G.N.N. Martin (for the definition of
+range encoding), Igor Pavlov (for putting all the above together in
+LZMA), and Julian Seward (for bzip2's CLI).
+
+
+File: clzip.info, Node: Examples, Next: Problems, Prev: Algorithm, Up: Top
5 A small tutorial with examples
********************************
@@ -545,13 +546,13 @@ Concept index

Tag Table:
Node: Top210
-Node: Introduction897
-Node: Algorithm6100
-Node: Invoking clzip8930
-Node: File format14479
-Node: Examples16881
-Node: Problems18850
-Node: Concept index19376
+Node: Introduction893
+Node: Invoking clzip6152
+Node: File format11705
+Node: Algorithm14108
+Node: Examples16933
+Node: Problems18900
+Node: Concept index19426

End Tag Table