summaryrefslogtreecommitdiffstats
path: root/doc/clzip.texi
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--doc/clzip.texi (renamed from doc/clzip.texinfo)93
1 files changed, 52 insertions, 41 deletions
diff --git a/doc/clzip.texinfo b/doc/clzip.texi
index 95bfe68..25869a0 100644
--- a/doc/clzip.texinfo
+++ b/doc/clzip.texi
@@ -6,8 +6,8 @@
@finalout
@c %**end of header
-@set UPDATED 17 September 2013
-@set VERSION 1.5
+@set UPDATED 30 January 2014
+@set VERSION 1.6-pre1
@dircategory Data Compression
@direntry
@@ -45,7 +45,7 @@ This manual is for Clzip (version @value{VERSION}, @value{UPDATED}).
@end menu
@sp 1
-Copyright @copyright{} 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
+Copyright @copyright{} 2010, 2011, 2012, 2013, 2014 Antonio Diaz Diaz.
This manual is free documentation: you have unlimited permission
to copy, distribute and modify it.
@@ -56,10 +56,10 @@ to copy, distribute and modify it.
@cindex introduction
Clzip is a lossless data compressor with a user interface similar to the
-one of gzip or bzip2. Clzip decompresses almost as fast as gzip and
-compresses more than bzip2, which makes it well suited for software
-distribution and data archiving. Clzip is a clean implementation of the
-LZMA algorithm.
+one of gzip or bzip2. Clzip decompresses almost as fast as gzip,
+compresses most files more than bzip2, and is better than both from a
+data recovery perspective. Clzip is a clean implementation of the LZMA
+algorithm.
Clzip uses the lzip file format; the files produced by clzip are fully
compatible with lzip-1.4 or newer, and can be rescued with lziprecover.
@@ -67,17 +67,23 @@ Clzip is in fact a C language version of lzip, intended for embedded
devices or systems lacking a C++ compiler.
The lzip file format is designed for long-term data archiving and
-provides very safe integrity checking. The member trailer stores the
-32-bit CRC of the original data, the size of the original data and the
-size of the member. These values, together with the value remaining in
-the range decoder and the end-of-stream marker, provide a 4 factor
-integrity checking which guarantees that the decompressed version of the
-data is identical to the original. This guards against corruption of the
-compressed data, and against undetected bugs in clzip (hopefully very
-unlikely). The chances of data corruption going undetected are
-microscopic. Be aware, though, that the check occurs upon decompression,
-so it can only tell you that something is wrong. It can't help you
-recover the original uncompressed data.
+provides very safe integrity checking. It is as simple as possible (but
+not simpler), so that with the only help of the lzip manual it would be
+possible for a digital archaeologist to extract the data from a lzip
+file long after quantum computers eventually render LZMA obsolete.
+Additionally lzip is copylefted, which guarantees that it will remain
+free forever.
+
+The member trailer stores the 32-bit CRC of the original data, the size
+of the original data and the size of the member. These values, together
+with the value remaining in the range decoder and the end-of-stream
+marker, provide a 4 factor integrity checking which guarantees that the
+decompressed version of the data is identical to the original. This
+guards against corruption of the compressed data, and against undetected
+bugs in clzip (hopefully very unlikely). The chances of data corruption
+going undetected are microscopic. Be aware, though, that the check
+occurs upon decompression, so it can only tell you that something is
+wrong. It can't help you recover the original uncompressed data.
If you ever need to recover data from a damaged lzip file, try the
lziprecover program. Lziprecover makes lzip files resistant to bit-flip
@@ -86,15 +92,28 @@ recovery capabilities, including error-checked merging of damaged copies
of a file.
Clzip uses the same well-defined exit status values used by lzip and
-bzip2, which makes it safer when used in pipes or scripts than
-compressors returning ambiguous warning values, like gzip.
+bzip2, which makes it safer than compressors returning ambiguous warning
+values (like gzip) when it is used as a back end for tar or zutils.
-Clzip replaces every file given in the command line with a compressed
-version of itself, with the name "original_name.lz". Each compressed
-file has the same modification date, permissions, and, when possible,
-ownership as the corresponding original, so that these properties can be
-correctly restored at decompression time. Clzip is able to read from some
-types of non regular files if the @samp{--stdout} option is specified.
+When compressing, clzip replaces every file given in the command line
+with a compressed version of itself, with the name "original_name.lz".
+When decompressing, clzip attempts to guess the name for the decompressed
+file from that of the compressed file as follows:
+
+@multitable {anyothername} {becomes} {anyothername.out}
+@item filename.lz @tab becomes @tab filename
+@item filename.tlz @tab becomes @tab filename.tar
+@item anyothername @tab becomes @tab anyothername.out
+@end multitable
+
+(De)compressing a file is much like copying or moving it; therefore clzip
+preserves the access and modification dates, permissions, and, when
+possible, ownership of the file just as "cp -p" does. (If the user ID or
+the group ID can't be duplicated, the file permission bits S_ISUID and
+S_ISGID are cleared).
+
+Clzip is able to read from some types of non regular files if the
+@samp{--stdout} option is specified.
If no file names are specified, clzip compresses (or decompresses) from
standard input to standard output. In this case, clzip will decline to
@@ -119,23 +138,14 @@ large, about 64 PiB each.
The amount of memory required for compression is about 1 or 2 times the
dictionary size limit (1 if input file size is less than dictionary size
limit, else 2) plus 9 times the dictionary size really used. The amount
-of memory required for decompression is only a few tens of KiB larger
-than the dictionary size really used.
+of memory required for decompression is about 46 kB larger than the
+dictionary size really used.
Clzip will automatically use the smallest possible dictionary size
without exceeding the given limit. Keep in mind that the decompression
memory requirement is affected at compression time by the choice of
dictionary size limit.
-When decompressing, clzip attempts to guess the name for the decompressed
-file from that of the compressed file as follows:
-
-@multitable {anyothername} {becomes} {anyothername.out}
-@item filename.lz @tab becomes @tab filename
-@item filename.tlz @tab becomes @tab filename.tar
-@item anyothername @tab becomes @tab anyothername.out
-@end multitable
-
@node Algorithm
@chapter Algorithm
@@ -180,7 +190,7 @@ price represents the number of output bits produced.
6) The range encoder encodes the sequence produced by the main encoder
and sends the produced bytes to the output stream.
-7) Go back to step 3 until the input data is finished or until the
+7) Go back to step 3 until the input data are finished or until the
member or volume size limits are reached.
8) The range encoder is flushed.
@@ -420,8 +430,9 @@ Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB@*
Valid values for dictionary size range from 4 KiB to 512 MiB.
@item Lzma stream
-The lzma stream, finished by an end of stream marker. Uses default values
-for encoder properties. See the lzip manual for a full description.
+The lzma stream, finished by an end of stream marker. Uses default
+values for encoder properties. See the lzip manual for a full
+description.
@item CRC32 (4 bytes)
CRC of the uncompressed original data.
@@ -443,7 +454,7 @@ facilitates safe recovery of undamaged members from multi-member files.
WARNING! Even if clzip is bug-free, other causes may result in a corrupt
compressed file (bugs in the system libraries, memory errors, etc).
-Therefore, if the data you are going to compress is important, give the
+Therefore, if the data you are going to compress are important, give the
@samp{--keep} option to clzip and do not remove the original file until
you verify the compressed file with a command like
@w{@samp{clzip -cd file.lz | cmp file -}}.