summaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2022-02-21 16:29:07 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2022-02-21 16:29:07 +0000
commit3b6a991863be64d009e1b700561526e2ecfcd98d (patch)
tree4d7038cdde9ffce40eaa9280d3f79eaa319044b4 /doc
parentAdding upstream version 1.9. (diff)
downloadplzip-3b6a991863be64d009e1b700561526e2ecfcd98d.tar.xz
plzip-3b6a991863be64d009e1b700561526e2ecfcd98d.zip
Adding upstream version 1.10.upstream/1.10
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc')
-rw-r--r--doc/plzip.123
-rw-r--r--doc/plzip.info283
-rw-r--r--doc/plzip.texi287
3 files changed, 308 insertions, 285 deletions
diff --git a/doc/plzip.1 b/doc/plzip.1
index deb0ea5..4be148d 100644
--- a/doc/plzip.1
+++ b/doc/plzip.1
@@ -1,5 +1,5 @@
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.16.
-.TH PLZIP "1" "January 2021" "plzip 1.9" "User Commands"
+.TH PLZIP "1" "January 2022" "plzip 1.10" "User Commands"
.SH NAME
plzip \- reduces the size of files
.SH SYNOPSIS
@@ -11,13 +11,14 @@ compatible with lzip 1.4 or newer. Plzip uses the compression library lzlib.
.PP
Lzip is a lossless data compressor with a user interface similar to the one
of gzip or bzip2. Lzip uses a simplified form of the 'Lempel\-Ziv\-Markov
-chain\-Algorithm' (LZMA) stream format, chosen to maximize safety and
-interoperability. Lzip can compress about as fast as gzip (lzip \fB\-0\fR) or
-compress most files more than bzip2 (lzip \fB\-9\fR). Decompression speed is
-intermediate between gzip and bzip2. Lzip is better than gzip and bzip2 from
-a data recovery perspective. Lzip has been designed, written, and tested
-with great care to replace gzip and bzip2 as the standard general\-purpose
-compressed format for unix\-like systems.
+chain\-Algorithm' (LZMA) stream format and provides a 3 factor integrity
+checking to maximize interoperability and optimize safety. Lzip can compress
+about as fast as gzip (lzip \fB\-0\fR) or compress most files more than bzip2
+(lzip \fB\-9\fR). Decompression speed is intermediate between gzip and bzip2.
+Lzip is better than gzip and bzip2 from a data recovery perspective. Lzip
+has been designed, written, and tested with great care to replace gzip and
+bzip2 as the standard general\-purpose compressed format for unix\-like
+systems.
.PP
Plzip can compress/decompress large files on multiprocessor machines much
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
@@ -116,7 +117,7 @@ To extract all the files from archive 'foo.tar.lz', use the commands
.PP
Exit status: 0 for a normal exit, 1 for environmental problems (file
not found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or
-invalid input file, 3 for an internal consistency error (eg, bug) which
+invalid input file, 3 for an internal consistency error (e.g., bug) which
caused plzip to panic.
.SH "REPORTING BUGS"
Report bugs to lzip\-bug@nongnu.org
@@ -125,8 +126,8 @@ Plzip home page: http://www.nongnu.org/lzip/plzip.html
.SH COPYRIGHT
Copyright \(co 2009 Laszlo Ersek.
.br
-Copyright \(co 2021 Antonio Diaz Diaz.
-Using lzlib 1.12
+Copyright \(co 2022 Antonio Diaz Diaz.
+Using lzlib 1.13
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>
.br
This is free software: you are free to change and redistribute it.
diff --git a/doc/plzip.info b/doc/plzip.info
index d70163e..c38ea5c 100644
--- a/doc/plzip.info
+++ b/doc/plzip.info
@@ -1,6 +1,6 @@
This is plzip.info, produced by makeinfo version 4.13+ from plzip.texi.
-INFO-DIR-SECTION Data Compression
+INFO-DIR-SECTION Compression
START-INFO-DIR-ENTRY
* Plzip: (plzip). Massively parallel implementation of lzip
END-INFO-DIR-ENTRY
@@ -11,7 +11,7 @@ File: plzip.info, Node: Top, Next: Introduction, Up: (dir)
Plzip Manual
************
-This manual is for Plzip (version 1.9, 3 January 2021).
+This manual is for Plzip (version 1.10, 24 January 2022).
* Menu:
@@ -19,16 +19,16 @@ This manual is for Plzip (version 1.9, 3 January 2021).
* Output:: Meaning of plzip's output
* Invoking plzip:: Command line interface
* Program design:: Internal structure of plzip
-* File format:: Detailed format of the compressed file
* Memory requirements:: Memory required to compress and decompress
* Minimum file sizes:: Minimum file sizes required for full speed
+* File format:: Detailed format of the compressed file
* Trailing data:: Extra data appended to the file
* Examples:: A small tutorial with examples
* Problems:: Reporting bugs
* Concept index:: Index of concepts
- Copyright (C) 2009-2021 Antonio Diaz Diaz.
+ Copyright (C) 2009-2022 Antonio Diaz Diaz.
This manual is free documentation: you have unlimited permission to copy,
distribute, and modify it.
@@ -44,13 +44,14 @@ compatible with lzip 1.4 or newer. Plzip uses the compression library lzlib.
Lzip is a lossless data compressor with a user interface similar to the
one of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
-chain-Algorithm' (LZMA) stream format, chosen to maximize safety and
-interoperability. Lzip can compress about as fast as gzip (lzip -0) or
-compress most files more than bzip2 (lzip -9). Decompression speed is
-intermediate between gzip and bzip2. Lzip is better than gzip and bzip2 from
-a data recovery perspective. Lzip has been designed, written, and tested
-with great care to replace gzip and bzip2 as the standard general-purpose
-compressed format for unix-like systems.
+chain-Algorithm' (LZMA) stream format and provides a 3 factor integrity
+checking to maximize interoperability and optimize safety. Lzip can compress
+about as fast as gzip (lzip -0) or compress most files more than bzip2
+(lzip -9). Decompression speed is intermediate between gzip and bzip2. Lzip
+is better than gzip and bzip2 from a data recovery perspective. Lzip has
+been designed, written, and tested with great care to replace gzip and
+bzip2 as the standard general-purpose compressed format for unix-like
+systems.
Plzip can compress/decompress large files on multiprocessor machines much
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
@@ -107,7 +108,7 @@ filename.lz becomes filename
filename.tlz becomes filename.tar
anyothername becomes anyothername.out
- (De)compressing a file is much like copying or moving it; therefore plzip
+ (De)compressing a file is much like copying or moving it. Therefore plzip
preserves the access and modification dates, permissions, and, when
possible, ownership of the file just as 'cp -p' does. (If the user ID or
the group ID can't be duplicated, the file permission bits S_ISUID and
@@ -206,7 +207,7 @@ once, the first time it appears in the command line.
'-B BYTES'
'--data-size=BYTES'
- When compressing, set the size of the input data blocks in bytes. The
+ When compressing, set the size in bytes of the input data blocks. The
input file will be divided in chunks of this size before compression is
performed. Valid values range from 8 KiB to 1 GiB. Default value is
two times the dictionary size, except for option '-0' where it
@@ -224,10 +225,13 @@ once, the first time it appears in the command line.
'-d'
'--decompress'
- Decompress the files specified. If a file does not exist or can't be
- opened, plzip continues decompressing the rest of the files. If a file
- fails to decompress, or is a terminal, plzip exits immediately without
- decompressing the rest of the files.
+ Decompress the files specified. If a file does not exist, can't be
+ opened, or the destination file already exists and '--force' has not
+ been specified, plzip continues decompressing the rest of the files
+ and exits with error status 1. If a file fails to decompress, or is a
+ terminal, plzip exits immediately with error status 2 without
+ decompressing the rest of the files. A terminal is considered an
+ uncompressed file, and therefore invalid.
'-f'
'--force'
@@ -253,10 +257,12 @@ once, the first time it appears in the command line.
positions and sizes of each member in multimember files are also
printed.
- '-lq' can be used to verify quickly (without decompressing) the
- structural integrity of the files specified. (Use '--test' to verify
- the data integrity). '-alq' additionally verifies that none of the
- files specified contain trailing data.
+ If any file is damaged, does not exist, can't be opened, or is not
+ regular, the final exit status will be > 0. '-lq' can be used to verify
+ quickly (without decompressing) the structural integrity of the files
+ specified. (Use '--test' to verify the data integrity). '-alq'
+ additionally verifies that none of the files specified contain
+ trailing data.
'-m BYTES'
'--match-length=BYTES'
@@ -395,9 +401,10 @@ once, the first time it appears in the command line.
actually being used at run time and exit. Report any differences
found. Exit with error status 1 if differences are found. A mismatch
may indicate that lzlib is not correctly installed or that a different
- version of lzlib has been installed after compiling plzip.
+ version of lzlib has been installed after compiling plzip. Exit with
+ error status 2 if LZ_API_VERSION and LZ_version_string don't match.
'plzip -v --check-lib' shows the version of lzlib being used and the
- value of 'LZ_API_VERSION' (if defined). *Note Library version:
+ value of LZ_API_VERSION (if defined). *Note Library version:
(lzlib)Library version.
@@ -419,18 +426,19 @@ Y yottabyte (10^24) | Yi yobibyte (2^80)
Exit status: 0 for a normal exit, 1 for environmental problems (file not
found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or invalid
-input file, 3 for an internal consistency error (eg, bug) which caused
+input file, 3 for an internal consistency error (e.g., bug) which caused
plzip to panic.

-File: plzip.info, Node: Program design, Next: File format, Prev: Invoking plzip, Up: Top
+File: plzip.info, Node: Program design, Next: Memory requirements, Prev: Invoking plzip, Up: Top
4 Internal structure of plzip
*****************************
When compressing, plzip divides the input file into chunks and compresses as
many chunks simultaneously as worker threads are chosen, creating a
-multimember compressed file.
+multimember compressed file. Each chunk is compressed in-place (using the
+same buffer for input and output), reducing the amount of RAM required.
When decompressing, plzip decompresses as many members simultaneously as
worker threads are chosen. Files that were compressed with lzip will not be
@@ -448,14 +456,14 @@ to the workers. The workers (de)compress the blocks received from the
splitter. The muxer collects processed packets from the workers, and writes
them to the output file.
- ,------------,
+ .------------.
,-->| worker 0 |--,
| `------------' |
-,-------, ,----------, | ,------------, | ,-------, ,--------,
+.-------. .----------. | .------------. | .-------. .--------.
| input |-->| splitter |-+-->| worker 1 |--+-->| muxer |-->| output |
| file | `----------' | `------------' | `-------' | file |
`-------' | ... | `--------'
- | ,------------, |
+ | .------------. |
`-->| worker N-1 |--'
`------------'
@@ -467,82 +475,9 @@ reduced and the decompression speed of large files with many members is
only limited by the number of processors available and by I/O speed.

-File: plzip.info, Node: File format, Next: Memory requirements, Prev: Program design, Up: Top
-
-5 File format
-*************
-
-Perfection is reached, not when there is no longer anything to add, but
-when there is no longer anything to take away.
--- Antoine de Saint-Exupery
-
-
- In the diagram below, a box like this:
-
-+---+
-| | <-- the vertical bars might be missing
-+---+
-
- represents one byte; a box like this:
-
-+==============+
-| |
-+==============+
-
- represents a variable number of bytes.
-
-
- A lzip file consists of a series of "members" (compressed data sets).
-The members simply appear one after another in the file, with no additional
-information before, between, or after them.
-
- Each member has the following structure:
-
-+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
-+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
- All multibyte values are stored in little endian order.
-
-'ID string (the "magic" bytes)'
- A four byte string, identifying the lzip format, with the value "LZIP"
- (0x4C, 0x5A, 0x49, 0x50).
-
-'VN (version number, 1 byte)'
- Just in case something needs to be modified in the future. 1 for now.
-
-'DS (coded dictionary size, 1 byte)'
- The dictionary size is calculated by taking a power of 2 (the base
- size) and subtracting from it a fraction between 0/16 and 7/16 of the
- base size.
- Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).
- Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
- from the base size to obtain the dictionary size.
- Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB
- Valid values for dictionary size range from 4 KiB to 512 MiB.
-
-'LZMA stream'
- The LZMA stream, finished by an end of stream marker. Uses default
- values for encoder properties. *Note Stream format: (lzip)Stream
- format, for a complete description.
-
-'CRC32 (4 bytes)'
- Cyclic Redundancy Check (CRC) of the uncompressed original data.
-
-'Data size (8 bytes)'
- Size of the uncompressed original data.
-
-'Member size (8 bytes)'
- Total size of the member, including header and trailer. This field acts
- as a distributed index, allows the verification of stream integrity,
- and facilitates safe recovery of undamaged members from multimember
- files.
-
-
-
-File: plzip.info, Node: Memory requirements, Next: Minimum file sizes, Prev: File format, Up: Top
+File: plzip.info, Node: Memory requirements, Next: Minimum file sizes, Prev: Program design, Up: Top
-6 Memory required to compress and decompress
+5 Memory required to compress and decompress
********************************************
The amount of memory required *per worker thread* for decompression or
@@ -588,9 +523,9 @@ Level Memory required
-9 568 MiB

-File: plzip.info, Node: Minimum file sizes, Next: Trailing data, Prev: Memory requirements, Up: Top
+File: plzip.info, Node: Minimum file sizes, Next: File format, Prev: Memory requirements, Up: Top
-7 Minimum file sizes required for full compression speed
+6 Minimum file sizes required for full compression speed
********************************************************
When compressing, plzip divides the input file into chunks and compresses
@@ -625,7 +560,83 @@ Level
-9 128 MiB 256 MiB 512 MiB 1 GiB 4 GiB 16 GiB

-File: plzip.info, Node: Trailing data, Next: Examples, Prev: Minimum file sizes, Up: Top
+File: plzip.info, Node: File format, Next: Trailing data, Prev: Minimum file sizes, Up: Top
+
+7 File format
+*************
+
+Perfection is reached, not when there is no longer anything to add, but
+when there is no longer anything to take away.
+-- Antoine de Saint-Exupery
+
+
+ In the diagram below, a box like this:
+
++---+
+| | <-- the vertical bars might be missing
++---+
+
+ represents one byte; a box like this:
+
++==============+
+| |
++==============+
+
+ represents a variable number of bytes.
+
+
+ A lzip file consists of a series of independent "members" (compressed
+data sets). The members simply appear one after another in the file, with no
+additional information before, between, or after them. Each member can
+encode in compressed form up to 16 EiB - 1 byte of uncompressed data. The
+size of a multimember file is unlimited.
+
+ Each member has the following structure:
+
++--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
++--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ All multibyte values are stored in little endian order.
+
+'ID string (the "magic" bytes)'
+ A four byte string, identifying the lzip format, with the value "LZIP"
+ (0x4C, 0x5A, 0x49, 0x50).
+
+'VN (version number, 1 byte)'
+ Just in case something needs to be modified in the future. 1 for now.
+
+'DS (coded dictionary size, 1 byte)'
+ The dictionary size is calculated by taking a power of 2 (the base
+ size) and subtracting from it a fraction between 0/16 and 7/16 of the
+ base size.
+ Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).
+ Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
+ from the base size to obtain the dictionary size.
+ Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB
+ Valid values for dictionary size range from 4 KiB to 512 MiB.
+
+'LZMA stream'
+ The LZMA stream, finished by an "End Of Stream" marker. Uses default
+ values for encoder properties. *Note Stream format: (lzip)Stream
+ format, for a complete description.
+
+'CRC32 (4 bytes)'
+ Cyclic Redundancy Check (CRC) of the original uncompressed data.
+
+'Data size (8 bytes)'
+ Size of the original uncompressed data.
+
+'Member size (8 bytes)'
+ Total size of the member, including header and trailer. This field acts
+ as a distributed index, allows the verification of stream integrity,
+ and facilitates the safe recovery of undamaged members from
+ multimember files. Member size should be limited to 2 PiB to prevent
+ the data size field from overflowing.
+
+
+
+File: plzip.info, Node: Trailing data, Next: Examples, Prev: File format, Up: Top
8 Extra data appended to the file
*********************************
@@ -699,7 +710,7 @@ show the compression ratio.
plzip -v file
-Example 3: Like example 1 but the created 'file.lz' has a block size of
+Example 3: Like example 2 but the created 'file.lz' has a block size of
1 MiB. The compression ratio is not shown.
plzip -B 1MiB file
@@ -717,15 +728,7 @@ status.
plzip -tv file.lz
-Example 6: Compress a whole device in /dev/sdc and send the output to
-'file.lz'.
-
- plzip -c /dev/sdc > file.lz
- or
- plzip /dev/sdc -o file.lz
-
-
-Example 7: The right way of concatenating the decompressed output of two or
+Example 6: The right way of concatenating the decompressed output of two or
more compressed files. *Note Trailing data::.
Don't do this
@@ -734,17 +737,25 @@ more compressed files. *Note Trailing data::.
plzip -cd file1.lz file2.lz file3.lz
-Example 8: Decompress 'file.lz' partially until 10 KiB of decompressed data
+Example 7: Decompress 'file.lz' partially until 10 KiB of decompressed data
are produced.
plzip -cd file.lz | dd bs=1024 count=10
-Example 9: Decompress 'file.lz' partially from decompressed byte at offset
+Example 8: Decompress 'file.lz' partially from decompressed byte at offset
10000 to decompressed byte at offset 14999 (5000 bytes are produced).
plzip -cd file.lz | dd bs=1000 skip=10 count=5
+
+Example 9: Compress a whole device in /dev/sdc and send the output to
+'file.lz'.
+
+ plzip -c /dev/sdc > file.lz
+ or
+ plzip /dev/sdc -o file.lz
+

File: plzip.info, Node: Problems, Next: Concept index, Prev: Examples, Up: Top
@@ -758,7 +769,7 @@ eternity, if not longer.
If you find a bug in plzip, please send electronic mail to
<lzip-bug@nongnu.org>. Include the version number, which you can find by
-running 'plzip --version'.
+running 'plzip --version' and 'plzip -v --check-lib'.

File: plzip.info, Node: Concept index, Prev: Problems, Up: Top
@@ -787,22 +798,22 @@ Concept index

Tag Table:
-Node: Top222
-Node: Introduction1159
-Node: Output5788
-Node: Invoking plzip7351
-Ref: --trailing-error8146
-Ref: --data-size8384
-Node: Program design18364
-Node: File format20542
-Ref: coded-dict-size21840
-Node: Memory requirements22995
-Node: Minimum file sizes24677
-Node: Trailing data26693
-Node: Examples28961
-Ref: concat-example30556
-Node: Problems31153
-Node: Concept index31681
+Node: Top217
+Node: Introduction1156
+Node: Output5829
+Node: Invoking plzip7392
+Ref: --trailing-error8187
+Ref: --data-size8425
+Node: Program design18819
+Node: Memory requirements21122
+Node: Minimum file sizes22807
+Node: File format24821
+Ref: coded-dict-size26260
+Node: Trailing data27514
+Node: Examples29775
+Ref: concat-example31210
+Node: Problems31967
+Node: Concept index32522

End Tag Table
diff --git a/doc/plzip.texi b/doc/plzip.texi
index 26c0820..818ecf5 100644
--- a/doc/plzip.texi
+++ b/doc/plzip.texi
@@ -6,10 +6,10 @@
@finalout
@c %**end of header
-@set UPDATED 3 January 2021
-@set VERSION 1.9
+@set UPDATED 24 January 2022
+@set VERSION 1.10
-@dircategory Data Compression
+@dircategory Compression
@direntry
* Plzip: (plzip). Massively parallel implementation of lzip
@end direntry
@@ -40,9 +40,9 @@ This manual is for Plzip (version @value{VERSION}, @value{UPDATED}).
* Output:: Meaning of plzip's output
* Invoking plzip:: Command line interface
* Program design:: Internal structure of plzip
-* File format:: Detailed format of the compressed file
* Memory requirements:: Memory required to compress and decompress
* Minimum file sizes:: Minimum file sizes required for full speed
+* File format:: Detailed format of the compressed file
* Trailing data:: Extra data appended to the file
* Examples:: A small tutorial with examples
* Problems:: Reporting bugs
@@ -50,7 +50,7 @@ This manual is for Plzip (version @value{VERSION}, @value{UPDATED}).
@end menu
@sp 1
-Copyright @copyright{} 2009-2021 Antonio Diaz Diaz.
+Copyright @copyright{} 2009-2022 Antonio Diaz Diaz.
This manual is free documentation: you have unlimited permission to copy,
distribute, and modify it.
@@ -69,13 +69,14 @@ compatible with lzip 1.4 or newer. Plzip uses the compression library
@uref{http://www.nongnu.org/lzip/lzip.html,,Lzip}
is a lossless data compressor with a user interface similar to the one
of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
-chain-Algorithm' (LZMA) stream format, chosen to maximize safety and
-interoperability. Lzip can compress about as fast as gzip @w{(lzip -0)} or
-compress most files more than bzip2 @w{(lzip -9)}. Decompression speed is
-intermediate between gzip and bzip2. Lzip is better than gzip and bzip2 from
-a data recovery perspective. Lzip has been designed, written, and tested
-with great care to replace gzip and bzip2 as the standard general-purpose
-compressed format for unix-like systems.
+chain-Algorithm' (LZMA) stream format and provides a 3 factor integrity
+checking to maximize interoperability and optimize safety. Lzip can compress
+about as fast as gzip @w{(lzip -0)} or compress most files more than bzip2
+@w{(lzip -9)}. Decompression speed is intermediate between gzip and bzip2.
+Lzip is better than gzip and bzip2 from a data recovery perspective. Lzip
+has been designed, written, and tested with great care to replace gzip and
+bzip2 as the standard general-purpose compressed format for unix-like
+systems.
Plzip can compress/decompress large files on multiprocessor machines much
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
@@ -85,8 +86,8 @@ hundreds of processors, but on files of only a few MB plzip is no faster
than lzip. @xref{Minimum file sizes}.
For creation and manipulation of compressed tar archives
-@uref{http://www.nongnu.org/lzip/manual/tarlz_manual.html,,tarlz} can be
-more efficient than using tar and plzip because tarlz is able to keep the
+@uref{http://www.nongnu.org/lzip/manual/tarlz_manual.html,,tarlz} can be more
+efficient than using tar and plzip because tarlz is able to keep the
alignment between tar members and lzip members.
@ifnothtml
@xref{Top,tarlz manual,,tarlz}.
@@ -112,8 +113,8 @@ The lzip format is as simple as possible (but not simpler). The lzip
manual provides the source code of a simple decompressor along with a
detailed explanation of how it works, so that with the only help of the
lzip manual it would be possible for a digital archaeologist to extract
-the data from a lzip file long after quantum computers eventually render
-LZMA obsolete.
+the data from a lzip file long after quantum computers eventually
+render LZMA obsolete.
@item
Additionally the lzip reference implementation is copylefted, which
@@ -145,9 +146,9 @@ file from that of the compressed file as follows:
@item anyothername @tab becomes @tab anyothername.out
@end multitable
-(De)compressing a file is much like copying or moving it; therefore plzip
+(De)compressing a file is much like copying or moving it. Therefore plzip
preserves the access and modification dates, permissions, and, when
-possible, ownership of the file just as @samp{cp -p} does. (If the user ID or
+possible, ownership of the file just as @w{@samp{cp -p}} does. (If the user ID or
the group ID can't be duplicated, the file permission bits S_ISUID and
S_ISGID are cleared).
@@ -258,7 +259,7 @@ garbage that can be safely ignored. @xref{concat-example}.
@anchor{--data-size}
@item -B @var{bytes}
@itemx --data-size=@var{bytes}
-When compressing, set the size of the input data blocks in bytes. The
+When compressing, set the size in bytes of the input data blocks. The
input file will be divided in chunks of this size before compression is
performed. Valid values range from @w{8 KiB} to @w{1 GiB}. Default value
is two times the dictionary size, except for option @samp{-0} where it
@@ -276,10 +277,12 @@ overrides @samp{-o}. @samp{-c} has no effect when testing or listing.
@item -d
@itemx --decompress
-Decompress the files specified. If a file does not exist or can't be
-opened, plzip continues decompressing the rest of the files. If a file
-fails to decompress, or is a terminal, plzip exits immediately without
-decompressing the rest of the files.
+Decompress the files specified. If a file does not exist, can't be opened,
+or the destination file already exists and @samp{--force} has not been
+specified, plzip continues decompressing the rest of the files and exits with
+error status 1. If a file fails to decompress, or is a terminal, plzip exits
+immediately with error status 2 without decompressing the rest of the files.
+A terminal is considered an uncompressed file, and therefore invalid.
@item -f
@itemx --force
@@ -304,10 +307,11 @@ size, the number of members in the file, and the amount of trailing data (if
any) are also printed. With @samp{-vv}, the positions and sizes of each
member in multimember files are also printed.
-@samp{-lq} can be used to verify quickly (without decompressing) the
-structural integrity of the files specified. (Use @samp{--test} to verify
-the data integrity). @samp{-alq} additionally verifies that none of the
-files specified contain trailing data.
+If any file is damaged, does not exist, can't be opened, or is not regular,
+the final exit status will be @w{> 0}. @samp{-lq} can be used to verify
+quickly (without decompressing) the structural integrity of the files
+specified. (Use @samp{--test} to verify the data integrity). @samp{-alq}
+additionally verifies that none of the files specified contain trailing data.
@item -m @var{bytes}
@itemx --match-length=@var{bytes}
@@ -448,8 +452,9 @@ used to compile plzip with the version actually being used at run time and
exit. Report any differences found. Exit with error status 1 if differences
are found. A mismatch may indicate that lzlib is not correctly installed or
that a different version of lzlib has been installed after compiling plzip.
-@w{@samp{plzip -v --check-lib}} shows the version of lzlib being used and
-the value of @samp{LZ_API_VERSION} (if defined).
+Exit with error status 2 if LZ_API_VERSION and LZ_version_string don't
+match. @w{@samp{plzip -v --check-lib}} shows the version of lzlib being used
+and the value of LZ_API_VERSION (if defined).
@ifnothtml
@xref{Library version,,,lzlib}.
@end ifnothtml
@@ -475,9 +480,9 @@ Table of SI and binary prefixes (unit multipliers):
@sp 1
Exit status: 0 for a normal exit, 1 for environmental problems (file not
-found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or
-invalid input file, 3 for an internal consistency error (eg, bug) which
-caused plzip to panic.
+found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or invalid
+input file, 3 for an internal consistency error (e.g., bug) which caused
+plzip to panic.
@node Program design
@@ -486,7 +491,8 @@ caused plzip to panic.
When compressing, plzip divides the input file into chunks and compresses as
many chunks simultaneously as worker threads are chosen, creating a
-multimember compressed file.
+multimember compressed file. Each chunk is compressed in-place (using the
+same buffer for input and output), reducing the amount of RAM required.
When decompressing, plzip decompresses as many members simultaneously as
worker threads are chosen. Files that were compressed with lzip will not
@@ -505,14 +511,14 @@ splitter. The muxer collects processed packets from the workers, and
writes them to the output file.
@verbatim
- ,------------,
+ .------------.
,-->| worker 0 |--,
| `------------' |
-,-------, ,----------, | ,------------, | ,-------, ,--------,
+.-------. .----------. | .------------. | .-------. .--------.
| input |-->| splitter |-+-->| worker 1 |--+-->| muxer |-->| output |
| file | `----------' | `------------' | `-------' | file |
`-------' | ... | `--------'
- | ,------------, |
+ | .------------. |
`-->| worker N-1 |--'
`------------'
@end verbatim
@@ -525,92 +531,6 @@ reduced and the decompression speed of large files with many members is
only limited by the number of processors available and by I/O speed.
-@node File format
-@chapter File format
-@cindex file format
-
-Perfection is reached, not when there is no longer anything to add, but
-when there is no longer anything to take away.@*
---- Antoine de Saint-Exupery
-
-@sp 1
-In the diagram below, a box like this:
-
-@verbatim
-+---+
-| | <-- the vertical bars might be missing
-+---+
-@end verbatim
-
-represents one byte; a box like this:
-
-@verbatim
-+==============+
-| |
-+==============+
-@end verbatim
-
-represents a variable number of bytes.
-
-@sp 1
-A lzip file consists of a series of "members" (compressed data sets).
-The members simply appear one after another in the file, with no
-additional information before, between, or after them.
-
-Each member has the following structure:
-
-@verbatim
-+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
-+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-@end verbatim
-
-All multibyte values are stored in little endian order.
-
-@table @samp
-@item ID string (the "magic" bytes)
-A four byte string, identifying the lzip format, with the value "LZIP"
-(0x4C, 0x5A, 0x49, 0x50).
-
-@item VN (version number, 1 byte)
-Just in case something needs to be modified in the future. 1 for now.
-
-@anchor{coded-dict-size}
-@item DS (coded dictionary size, 1 byte)
-The dictionary size is calculated by taking a power of 2 (the base size)
-and subtracting from it a fraction between 0/16 and 7/16 of the base size.@*
-Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).@*
-Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
-from the base size to obtain the dictionary size.@*
-Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB@*
-Valid values for dictionary size range from 4 KiB to 512 MiB.
-
-@item LZMA stream
-The LZMA stream, finished by an end of stream marker. Uses default values
-for encoder properties.
-@ifnothtml
-@xref{Stream format,,,lzip},
-@end ifnothtml
-@ifhtml
-See
-@uref{http://www.nongnu.org/lzip/manual/lzip_manual.html#Stream-format,,Stream format}
-@end ifhtml
-for a complete description.
-
-@item CRC32 (4 bytes)
-Cyclic Redundancy Check (CRC) of the uncompressed original data.
-
-@item Data size (8 bytes)
-Size of the uncompressed original data.
-
-@item Member size (8 bytes)
-Total size of the member, including header and trailer. This field acts
-as a distributed index, allows the verification of stream integrity, and
-facilitates safe recovery of undamaged members from multimember files.
-
-@end table
-
-
@node Memory requirements
@chapter Memory required to compress and decompress
@cindex memory requirements
@@ -709,6 +629,96 @@ data size for each level:
@end multitable
+@node File format
+@chapter File format
+@cindex file format
+
+Perfection is reached, not when there is no longer anything to add, but
+when there is no longer anything to take away.@*
+--- Antoine de Saint-Exupery
+
+@sp 1
+In the diagram below, a box like this:
+
+@verbatim
++---+
+| | <-- the vertical bars might be missing
++---+
+@end verbatim
+
+represents one byte; a box like this:
+
+@verbatim
++==============+
+| |
++==============+
+@end verbatim
+
+represents a variable number of bytes.
+
+@sp 1
+A lzip file consists of a series of independent "members" (compressed data
+sets). The members simply appear one after another in the file, with no
+additional information before, between, or after them. Each member can
+encode in compressed form up to @w{16 EiB - 1 byte} of uncompressed data.
+The size of a multimember file is unlimited.
+
+Each member has the following structure:
+
+@verbatim
++--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
++--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+@end verbatim
+
+All multibyte values are stored in little endian order.
+
+@table @samp
+@item ID string (the "magic" bytes)
+A four byte string, identifying the lzip format, with the value "LZIP"
+(0x4C, 0x5A, 0x49, 0x50).
+
+@item VN (version number, 1 byte)
+Just in case something needs to be modified in the future. 1 for now.
+
+@anchor{coded-dict-size}
+@item DS (coded dictionary size, 1 byte)
+The dictionary size is calculated by taking a power of 2 (the base size)
+and subtracting from it a fraction between 0/16 and 7/16 of the base size.@*
+Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).@*
+Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
+from the base size to obtain the dictionary size.@*
+Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB@*
+Valid values for dictionary size range from 4 KiB to 512 MiB.
+
+@item LZMA stream
+The LZMA stream, finished by an "End Of Stream" marker. Uses default values
+for encoder properties.
+@ifnothtml
+@xref{Stream format,,,lzip},
+@end ifnothtml
+@ifhtml
+See
+@uref{http://www.nongnu.org/lzip/manual/lzip_manual.html#Stream-format,,Stream format}
+@end ifhtml
+for a complete description.
+
+@item CRC32 (4 bytes)
+Cyclic Redundancy Check (CRC) of the original uncompressed data.
+
+@item Data size (8 bytes)
+Size of the original uncompressed data.
+
+@item Member size (8 bytes)
+Total size of the member, including header and trailer. This field acts
+as a distributed index, allows the verification of stream integrity, and
+facilitates the safe recovery of undamaged members from multimember files.
+Member size should be limited to @w{2 PiB} to prevent the data size field
+from overflowing.
+
+@end table
+
+
@node Trailing data
@chapter Extra data appended to the file
@cindex trailing data
@@ -795,7 +805,7 @@ plzip -v file
@sp 1
@noindent
-Example 3: Like example 1 but the created @samp{file.lz} has a block size of
+Example 3: Like example 2 but the created @samp{file.lz} has a block size of
@w{1 MiB}. The compression ratio is not shown.
@example
@@ -821,20 +831,9 @@ plzip -tv file.lz
@end example
@sp 1
-@noindent
-Example 6: Compress a whole device in /dev/sdc and send the output to
-@samp{file.lz}.
-
-@example
- plzip -c /dev/sdc > file.lz
-or
- plzip /dev/sdc -o file.lz
-@end example
-
-@sp 1
@anchor{concat-example}
@noindent
-Example 7: The right way of concatenating the decompressed output of two or
+Example 6: The right way of concatenating the decompressed output of two or
more compressed files. @xref{Trailing data}.
@example
@@ -846,7 +845,7 @@ Do this instead
@sp 1
@noindent
-Example 8: Decompress @samp{file.lz} partially until @w{10 KiB} of
+Example 7: Decompress @samp{file.lz} partially until @w{10 KiB} of
decompressed data are produced.
@example
@@ -855,13 +854,24 @@ plzip -cd file.lz | dd bs=1024 count=10
@sp 1
@noindent
-Example 9: Decompress @samp{file.lz} partially from decompressed byte at
+Example 8: Decompress @samp{file.lz} partially from decompressed byte at
offset 10000 to decompressed byte at offset 14999 (5000 bytes are produced).
@example
plzip -cd file.lz | dd bs=1000 skip=10 count=5
@end example
+@sp 1
+@noindent
+Example 9: Compress a whole device in /dev/sdc and send the output to
+@samp{file.lz}.
+
+@example
+ plzip -c /dev/sdc > file.lz
+or
+ plzip /dev/sdc -o file.lz
+@end example
+
@node Problems
@chapter Reporting bugs
@@ -875,7 +885,8 @@ for all eternity, if not longer.
If you find a bug in plzip, please send electronic mail to
@email{lzip-bug@@nongnu.org}. Include the version number, which you can
-find by running @w{@samp{plzip --version}}.
+find by running @w{@samp{plzip --version}} and
+@w{@samp{plzip -v --check-lib}}.
@node Concept index