summaryrefslogtreecommitdiffstats
path: root/doc/plzip.info
diff options
context:
space:
mode:
Diffstat (limited to 'doc/plzip.info')
-rw-r--r--doc/plzip.info206
1 files changed, 161 insertions, 45 deletions
diff --git a/doc/plzip.info b/doc/plzip.info
index 474db91..a814b3f 100644
--- a/doc/plzip.info
+++ b/doc/plzip.info
@@ -11,7 +11,7 @@ File: plzip.info, Node: Top, Next: Introduction, Up: (dir)
Plzip Manual
************
-This manual is for Plzip (version 1.4, 9 July 2015).
+This manual is for Plzip (version 1.5, 14 May 2016).
* Menu:
@@ -21,11 +21,13 @@ This manual is for Plzip (version 1.4, 9 July 2015).
* File format:: Detailed format of the compressed file
* Memory requirements:: Memory required to compress and decompress
* Minimum file sizes:: Minimum file sizes required for full speed
+* Trailing data:: Extra data appended to the file
+* Examples:: A small tutorial with examples
* Problems:: Reporting bugs
* Concept index:: Index of concepts
- Copyright (C) 2009-2015 Antonio Diaz Diaz.
+ Copyright (C) 2009-2016 Antonio Diaz Diaz.
This manual is free documentation: you have unlimited permission to
copy, distribute and modify it.
@@ -59,7 +61,7 @@ availability:
recovery means. The lziprecover program can repair bit-flip errors
(one of the most common forms of data corruption) in lzip files,
and provides data recovery capabilities, including error-checked
- merging of damaged copies of a file. *note Data safety:
+ merging of damaged copies of a file. *Note Data safety:
(lziprecover)Data safety.
* The lzip format is as simple as possible (but not simpler). The
@@ -115,13 +117,6 @@ two or more compressed files. The result is the concatenation of the
corresponding uncompressed files. Integrity testing of concatenated
compressed files is also supported.
- WARNING! Even if plzip is bug-free, other causes may result in a
-corrupt compressed file (bugs in the system libraries, memory errors,
-etc). Therefore, if the data you are going to compress are important,
-give the '--keep' option to plzip and do not remove the original file
-until you verify the compressed file with a command like
-'plzip -cd file.lz | cmp file -'.
-

File: plzip.info, Node: Invoking plzip, Next: Program design, Prev: Introduction, Up: Top
@@ -132,6 +127,10 @@ The format for running plzip is:
plzip [OPTIONS] [FILES]
+'-' used as a FILE argument means standard input. It can be mixed with
+other FILES and is read just once, the first time it appears in the
+command line.
+
Plzip supports the following options:
'-h'
@@ -142,6 +141,13 @@ The format for running plzip is:
'--version'
Print the version number of plzip on the standard output and exit.
+'-a'
+'--trailing-error'
+ Exit with error status 2 if any remaining input is detected after
+ decompressing the last member. Such remaining input is usually
+ trailing garbage that can be safely ignored. *Note
+ concat-example::.
+
'-B BYTES'
'--data-size=BYTES'
Set the size of the input data blocks, in bytes. The input file
@@ -153,12 +159,17 @@ The format for running plzip is:
'-c'
'--stdout'
- Compress or decompress to standard output. Needed when reading
- from a named pipe (fifo) or from a device.
+ Compress or decompress to standard output; keep input files
+ unchanged. If compressing several files, each file is compressed
+ independently. This option is needed when reading from a named
+ pipe (fifo) or from a device.
'-d'
'--decompress'
- Decompress.
+ Decompress the specified file(s). If a file does not exist or
+ can't be opened, plzip continues decompressing the rest of the
+ files. If a file fails to decompress, plzip exits immediately
+ without decompressing the rest of the files.
'-f'
'--force'
@@ -207,12 +218,13 @@ The format for running plzip is:
'-s BYTES'
'--dictionary-size=BYTES'
- Set the dictionary size limit in bytes. Valid values range from 4
- KiB to 512 MiB. Plzip will use the smallest possible dictionary
- size for each file without exceeding this limit. Note that
- dictionary sizes are quantized. If the specified size does not
- match one of the valid sizes, it will be rounded upwards by adding
- up to (BYTES / 16) to it.
+ Set the dictionary size limit in bytes. Plzip will use the smallest
+ possible dictionary size for each file without exceeding this
+ limit. Valid values range from 4 KiB to 512 MiB. Values 12 to 29
+ are interpreted as powers of two, meaning 2^12 to 2^29 bytes. Note
+ that dictionary sizes are quantized. If the specified size does
+ not match one of the valid sizes, it will be rounded upwards by
+ adding up to (BYTES / 8) to it.
For maximum compression you should use a dictionary size limit as
large as possible, but keep in mind that the decompression memory
@@ -224,7 +236,8 @@ The format for running plzip is:
Check integrity of the specified file(s), but don't decompress
them. This really performs a trial decompression and throws away
the result. Use it together with '-v' to see information about
- the file.
+ the file(s). If a file fails the test, plzip may be unable to
+ check the rest of the files.
'-v'
'--verbose'
@@ -237,14 +250,14 @@ The format for running plzip is:
'-0 .. -9'
Set the compression parameters (dictionary size and match length
- limit) as shown in the table below. Note that '-9' can be much
- slower than '-0'. These options have no effect when decompressing.
+ limit) as shown in the table below. The default compression level
+ is '-6'. Note that '-9' can be much slower than '-0'. These
+ options have no effect when decompressing.
The bidimensional parameter space of LZMA can't be mapped to a
linear scale optimal for all files. If your files are large, very
- repetitive, etc, you may need to use the '--match-length' and
- '--dictionary-size' options directly to achieve optimal
- performance.
+ repetitive, etc, you may need to use the '--dictionary-size' and
+ '--match-length' options directly to achieve optimal performance.
Level Dictionary size Match length limit
-0 64 KiB 16 bytes
@@ -292,7 +305,7 @@ File: plzip.info, Node: Program design, Next: File format, Prev: Invoking plz
When compressing, plzip divides the input file into chunks and
compresses as many chunks simultaneously as worker threads are chosen,
-creating a multi-member compressed file.
+creating a multimember compressed file.
When decompressing, plzip decompresses as many members
simultaneously as worker threads are chosen. Files that were compressed
@@ -348,12 +361,12 @@ additional information before, between, or after them.
Each member has the following structure:
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-| ID string | VN | DS | Lzma stream | CRC32 | Data size | Member size |
+| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
All multibyte values are stored in little endian order.
-'ID string'
+'ID string (the "magic" bytes)'
A four byte string, identifying the lzip format, with the value
"LZIP" (0x4C, 0x5A, 0x49, 0x50).
@@ -371,8 +384,8 @@ additional information before, between, or after them.
Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB
Valid values for dictionary size range from 4 KiB to 512 MiB.
-'Lzma stream'
- The lzma stream, finished by an end of stream marker. Uses default
+'LZMA stream'
+ The LZMA stream, finished by an end of stream marker. Uses default
values for encoder properties. *Note Stream format: (lzip)Stream
format, for a complete description.
@@ -386,7 +399,7 @@ additional information before, between, or after them.
Total size of the member, including header and trailer. This field
acts as a distributed index, allows the verification of stream
integrity, and facilitates safe recovery of undamaged members from
- multi-member files.
+ multimember files.

@@ -408,7 +421,7 @@ following:
file, or for testing of a regular file; the dictionary size.
(Note that regular files with more than 1024 bytes of trailing
- garbage are treated as non-seekable).
+ data are treated as non-seekable).
* For testing of a non-seekable file or of standard input; the
dictionary size plus up to 5 MiB.
@@ -420,14 +433,14 @@ following:
dictionary size plus up to 35 MiB.

-File: plzip.info, Node: Minimum file sizes, Next: Problems, Prev: Memory requirements, Up: Top
+File: plzip.info, Node: Minimum file sizes, Next: Trailing data, Prev: Memory requirements, Up: Top
6 Minimum file sizes required for full compression speed
********************************************************
When compressing, plzip divides the input file into chunks and
compresses as many chunks simultaneously as worker threads are chosen,
-creating a multi-member compressed file.
+creating a multimember compressed file.
For this to work as expected (and roughly multiply the compression
speed by the number of available processors), the uncompressed file
@@ -456,9 +469,106 @@ Level
-9 128 MiB 256 MiB 512 MiB 1 GiB 4 GiB 16 GiB

-File: plzip.info, Node: Problems, Next: Concept index, Prev: Minimum file sizes, Up: Top
+File: plzip.info, Node: Trailing data, Next: Examples, Prev: Minimum file sizes, Up: Top
+
+7 Extra data appended to the file
+*********************************
+
+Sometimes extra data is found appended to a lzip file after the last
+member. Such trailing data may be:
+
+ * Padding added to make the file size a multiple of some block size,
+ for example when writing to a tape.
+
+ * Garbage added by some not totally successful copy operation.
+
+ * Useful data added by the user; a cryptographically secure hash, a
+ description of file contents, etc.
+
+ * Malicious data added to the file in order to make its total size
+ and hash value (for a chosen hash) coincide with those of another
+ file.
+
+ * In very rare cases, trailing data could be the corrupt header of
+ another member. In multimember or concatenated files the
+ probability of corruption happening in the magic bytes is 5 times
+ smaller than the probability of getting a false positive caused by
+ the corruption of the integrity information itself. Therefore it
+ can be considered to be below the noise level.
+
+ Trailing data can be safely ignored in most cases. In some cases,
+like that of user-added data, it is expected to be ignored. In those
+cases where a file containing trailing data must be rejected, the option
+'--trailing-error' can be used. *Note --trailing-error::.
+
+
+File: plzip.info, Node: Examples, Next: Problems, Prev: Trailing data, Up: Top
+
+8 A small tutorial with examples
+********************************
+
+WARNING! Even if plzip is bug-free, other causes may result in a corrupt
+compressed file (bugs in the system libraries, memory errors, etc).
+Therefore, if the data you are going to compress are important, give the
+'--keep' option to plzip and don't remove the original file until you
+verify the compressed file with a command like
+'plzip -cd file.lz | cmp file -'.
+
+
+Example 1: Replace a regular file with its compressed version 'file.lz'
+and show the compression ratio.
+
+ plzip -v file
+
+
+Example 2: Like example 1 but the created 'file.lz' has a block size of
+1 MiB. The compression ratio is not shown.
+
+ plzip -B 1MiB file
+
+
+Example 3: Restore a regular file from its compressed version
+'file.lz'. If the operation is successful, 'file.lz' is removed.
+
+ plzip -d file.lz
+
+
+Example 4: Verify the integrity of the compressed file 'file.lz' and
+show status.
+
+ plzip -tv file.lz
+
+
+Example 5: Compress a whole device in /dev/sdc and send the output to
+'file.lz'.
+
+ plzip -c /dev/sdc > file.lz
+
+
+Example 6: The right way of concatenating compressed files. *Note
+Trailing data::.
+
+ Don't do this
+ cat file1.lz file2.lz file3.lz | plzip -d
+ Do this instead
+ plzip -cd file1.lz file2.lz file3.lz
+
+
+Example 7: Decompress 'file.lz' partially until 10 KiB of decompressed
+data are produced.
+
+ plzip -cd file.lz | dd bs=1024 count=10
+
+
+Example 8: Decompress 'file.lz' partially from decompressed byte 10000
+to decompressed byte 15000 (5000 bytes are produced).
+
+ plzip -cd file.lz | dd bs=1000 skip=10 count=5
+
+
+File: plzip.info, Node: Problems, Next: Concept index, Prev: Examples, Up: Top
-7 Reporting bugs
+9 Reporting bugs
****************
There are probably bugs in plzip. There are certainly errors and
@@ -480,6 +590,7 @@ Concept index
* Menu:
* bugs: Problems. (line 6)
+* examples: Examples. (line 6)
* file format: File format. (line 6)
* getting help: Problems. (line 6)
* introduction: Introduction. (line 6)
@@ -488,6 +599,7 @@ Concept index
* minimum file sizes: Minimum file sizes. (line 6)
* options: Invoking plzip. (line 6)
* program design: Program design. (line 6)
+* trailing data: Trailing data. (line 6)
* usage: Invoking plzip. (line 6)
* version: Invoking plzip. (line 6)
@@ -495,15 +607,19 @@ Concept index

Tag Table:
Node: Top221
-Node: Introduction984
-Node: Invoking plzip5332
-Ref: --data-size5747
-Node: Program design10972
-Node: File format12560
-Node: Memory requirements14973
-Node: Minimum file sizes16085
-Node: Problems18007
-Node: Concept index18543
+Node: Introduction1101
+Node: Invoking plzip5078
+Ref: --trailing-error5647
+Ref: --data-size5890
+Node: Program design11683
+Node: File format13270
+Node: Memory requirements15702
+Node: Minimum file sizes16811
+Node: Trailing data18737
+Node: Examples20121
+Ref: concat-example21286
+Node: Problems21823
+Node: Concept index22349

End Tag Table