diff options
Diffstat (limited to 'doc/plzip.info')
-rw-r--r-- | doc/plzip.info | 206 |
1 files changed, 161 insertions, 45 deletions
diff --git a/doc/plzip.info b/doc/plzip.info index 474db91..a814b3f 100644 --- a/doc/plzip.info +++ b/doc/plzip.info @@ -11,7 +11,7 @@ File: plzip.info, Node: Top, Next: Introduction, Up: (dir) Plzip Manual ************ -This manual is for Plzip (version 1.4, 9 July 2015). +This manual is for Plzip (version 1.5, 14 May 2016). * Menu: @@ -21,11 +21,13 @@ This manual is for Plzip (version 1.4, 9 July 2015). * File format:: Detailed format of the compressed file * Memory requirements:: Memory required to compress and decompress * Minimum file sizes:: Minimum file sizes required for full speed +* Trailing data:: Extra data appended to the file +* Examples:: A small tutorial with examples * Problems:: Reporting bugs * Concept index:: Index of concepts - Copyright (C) 2009-2015 Antonio Diaz Diaz. + Copyright (C) 2009-2016 Antonio Diaz Diaz. This manual is free documentation: you have unlimited permission to copy, distribute and modify it. @@ -59,7 +61,7 @@ availability: recovery means. The lziprecover program can repair bit-flip errors (one of the most common forms of data corruption) in lzip files, and provides data recovery capabilities, including error-checked - merging of damaged copies of a file. *note Data safety: + merging of damaged copies of a file. *Note Data safety: (lziprecover)Data safety. * The lzip format is as simple as possible (but not simpler). The @@ -115,13 +117,6 @@ two or more compressed files. The result is the concatenation of the corresponding uncompressed files. Integrity testing of concatenated compressed files is also supported. - WARNING! Even if plzip is bug-free, other causes may result in a -corrupt compressed file (bugs in the system libraries, memory errors, -etc). Therefore, if the data you are going to compress are important, -give the '--keep' option to plzip and do not remove the original file -until you verify the compressed file with a command like -'plzip -cd file.lz | cmp file -'. - File: plzip.info, Node: Invoking plzip, Next: Program design, Prev: Introduction, Up: Top @@ -132,6 +127,10 @@ The format for running plzip is: plzip [OPTIONS] [FILES] +'-' used as a FILE argument means standard input. It can be mixed with +other FILES and is read just once, the first time it appears in the +command line. + Plzip supports the following options: '-h' @@ -142,6 +141,13 @@ The format for running plzip is: '--version' Print the version number of plzip on the standard output and exit. +'-a' +'--trailing-error' + Exit with error status 2 if any remaining input is detected after + decompressing the last member. Such remaining input is usually + trailing garbage that can be safely ignored. *Note + concat-example::. + '-B BYTES' '--data-size=BYTES' Set the size of the input data blocks, in bytes. The input file @@ -153,12 +159,17 @@ The format for running plzip is: '-c' '--stdout' - Compress or decompress to standard output. Needed when reading - from a named pipe (fifo) or from a device. + Compress or decompress to standard output; keep input files + unchanged. If compressing several files, each file is compressed + independently. This option is needed when reading from a named + pipe (fifo) or from a device. '-d' '--decompress' - Decompress. + Decompress the specified file(s). If a file does not exist or + can't be opened, plzip continues decompressing the rest of the + files. If a file fails to decompress, plzip exits immediately + without decompressing the rest of the files. '-f' '--force' @@ -207,12 +218,13 @@ The format for running plzip is: '-s BYTES' '--dictionary-size=BYTES' - Set the dictionary size limit in bytes. Valid values range from 4 - KiB to 512 MiB. Plzip will use the smallest possible dictionary - size for each file without exceeding this limit. Note that - dictionary sizes are quantized. If the specified size does not - match one of the valid sizes, it will be rounded upwards by adding - up to (BYTES / 16) to it. + Set the dictionary size limit in bytes. Plzip will use the smallest + possible dictionary size for each file without exceeding this + limit. Valid values range from 4 KiB to 512 MiB. Values 12 to 29 + are interpreted as powers of two, meaning 2^12 to 2^29 bytes. Note + that dictionary sizes are quantized. If the specified size does + not match one of the valid sizes, it will be rounded upwards by + adding up to (BYTES / 8) to it. For maximum compression you should use a dictionary size limit as large as possible, but keep in mind that the decompression memory @@ -224,7 +236,8 @@ The format for running plzip is: Check integrity of the specified file(s), but don't decompress them. This really performs a trial decompression and throws away the result. Use it together with '-v' to see information about - the file. + the file(s). If a file fails the test, plzip may be unable to + check the rest of the files. '-v' '--verbose' @@ -237,14 +250,14 @@ The format for running plzip is: '-0 .. -9' Set the compression parameters (dictionary size and match length - limit) as shown in the table below. Note that '-9' can be much - slower than '-0'. These options have no effect when decompressing. + limit) as shown in the table below. The default compression level + is '-6'. Note that '-9' can be much slower than '-0'. These + options have no effect when decompressing. The bidimensional parameter space of LZMA can't be mapped to a linear scale optimal for all files. If your files are large, very - repetitive, etc, you may need to use the '--match-length' and - '--dictionary-size' options directly to achieve optimal - performance. + repetitive, etc, you may need to use the '--dictionary-size' and + '--match-length' options directly to achieve optimal performance. Level Dictionary size Match length limit -0 64 KiB 16 bytes @@ -292,7 +305,7 @@ File: plzip.info, Node: Program design, Next: File format, Prev: Invoking plz When compressing, plzip divides the input file into chunks and compresses as many chunks simultaneously as worker threads are chosen, -creating a multi-member compressed file. +creating a multimember compressed file. When decompressing, plzip decompresses as many members simultaneously as worker threads are chosen. Files that were compressed @@ -348,12 +361,12 @@ additional information before, between, or after them. Each member has the following structure: +--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -| ID string | VN | DS | Lzma stream | CRC32 | Data size | Member size | +| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size | +--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ All multibyte values are stored in little endian order. -'ID string' +'ID string (the "magic" bytes)' A four byte string, identifying the lzip format, with the value "LZIP" (0x4C, 0x5A, 0x49, 0x50). @@ -371,8 +384,8 @@ additional information before, between, or after them. Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB Valid values for dictionary size range from 4 KiB to 512 MiB. -'Lzma stream' - The lzma stream, finished by an end of stream marker. Uses default +'LZMA stream' + The LZMA stream, finished by an end of stream marker. Uses default values for encoder properties. *Note Stream format: (lzip)Stream format, for a complete description. @@ -386,7 +399,7 @@ additional information before, between, or after them. Total size of the member, including header and trailer. This field acts as a distributed index, allows the verification of stream integrity, and facilitates safe recovery of undamaged members from - multi-member files. + multimember files. @@ -408,7 +421,7 @@ following: file, or for testing of a regular file; the dictionary size. (Note that regular files with more than 1024 bytes of trailing - garbage are treated as non-seekable). + data are treated as non-seekable). * For testing of a non-seekable file or of standard input; the dictionary size plus up to 5 MiB. @@ -420,14 +433,14 @@ following: dictionary size plus up to 35 MiB. -File: plzip.info, Node: Minimum file sizes, Next: Problems, Prev: Memory requirements, Up: Top +File: plzip.info, Node: Minimum file sizes, Next: Trailing data, Prev: Memory requirements, Up: Top 6 Minimum file sizes required for full compression speed ******************************************************** When compressing, plzip divides the input file into chunks and compresses as many chunks simultaneously as worker threads are chosen, -creating a multi-member compressed file. +creating a multimember compressed file. For this to work as expected (and roughly multiply the compression speed by the number of available processors), the uncompressed file @@ -456,9 +469,106 @@ Level -9 128 MiB 256 MiB 512 MiB 1 GiB 4 GiB 16 GiB -File: plzip.info, Node: Problems, Next: Concept index, Prev: Minimum file sizes, Up: Top +File: plzip.info, Node: Trailing data, Next: Examples, Prev: Minimum file sizes, Up: Top + +7 Extra data appended to the file +********************************* + +Sometimes extra data is found appended to a lzip file after the last +member. Such trailing data may be: + + * Padding added to make the file size a multiple of some block size, + for example when writing to a tape. + + * Garbage added by some not totally successful copy operation. + + * Useful data added by the user; a cryptographically secure hash, a + description of file contents, etc. + + * Malicious data added to the file in order to make its total size + and hash value (for a chosen hash) coincide with those of another + file. + + * In very rare cases, trailing data could be the corrupt header of + another member. In multimember or concatenated files the + probability of corruption happening in the magic bytes is 5 times + smaller than the probability of getting a false positive caused by + the corruption of the integrity information itself. Therefore it + can be considered to be below the noise level. + + Trailing data can be safely ignored in most cases. In some cases, +like that of user-added data, it is expected to be ignored. In those +cases where a file containing trailing data must be rejected, the option +'--trailing-error' can be used. *Note --trailing-error::. + + +File: plzip.info, Node: Examples, Next: Problems, Prev: Trailing data, Up: Top + +8 A small tutorial with examples +******************************** + +WARNING! Even if plzip is bug-free, other causes may result in a corrupt +compressed file (bugs in the system libraries, memory errors, etc). +Therefore, if the data you are going to compress are important, give the +'--keep' option to plzip and don't remove the original file until you +verify the compressed file with a command like +'plzip -cd file.lz | cmp file -'. + + +Example 1: Replace a regular file with its compressed version 'file.lz' +and show the compression ratio. + + plzip -v file + + +Example 2: Like example 1 but the created 'file.lz' has a block size of +1 MiB. The compression ratio is not shown. + + plzip -B 1MiB file + + +Example 3: Restore a regular file from its compressed version +'file.lz'. If the operation is successful, 'file.lz' is removed. + + plzip -d file.lz + + +Example 4: Verify the integrity of the compressed file 'file.lz' and +show status. + + plzip -tv file.lz + + +Example 5: Compress a whole device in /dev/sdc and send the output to +'file.lz'. + + plzip -c /dev/sdc > file.lz + + +Example 6: The right way of concatenating compressed files. *Note +Trailing data::. + + Don't do this + cat file1.lz file2.lz file3.lz | plzip -d + Do this instead + plzip -cd file1.lz file2.lz file3.lz + + +Example 7: Decompress 'file.lz' partially until 10 KiB of decompressed +data are produced. + + plzip -cd file.lz | dd bs=1024 count=10 + + +Example 8: Decompress 'file.lz' partially from decompressed byte 10000 +to decompressed byte 15000 (5000 bytes are produced). + + plzip -cd file.lz | dd bs=1000 skip=10 count=5 + + +File: plzip.info, Node: Problems, Next: Concept index, Prev: Examples, Up: Top -7 Reporting bugs +9 Reporting bugs **************** There are probably bugs in plzip. There are certainly errors and @@ -480,6 +590,7 @@ Concept index * Menu: * bugs: Problems. (line 6) +* examples: Examples. (line 6) * file format: File format. (line 6) * getting help: Problems. (line 6) * introduction: Introduction. (line 6) @@ -488,6 +599,7 @@ Concept index * minimum file sizes: Minimum file sizes. (line 6) * options: Invoking plzip. (line 6) * program design: Program design. (line 6) +* trailing data: Trailing data. (line 6) * usage: Invoking plzip. (line 6) * version: Invoking plzip. (line 6) @@ -495,15 +607,19 @@ Concept index Tag Table: Node: Top221 -Node: Introduction984 -Node: Invoking plzip5332 -Ref: --data-size5747 -Node: Program design10972 -Node: File format12560 -Node: Memory requirements14973 -Node: Minimum file sizes16085 -Node: Problems18007 -Node: Concept index18543 +Node: Introduction1101 +Node: Invoking plzip5078 +Ref: --trailing-error5647 +Ref: --data-size5890 +Node: Program design11683 +Node: File format13270 +Node: Memory requirements15702 +Node: Minimum file sizes16811 +Node: Trailing data18737 +Node: Examples20121 +Ref: concat-example21286 +Node: Problems21823 +Node: Concept index22349 End Tag Table |