summaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc')
-rw-r--r--doc/lziprecover.143
-rw-r--r--doc/lziprecover.info354
-rw-r--r--doc/lziprecover.texi379
3 files changed, 424 insertions, 352 deletions
diff --git a/doc/lziprecover.1 b/doc/lziprecover.1
index e05a366..1f26b81 100644
--- a/doc/lziprecover.1
+++ b/doc/lziprecover.1
@@ -1,5 +1,5 @@
-.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.16.
-.TH LZIPRECOVER "1" "January 2022" "lziprecover 1.23" "User Commands"
+.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.49.2.
+.TH LZIPRECOVER "1" "June 2023" "lziprecover 1.24-pre1" "User Commands"
.SH NAME
lziprecover \- recovers data from damaged lzip files
.SH SYNOPSIS
@@ -45,7 +45,7 @@ convert lzma\-alone files to lzip format
write to standard output, keep input files
.TP
\fB\-d\fR, \fB\-\-decompress\fR
-decompress
+decompress, test compressed file integrity
.TP
\fB\-D\fR, \fB\-\-range\-decompress=\fR<n\-m>
decompress a range of bytes to stdout
@@ -83,8 +83,8 @@ place the output into <file>
\fB\-q\fR, \fB\-\-quiet\fR
suppress all messages
.TP
-\fB\-R\fR, \fB\-\-repair\fR
-try to repair a small error in file
+\fB\-R\fR, \fB\-\-byte\-repair\fR
+try to repair a corrupt byte in file
.TP
\fB\-s\fR, \fB\-\-split\fR
split multimember file in single\-member files
@@ -95,17 +95,26 @@ test compressed file integrity
\fB\-v\fR, \fB\-\-verbose\fR
be verbose (a 2nd \fB\-v\fR gives more)
.TP
-\fB\-\-loose\-trailing\fR
-allow trailing data seeming corrupt header
-.TP
-\fB\-\-dump=\fR<list>:d:t
-dump members listed/damaged, tdata to stdout
+\fB\-\-dump=\fR<list>:d:e:t
+dump members, damaged/empty, tdata to stdout
.TP
-\fB\-\-remove=\fR<list>:d:t
+\fB\-\-remove=\fR<list>:d:e:t
remove members, tdata from files in place
.TP
-\fB\-\-strip=\fR<list>:d:t
+\fB\-\-strip=\fR<list>:d:e:t
copy files to stdout stripping members given
+.TP
+\fB\-\-empty\-error\fR
+exit with error status if empty member in file
+.TP
+\fB\-\-marking\-error\fR
+exit with error status if 1st LZMA byte not 0
+.TP
+\fB\-\-loose\-trailing\fR
+allow trailing data seeming corrupt header
+.TP
+\fB\-\-clear\-marking\fR
+reset the first LZMA byte of each member
.PP
If no file names are given, or if a file is '\-', lziprecover decompresses
from standard input to standard output.
@@ -115,16 +124,16 @@ Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc...
To extract all the files from archive 'foo.tar.lz', use the commands
\&'tar \fB\-xf\fR foo.tar.lz' or 'lziprecover \fB\-cd\fR foo.tar.lz | tar \fB\-xf\fR \-'.
.PP
-Exit status: 0 for a normal exit, 1 for environmental problems (file
-not found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or
-invalid input file, 3 for an internal consistency error (e.g., bug) which
-caused lziprecover to panic.
+Exit status: 0 for a normal exit, 1 for environmental problems
+(file not found, invalid command line options, I/O errors, etc), 2 to
+indicate a corrupt or invalid input file, 3 for an internal consistency
+error (e.g., bug) which caused lziprecover to panic.
.SH "REPORTING BUGS"
Report bugs to lzip\-bug@nongnu.org
.br
Lziprecover home page: http://www.nongnu.org/lzip/lziprecover.html
.SH COPYRIGHT
-Copyright \(co 2022 Antonio Diaz Diaz.
+Copyright \(co 2023 Antonio Diaz Diaz.
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>
.br
This is free software: you are free to change and redistribute it.
diff --git a/doc/lziprecover.info b/doc/lziprecover.info
index 112f65b..2ef7641 100644
--- a/doc/lziprecover.info
+++ b/doc/lziprecover.info
@@ -12,7 +12,7 @@ File: lziprecover.info, Node: Top, Next: Introduction, Up: (dir)
Lziprecover Manual
******************
-This manual is for Lziprecover (version 1.23, 21 January 2022).
+This manual is for Lziprecover (version 1.24-pre1, 14 June 2023).
* Menu:
@@ -32,7 +32,7 @@ This manual is for Lziprecover (version 1.23, 21 January 2022).
* Concept index:: Index of concepts
- Copyright (C) 2009-2022 Antonio Diaz Diaz.
+ Copyright (C) 2009-2023 Antonio Diaz Diaz.
This manual is free documentation: you have unlimited permission to copy,
distribute, and modify it.
@@ -89,9 +89,9 @@ byte near the beginning is a thing of the past.
Compression may be good for long-term archiving. For compressible data,
multiple compressed copies may provide redundancy in a more useful form and
may have a better chance of surviving intact than one uncompressed copy
-using the same amount of storage space. This is specially true if the format
-provides recovery capabilities like those of lziprecover, which is able to
-find and combine the good parts of several damaged copies.
+using the same amount of storage space. This is especially true if the
+format provides recovery capabilities like those of lziprecover, which is
+able to find and combine the good parts of several damaged copies.
Lziprecover is able to recover or decompress files produced by any of the
compressors in the lzip family: lzip, plzip, minilzip/lzlib, clzip, and
@@ -107,7 +107,7 @@ recoverable data in all members of the file can be extracted with the
following command (the resulting file may contain errors and some garbage
data may be produced at the end of each damaged member):
- lziprecover -cd -i file.lz > file
+ lziprecover -cd --ignore-errors file.lz > file
When recovering data, lziprecover takes as arguments the names of the
damaged files and writes zero or more recovered files depending on the
@@ -134,7 +134,8 @@ The format for running lziprecover is:
When decompressing or testing, a hyphen '-' used as a FILE argument means
standard input. It can be mixed with other FILES and is read just once, the
first time it appears in the command line. If no file names are specified,
-lziprecover decompresses from standard input to standard output.
+lziprecover decompresses from standard input to standard output. Remember
+to prepend './' to any file name beginning with a hyphen, or use '--'.
lziprecover supports the following options: *Note Argument syntax:
(arg_parser)Argument syntax.
@@ -181,13 +182,14 @@ lziprecover decompresses from standard input to standard output.
'-d'
'--decompress'
- Decompress the files specified. If a file does not exist, can't be
- opened, or the destination file already exists and '--force' has not
- been specified, lziprecover continues decompressing the rest of the
- files and exits with error status 1. If a file fails to decompress, or
- is a terminal, lziprecover exits immediately with error status 2
- without decompressing the rest of the files. A terminal is considered
- an uncompressed file, and therefore invalid.
+ Decompress the files specified. The integrity of the files specified is
+ checked. If a file does not exist, can't be opened, or the destination
+ file already exists and '--force' has not been specified, lziprecover
+ continues decompressing the rest of the files and exits with error
+ status 1. If a file fails to decompress, or is a terminal, lziprecover
+ exits immediately with error status 2 without decompressing the rest
+ of the files. A terminal is considered an uncompressed file, and
+ therefore invalid.
'-D RANGE'
'--range-decompress=RANGE'
@@ -197,7 +199,7 @@ lziprecover decompresses from standard input to standard output.
only decompresses the members containing the desired data. In order to
guarantee the correctness of the data produced, all members containing
any part of the desired data are decompressed and their integrity is
- verified.
+ checked.
Four formats of RANGE are recognized, 'BEGIN', 'BEGIN-END',
'BEGIN,SIZE', and ',SIZE'. If only BEGIN is specified, END is taken as
@@ -250,7 +252,7 @@ lziprecover decompresses from standard input to standard output.
errors, for example).
Make '--list', '--dump', '--remove', and '--strip' ignore format
- errors. The sizes of the members with errors (specially the last) may
+ errors. The sizes of the members with errors (especially the last) may
be wrong.
'-k'
@@ -271,11 +273,11 @@ lziprecover decompresses from standard input to standard output.
file numbers produced by '--split'.
If any file is damaged, does not exist, can't be opened, or is not
- regular, the final exit status will be > 0. '-lq' can be used to verify
+ regular, the final exit status will be > 0. '-lq' can be used to check
quickly (without decompressing) the structural integrity of the files
- specified. (Use '--test' to verify the data integrity). '-alq'
- additionally verifies that none of the files specified contain
- trailing data.
+ specified. (Use '--test' to check the data integrity). '-alq'
+ additionally checks that none of the files specified contain trailing
+ data.
'-m'
'--merge'
@@ -302,7 +304,7 @@ lziprecover decompresses from standard input to standard output.
Quiet operation. Suppress all messages.
'-R'
-'--repair'
+'--byte-repair'
Try to repair a FILE with small errors (up to one single-byte error
per member). If successful, a repaired copy is written to the file
'FILE_fixed.lz'. FILE is not modified at all. The exit status is 0 if
@@ -335,7 +337,7 @@ lziprecover decompresses from standard input to standard output.
really performs a trial decompression and throws away the result. Use
it together with '-v' to see information about the files. If a file
fails the test, does not exist, can't be opened, or is a terminal,
- lziprecover continues checking the rest of the files. A final
+ lziprecover continues testing the rest of the files. A final
diagnostic is shown at verbosity level 1 or higher if any file fails
the test when testing multiple files.
@@ -351,35 +353,31 @@ lziprecover decompresses from standard input to standard output.
In other modes, increasing verbosity levels show final status, progress
of operations, and extra information (for example, the failed areas).
-'--loose-trailing'
- When decompressing, testing, or listing, allow trailing data whose
- first bytes are so similar to the magic bytes of a lzip header that
- they can be confused with a corrupt header. Use this option if a file
- triggers a "corrupt header" error and the cause is not indeed a
- corrupt header.
-
-'--dump=[MEMBER_LIST][:damaged][:tdata]'
- Dump the members listed, the damaged members (if any), or the trailing
- data (if any) of one or more regular multimember files to standard
- output, or to a file if the option '--output' is used. If more than
- one file is given, the elements dumped from all files are concatenated.
- If a file does not exist, can't be opened, or is not regular,
- lziprecover continues processing the rest of the files. If the dump
- fails in one file, lziprecover exits immediately without processing the
- rest of the files. Only '--dump=tdata' can write to a terminal.
+'--dump=[MEMBER_LIST][:damaged][:empty][:tdata]'
+ Dump the members listed, the damaged members (if any), the empty
+ members (if any), or the trailing data (if any) of one or more regular
+ multimember files to standard output, or to a file if the option
+ '--output' is used. If more than one file is given, the elements
+ dumped from all the files are concatenated. If a file does not exist,
+ can't be opened, or is not regular, lziprecover continues processing
+ the rest of the files. If the dump fails in one file, lziprecover
+ exits immediately without processing the rest of the files. Only
+ '--dump=tdata' can write to a terminal. '--dump=damaged' implies
+ '--ignore-errors'.
The argument to '--dump' is a colon-separated list of the following
element specifiers; a member list (1,3-6), a reverse member list
- (r1,3-6), and the strings "damaged" and "tdata" (which may be shortened
- to 'd' and 't' respectively). A member list selects the members (or
- gaps) listed, whose numbers coincide with those shown by '--list'. A
- reverse member list selects the members listed counting from the last
- member in the file (r1). Negated versions of both kinds of lists exist
- (^1,3-6:r^1,3-6) which selects all the members except those in the
- list. The strings "damaged" and "tdata" select the damaged members and
- the trailing data respectively. If the same member is selected more
- than once, for example by '1:r1' in a single-member file, it is dumped
- just once. See the following examples:
+ (r1,3-6), and the strings "damaged", "empty", and "tdata" (which may
+ be shortened to 'd', 'e', and 't' respectively). A member list selects
+ the members (or gaps) listed, whose numbers coincide with those shown
+ by '--list'. A reverse member list selects the members listed counting
+ from the last member in the file (r1). Negated versions of both kinds
+ of lists exist (^1,3-6:r^1,3-6) which select all the members except
+ those in the list. The strings "damaged", "empty", and "tdata" select
+ the damaged members, the empty members (those with a data size = 0),
+ and the trailing data respectively. If the same member is selected
+ more than once, for example by '1:r1' in a single-member file, it is
+ dumped just once. See the following examples:
'--dump' argument Elements dumped
---------------------------------------------------------------------
@@ -388,43 +386,75 @@ lziprecover decompresses from standard input to standard output.
'^13,15' all but 13th and 15th members in file
'r^1' all but last member in file
'damaged' all damaged members in file
+ 'empty' all empty members in file
'tdata' trailing data
'1-5:r1:tdata' members 1 to 5, last member, trailing data
'damaged:tdata' damaged members, trailing data
'3,12:damaged:tdata' members 3, 12, damaged members, trailing data
-'--remove=[MEMBER_LIST][:damaged][:tdata]'
- Remove the members listed, the damaged members (if any), or the
- trailing data (if any) from regular multimember files in place. The
- date of each file is preserved if possible. If all members in a file
- are selected to be removed, the file is left unchanged and the exit
- status is set to 2. If a file does not exist, can't be opened, is not
- regular, or is left unchanged, lziprecover continues processing the
- rest of the files. In case of I/O error, lziprecover exits immediately
- without processing the rest of the files. See '--dump' above for a
- description of the argument.
-
- This option may be dangerous even if only the trailing data is being
+'--remove=[MEMBER_LIST][:damaged][:empty][:tdata]'
+ Remove the members listed, the damaged members (if any), the empty
+ members (if any), or the trailing data (if any) from regular
+ multimember files in place. The date of each file modified is
+ preserved if possible. If all members in a file are selected to be
+ removed, the file is left unchanged and the exit status is set to 2.
+ If a file does not exist, can't be opened, is not regular, or is left
+ unchanged, lziprecover continues processing the rest of the files. In
+ case of I/O error, lziprecover exits immediately without processing
+ the rest of the files. See '--dump' above for a description of the
+ argument.
+
+ This option may be dangerous even if only the trailing data are being
removed because the file may be corrupt or the trailing data may
contain a forbidden combination of characters. *Note Trailing data::.
- It is advisable to make a backup before attempting the removal. At
- least verify that 'lzip -cd file.lz | wc -c' and the uncompressed size
- shown by 'lzip -l file.lz' match before attempting the removal of
- trailing data.
-
-'--strip=[MEMBER_LIST][:damaged][:tdata]'
+ It is safer to send the output of '--strip' to a temporary file, check
+ it, and then copy it over the original file. But if you prefer
+ '--remove' because of its more efficient in-place removal, it is
+ advisable to make a backup before attempting the removal. At least
+ check that 'lzip -cd file.lz | wc -c' and the uncompressed size shown
+ by 'lzip -l file.lz' match before attempting the removal of trailing
+ data.
+
+'--strip=[MEMBER_LIST][:damaged][:empty][:tdata]'
Copy one or more regular multimember files to standard output (or to a
file if the option '--output' is used), stripping the members listed,
- the damaged members (if any), or the trailing data (if any) from each
- file. If all members in a file are selected to be stripped, the
- trailing data (if any) are also stripped even if 'tdata' is not
- specified. If more than one file is given, the files are concatenated.
- In this case the trailing data are also stripped from all but the last
- file even if 'tdata' is not specified. If a file does not exist, can't
- be opened, or is not regular, lziprecover continues processing the
- rest of the files. If a file fails to copy, lziprecover exits
- immediately without processing the rest of the files. See '--dump'
- above for a description of the argument.
+ the damaged members (if any), the empty members (if any), or the
+ trailing data (if any) from each file. If all members in a file are
+ selected to be stripped, the trailing data (if any) are also stripped
+ even if 'tdata' is not specified. If more than one file is given, the
+ files are concatenated. In this case the trailing data are also
+ stripped from all but the last file even if 'tdata' is not specified.
+ If a file does not exist, can't be opened, or is not regular,
+ lziprecover continues processing the rest of the files. If a file
+ fails to copy, lziprecover exits immediately without processing the
+ rest of the files. See '--dump' above for a description of the
+ argument.
+
+'--empty-error'
+ Exit with error status 2 if any empty member is found in the input
+ files.
+
+'--marking-error'
+ Exit with error status 2 if the first LZMA byte is non-zero in any
+ member of the input files. This may be caused by data corruption or by
+ deliberate insertion of tracking information in the file. Use
+ 'lziprecover --clear-marking' to clear any such non-zero bytes.
+
+'--loose-trailing'
+ When decompressing, testing, or listing, allow trailing data whose
+ first bytes are so similar to the magic bytes of a lzip header that
+ they can be confused with a corrupt header. Use this option if a file
+ triggers a "corrupt header" error and the cause is not indeed a
+ corrupt header.
+
+'--clear-marking'
+ Set to zero the first LZMA byte of each member in the files specified.
+ At verbosity level 1 (-v), print the number of members cleared. The
+ date of each file modified is preserved if possible. This option
+ exists because the first byte of the LZMA stream is ignored by the
+ range decoder, and can therefore be (mis)used to store any value which
+ can then be used as a watermark to track the path of the compressed
+ payload.
Lziprecover also supports the following debug options (for experts):
@@ -443,9 +473,9 @@ lziprecover decompresses from standard input to standard output.
'--md5sum'
Print to standard output the MD5 digests of the input FILES one per
line in the same format produced by the 'md5sum' tool. Lziprecover
- uses MD5 digests to verify the result of some operations. This option
- allows the verification of lziprecover's implementation of the MD5
- algorithm.
+ uses MD5 digests to check the result of some operations. This option
+ can be used to test the correctness of lziprecover's implementation of
+ the MD5 algorithm.
'-S[VALUE]'
'--nrep-stats[=VALUE]'
@@ -453,8 +483,8 @@ lziprecover decompresses from standard input to standard output.
VALUE in the compressed LZMA streams of the input FILES with the
frequency expected for random data (1 / 2^(8N)). If VALUE is not
specified, print the frequency of repeated sequences of all possible
- byte values. Print cumulative data for all files followed by the name
- of the first file with the longest sequence.
+ byte values. Print cumulative data for all the files, followed by the
+ name of the first file with the longest sequence.
'-U 1|BSIZE'
'--unzcrash=1|BSIZE'
@@ -509,31 +539,34 @@ lziprecover decompresses from standard input to standard output.
range-format::, for a description of RANGE.
'-Z POSITION,VALUE'
-'--debug-repair=POSITION,VALUE'
+'--debug-byte-repair=POSITION,VALUE'
Load the compressed FILE into memory, set the byte at POSITION to
- VALUE, and then try to repair the error. *Note --repair::.
+ VALUE, and then try to repair the byte error. *Note --byte-repair::.
- Numbers given as arguments to options may be followed by a multiplier
-and an optional 'B' for "byte".
+ Numbers given as arguments to options may be expressed in decimal,
+hexadecimal, or octal (using the same syntax as integer constants in C++),
+and may be followed by a multiplier and an optional 'B' for "byte".
Table of SI and binary prefixes (unit multipliers):
-Prefix Value | Prefix Value
-k kilobyte (10^3 = 1000) | Ki kibibyte (2^10 = 1024)
-M megabyte (10^6) | Mi mebibyte (2^20)
-G gigabyte (10^9) | Gi gibibyte (2^30)
-T terabyte (10^12) | Ti tebibyte (2^40)
-P petabyte (10^15) | Pi pebibyte (2^50)
-E exabyte (10^18) | Ei exbibyte (2^60)
-Z zettabyte (10^21) | Zi zebibyte (2^70)
-Y yottabyte (10^24) | Yi yobibyte (2^80)
+Prefix Value | Prefix Value
+k kilobyte (10^3 = 1000) | Ki kibibyte (2^10 = 1024)
+M megabyte (10^6) | Mi mebibyte (2^20)
+G gigabyte (10^9) | Gi gibibyte (2^30)
+T terabyte (10^12) | Ti tebibyte (2^40)
+P petabyte (10^15) | Pi pebibyte (2^50)
+E exabyte (10^18) | Ei exbibyte (2^60)
+Z zettabyte (10^21) | Zi zebibyte (2^70)
+Y yottabyte (10^24) | Yi yobibyte (2^80)
+R ronnabyte (10^27) | Ri robibyte (2^90)
+Q quettabyte (10^30) | Qi quebibyte (2^100)
Exit status: 0 for a normal exit, 1 for environmental problems (file not
-found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or invalid
-input file, 3 for an internal consistency error (e.g., bug) which caused
-lziprecover to panic.
+found, invalid command line options, I/O errors, etc), 2 to indicate a
+corrupt or invalid input file, 3 for an internal consistency error (e.g.,
+bug) which caused lziprecover to panic.

File: lziprecover.info, Node: Data safety, Next: Repairing one byte, Prev: Invoking lziprecover, Up: Top
@@ -593,7 +626,7 @@ only be recovered by an expert, if at all.
If you used bzip2, and if the file is large enough to contain more than
one compressed data block (usually larger than 900 kB uncompressed), and if
no block is damaged in both files, then the data can be manually recovered
-by splitting the files with bzip2recover, verifying every block, and then
+by splitting the files with bzip2recover, checking every block, and then
copying the right blocks in the right order into another file.
But if you used lzip, the data can be automatically recovered with
@@ -616,12 +649,12 @@ mailbox, therefore the initial part of two consecutive backups is identical
unless some messages have been changed or deleted in the meantime. The new
messages added to each backup are usually a small part of the whole mailbox.
-+========================================================+
-| Older backup containing some messages |
-+========================================================+
-+========================================================+================+
-| Newer backup containing the messages above plus some | new messages |
-+========================================================+================+
++============================================+
+| Older backup containing some messages |
++============================================+
++============================================+========================+
+| Newer backup containing the messages above | plus some new messages |
++============================================+========================+
One day you discover that your mailbox has disappeared because you
deleted it inadvertently or because of a bug in your email reader. Not only
@@ -644,7 +677,7 @@ combining the good blocks from both backups.
But if you used lzip, the whole newer backup can be automatically
recovered with 'lziprecover --reproduce' as long as the missing bytes can be
recovered from the older backup, even if other messages in the common part
-have been changed or deleted. Mailboxes seem to be specially easy to
+have been changed or deleted. Mailboxes seem to be especially easy to
reproduce. The probability of reproducing a mailbox (*note
performance-of-reproduce::) is almost as high as that of merging two
identical backups (*note performance-of-merge::).
@@ -791,7 +824,7 @@ feeding the concatenated data to the same version of lzip that created the
file. For this to work, a reference file is required containing the
uncompressed data corresponding to the missing compressed data of the zeroed
sector, plus some context data before and after them. It is possible to
-recover a large file using just a few KB of reference data.
+recover a large file using just a few kB of reference data.
The difficult part is finding a suitable reference file. It must contain
the exact data required (possibly mixed with other data). Containing similar
@@ -821,9 +854,9 @@ which should produce an output like the following:
foo: Match found at offset 296892
Reproduction succeeded at pos 65536
- 1 sectors tested
- 1 reproductions returned with zero status
- all comparisons passed
+ 1 sectors tested
+ 1 reproductions returned with zero status
+ all comparisons passed
Using 'foo' as reference file guarantees that any zeroed sector in
'foo.lz' can be reproduced because both files contain the same data. In
@@ -859,7 +892,7 @@ when they are required.
6.1 Performance of '--reproduce'
================================
-Reproduce mode is specially useful when recovering a corrupt backup (or a
+Reproduce mode is especially useful when recovering a corrupt backup (or a
corrupt source tarball) that is part of a series. Usually only a small
fraction of the data changes from one backup to the next or from one version
of a source tarball to the next. This makes sometimes possible to reproduce
@@ -890,11 +923,11 @@ gmp-6.1.1.tar gmp-6.1.2.tar.lz 175 / 473 = 37%
gmp-6.1.2.tar gmp-6.1.1.tar.lz 181 / 472 = 38.35%
Note that the "performance of reproduce" is a probability, not a partial
-recovery. The data is either recovered fully (with the probability X shown
+recovery. The data are either recovered fully (with the probability X shown
in the last column of the tables above) or not recovered at all (with
probability 1 - X).
- Example 1: Recover a damaged source tarball with a zeroed sector of 512
+Example 1: Recover a damaged source tarball with a zeroed sector of 512
bytes at file position 1019904, using as reference another source tarball
for a different version of the software.
@@ -1049,10 +1082,10 @@ File: lziprecover.info, Node: File names, Next: File format, Prev: Tarlz, Up
8 Names of the files produced by lziprecover
********************************************
-The name of the fixed file produced by '--merge' and '--repair' is made by
-appending the string '_fixed.lz' to the original file name. If the original
-file name ends with one of the extensions '.tar.lz', '.lz', or '.tlz', the
-string '_fixed' is inserted before the extension.
+The name of the fixed file produced by '--byte-repair' and '--merge' is
+made by appending the string '_fixed.lz' to the original file name. If the
+original file name ends with one of the extensions '.tar.lz', '.lz', or
+'.tlz', the string '_fixed' is inserted before the extension.

File: lziprecover.info, Node: File format, Next: Trailing data, Prev: File names, Up: Top
@@ -1124,10 +1157,10 @@ size of a multimember file is unlimited.
'Member size (8 bytes)'
Total size of the member, including header and trailer. This field acts
- as a distributed index, allows the verification of stream integrity,
- and facilitates the safe recovery of undamaged members from
- multimember files. Member size should be limited to 2 PiB to prevent
- the data size field from overflowing.
+ as a distributed index, improves the checking of stream integrity, and
+ facilitates the safe recovery of undamaged members from multimember
+ files. Lzip limits the member size to 2 PiB to prevent the data size
+ field from overflowing.

@@ -1143,12 +1176,13 @@ member. Such trailing data may be:
example when writing to a tape. It is safe to append any amount of
padding zero bytes to a lzip file.
- * Useful data added by the user; a cryptographically secure hash, a
+ * Useful data added by the user; an "End Of File" string (to check that
+ the file has not been truncated), a cryptographically secure hash, a
description of file contents, etc. It is safe to append any amount of
- text to a lzip file as long as none of the first four bytes of the text
- match the corresponding byte in the string "LZIP", and the text does
- not contain any zero bytes (null characters). Nonzero bytes and zero
- bytes can't be safely mixed in trailing data.
+ text to a lzip file as long as none of the first four bytes of the
+ text match the corresponding byte in the string "LZIP", and the text
+ does not contain any zero bytes (null characters). Nonzero bytes and
+ zero bytes can't be safely mixed in trailing data.
* Garbage added by some not totally successful copy operation.
@@ -1190,7 +1224,7 @@ Example 1: Add a comment or description to a compressed file.
lziprecover --remove=tdata file.lz
-Example 2: Add and verify a cryptographically secure hash. (This may be
+Example 2: Add and check a cryptographically secure hash. (This may be
convenient, but a separate copy of the hash must be kept in a safe place to
guarantee that both file and hash have not been maliciously replaced).
@@ -1217,7 +1251,7 @@ the operation is successful, 'file.lz' is removed.
lziprecover -d file.lz
-Example 3: Verify the integrity of the compressed file 'file.lz' and show
+Example 3: Check the integrity of the compressed file 'file.lz' and show
status.
lziprecover -tv file.lz
@@ -1233,7 +1267,7 @@ more compressed files. *Note Trailing data::.
You may also concatenate the compressed files like this
lziprecover --strip=tdata file1.lz file2.lz file3.lz > file123.lz
Or keeping the trailing data of the last file like this
- lziprecover --strip=damaged file1.lz file2.lz file3.lz > file123.lz
+ lziprecover --strip=empty file1.lz file2.lz file3.lz > file123.lz
Example 5: Decompress 'file.lz' partially until 10 KiB of decompressed data
@@ -1248,7 +1282,7 @@ Example 6: Decompress 'file.lz' partially from decompressed byte at offset
lziprecover -D 10000-15000 file.lz
-Example 7: Repair small errors in the file 'file.lz'. (Indented lines are
+Example 7: Repair a corrupt byte in the file 'file.lz'. (Indented lines are
abridged diagnostic messages from lziprecover).
lziprecover -v -R file.lz
@@ -1375,9 +1409,9 @@ tested must decompress it correctly for the comparisons to work.
for example.
'-n'
-'--no-verify'
- Skip initial verification of FILE and 'zcmp'. May speed up things a
- lot when testing many (or large) known good files.
+'--no-check'
+ Skip initial test of FILE and 'zcmp'. May speed up things a lot when
+ testing many (or large) known good files.
'-p BYTES'
'--position=BYTES'
@@ -1413,9 +1447,9 @@ tested must decompress it correctly for the comparisons to work.
Exit status: 0 for a normal exit, 1 for environmental problems (file not
-found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or invalid
-input file, 3 for an internal consistency error (e.g., bug) which caused
-unzcrash to panic.
+found, invalid command line options, I/O errors, etc), 2 to indicate a
+corrupt or invalid input file, 3 for an internal consistency error (e.g.,
+bug) which caused unzcrash to panic.

File: lziprecover.info, Node: Problems, Next: Concept index, Prev: Unzcrash, Up: Top
@@ -1465,31 +1499,31 @@ Concept index

Tag Table:
Node: Top226
-Node: Introduction1406
-Node: Invoking lziprecover5398
-Ref: --trailing-error6265
-Ref: range-format8644
-Ref: --reproduce8979
-Ref: --repair13278
-Node: Data safety25584
-Node: Merging with a backup27572
-Node: Reproducing a mailbox28836
-Node: Repairing one byte31337
-Node: Merging files33402
-Ref: performance-of-merge34572
-Ref: ddrescue-example36181
-Node: Reproducing one sector37468
-Ref: performance-of-reproduce41351
-Ref: ddrescue-example244026
-Node: Tarlz46446
-Node: File names50110
-Node: File format50567
-Node: Trailing data53258
-Node: Examples56499
-Ref: concat-example57075
-Node: Unzcrash58467
-Node: Problems64739
-Node: Concept index65291
+Node: Introduction1408
+Node: Invoking lziprecover5414
+Ref: --trailing-error6361
+Ref: range-format8793
+Ref: --reproduce9128
+Ref: --byte-repair13424
+Node: Data safety27441
+Node: Merging with a backup29429
+Node: Reproducing a mailbox30692
+Node: Repairing one byte33146
+Node: Merging files35211
+Ref: performance-of-merge36381
+Ref: ddrescue-example37990
+Node: Reproducing one sector39277
+Ref: performance-of-reproduce43163
+Ref: ddrescue-example245837
+Node: Tarlz48257
+Node: File names51921
+Node: File format52383
+Node: Trailing data55070
+Node: Examples58388
+Ref: concat-example58963
+Node: Unzcrash60355
+Node: Problems66633
+Node: Concept index67185

End Tag Table
diff --git a/doc/lziprecover.texi b/doc/lziprecover.texi
index 7b3449e..11a9ed5 100644
--- a/doc/lziprecover.texi
+++ b/doc/lziprecover.texi
@@ -6,8 +6,8 @@
@finalout
@c %**end of header
-@set UPDATED 21 January 2022
-@set VERSION 1.23
+@set UPDATED 14 June 2023
+@set VERSION 1.24-pre1
@dircategory Compression
@direntry
@@ -53,7 +53,7 @@ This manual is for Lziprecover (version @value{VERSION}, @value{UPDATED}).
@end menu
@sp 1
-Copyright @copyright{} 2009-2022 Antonio Diaz Diaz.
+Copyright @copyright{} 2009-2023 Antonio Diaz Diaz.
This manual is free documentation: you have unlimited permission to copy,
distribute, and modify it.
@@ -116,9 +116,9 @@ the beginning is a thing of the past.
Compression may be good for long-term archiving. For compressible data,
multiple compressed copies may provide redundancy in a more useful form and
may have a better chance of surviving intact than one uncompressed copy
-using the same amount of storage space. This is specially true if the format
-provides recovery capabilities like those of lziprecover, which is able to
-find and combine the good parts of several damaged copies.
+using the same amount of storage space. This is especially true if the
+format provides recovery capabilities like those of lziprecover, which is
+able to find and combine the good parts of several damaged copies.
Lziprecover is able to recover or decompress files produced by any of the
compressors in the lzip family: lzip, plzip, minilzip/lzlib, clzip, and
@@ -135,7 +135,7 @@ data in all members of the file can be extracted with the following command
at the end of each damaged member):
@example
-lziprecover -cd -i file.lz > file
+lziprecover -cd --ignore-errors file.lz > file
@end example
When recovering data, lziprecover takes as arguments the names of the
@@ -169,7 +169,8 @@ When decompressing or testing, a hyphen @samp{-} used as a @var{file}
argument means standard input. It can be mixed with other @var{files} and is
read just once, the first time it appears in the command line. If no file
names are specified, lziprecover decompresses from standard input to
-standard output.
+standard output. Remember to prepend @file{./} to any file name beginning
+with a hyphen, or use @samp{--}.
lziprecover supports the following
@uref{http://www.nongnu.org/arg-parser/manual/arg_parser_manual.html#Argument-syntax,,options}:
@@ -215,21 +216,21 @@ lzma-alone file as follows:
@item -c
@itemx --stdout
Write decompressed data to standard output; keep input files unchanged. This
-option (or @samp{-o}) is needed when reading from a named pipe (fifo) or
+option (or @option{-o}) is needed when reading from a named pipe (fifo) or
from a device. Use it also to recover as much of the decompressed data as
-possible when decompressing a corrupt file. @samp{-c} overrides @samp{-o}.
-@samp{-c} has no effect when merging, removing members, repairing,
+possible when decompressing a corrupt file. @option{-c} overrides @option{-o}.
+@option{-c} has no effect when merging, removing members, repairing,
reproducing, splitting, testing or listing.
@item -d
@itemx --decompress
-Decompress the files specified. If a file does not exist, can't be opened,
-or the destination file already exists and @samp{--force} has not been
-specified, lziprecover continues decompressing the rest of the files and
-exits with error status 1. If a file fails to decompress, or is a terminal,
-lziprecover exits immediately with error status 2 without decompressing the
-rest of the files. A terminal is considered an uncompressed file, and
-therefore invalid.
+Decompress the files specified. The integrity of the files specified is
+checked. If a file does not exist, can't be opened, or the destination file
+already exists and @option{--force} has not been specified, lziprecover
+continues decompressing the rest of the files and exits with error status 1.
+If a file fails to decompress, or is a terminal, lziprecover exits
+immediately with error status 2 without decompressing the rest of the files.
+A terminal is considered an uncompressed file, and therefore invalid.
@item -D @var{range}
@itemx --range-decompress=@var{range}
@@ -238,7 +239,7 @@ Decompress only a range of bytes starting at decompressed byte position
at 0. This option provides random access to the data in multimember files;
it only decompresses the members containing the desired data. In order to
guarantee the correctness of the data produced, all members containing any
-part of the desired data are decompressed and their integrity is verified.
+part of the desired data are decompressed and their integrity is checked.
@anchor{range-format}
Four formats of @var{range} are recognized, @samp{@var{begin}},
@@ -246,7 +247,7 @@ Four formats of @var{range} are recognized, @samp{@var{begin}},
@samp{,@var{size}}. If only @var{begin} is specified, @var{end} is taken as
the end of the file. If only @var{size} is specified, @var{begin} is taken
as the beginning of the file. The bytes produced are sent to standard output
-unless the option @samp{--output} is used.
+unless the option @option{--output} is used.
@anchor{--reproduce}
@item -e
@@ -262,16 +263,16 @@ of the reproduce mode.
@item --lzip-level=@var{digit}|a|m[@var{length}]
Try only the given compression level or match length limit when reproducing
-a zeroed sector. @samp{--lzip-level=a} tries all the compression levels
-@w{(0 to 9)}, while @samp{--lzip-level=m} tries all the match length limits
+a zeroed sector. @option{--lzip-level=a} tries all the compression levels
+@w{(0 to 9)}, while @option{--lzip-level=m} tries all the match length limits
@w{(5 to 273)}.
@item --lzip-name=@var{name}
-Set the name of the lzip executable used by @samp{--reproduce}. If
-@samp{--lzip-name} is not specified, @samp{lzip} is used.
+Set the name of the lzip executable used by @option{--reproduce}. If
+@option{--lzip-name} is not specified, @samp{lzip} is used.
@item --reference-file=@var{file}
-Set the reference file used by @samp{--reproduce}. It must contain the
+Set the reference file used by @option{--reproduce}. It must contain the
uncompressed data corresponding to the missing compressed data of the zeroed
sector, plus some context data before and after them.
@@ -281,7 +282,7 @@ Force overwrite of output files.
@item -i
@itemx --ignore-errors
-Make @samp{--decompress}, @samp{--test}, and @samp{--range-decompress}
+Make @option{--decompress}, @option{--test}, and @option{--range-decompress}
ignore format and data errors and continue decompressing the remaining
members in the file; keep input files unchanged. For example, the commands
@w{@samp{lziprecover -cd -i file.lz > file}} or
@@ -293,8 +294,8 @@ range decompressed may be smaller than the range requested, because of the
errors. The exit status is set to 0 unless other errors are found (I/O
errors, for example).
-Make @samp{--list}, @samp{--dump}, @samp{--remove}, and @samp{--strip}
-ignore format errors. The sizes of the members with errors (specially the
+Make @option{--list}, @option{--dump}, @option{--remove}, and @option{--strip}
+ignore format errors. The sizes of the members with errors (especially the
last) may be wrong.
@item -k
@@ -306,18 +307,18 @@ Keep (don't delete) input files during decompression.
Print the uncompressed size, compressed size, and percentage saved of the
files specified. Trailing data are ignored. The values produced are correct
even for multimember files. If more than one file is given, a final line
-containing the cumulative sizes is printed. With @samp{-v}, the dictionary
+containing the cumulative sizes is printed. With @option{-v}, the dictionary
size, the number of members in the file, and the amount of trailing data (if
-any) are also printed. With @samp{-vv}, the positions and sizes of each
-member in multimember files are also printed. With @samp{-i}, format errors
-are ignored, and with @samp{-ivv}, gaps between members are shown. The
-member numbers shown coincide with the file numbers produced by @samp{--split}.
+any) are also printed. With @option{-vv}, the positions and sizes of each
+member in multimember files are also printed. With @option{-i}, format errors
+are ignored, and with @option{-ivv}, gaps between members are shown. The
+member numbers shown coincide with the file numbers produced by @option{--split}.
If any file is damaged, does not exist, can't be opened, or is not regular,
-the final exit status will be @w{> 0}. @samp{-lq} can be used to verify
+the final exit status will be @w{> 0}. @option{-lq} can be used to check
quickly (without decompressing) the structural integrity of the files
-specified. (Use @samp{--test} to verify the data integrity). @samp{-alq}
-additionally verifies that none of the files specified contain trailing data.
+specified. (Use @option{--test} to check the data integrity). @option{-alq}
+additionally checks that none of the files specified contain trailing data.
@item -m
@itemx --merge
@@ -333,19 +334,19 @@ Place the output into @var{file} instead of into @samp{@var{file}_fixed.lz}.
If splitting, the names of the files produced are in the form
@samp{rec01@var{file}}, @samp{rec02@var{file}}, etc.
-If decompressing, or converting lzma-alone files, and @samp{-c} has not been
+If decompressing, or converting lzma-alone files, and @option{-c} has not been
also specified, write the decompressed or converted output to @var{file};
-keep input files unchanged. This option (or @samp{-c}) is needed when
+keep input files unchanged. This option (or @option{-c}) is needed when
reading from a named pipe (fifo) or from a device. @w{@samp{-o -}} is
-equivalent to @samp{-c}. @samp{-o} has no effect when testing or listing.
+equivalent to @option{-c}. @option{-o} has no effect when testing or listing.
@item -q
@itemx --quiet
Quiet operation. Suppress all messages.
-@anchor{--repair}
+@anchor{--byte-repair}
@item -R
-@itemx --repair
+@itemx --byte-repair
Try to repair a @var{file} with small errors (up to one single-byte error
per member). If successful, a repaired copy is written to the file
@samp{@var{file}_fixed.lz}. @var{file} is not modified at all. The exit
@@ -375,11 +376,11 @@ depending on the number of members in @var{file}.
@itemx --test
Check integrity of the files specified, but don't decompress them. This
really performs a trial decompression and throws away the result. Use it
-together with @samp{-v} to see information about the files. If a file
+together with @option{-v} to see information about the files. If a file
fails the test, does not exist, can't be opened, or is a terminal, lziprecover
-continues checking the rest of the files. A final diagnostic is shown at
-verbosity level 1 or higher if any file fails the test when testing
-multiple files.
+continues testing the rest of the files. A final diagnostic is shown at
+verbosity level 1 or higher if any file fails the test when testing multiple
+files.
@item -v
@itemx --verbose
@@ -389,38 +390,33 @@ verbosity level, showing status, compression ratio, dictionary size,
trailer contents (CRC, data size, member size), and up to 6 bytes of
trailing data (if any) both in hexadecimal and as a string of printable
ASCII characters.@*
-Two or more @samp{-v} options show the progress of decompression.@*
+Two or more @option{-v} options show the progress of decompression.@*
In other modes, increasing verbosity levels show final status, progress
of operations, and extra information (for example, the failed areas).
-@item --loose-trailing
-When decompressing, testing, or listing, allow trailing data whose first
-bytes are so similar to the magic bytes of a lzip header that they can
-be confused with a corrupt header. Use this option if a file triggers a
-"corrupt header" error and the cause is not indeed a corrupt header.
-
-@item --dump=[@var{member_list}][:damaged][:tdata]
-Dump the members listed, the damaged members (if any), or the trailing
-data (if any) of one or more regular multimember files to standard
-output, or to a file if the option @samp{--output} is used. If more than
-one file is given, the elements dumped from all files are concatenated.
-If a file does not exist, can't be opened, or is not regular,
-lziprecover continues processing the rest of the files. If the dump
-fails in one file, lziprecover exits immediately without processing the
-rest of the files. Only @samp{--dump=tdata} can write to a terminal.
-
-The argument to @samp{--dump} is a colon-separated list of the following
-element specifiers; a member list (1,3-6), a reverse member list
-(r1,3-6), and the strings "damaged" and "tdata" (which may be shortened
-to 'd' and 't' respectively). A member list selects the members (or
-gaps) listed, whose numbers coincide with those shown by @samp{--list}.
-A reverse member list selects the members listed counting from the last
-member in the file (r1). Negated versions of both kinds of lists exist
-(^1,3-6:r^1,3-6) which selects all the members except those in the list.
-The strings "damaged" and "tdata" select the damaged members and the
-trailing data respectively. If the same member is selected more than
-once, for example by @samp{1:r1} in a single-member file, it is dumped
-just once. See the following examples:
+@item --dump=[@var{member_list}][:damaged][:empty][:tdata]
+Dump the members listed, the damaged members (if any), the empty members (if
+any), or the trailing data (if any) of one or more regular multimember files
+to standard output, or to a file if the option @option{--output} is used. If
+more than one file is given, the elements dumped from all the files are
+concatenated. If a file does not exist, can't be opened, or is not regular,
+lziprecover continues processing the rest of the files. If the dump fails in
+one file, lziprecover exits immediately without processing the rest of the
+files. Only @option{--dump=tdata} can write to a terminal.
+@option{--dump=damaged} implies @option{--ignore-errors}.
+
+The argument to @option{--dump} is a colon-separated list of the following
+element specifiers; a member list (1,3-6), a reverse member list (r1,3-6),
+and the strings "damaged", "empty", and "tdata" (which may be shortened to
+'d', 'e', and 't' respectively). A member list selects the members (or gaps)
+listed, whose numbers coincide with those shown by @option{--list}. A reverse
+member list selects the members listed counting from the last member in the
+file (r1). Negated versions of both kinds of lists exist (^1,3-6:r^1,3-6)
+which select all the members except those in the list. The strings
+"damaged", "empty", and "tdata" select the damaged members, the empty
+members (those with a data size = 0), and the trailing data respectively. If
+the same member is selected more than once, for example by @samp{1:r1} in a
+single-member file, it is dumped just once. See the following examples:
@multitable {@code{3,12:damaged:tdata}} {members 3, 12, damaged members, trailing data}
@headitem @code{--dump} argument @tab Elements dumped
@@ -429,44 +425,71 @@ just once. See the following examples:
@item @code{^13,15} @tab all but 13th and 15th members in file
@item @code{r^1} @tab all but last member in file
@item @code{damaged} @tab all damaged members in file
+@item @code{empty} @tab all empty members in file
@item @code{tdata} @tab trailing data
@item @code{1-5:r1:tdata} @tab members 1 to 5, last member, trailing data
@item @code{damaged:tdata} @tab damaged members, trailing data
@item @code{3,12:damaged:tdata} @tab members 3, 12, damaged members, trailing data
@end multitable
-@item --remove=[@var{member_list}][:damaged][:tdata]
-Remove the members listed, the damaged members (if any), or the trailing
-data (if any) from regular multimember files in place. The date of each
-file is preserved if possible. If all members in a file are selected to
-be removed, the file is left unchanged and the exit status is set to 2.
-If a file does not exist, can't be opened, is not regular, or is left
-unchanged, lziprecover continues processing the rest of the files. In case
-of I/O error, lziprecover exits immediately without processing the rest of
-the files. See @samp{--dump} above for a description of the argument.
-
-This option may be dangerous even if only the trailing data is being
-removed because the file may be corrupt or the trailing data may contain
-a forbidden combination of characters. @xref{Trailing data}. It is
-advisable to make a backup before attempting the removal. At least
-verify that @w{@samp{lzip -cd file.lz | wc -c}} and the uncompressed
-size shown by @w{@samp{lzip -l file.lz}} match before attempting the
-removal of trailing data.
-
-@item --strip=[@var{member_list}][:damaged][:tdata]
-Copy one or more regular multimember files to standard output (or to a
-file if the option @samp{--output} is used), stripping the members
-listed, the damaged members (if any), or the trailing data (if any) from
-each file. If all members in a file are selected to be stripped, the
-trailing data (if any) are also stripped even if @samp{tdata} is not
-specified. If more than one file is given, the files are concatenated.
-In this case the trailing data are also stripped from all but the last
-file even if @samp{tdata} is not specified. If a file does not exist,
-can't be opened, or is not regular, lziprecover continues processing the
-rest of the files. If a file fails to copy, lziprecover exits
-immediately without processing the rest of the files. See @samp{--dump}
+@item --remove=[@var{member_list}][:damaged][:empty][:tdata]
+Remove the members listed, the damaged members (if any), the empty members
+(if any), or the trailing data (if any) from regular multimember files in
+place. The date of each file modified is preserved if possible. If all
+members in a file are selected to be removed, the file is left unchanged and
+the exit status is set to 2. If a file does not exist, can't be opened, is
+not regular, or is left unchanged, lziprecover continues processing the rest
+of the files. In case of I/O error, lziprecover exits immediately without
+processing the rest of the files. See @option{--dump} above for a description
+of the argument.
+
+This option may be dangerous even if only the trailing data are being
+removed because the file may be corrupt or the trailing data may contain a
+forbidden combination of characters. @xref{Trailing data}. It is safer to
+send the output of @option{--strip} to a temporary file, check it, and then
+copy it over the original file. But if you prefer @option{--remove} because of
+its more efficient in-place removal, it is advisable to make a backup before
+attempting the removal. At least check that @w{@samp{lzip -cd file.lz | wc -c}}
+and the uncompressed size shown by @w{@samp{lzip -l file.lz}} match before
+attempting the removal of trailing data.
+
+@item --strip=[@var{member_list}][:damaged][:empty][:tdata]
+Copy one or more regular multimember files to standard output (or to a file
+if the option @option{--output} is used), stripping the members listed, the
+damaged members (if any), the empty members (if any), or the trailing data
+(if any) from each file. If all members in a file are selected to be
+stripped, the trailing data (if any) are also stripped even if @samp{tdata}
+is not specified. If more than one file is given, the files are
+concatenated. In this case the trailing data are also stripped from all but
+the last file even if @samp{tdata} is not specified. If a file does not
+exist, can't be opened, or is not regular, lziprecover continues processing
+the rest of the files. If a file fails to copy, lziprecover exits
+immediately without processing the rest of the files. See @option{--dump}
above for a description of the argument.
+@item --empty-error
+Exit with error status 2 if any empty member is found in the input files.
+
+@item --marking-error
+Exit with error status 2 if the first LZMA byte is non-zero in any member of
+the input files. This may be caused by data corruption or by deliberate
+insertion of tracking information in the file. Use
+@w{@samp{lziprecover --clear-marking}} to clear any such non-zero bytes.
+
+@item --loose-trailing
+When decompressing, testing, or listing, allow trailing data whose first
+bytes are so similar to the magic bytes of a lzip header that they can
+be confused with a corrupt header. Use this option if a file triggers a
+"corrupt header" error and the cause is not indeed a corrupt header.
+
+@item --clear-marking
+Set to zero the first LZMA byte of each member in the files specified. At
+verbosity level 1 (-v), print the number of members cleared. The date of
+each file modified is preserved if possible. This option exists because the
+first byte of the LZMA stream is ignored by the range decoder, and can
+therefore be (mis)used to store any value which can then be used as a
+watermark to track the path of the compressed payload.
+
@end table
Lziprecover also supports the following debug options (for experts):
@@ -486,8 +509,9 @@ nonzero status only in case of fatal error.
@itemx --md5sum
Print to standard output the MD5 digests of the input @var{files} one per
line in the same format produced by the @command{md5sum} tool. Lziprecover
-uses MD5 digests to verify the result of some operations. This option allows
-the verification of lziprecover's implementation of the MD5 algorithm.
+uses MD5 digests to check the result of some operations. This option can be
+used to test the correctness of lziprecover's implementation of the MD5
+algorithm.
@item -S[@var{value}]
@itemx --nrep-stats[=@var{value}]
@@ -495,7 +519,7 @@ Compare the frequency of sequences of N repeated bytes of a given
@var{value} in the compressed LZMA streams of the input @var{files} with the
frequency expected for random data (1 / 2^(8N)). If @var{value} is not
specified, print the frequency of repeated sequences of all possible byte
-values. Print cumulative data for all files followed by the name of the
+values. Print cumulative data for all the files, followed by the name of the
first file with the longest sequence.
@item -U 1|B@var{size}
@@ -516,7 +540,7 @@ stream of the compressed input @var{file} like the command
but in memory, and therefore much faster. Testing and comparisons work just
like with the argument @samp{1} explained above.
-By default @samp{--unzcrash} only prints the interesting cases; CRC
+By default @option{--unzcrash} only prints the interesting cases; CRC
mismatches, size mismatches, unsupported marker codes, unexpected EOFs,
apparently successful decompressions, and decoder errors detected 50_000 or
more bytes beyond the byte (or the start of the block) being tested. At
@@ -551,34 +575,37 @@ decoder realized that the data contains an error. @xref{range-format}, for a
description of @var{range}.
@item -Z @var{position},@var{value}
-@itemx --debug-repair=@var{position},@var{value}
+@itemx --debug-byte-repair=@var{position},@var{value}
Load the compressed @var{file} into memory, set the byte at @var{position}
-to @var{value}, and then try to repair the error. @xref{--repair}.
+to @var{value}, and then try to repair the byte error. @xref{--byte-repair}.
@end table
-Numbers given as arguments to options may be followed by a multiplier
-and an optional @samp{B} for "byte".
+Numbers given as arguments to options may be expressed in decimal,
+hexadecimal, or octal (using the same syntax as integer constants in C++),
+and may be followed by a multiplier and an optional @samp{B} for "byte".
Table of SI and binary prefixes (unit multipliers):
-@multitable {Prefix} {kilobyte (10^3 = 1000)} {|} {Prefix} {kibibyte (2^10 = 1024)}
+@multitable {Prefix} {kilobyte (10^3 = 1000)} {|} {Prefix} {kibibyte (2^10 = 1024)}
@item Prefix @tab Value @tab | @tab Prefix @tab Value
-@item k @tab kilobyte (10^3 = 1000) @tab | @tab Ki @tab kibibyte (2^10 = 1024)
-@item M @tab megabyte (10^6) @tab | @tab Mi @tab mebibyte (2^20)
-@item G @tab gigabyte (10^9) @tab | @tab Gi @tab gibibyte (2^30)
-@item T @tab terabyte (10^12) @tab | @tab Ti @tab tebibyte (2^40)
-@item P @tab petabyte (10^15) @tab | @tab Pi @tab pebibyte (2^50)
-@item E @tab exabyte (10^18) @tab | @tab Ei @tab exbibyte (2^60)
-@item Z @tab zettabyte (10^21) @tab | @tab Zi @tab zebibyte (2^70)
-@item Y @tab yottabyte (10^24) @tab | @tab Yi @tab yobibyte (2^80)
+@item k @tab kilobyte (10^3 = 1000) @tab | @tab Ki @tab kibibyte (2^10 = 1024)
+@item M @tab megabyte (10^6) @tab | @tab Mi @tab mebibyte (2^20)
+@item G @tab gigabyte (10^9) @tab | @tab Gi @tab gibibyte (2^30)
+@item T @tab terabyte (10^12) @tab | @tab Ti @tab tebibyte (2^40)
+@item P @tab petabyte (10^15) @tab | @tab Pi @tab pebibyte (2^50)
+@item E @tab exabyte (10^18) @tab | @tab Ei @tab exbibyte (2^60)
+@item Z @tab zettabyte (10^21) @tab | @tab Zi @tab zebibyte (2^70)
+@item Y @tab yottabyte (10^24) @tab | @tab Yi @tab yobibyte (2^80)
+@item R @tab ronnabyte (10^27) @tab | @tab Ri @tab robibyte (2^90)
+@item Q @tab quettabyte (10^30) @tab | @tab Qi @tab quebibyte (2^100)
@end multitable
@sp 1
-Exit status: 0 for a normal exit, 1 for environmental problems (file not
-found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or invalid
-input file, 3 for an internal consistency error (e.g., bug) which caused
-lziprecover to panic.
+Exit status: 0 for a normal exit, 1 for environmental problems
+(file not found, invalid command line options, I/O errors, etc), 2 to
+indicate a corrupt or invalid input file, 3 for an internal consistency
+error (e.g., bug) which caused lziprecover to panic.
@node Data safety
@@ -636,7 +663,7 @@ only be recovered by an expert, if at all.
If you used bzip2, and if the file is large enough to contain more than one
compressed data block (usually larger than @w{900 kB} uncompressed), and if
no block is damaged in both files, then the data can be manually recovered
-by splitting the files with bzip2recover, verifying every block, and then
+by splitting the files with bzip2recover, checking every block, and then
copying the right blocks in the right order into another file.
But if you used lzip, the data can be automatically recovered with
@@ -659,12 +686,12 @@ unless some messages have been changed or deleted in the meantime. The new
messages added to each backup are usually a small part of the whole mailbox.
@verbatim
-+========================================================+
-| Older backup containing some messages |
-+========================================================+
-+========================================================+================+
-| Newer backup containing the messages above plus some | new messages |
-+========================================================+================+
++============================================+
+| Older backup containing some messages |
++============================================+
++============================================+========================+
+| Newer backup containing the messages above | plus some new messages |
++============================================+========================+
@end verbatim
One day you discover that your mailbox has disappeared because you deleted
@@ -687,7 +714,7 @@ combining the good blocks from both backups.
But if you used lzip, the whole newer backup can be automatically recovered
with @w{@samp{lziprecover --reproduce}} as long as the missing bytes can be
recovered from the older backup, even if other messages in the common part
-have been changed or deleted. Mailboxes seem to be specially easy to
+have been changed or deleted. Mailboxes seem to be especially easy to
reproduce. The probability of reproducing a mailbox
(@pxref{performance-of-reproduce}) is almost as high as that of merging two
identical backups (@pxref{performance-of-merge}).
@@ -852,7 +879,7 @@ feeding the concatenated data to the same version of lzip that created the
file. For this to work, a reference file is required containing the
uncompressed data corresponding to the missing compressed data of the zeroed
sector, plus some context data before and after them. It is possible to
-recover a large file using just a few KB of reference data.
+recover a large file using just a few kB of reference data.
The difficult part is finding a suitable reference file. It must contain the
exact data required (possibly mixed with other data). Containing similar
@@ -886,9 +913,9 @@ Testing sectors of size 512 at file positions 65536 to 66047
foo: Match found at offset 296892
Reproduction succeeded at pos 65536
- 1 sectors tested
- 1 reproductions returned with zero status
- all comparisons passed
+ 1 sectors tested
+ 1 reproductions returned with zero status
+ all comparisons passed
@end example
Using @samp{foo} as reference file guarantees that any zeroed sector in
@@ -923,8 +950,8 @@ overhead. It uses basic ustar headers, and only adds extended pax headers
when they are required.
@anchor{performance-of-reproduce}
-@section Performance of @samp{--reproduce}
-Reproduce mode is specially useful when recovering a corrupt backup (or a
+@section Performance of @option{--reproduce}
+Reproduce mode is especially useful when recovering a corrupt backup (or a
corrupt source tarball) that is part of a series. Usually only a small
fraction of the data changes from one backup to the next or from one version
of a source tarball to the next. This makes sometimes possible to reproduce
@@ -957,10 +984,11 @@ real backups of my own working directory:
@end multitable
Note that the "performance of reproduce" is a probability, not a partial
-recovery. The data is either recovered fully (with the probability X shown
+recovery. The data are either recovered fully (with the probability X shown
in the last column of the tables above) or not recovered at all (with
probability @w{1 - X}).
+@noindent
Example 1: Recover a damaged source tarball with a zeroed sector of 512
bytes at file position 1019904, using as reference another source tarball
for a different version of the software.
@@ -1136,11 +1164,11 @@ archive contains the end-of-file blocks.
@chapter Names of the files produced by lziprecover
@cindex file names
-The name of the fixed file produced by @samp{--merge} and @samp{--repair} is
-made by appending the string @samp{_fixed.lz} to the original file name. If
-the original file name ends with one of the extensions @samp{.tar.lz},
-@samp{.lz}, or @samp{.tlz}, the string @samp{_fixed} is inserted before the
-extension.
+The name of the fixed file produced by @option{--byte-repair} and
+@option{--merge} is made by appending the string @samp{_fixed.lz} to the
+original file name. If the original file name ends with one of the
+extensions @samp{.tar.lz}, @samp{.lz}, or @samp{.tlz}, the string
+@samp{_fixed} is inserted before the extension.
@node File format
@@ -1224,10 +1252,10 @@ Size of the original uncompressed data.
@item Member size (8 bytes)
Total size of the member, including header and trailer. This field acts
-as a distributed index, allows the verification of stream integrity, and
+as a distributed index, improves the checking of stream integrity, and
facilitates the safe recovery of undamaged members from multimember files.
-Member size should be limited to @w{2 PiB} to prevent the data size field
-from overflowing.
+Lzip limits the member size to @w{2 PiB} to prevent the data size field from
+overflowing.
@end table
@@ -1246,12 +1274,13 @@ example when writing to a tape. It is safe to append any amount of
padding zero bytes to a lzip file.
@item
-Useful data added by the user; a cryptographically secure hash, a
-description of file contents, etc. It is safe to append any amount of
-text to a lzip file as long as none of the first four bytes of the text
-match the corresponding byte in the string "LZIP", and the text does not
-contain any zero bytes (null characters). Nonzero bytes and zero bytes
-can't be safely mixed in trailing data.
+Useful data added by the user; an "End Of File" string (to check that the
+file has not been truncated), a cryptographically secure hash, a description
+of file contents, etc. It is safe to append any amount of text to a lzip
+file as long as none of the first four bytes of the text match the
+corresponding byte in the string "LZIP", and the text does not contain any
+zero bytes (null characters). Nonzero bytes and zero bytes can't be safely
+mixed in trailing data.
@item
Garbage added by some not totally successful copy operation.
@@ -1269,7 +1298,7 @@ integrity information itself. Therefore it can be considered to be below
the noise level. Additionally, the test used by lziprecover to discriminate
trailing data from a corrupt header has a Hamming distance (HD) of 3,
and the 3 bit flips must happen in different magic bytes for the test to
-fail. In any case, the option @samp{--trailing-error} guarantees that
+fail. In any case, the option @option{--trailing-error} guarantees that
any corrupt header will be detected.
@end itemize
@@ -1280,7 +1309,7 @@ possible in the presence of trailing data.
Trailing data can be safely ignored in most cases. In some cases, like
that of user-added data, they are expected to be ignored. In those cases
where a file containing trailing data must be rejected, the option
-@samp{--trailing-error} can be used. @xref{--trailing-error}.
+@option{--trailing-error} can be used. @xref{--trailing-error}.
Lziprecover facilitates the management of metadata stored as trailing
data in lzip files. See the following examples:
@@ -1301,7 +1330,7 @@ lziprecover --remove=tdata file.lz
@sp 1
@noindent
-Example 2: Add and verify a cryptographically secure hash. (This may be
+Example 2: Add and check a cryptographically secure hash. (This may be
convenient, but a separate copy of the hash must be kept in a safe place
to guarantee that both file and hash have not been maliciously replaced).
@@ -1335,7 +1364,7 @@ lziprecover -d file.lz
@sp 1
@noindent
-Example 3: Verify the integrity of the compressed file @samp{file.lz} and
+Example 3: Check the integrity of the compressed file @samp{file.lz} and
show status.
@example
@@ -1356,7 +1385,7 @@ Do this instead
You may also concatenate the compressed files like this
lziprecover --strip=tdata file1.lz file2.lz file3.lz > file123.lz
Or keeping the trailing data of the last file like this
- lziprecover --strip=damaged file1.lz file2.lz file3.lz > file123.lz
+ lziprecover --strip=empty file1.lz file2.lz file3.lz > file123.lz
@end example
@sp 1
@@ -1379,7 +1408,7 @@ lziprecover -D 10000-15000 file.lz
@sp 1
@noindent
-Example 7: Repair small errors in the file @samp{file.lz}. (Indented lines
+Example 7: Repair a corrupt byte in the file @samp{file.lz}. (Indented lines
are abridged diagnostic messages from lziprecover).
@example
@@ -1416,11 +1445,11 @@ decompresses it, increasing 256 times each byte of the compressed data, so
as to test all possible one-byte errors. Note that it may take years or even
centuries to test all possible one-byte errors in a large file (tens of MB).
-If the option @samp{--block} is given, unzcrash reads the file specified and
+If the option @option{--block} is given, unzcrash reads the file specified and
then repeatedly decompresses it, setting all bytes in each successive block
to the value given, so as to test all possible full sector errors.
-If the option @samp{--truncate} is given, unzcrash reads the file specified
+If the option @option{--truncate} is given, unzcrash reads the file specified
and then repeatedly decompresses it, truncating the file to increasing
lengths, so as to test all possible truncation points.
@@ -1448,7 +1477,7 @@ to understand the format being tested. For example the @samp{zcmp} provided
by @uref{http://www.nongnu.org/zutils/manual/zutils_manual.html#Zcmp,,zutils}.
If the @samp{zcmp} program used does not understand the format being tested,
all the comparisons will fail because the compressed files will be compared
-without being decompressed first. Use @samp{--zcmp=false} to disable
+without being decompressed first. Use @option{--zcmp=false} to disable
comparisons.
@ifnothtml
@xref{Zcmp,,,zutils}.
@@ -1499,12 +1528,12 @@ The number of N-bit errors per byte (N = 1 to 8) is:
Test block errors of given @var{size}, simulating a whole sector I/O error.
@var{size} defaults to 512 bytes. @var{value} defaults to 0. By default,
only contiguous, non-overlapping blocks are tested, but this may be changed
-with the option @samp{--delta}.
+with the option @option{--delta}.
@item -d @var{n}
@itemx --delta=@var{n}
Test one byte, block, or truncation size every @var{n} bytes. If
-@samp{--delta} is not specified, unzcrash tests all the bytes,
+@option{--delta} is not specified, unzcrash tests all the bytes,
non-overlapping blocks, or truncation sizes. Values of @var{n} smaller than
the block size will result in overlapping blocks. (Which is convenient for
testing because there are usually too few non-overlapping blocks in a file).
@@ -1520,9 +1549,9 @@ value of the byte at @var{position}. This option can be used to run tests
with a changed dictionary size, for example.
@item -n
-@itemx --no-verify
-Skip initial verification of @var{file} and @samp{zcmp}. May speed up things
-a lot when testing many (or large) known good files.
+@itemx --no-check
+Skip initial test of @var{file} and @samp{zcmp}. May speed up things a lot
+when testing many (or large) known good files.
@item -p @var{bytes}
@itemx --position=@var{bytes}
@@ -1536,13 +1565,13 @@ Quiet operation. Suppress all messages.
@item -s @var{bytes}
@itemx --size=@var{bytes}
Number of byte positions to test. If not specified, the rest of the file
-is tested (from @samp{--position} to end of file). Negative values are
+is tested (from @option{--position} to end of file). Negative values are
relative to the rest of the file.
@item -t
@itemx --truncate
Test all possible truncation points in the range specified by
-@samp{--position} and @samp{--size}.
+@option{--position} and @option{--size}.
@item -v
@itemx --verbose
@@ -1551,17 +1580,17 @@ Verbose mode.
@item -z
@itemx --zcmp=<command>
Set zcmp command name and options. Defaults to @samp{zcmp}. Use
-@samp{--zcmp=false} to disable comparisons. If testing a decompressor
+@option{--zcmp=false} to disable comparisons. If testing a decompressor
different from the one used by default by zcmp, it is needed to force
unzcrash and zcmp to use the same decompressor with a command like
@w{@samp{unzcrash --zcmp='zcmp --lz=plzip' 'plzip -t' @var{file}}}
@end table
-Exit status: 0 for a normal exit, 1 for environmental problems (file not
-found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or
-invalid input file, 3 for an internal consistency error (e.g., bug) which
-caused unzcrash to panic.
+Exit status: 0 for a normal exit, 1 for environmental problems
+(file not found, invalid command line options, I/O errors, etc), 2 to
+indicate a corrupt or invalid input file, 3 for an internal consistency
+error (e.g., bug) which caused unzcrash to panic.
@node Problems