summaryrefslogtreecommitdiffstats
path: root/doc/lziprecover.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/lziprecover.texi')
-rw-r--r--doc/lziprecover.texi84
1 files changed, 54 insertions, 30 deletions
diff --git a/doc/lziprecover.texi b/doc/lziprecover.texi
index 80d6eb4..08d4312 100644
--- a/doc/lziprecover.texi
+++ b/doc/lziprecover.texi
@@ -6,8 +6,8 @@
@finalout
@c %**end of header
-@set UPDATED 29 August 2014
-@set VERSION 1.16
+@set UPDATED 16 October 2014
+@set VERSION 1.17-pre1
@dircategory Data Compression
@direntry
@@ -39,6 +39,7 @@ This manual is for Lziprecover (version @value{VERSION}, @value{UPDATED}).
* Invoking lziprecover:: Command line interface
* Repairing files:: Fixing bit-flip and similar errors
* Merging files:: Fixing several damaged copies
+* File names:: Names of the files produced by lziprecover
* File format:: Detailed format of the compressed file
* Examples:: A small tutorial with examples
* Unzcrash:: Testing the robustness of decompressors
@@ -59,11 +60,13 @@ to copy, distribute and modify it.
Lziprecover is a data recovery tool and decompressor for files in the
lzip compressed data format (.lz), able to repair slightly damaged
-files, recover badly damaged files from two or more copies, extract data
-from damaged files, decompress files and test integrity of files.
+files, produce a correct file by merging the good parts of two or more
+damaged copies, extract data from damaged files, decompress files and
+test integrity of files.
-The lzip file format is designed for long-term data archiving, taking
-into account both data integrity and decoder availability:
+The lzip file format is designed for data sharing and long-term
+archiving, taking into account both data integrity and decoder
+availability:
@itemize @bullet
@item
@@ -82,8 +85,8 @@ data from a lzip file long after quantum computers eventually render
LZMA obsolete.
@item
-Additionally lzip is copylefted, which guarantees that it will remain
-free forever.
+Additionally the lzip reference implementation is copylefted, which
+guarantees that it will remain free forever.
@end itemize
A nice feature of the lzip format is that a corrupt byte is easier to
@@ -196,7 +199,7 @@ information about the members in the file.
@item -m
@itemx --merge
-Try to produce a correct file merging the good parts of two or more
+Try to produce a correct file by merging the good parts of two or more
damaged copies. If successful, a repaired copy is written to the file
@samp{@var{file}_fixed.lz}. The exit status is 0 if a correct file could
be produced, 2 otherwise. See the chapter @samp{Merging files}
@@ -231,12 +234,12 @@ Search for members in @samp{@var{file}} and write each member in its own
integrity of the resulting files, decompress those which are undamaged,
and try to repair or partially decompress those which are damaged.
-The names of the files produced are in the form
-@samp{rec01@var{file}.lz}, @samp{rec02@var{file}.lz}, etc, and are
-designed so that the use of wildcards in subsequent processing, for
-example, @w{@samp{lziprecover -cd rec*@var{file}.lz > recovered_data}},
-processes the files in the correct order. The number of digits used in
-the names varies depending on the number of members in @samp{@var{file}}.
+The names of the files produced are in the form @samp{rec01@var{file}},
+@samp{rec02@var{file}}, etc, and are designed so that the use of
+wildcards in subsequent processing, for example, @w{@samp{lziprecover
+-cd rec*@var{file} > recovered_data}}, processes the files in the
+correct order. The number of digits used in the names varies depending
+on the number of members in @samp{@var{file}}.
@item -t
@itemx --test
@@ -282,17 +285,26 @@ caused lziprecover to panic.
@chapter Repairing files
@cindex repairing files
-Lziprecover is usually able to repair files with small errors (up to one
-byte error per member). The error may be located anywhere in the file
-except in the header (first 6 bytes of each member) or in the
-@samp{Member size} field of the trailer (last 8 bytes of each member).
-This makes lzip files resistant to bit-flip, one of the most common
-forms of data corruption.
+Lziprecover can repair perfectly most files with small errors (up to one
+single-byte error per member), without the need of any extra redundance
+at all. If the reparation is successful, the repaired file will be
+identical bit for bit to the original.
+
+The error may be located anywhere in the file except in the header
+(first 6 bytes of each member) or in the @samp{Member size} field of the
+trailer (last 8 bytes of each member). This makes lzip files resistant
+to bit-flip, one of the most common forms of data corruption.
Bit-flip happens when one bit in the file is changed from 0 to 1 or vice
versa. It may be caused by bad RAM or even by natural radiation. I have
seen a case of bit-flip in a file stored on an USB flash drive.
+One byte may seem small, but most file corruptions not produced by I/O
+errors just affect one byte, or even one bit, of the file. Also, unlike
+magnetic media, where errors usually affect a whole sector, solid-state
+storage devices tend to produce single-byte errors, making of lzip the
+perfect format for data stored on such devices.
+
Repairing a file can take some time. Small files or files with the error
located near the beginning can be repaired in a few seconds. But
repairing a large file compressed with a large dictionary size and with
@@ -309,7 +321,7 @@ repairs more efficiently the worst errors.
If you have several copies of a file but all of them are too damaged to
repair them (@pxref{Repairing files}), lziprecover can try to produce a
-correct file merging the good parts of the damaged copies.
+correct file by merging the good parts of the damaged copies.
The merge may succeed even if some copies of the file have all the
headers and trailers damaged, as long as there is at least one copy of
@@ -321,16 +333,16 @@ damaged in all copies), or are adjacent and the boundary can't be
determined, or if the copies have too many damaged areas.
All the copies must have the same size. If some of them have been
-truncated and are therefore smaller than they should, you can extend
-them to the correct size with the following command before merging them
-with the other copies:
+truncated and are therefore smaller than they should, they can be
+extended to the correct size with the following command before merging
+them with the other copies:
@example
ddrescue --extend-outfile=<correct_size> small_file.lz extended_file.lz
@end example
If some of the copies have got garbage data at the end and are therefore
-larger than they should, you can reduce their sizes to the correct value
+larger than they should, their sizes can be reduced to the correct value
with the following command before merging them with the other copies:
@example
@@ -342,7 +354,19 @@ of them with one damaged area affecting 1 percent of the copy, the
probability of obtaining a correct file is about 98 percent. With three
such copies the probability rises to 99.97 percent. For large files (a
few MB) with small errors (one sector damaged per copy), the probability
-approaches 100 percent even with only two copies.
+approaches 100 percent even with only two copies. (Supposing that the
+errors are randomly located inside each copy).
+
+
+@node File names
+@chapter Names of the files produced by lziprecover
+@cindex file names
+
+The name of the fixed file produced by @samp{--merge} and
+@samp{--repair} is made by appending the string @samp{_fixed.lz} to the
+original file name. If the original file name ends with one of the
+extensions @samp{.tar.lz}, @samp{.lz} or @samp{.tlz}, the string
+@samp{_fixed} is inserted before the extension.
@node File format
@@ -541,9 +565,9 @@ accesses. If it does, please, report it as a bug.
Unzcrash really executes as a subprocess the shell command specified in
the first non-option argument, and then writes the file specified in the
second non-option argument to the standard input of the subprocess,
-modifying the corresponding byte each time. Therefore you can use
-unzcrash to test any decompressor (not only lzip), or even other decoder
-programs with a suitable command line syntax.
+modifying the corresponding byte each time. Therefore unzcrash can be
+used to test any decompressor (not only lzip), or even other decoder
+programs having a suitable command line syntax.
The format for running unzcrash is: