summaryrefslogtreecommitdiffstats
path: root/README
blob: 75e8968d60069d4b11d972085f6f4fa72fa96394 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
Description

Plzip is a massively parallel (multi-threaded) implementation of lzip, fully
compatible with lzip 1.4 or newer. Plzip uses the compression library lzlib.

Lzip is a lossless data compressor with a user interface similar to the one
of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
chain-Algorithm' (LZMA) stream format and provides a 3 factor integrity
checking to maximize interoperability and optimize safety. Lzip can compress
about as fast as gzip (lzip -0) or compress most files more than bzip2
(lzip -9). Decompression speed is intermediate between gzip and bzip2.
Lzip is better than gzip and bzip2 from a data recovery perspective. Lzip
has been designed, written, and tested with great care to replace gzip and
bzip2 as the standard general-purpose compressed format for unix-like
systems.

Plzip can compress/decompress large files on multiprocessor machines much
faster than lzip, at the cost of a slightly reduced compression ratio (0.4
to 2 percent larger compressed files). Note that the number of usable
threads is limited by file size; on files larger than a few GB plzip can use
hundreds of processors, but on files of only a few MB plzip is no faster
than lzip.

For creation and manipulation of compressed tar archives tarlz can be more
efficient than using tar and plzip because tarlz is able to keep the
alignment between tar members and lzip members.

When compressing, plzip divides the input file into chunks and compresses as
many chunks simultaneously as worker threads are chosen, creating a
multimember compressed file.

When decompressing, plzip decompresses as many members simultaneously as
worker threads are chosen. Files that were compressed with lzip will not
be decompressed faster than using lzip (unless the option '-b' was used)
because lzip usually produces single-member files, which can't be
decompressed in parallel.

The lzip file format is designed for data sharing and long-term archiving,
taking into account both data integrity and decoder availability:

   * The lzip format provides very safe integrity checking and some data
     recovery means. The program lziprecover can repair bit flip errors
     (one of the most common forms of data corruption) in lzip files, and
     provides data recovery capabilities, including error-checked merging
     of damaged copies of a file.

   * The lzip format is as simple as possible (but not simpler). The lzip
     manual provides the source code of a simple decompressor along with a
     detailed explanation of how it works, so that with the only help of the
     lzip manual it would be possible for a digital archaeologist to extract
     the data from a lzip file long after quantum computers eventually
     render LZMA obsolete.

   * Additionally the lzip reference implementation is copylefted, which
     guarantees that it will remain free forever.

A nice feature of the lzip format is that a corrupt byte is easier to repair
the nearer it is from the beginning of the file. Therefore, with the help of
lziprecover, losing an entire archive just because of a corrupt byte near
the beginning is a thing of the past.

Plzip uses the same well-defined exit status values used by lzip, which
makes it safer than compressors returning ambiguous warning values (like
gzip) when it is used as a back end for other programs like tar or zutils.

Plzip will automatically use for each file the largest dictionary size that
does not exceed neither the file size nor the limit given. Keep in mind that
the decompression memory requirement is affected at compression time by the
choice of dictionary size limit.

When compressing, plzip replaces every file given in the command line
with a compressed version of itself, with the name "original_name.lz".
When decompressing, plzip attempts to guess the name for the decompressed
file from that of the compressed file as follows:

filename.lz    becomes   filename
filename.tlz   becomes   filename.tar
anyothername   becomes   anyothername.out

(De)compressing a file is much like copying or moving it. Therefore plzip
preserves the access and modification dates, permissions, and, when
possible, ownership of the file just as 'cp -p' does. (If the user ID or
the group ID can't be duplicated, the file permission bits S_ISUID and
S_ISGID are cleared).

Plzip is able to read from some types of non-regular files if either the
option '-c' or the option '-o' is specified.

If no file names are specified, plzip compresses (or decompresses) from
standard input to standard output. Plzip will refuse to read compressed data
from a terminal or write compressed data to a terminal, as this would be
entirely incomprehensible and might leave the terminal in an abnormal state.

Plzip will correctly decompress a file which is the concatenation of two or
more compressed files. The result is the concatenation of the corresponding
decompressed files. Integrity testing of concatenated compressed files is
also supported.

LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never have
been compressed. Decompressed is used to refer to data which have undergone
the process of decompression.


Copyright (C) 2009-2022 Antonio Diaz Diaz.

This file is free documentation: you have unlimited permission to copy,
distribute, and modify it.

The file Makefile.in is a data file used by configure to produce the
Makefile. It has the same copyright owner and permissions that configure
itself.