README


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89

See the file INSTALL for compilation and installation instructions.

Description

Lzlib is a data compression library providing in-memory LZMA compression and
decompression functions, including integrity checking of the decompressed
data. The compressed data format used by the library is the lzip format.
Lzlib is written in C and is distributed under a 2-clause BSD license.

The functions and variables forming the interface of the compression library
are declared in the file 'lzlib.h'. Usage examples of the library are given
in the files 'bbexample.c', 'ffexample.c', and 'minilzip.c' from the source
distribution.

As 'lzlib.h' can be used in C and C++ programs, it must not impose a choice
of system headers on the program by including one of them. Therefore it is
the responsibility of the program using lzlib to include before 'lzlib.h'
some header that declares the type 'uint8_t'. There are at least four such
headers in C and C++: 'stdint.h', 'cstdint', 'inttypes.h', and 'cinttypes'.

All the library functions are thread safe. The library does not install any
signal handler. The decoder checks the consistency of the compressed data,
so the library should never crash even in case of corrupted input.

Compression/decompression is done by repeatedly calling a couple of
read/write functions until all the data have been processed by the library.
This interface is safer and less error prone than the traditional zlib
interface.

Compression/decompression is done when the read function is called. This
means the value returned by the position functions is not updated until a
read call, even if a lot of data are written. If you want the data to be
compressed in advance, just call the read function with a size equal to 0.

If all the data to be compressed are written in advance, lzlib automatically
adjusts the header of the compressed data to use the largest dictionary size
that does not exceed neither the data size nor the limit given to
'LZ_compress_open'. This feature reduces the amount of memory needed for
decompression and allows minilzip to produce identical compressed output as
lzip.

Lzlib correctly decompresses a data stream which is the concatenation of
two or more compressed data streams. The result is the concatenation of the
corresponding decompressed data streams. Integrity testing of concatenated
compressed data streams is also supported.

Lzlib is able to compress and decompress streams of unlimited size by
automatically creating multimember output. The members so created are large,
about 2 PiB each.

In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a
concrete algorithm; it is more like "any algorithm using the LZMA coding
scheme". For example, the option '-0' of lzip uses the scheme in almost the
simplest way possible; issuing the longest match it can find, or a literal
byte if it can't find a match. Inversely, a more elaborate way of finding
coding sequences of minimum size than the one currently used by lzip could
be developed, and the resulting sequence could also be coded using the LZMA
coding scheme.

Lzlib currently implements two variants of the LZMA algorithm: fast (used by
option '-0' of minilzip) and normal (used by all other compression levels).

The high compression of LZMA comes from combining two basic, well-proven
compression ideas: sliding dictionaries (LZ77) and Markov models (the thing
used by every compression algorithm that uses a range encoder or similar
order-0 entropy coder as its last stage) with segregation of contexts
according to what the bits are used for.

The ideas embodied in lzlib are due to (at least) the following people:
Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the
definition of Markov chains), G.N.N. Martin (for the definition of range
encoding), Igor Pavlov (for putting all the above together in LZMA), and
Julian Seward (for bzip2's CLI).

LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never have
been compressed. Decompressed is used to refer to data which have undergone
the process of decompression.

minilzip uses Arg_parser for command-line argument parsing:
http://www.nongnu.org/arg-parser/arg_parser.html


Copyright (C) 2009-2025 Antonio Diaz Diaz.

This file is free documentation: you have unlimited permission to copy,
distribute, and modify it.

The file Makefile.in is a data file used by configure to produce the Makefile.
It has the same copyright owner and permissions that configure itself.