summaryrefslogtreecommitdiffstats
path: root/README
diff options
context:
space:
mode:
Diffstat (limited to 'README')
-rw-r--r--README149
1 files changed, 76 insertions, 73 deletions
diff --git a/README b/README
index fa1da74..2207fb5 100644
--- a/README
+++ b/README
@@ -1,105 +1,108 @@
Description
-Lzlib is a data compression library providing in-memory LZMA compression
-and decompression functions, including integrity checking of the
-decompressed data. The compressed data format used by the library is the
-lzip format. Lzlib is written in C.
+Lzlib is a data compression library providing in-memory LZMA compression and
+decompression functions, including integrity checking of the decompressed
+data. The compressed data format used by the library is the lzip format.
+Lzlib is written in C.
The lzip file format is designed for data sharing and long-term archiving,
taking into account both data integrity and decoder availability:
* The lzip format provides very safe integrity checking and some data
- recovery means. The lziprecover program can repair bit flip errors
- (one of the most common forms of data corruption) in lzip files,
- and provides data recovery capabilities, including error-checked
- merging of damaged copies of a file.
-
- * The lzip format is as simple as possible (but not simpler). The
- lzip manual provides the source code of a simple decompressor
- along with a detailed explanation of how it works, so that with
- the only help of the lzip manual it would be possible for a
- digital archaeologist to extract the data from a lzip file long
- after quantum computers eventually render LZMA obsolete.
+ recovery means. The program lziprecover can repair bit flip errors
+ (one of the most common forms of data corruption) in lzip files, and
+ provides data recovery capabilities, including error-checked merging
+ of damaged copies of a file.
+
+ * The lzip format is as simple as possible (but not simpler). The lzip
+ manual provides the source code of a simple decompressor along with a
+ detailed explanation of how it works, so that with the only help of the
+ lzip manual it would be possible for a digital archaeologist to extract
+ the data from a lzip file long after quantum computers eventually
+ render LZMA obsolete.
* Additionally the lzip reference implementation is copylefted, which
guarantees that it will remain free forever.
-A nice feature of the lzip format is that a corrupt byte is easier to
-repair the nearer it is from the beginning of the file. Therefore, with
-the help of lziprecover, losing an entire archive just because of a
-corrupt byte near the beginning is a thing of the past.
+A nice feature of the lzip format is that a corrupt byte is easier to repair
+the nearer it is from the beginning of the file. Therefore, with the help of
+lziprecover, losing an entire archive just because of a corrupt byte near
+the beginning is a thing of the past.
-The functions and variables forming the interface of the compression
-library are declared in the file 'lzlib.h'. Usage examples of the
-library are given in the files 'main.c' and 'bbexample.c' from the
-source distribution.
+The functions and variables forming the interface of the compression library
+are declared in the file 'lzlib.h'. Usage examples of the library are given
+in the files 'bbexample.c', 'ffexample.c', and 'minilzip.c' from the source
+distribution.
+
+As 'lzlib.h' can be used by C and C++ programs, it must not impose a choice
+of system headers on the program by including one of them. Therefore it is
+the responsibility of the program using lzlib to include before 'lzlib.h'
+some header that declares the type 'uint8_t'. There are at least four such
+headers in C and C++: 'stdint.h', 'cstdint', 'inttypes.h', and 'cinttypes'.
+
+All the library functions are thread safe. The library does not install any
+signal handler. The decoder checks the consistency of the compressed data,
+so the library should never crash even in case of corrupted input.
Compression/decompression is done by repeatedly calling a couple of
-read/write functions until all the data have been processed by the
-library. This interface is safer and less error prone than the
-traditional zlib interface.
+read/write functions until all the data have been processed by the library.
+This interface is safer and less error prone than the traditional zlib
+interface.
Compression/decompression is done when the read function is called. This
-means the value returned by the position functions will not be updated
-until a read call, even if a lot of data are written. If you want the
-data to be compressed in advance, just call the read function with a
-size equal to 0.
-
-If all the data to be compressed are written in advance, lzlib will
-automatically adjust the header of the compressed data to use the
-largest dictionary size that does not exceed neither the data size nor
-the limit given to LZ_compress_open. This feature reduces the amount of
-memory needed for decompression and allows minilzip to produce identical
-compressed output as lzip.
-
-Lzlib will correctly decompress a data stream which is the concatenation
-of two or more compressed data streams. The result is the concatenation
-of the corresponding decompressed data streams. Integrity testing of
-concatenated compressed data streams is also supported.
+means the value returned by the position functions is not updated until a
+read call, even if a lot of data are written. If you want the data to be
+compressed in advance, just call the read function with a size equal to 0.
+
+If all the data to be compressed are written in advance, lzlib automatically
+adjusts the header of the compressed data to use the largest dictionary size
+that does not exceed neither the data size nor the limit given to
+'LZ_compress_open'. This feature reduces the amount of memory needed for
+decompression and allows minilzip to produce identical compressed output as
+lzip.
+
+Lzlib correctly decompresses a data stream which is the concatenation of
+two or more compressed data streams. The result is the concatenation of the
+corresponding decompressed data streams. Integrity testing of concatenated
+compressed data streams is also supported.
Lzlib is able to compress and decompress streams of unlimited size by
-automatically creating multimember output. The members so created are
-large, about 2 PiB each.
-
-All the library functions are thread safe. The library does not install
-any signal handler. The decoder checks the consistency of the compressed
-data, so the library should never crash even in case of corrupted input.
+automatically creating multimember output. The members so created are large,
+about 2 PiB each.
In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a
concrete algorithm; it is more like "any algorithm using the LZMA coding
-scheme". For example, the option '-0' of lzip uses the scheme in almost
-the simplest way possible; issuing the longest match it can find, or a
-literal byte if it can't find a match. Inversely, a much more elaborated
-way of finding coding sequences of minimum size than the one currently
-used by lzip could be developed, and the resulting sequence could also
-be coded using the LZMA coding scheme.
+scheme". For example, the option '-0' of lzip uses the scheme in almost the
+simplest way possible; issuing the longest match it can find, or a literal
+byte if it can't find a match. Inversely, a much more elaborated way of
+finding coding sequences of minimum size than the one currently used by lzip
+could be developed, and the resulting sequence could also be coded using the
+LZMA coding scheme.
-Lzlib currently implements two variants of the LZMA algorithm; fast
-(used by option '-0' of minilzip) and normal (used by all other
-compression levels).
+Lzlib currently implements two variants of the LZMA algorithm: fast (used by
+option '-0' of minilzip) and normal (used by all other compression levels).
The high compression of LZMA comes from combining two basic, well-proven
-compression ideas: sliding dictionaries (LZ77/78) and markov models (the
-thing used by every compression algorithm that uses a range encoder or
-similar order-0 entropy coder as its last stage) with segregation of
-contexts according to what the bits are used for.
+compression ideas: sliding dictionaries (LZ77) and Markov models (the thing
+used by every compression algorithm that uses a range encoder or similar
+order-0 entropy coder as its last stage) with segregation of contexts
+according to what the bits are used for.
The ideas embodied in lzlib are due to (at least) the following people:
-Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for
-the definition of Markov chains), G.N.N. Martin (for the definition of
-range encoding), Igor Pavlov (for putting all the above together in
-LZMA), and Julian Seward (for bzip2's CLI).
+Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the
+definition of Markov chains), G.N.N. Martin (for the definition of range
+encoding), Igor Pavlov (for putting all the above together in LZMA), and
+Julian Seward (for bzip2's CLI).
-LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never
-have been compressed. Decompressed is used to refer to data which have
-undergone the process of decompression.
+LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never have
+been compressed. Decompressed is used to refer to data which have undergone
+the process of decompression.
-Copyright (C) 2009-2019 Antonio Diaz Diaz.
+Copyright (C) 2009-2024 Antonio Diaz Diaz.
This file is free documentation: you have unlimited permission to copy,
-distribute and modify it.
+distribute, and modify it.
-The file Makefile.in is a data file used by configure to produce the
-Makefile. It has the same copyright owner and permissions that configure
-itself.
+The file Makefile.in is a data file used by configure to produce the Makefile.
+It has the same copyright owner and permissions that configure itself.