summaryrefslogtreecommitdiffstats
path: root/doc/lzlib.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/lzlib.texi')
-rw-r--r--doc/lzlib.texi101
1 files changed, 51 insertions, 50 deletions
diff --git a/doc/lzlib.texi b/doc/lzlib.texi
index 462c840..23bbbfb 100644
--- a/doc/lzlib.texi
+++ b/doc/lzlib.texi
@@ -6,8 +6,8 @@
@finalout
@c %**end of header
-@set UPDATED 19 April 2024
-@set VERSION 1.15-pre1
+@set UPDATED 16 October 2024
+@set VERSION 1.15-pre2
@dircategory Compression
@direntry
@@ -45,7 +45,7 @@ This manual is for Lzlib (version @value{VERSION}, @value{UPDATED}).
* Error codes:: Meaning of codes returned by functions
* Error messages:: Error messages corresponding to error codes
* Invoking minilzip:: Command-line interface of the test program
-* Data format:: Detailed format of the compressed data
+* File format:: Detailed format of the compressed file
* Examples:: A small tutorial with examples
* Problems:: Reporting bugs
* Concept index:: Index of concepts
@@ -67,7 +67,7 @@ distribute, and modify it.
is a data compression library providing in-memory LZMA compression and
decompression functions, including integrity checking of the decompressed
data. The compressed data format used by the library is the lzip format.
-Lzlib is written in C.
+Lzlib is written in C and is distributed under a 2-clause BSD license.
The lzip file format is designed for data sharing and long-term archiving,
taking into account both data integrity and decoder availability:
@@ -103,16 +103,16 @@ lziprecover, losing an entire archive just because of a corrupt byte near
the beginning is a thing of the past.
The functions and variables forming the interface of the compression library
-are declared in the file @samp{lzlib.h}. Usage examples of the library are
-given in the files @samp{bbexample.c}, @samp{ffexample.c}, and
-@samp{minilzip.c} from the source distribution.
+are declared in the file @file{lzlib.h}. Usage examples of the library are
+given in the files @file{bbexample.c}, @file{ffexample.c}, and
+@file{minilzip.c} from the source distribution.
-As @samp{lzlib.h} can be used by C and C++ programs, it must not impose a
+As @file{lzlib.h} can be used by C and C++ programs, it must not impose a
choice of system headers on the program by including one of them. Therefore
it is the responsibility of the program using lzlib to include before
-@samp{lzlib.h} some header that declares the type @samp{uint8_t}. There are
-at least four such headers in C and C++: @samp{stdint.h}, @samp{cstdint},
-@samp{inttypes.h}, and @samp{cinttypes}.
+@file{lzlib.h} some header that declares the type @samp{uint8_t}. There are
+at least four such headers in C and C++: @file{stdint.h}, @file{cstdint},
+@file{inttypes.h}, and @file{cinttypes}.
All the library functions are thread safe. The library does not install any
signal handler. The decoder checks the consistency of the compressed data,
@@ -183,10 +183,10 @@ versions of itself down to 1.0. Any application working with an older lzlib
should work with a newer lzlib. Installing a newer lzlib should not break
anything. This chapter describes the constants and functions that the
application can use to discover the version of the library being used. All
-of them are declared in @samp{lzlib.h}.
+of them are declared in @file{lzlib.h}.
@defvr Constant LZ_API_VERSION
-This constant is defined in @samp{lzlib.h} and works as a version test
+This constant is defined in @file{lzlib.h} and works as a version test
macro. The application should check at compile time that LZ_API_VERSION is
greater than or equal to the version required by the application:
@@ -207,7 +207,7 @@ which allow the application to announce to the library its desire to have
certain symbols and prototypes exposed.
@deftypefun int LZ_api_version ( void )
-If LZ_API_VERSION >= 1012, this function is declared in @samp{lzlib.h} (else
+If LZ_API_VERSION >= 1012, this function is declared in @file{lzlib.h} (else
it doesn't exist). It returns the LZ_API_VERSION of the library object code
being used. The application should check at run time that the value
returned by @code{LZ_api_version} is greater than or equal to the version
@@ -225,7 +225,7 @@ the functionality required by the application.
@end deftypefun
@deftypevr Constant {const char *} LZ_version_string
-This string constant is defined in the header file @samp{lzlib.h} and
+This string constant is defined in the header file @file{lzlib.h} and
represents the version of the library being used at compile time.
@end deftypevr
@@ -385,7 +385,7 @@ lzip files; it is a device for interactive communication between
applications using lzlib, but is useless and wasteful in a file, and is
excluded from the media type @samp{application/lzip}. The LZMA marker
@samp{2} ("End Of Stream" marker) is the only marker allowed in lzip files.
-@xref{Data format}.
+@xref{File format}.
Repeated use of @samp{LZ_compress_sync_flush} may degrade compression
ratio, so use it only when needed. If the interval between calls to
@@ -524,7 +524,7 @@ detecting a truncated member.
@deftypefun int LZ_decompress_reset ( struct LZ_Decoder * const @var{decoder} )
Resets the internal state of @var{decoder} as it was just after opening
it with the function @samp{LZ_decompress_open}. Data stored in the
-internal buffers is discarded. Position counters are set to 0.
+internal buffers are discarded. Position counters are set to 0.
@end deftypefun
@@ -670,7 +670,7 @@ necessarily LZ_ok, and you should not use @samp{LZ_(de)compress_errno}
to determine whether a call failed. If the call failed, then you can
examine @samp{LZ_(de)compress_errno}.
-The error codes are defined in the header file @samp{lzlib.h}.
+The error codes are defined in the header file @file{lzlib.h}.
@deftypevr Constant {enum LZ_Errno} LZ_ok
The value of this constant is 0 and is used to indicate that there is no error.
@@ -693,8 +693,8 @@ finished.
@end deftypevr
@deftypevr Constant {enum LZ_Errno} LZ_header_error
-An invalid member header (one with the wrong magic bytes) was read. If
-this happens at the end of the data stream it may indicate trailing data.
+An invalid member header (one with the wrong magic bytes) was read. If this
+happens at the end of the data stream it may indicate trailing data.
@end deftypevr
@deftypevr Constant {enum LZ_Errno} LZ_unexpected_eof
@@ -702,11 +702,12 @@ The end of the data stream was reached in the middle of a member.
@end deftypevr
@deftypevr Constant {enum LZ_Errno} LZ_data_error
-The data stream is corrupt. If @samp{LZ_decompress_member_position} is 6
-or less, it indicates either a format version not supported, an invalid
-dictionary size, a corrupt header in a multimember data stream, or
-trailing data too similar to a valid lzip header. Lziprecover can be
-used to remove conflicting trailing data from a file.
+The data stream is corrupt. If @samp{LZ_decompress_member_position} is 6 or
+less, it indicates either a format version not supported, an invalid
+dictionary size, a nonzero first LZMA byte, a corrupt header in a multimember
+data stream, or trailing data too similar to a valid lzip header.
+Lziprecover can be used to repair some of these errors and to remove
+conflicting trailing data from a file.
@end deftypevr
@deftypevr Constant {enum LZ_Errno} LZ_library_error
@@ -747,8 +748,8 @@ maximum dictionary size is 512 MiB so that any lzip file can be decompressed
on 32-bit machines. Lzip provides accurate and robust 3-factor integrity
checking. Lzip can compress about as fast as gzip @w{(lzip -0)} or compress most
files more than bzip2 @w{(lzip -9)}. Decompression speed is intermediate between
-gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
-perspective. Lzip has been designed, written, and tested with great care to
+gzip and bzip2. Lzip provides better data recovery capabilities than gzip
+and bzip2. Lzip has been designed, written, and tested with great care to
replace gzip and bzip2 as the standard general-purpose compressed format for
Unix-like systems.
@@ -766,6 +767,7 @@ argument means standard input. It can be mixed with other @var{files} and is
read just once, the first time it appears in the command line. Remember to
prepend @file{./} to any file name beginning with a hyphen, or use @samp{--}.
+@noindent
minilzip supports the following
@uref{http://www.nongnu.org/arg-parser/manual/arg_parser_manual.html#Argument-syntax,,options}:
@ifnothtml
@@ -823,7 +825,7 @@ Force overwrite of output files.
@item -F
@itemx --recompress
When compressing, force re-compression of files whose name already has
-the @samp{.lz} or @samp{.tlz} suffix.
+the @file{.lz} or @file{.tlz} suffix.
@item -k
@itemx --keep
@@ -846,8 +848,8 @@ when reading from a named pipe (fifo) or from a device. @w{@option{-o -}} is
equivalent to @option{-c}. @option{-o} has no effect when testing.
When compressing and splitting the output in volumes, @var{file} is used as
-a prefix, and several files named @samp{@var{file}00001.lz},
-@samp{@var{file}00002.lz}, etc, are created. In this case, only one input
+a prefix, and several files named @file{@var{file}00001.lz},
+@file{@var{file}00002.lz}, etc, are created. In this case, only one input
file is allowed.
@item -q
@@ -873,7 +875,7 @@ is affected at compression time by the choice of dictionary size limit.
@itemx --volume-size=@var{bytes}
When compressing, and @option{-c} has not been also specified, split the
compressed output into several volume files with names
-@samp{original_name00001.lz}, @samp{original_name00002.lz}, etc, and set the
+@file{original_name00001.lz}, @file{original_name00002.lz}, etc, and set the
volume size limit to @var{bytes}. Input files are kept unchanged. Each
volume is a complete, maybe multimember, lzip file. A small volume size may
degrade compression ratio, so use it only when needed. Valid values range
@@ -892,11 +894,10 @@ files.
@item -v
@itemx --verbose
Verbose mode.@*
-When compressing, show the compression ratio and size for each file
-processed.@*
-When decompressing or testing, further -v's (up to 4) increase the
-verbosity level, showing status, compression ratio, dictionary size,
-and trailer contents (CRC, data size, member size).
+When compressing, show the compression ratio and size for each file processed.@*
+When decompressing or testing, further -v's (up to 4) increase the verbosity
+level, showing status, compression ratio, dictionary size, and trailer
+contents (CRC, data size, member size).
@item -0 .. -9
Compression level. Set the compression parameters (dictionary size and
@@ -915,7 +916,7 @@ given, the last setting is used. For example @w{@option{-9 -s64MiB}} is
equivalent to @w{@option{-s64MiB -m273}}
@multitable {Level} {Dictionary size (-s)} {Match length limit (-m)}
-@item Level @tab Dictionary size (-s) @tab Match length limit (-m)
+@headitem Level @tab Dictionary size (-s) @tab Match length limit (-m)
@item -0 @tab 64 KiB @tab 16 bytes
@item -1 @tab 1 MiB @tab 5 bytes
@item -2 @tab 1.5 MiB @tab 6 bytes
@@ -960,7 +961,7 @@ and may be followed by a multiplier and an optional @samp{B} for "byte".
Table of SI and binary prefixes (unit multipliers):
@multitable {Prefix} {kilobyte (10^3 = 1000)} {|} {Prefix} {kibibyte (2^10 = 1024)}
-@item Prefix @tab Value @tab | @tab Prefix @tab Value
+@headitem Prefix @tab Value @tab | @tab Prefix @tab Value
@item k @tab kilobyte (10^3 = 1000) @tab | @tab Ki @tab kibibyte (2^10 = 1024)
@item M @tab megabyte (10^6) @tab | @tab Mi @tab mebibyte (2^20)
@item G @tab gigabyte (10^9) @tab | @tab Gi @tab gibibyte (2^30)
@@ -980,15 +981,14 @@ indicate a corrupt or invalid input file, 3 for an internal consistency
error (e.g., bug) which caused minilzip to panic.
-@node Data format
-@chapter Data format
-@cindex data format
+@node File format
+@chapter File format
+@cindex file format
Perfection is reached, not when there is no longer anything to add, but
when there is no longer anything to take away.@*
--- Antoine de Saint-Exupery
-@sp 1
In the diagram below, a box like this:
@verbatim
@@ -1007,12 +1007,13 @@ represents one byte; a box like this:
represents a variable number of bytes.
-@sp 1
-Lzip data consist of one or more independent "members" (compressed data
-sets). The members simply appear one after another in the data stream, with
-no additional information before, between, or after them. Each member can
+@noindent
+A lzip file consists of one or more independent "members" (compressed data
+sets). The members simply appear one after another in the file, with no
+additional information before, between, or after them. Each member can
encode in compressed form up to @w{16 EiB - 1 byte} of uncompressed data.
-The size of a multimember data stream is unlimited.
+The size of a multimember file is unlimited. Empty members (data size = 0)
+are not allowed in multimember files.
Each member has the following structure:
@@ -1043,7 +1044,7 @@ Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB@*
Valid values for dictionary size range from 4 KiB to 512 MiB.
@item LZMA stream
-The LZMA stream, finished by an "End Of Stream" marker. Uses default values
+The LZMA stream, terminated by an "End Of Stream" marker. Uses default values
for encoder properties.
@ifnothtml
@xref{Stream format,,,lzip},
@@ -1077,8 +1078,8 @@ overflowing.
@cindex examples
This chapter provides real code examples for the most common uses of the
-library. See these examples in context in the files @samp{bbexample.c} and
-@samp{ffexample.c} from the source distribution of lzlib.
+library. See these examples in context in the files @file{bbexample.c} and
+@file{ffexample.c} from the source distribution of lzlib.
Note that the interface of lzlib is symmetrical. That is, the code for
normal compression and decompression is identical except because one calls