summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--AUTHORS7
-rw-r--r--COPYING17
-rw-r--r--COPYING.GPL338
-rw-r--r--ChangeLog266
-rw-r--r--INSTALL85
-rw-r--r--Makefile.in205
-rw-r--r--NEWS19
-rw-r--r--README108
-rw-r--r--bbexample.c367
-rw-r--r--carg_parser.c319
-rw-r--r--carg_parser.h97
-rw-r--r--cbuffer.c143
-rwxr-xr-xconfigure244
-rw-r--r--decoder.c145
-rw-r--r--decoder.h463
-rw-r--r--doc/lzlib.info1336
-rw-r--r--doc/lzlib.texi1407
-rw-r--r--doc/minilzip.1136
-rw-r--r--encoder.c587
-rw-r--r--encoder.h326
-rw-r--r--encoder_base.c194
-rw-r--r--encoder_base.h609
-rw-r--r--fast_encoder.c175
-rw-r--r--fast_encoder.h70
-rw-r--r--ffexample.c300
-rw-r--r--lzcheck.c400
-rw-r--r--lzip.h294
-rw-r--r--lzlib.c601
-rw-r--r--lzlib.h110
-rw-r--r--minilzip.c1292
-rwxr-xr-xtestsuite/check.sh449
-rw-r--r--testsuite/fox.lzbin0 -> 80 bytes
-rw-r--r--testsuite/fox_bcrc.lzbin0 -> 80 bytes
-rw-r--r--testsuite/fox_crc0.lzbin0 -> 80 bytes
-rw-r--r--testsuite/fox_das46.lzbin0 -> 80 bytes
-rw-r--r--testsuite/fox_de20.lzbin0 -> 80 bytes
-rw-r--r--testsuite/fox_lf9
-rw-r--r--testsuite/fox_mes81.lzbin0 -> 80 bytes
-rw-r--r--testsuite/fox_s11.lzbin0 -> 80 bytes
-rw-r--r--testsuite/fox_v2.lzbin0 -> 80 bytes
-rw-r--r--testsuite/test.txt676
-rw-r--r--testsuite/test.txt.lzbin0 -> 7376 bytes
-rw-r--r--testsuite/test_em.txt.lzbin0 -> 14024 bytes
-rw-r--r--testsuite/test_sync.lzbin0 -> 7568 bytes
44 files changed, 11794 insertions, 0 deletions
diff --git a/AUTHORS b/AUTHORS
new file mode 100644
index 0000000..dfd16e1
--- /dev/null
+++ b/AUTHORS
@@ -0,0 +1,7 @@
+Lzlib was written by Antonio Diaz Diaz.
+
+The ideas embodied in lzlib are due to (at least) the following people:
+Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the
+definition of Markov chains), G.N.N. Martin (for the definition of range
+encoding), Igor Pavlov (for putting all the above together in LZMA), and
+Julian Seward (for bzip2's CLI).
diff --git a/COPYING b/COPYING
new file mode 100644
index 0000000..a6511c8
--- /dev/null
+++ b/COPYING
@@ -0,0 +1,17 @@
+ Lzlib - Compression library for the lzip format
+ Copyright (C) Antonio Diaz Diaz.
+
+ This library is free software. Redistribution and use in source and
+ binary forms, with or without modification, are permitted provided
+ that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions, and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions, and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
diff --git a/COPYING.GPL b/COPYING.GPL
new file mode 100644
index 0000000..4ad17ae
--- /dev/null
+++ b/COPYING.GPL
@@ -0,0 +1,338 @@
+ GNU GENERAL PUBLIC LICENSE
+ Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+ Preamble
+
+ The licenses for most software are designed to take away your
+freedom to share and change it. By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users. This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it. (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.) You can apply it to
+your programs, too.
+
+ When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+ To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+ For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have. You must make sure that they, too, receive or can get the
+source code. And you must show them these terms so they know their
+rights.
+
+ We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+ Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software. If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+ Finally, any free program is threatened constantly by software
+patents. We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary. To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+
+ GNU GENERAL PUBLIC LICENSE
+ TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+ 0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License. The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language. (Hereinafter, translation is included without limitation in
+the term "modification".) Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope. The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+ 1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+ 2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+ a) You must cause the modified files to carry prominent notices
+ stating that you changed the files and the date of any change.
+
+ b) You must cause any work that you distribute or publish, that in
+ whole or in part contains or is derived from the Program or any
+ part thereof, to be licensed as a whole at no charge to all third
+ parties under the terms of this License.
+
+ c) If the modified program normally reads commands interactively
+ when run, you must cause it, when started running for such
+ interactive use in the most ordinary way, to print or display an
+ announcement including an appropriate copyright notice and a
+ notice that there is no warranty (or else, saying that you provide
+ a warranty) and that users may redistribute the program under
+ these conditions, and telling the user how to view a copy of this
+ License. (Exception: if the Program itself is interactive but
+ does not normally print such an announcement, your work based on
+ the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole. If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works. But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+ 3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+ a) Accompany it with the complete corresponding machine-readable
+ source code, which must be distributed under the terms of Sections
+ 1 and 2 above on a medium customarily used for software interchange; or,
+
+ b) Accompany it with a written offer, valid for at least three
+ years, to give any third party, for a charge no more than your
+ cost of physically performing source distribution, a complete
+ machine-readable copy of the corresponding source code, to be
+ distributed under the terms of Sections 1 and 2 above on a medium
+ customarily used for software interchange; or,
+
+ c) Accompany it with the information you received as to the offer
+ to distribute corresponding source code. (This alternative is
+ allowed only for noncommercial distribution and only if you
+ received the program in object code or executable form with such
+ an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it. For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable. However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+ 4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License. Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+ 5. You are not required to accept this License, since you have not
+signed it. However, nothing else grants you permission to modify or
+distribute the Program or its derivative works. These actions are
+prohibited by law if you do not accept this License. Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+ 6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions. You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+ 7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License. If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all. For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices. Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+ 8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded. In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+ 9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time. Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number. If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation. If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+ 10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission. For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this. Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+ NO WARRANTY
+
+ 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+ 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+ END OF TERMS AND CONDITIONS
+
+ How to Apply These Terms to Your New Programs
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+ <one line to give the program's name and a brief idea of what it does.>
+ Copyright (C) <year> <name of author>
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+ Gnomovision version 69, Copyright (C) <year> <name of author>
+ Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+ This is free software, and you are welcome to redistribute it
+ under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License. Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary. Here is a sample; alter the names:
+
+ Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+ `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+ <signature of Ty Coon>, 1 April 1989
+ Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs. If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library. If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
diff --git a/ChangeLog b/ChangeLog
new file mode 100644
index 0000000..f178fe7
--- /dev/null
+++ b/ChangeLog
@@ -0,0 +1,266 @@
+2024-01-20 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.14 released.
+ * minilzip.c: Reformat file diagnostics as 'PROGRAM: FILE: MESSAGE'.
+ (show_option_error): New function showing argument and option name.
+ (main): Make -o preserve date/mode/owner if 1 input file.
+ * lzip.h: Rename verify_* to check_*.
+ * lzlib.texi: Document the need to declare uint8_t before lzlib.h.
+ (Reported by Michal Górny).
+ * configure, Makefile.in: New variable 'MAKEINFO'.
+ * INSTALL: Document use of CFLAGS+='--std=c99 -D_XOPEN_SOURCE=500'.
+
+2022-01-23 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.13 released.
+ * configure: Set variables AR and ARFLAGS. (Reported by Hoël Bézier).
+ * main.c: Rename to minilzip.c.
+ * minilzip.c (getnum): Show option name and valid range if error.
+ (check_lib): Check that LZ_API_VERSION and LZ_version_string match.
+ * Improve several descriptions in manual, '--help', and man page.
+ * lzlib.texi: Change GNU Texinfo category to 'Compression'.
+ (Reported by Alfred M. Szmidt).
+
+2021-01-02 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.12 released.
+ * lzlib.h: Define LZ_API_VERSION as 1000 * major + minor. 1.12 = 1012.
+ This change does not affect the soversion.
+ * lzlib.h, lzlib.c: New function LZ_api_version.
+ * LZd_try_verify_trailer: Return 2 if EOF at trailer or EOS marker.
+ * Decompression speed has been slightly increased.
+ * decoder.h: Increase 'rd_min_available_bytes' from 8 to 10.
+ * encoder_base.c (LZeb_try_sync_flush):
+ Compensate for the increase in 'rd_min_available_bytes'.
+ * main.c (do_decompress): Fix false report about library stall.
+ * main.c: New option '--check-lib'.
+ * main.c (main): Report an error if a file name is empty.
+ Make '-o' behave like '-c', but writing to file instead of stdout.
+ Make '-c' and '-o' check whether the output is a terminal only once.
+ Do not open output if input is a terminal.
+ Replace 'decompressed', 'compressed' with 'out', 'in' in output.
+ Set a valid invocation_name even if argc == 0.
+ * lzlib.texi: Document the new way of checking the library version.
+ Document that 'LZ_(de)compress_close' and 'LZ_(de)compress_errno'
+ can be called with a null argument.
+ Document that sync flush marker is not allowed in lzip files.
+ Document the consequences of not calling 'LZ_decompress_finish'.
+ Document that 'LZ_decompress_read' returns at least once per member.
+ Document that 'LZ_(de)compress_read' can be called with a null
+ buffer pointer argument.
+ Real code examples for common uses have been added to the tutorial.
+ * bbexample.c: Don't use 'LZ_(de)compress_write_size'.
+ * lzcheck.c: New options '-s' (sync) and '-m' (member by member).
+ Test member by member without 'LZ_decompress_finish'.
+ * ffexample.c: New file containing example functions for file-to-file
+ compression/decompression.
+ * Document extraction from tar.lz in '--help' output and man page.
+ * Makefile.in: 'install-bin' no longer installs the man page.
+ New targets 'install-bin-compress' and 'install-bin-strip-compress'.
+ * testsuite: Add 9 new test files.
+
+2019-01-02 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.11 released.
+ * Rename File_* to Lzip_*.
+ * LZ_decompress_read: Don't return error until all data is read.
+ * decoder.c (LZd_decode_member): Decode truncated data until EOF.
+ * cbuffer.c (Cb_read_data): Allow a null buffer pointer.
+ * main.c: Don't allow mixing different operations (-d and -t).
+ * main.c: Check return value of close( infd ).
+ * main.c: Compile on DOS with DJGPP.
+ * lzlib.texi: Improve descriptions of '-0..-9', '-m', and '-s'.
+ Document that 'LZ_(de)compress_finish' can be called repeatedly.
+ * configure: Accept appending to CFLAGS; 'CFLAGS+=OPTIONS'.
+ * Makefile.in: Rename targets 'install-bin*' to 'install-lib*'.
+ * Makefile.in: Targets 'install-bin*' now install minilzip.
+ * INSTALL: Document use of CFLAGS+='-D __USE_MINGW_ANSI_STDIO'.
+
+2018-02-07 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.10 released.
+ * LZ_compress_finish now adjusts dictionary size for each member.
+ (Older versions can adjust dictionary size only once).
+ * lzlib.c (LZ_decompress_read): Detect corrupt header with HD=3.
+ * main.c: New option '--loose-trailing'.
+ * main.c (main): Option '-S, --volume-size' now keeps input files.
+ * main.c: Replace 'bits/byte' with inverse compression ratio.
+ * main.c: Show final diagnostic when testing multiple files.
+ * main.c: Do not add a second .lz extension to the arg of -o.
+ * main.c: Show dictionary size at verbosity level 4 (-vvvv).
+ * lzlib.texi: New chapter 'Invoking minilzip'.
+
+2017-04-11 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.9 released.
+ * Compression time of option '-0' has been reduced by 3%.
+ * Compression time of options -1 to -9 has been reduced by 1%.
+ * Decompression time has been reduced by 3%.
+ * main.c: Continue testing if any input file is a terminal.
+ * Change the license of the library to "2-clause BSD".
+
+2016-05-17 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.8 released.
+ * lzlib.h: Define LZ_API_VERSION to 1.
+ * lzlib.c (LZ_decompress_sync_to_member): Add skipped size to in_size.
+ * decoder.c (LZd_verify_trailer): Remove test of final code.
+ * main.c: New option '-a, --trailing-error'.
+ * main.c (main): Delete '--output' file if infd is a terminal.
+ * main.c (main): Don't use stdin more than once.
+ * configure: Avoid warning on some shells when testing for gcc.
+ * Makefile.in: Detect the existence of install-info.
+ * check.sh: A POSIX shell is required to run the tests.
+ * check.sh: Don't check error messages.
+
+2015-07-08 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.7 released.
+ * Port fast encoder and option '-0' from lzip.
+ * If open-->write-->finish, produce same dictionary size as lzip.
+ * Makefile.in: New targets 'install*-compress'.
+
+2014-08-27 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.6 released.
+ * Compression ratio of option '-9' has been slightly increased.
+ * configure: New options '--disable-static' and '--disable-ldconfig'.
+ * Makefile.in: Ignore errors from ldconfig.
+ * Makefile.in: Use 'CFLAGS' in every invocation of 'CC'.
+ * main.c (close_and_set_permissions): Behave like 'cp -p'.
+ * lzlib.texinfo: Rename to lzlib.texi.
+ * Change license to "GPL version 2 or later with link exception".
+
+2013-09-15 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.5 released.
+ * Remove decompression support for version 0 files.
+ * The LZ_compress_sync_flush mechanism has been fixed (again).
+ * Minor fixes.
+
+2013-05-28 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.4 released.
+ * Multi-step trials have been implemented.
+ * Compression ratio has been slightly increased.
+ * Compression time has been reduced by 8%.
+ * Decompression time has been reduced by 7%.
+ * lzlib.h: Change 'long long' values to 'unsigned long long'.
+ * encoder.c (Mf_init): Reduce minimum buffer size to 64KiB.
+ * lzlib.c (LZ_decompress_read): Tell LZ_header_error from
+ LZ_unexpected_eof the same way as lzip does.
+ * Makefile.in: New targets 'install-as-lzip' and 'install-bin'.
+ * main.c: Use 'setmode' instead of '_setmode' on Windows and OS/2.
+ * main.c: Define 'strtoull' to 'strtoul' on Windows.
+
+2012-02-29 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 1.3 released.
+ * Translated to C from the C++ source of lzlib 1.2.
+ * configure: Rename 'datadir' to 'datarootdir'.
+
+2011-10-25 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 1.2 released.
+ * encoder.h (Lee_update_prices): Update high length symbol prices
+ independently of the value of 'pos_state'. This gives better
+ compression for large values of '--match-length' without being
+ slower.
+ * encoder.h, encoder.cc: Optimize pair price calculations, reducing
+ compression time for large values of '--match-length' by up to 6%.
+ * main.cc: New option '-F, --recompress'.
+ * Makefile.in: 'make install' no longer tries to run '/sbin/ldconfig'
+ on systems lacking it.
+
+2011-01-03 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 1.1 released.
+ * Compression time has been reduced by 2%.
+ * All declarations not belonging to the API have been
+ encapsulated in the namespace 'Lzlib'.
+ * testsuite: Rename 'test1' to 'test.txt'. New tests.
+ * Match length limits set by options -1 to -9 of minilzip have
+ been changed to match those of lzip 1.11.
+ * main.cc: Set stdin/stdout in binary mode on OS2.
+ * bbexample.cc: New file containing example functions for
+ buffer-to-buffer compression/decompression.
+
+2010-05-08 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 1.0 released.
+ * New functions LZ_decompress_member_version, LZ_decompress_data_crc,
+ LZ_decompress_member_finished, and LZ_decompress_dictionary_size.
+ * Variables declared 'extern' have been encapsulated in a namespace.
+ * main.cc: Fix warning about fchown's return value being ignored.
+ * decoder.h: Integrate Input_buffer in Range_decoder.
+
+2010-02-10 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 0.9 released.
+ * Compression time has been reduced by 8%.
+ * main.cc: New constant 'o_binary'.
+
+2010-01-17 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 0.8 released.
+ * New functions LZ_decompress_reset, LZ_decompress_sync_to_member,
+ LZ_decompress_write_size, and LZ_strerror.
+ * lzlib.h: API change. Replace 'enum' with functions for values of
+ dictionary size limits to make interface names consistent.
+ * lzlib.h: API change. Rename 'LZ_errno' to 'LZ_Errno'.
+ * lzlib.h: API change. Replace 'void *' with 'struct LZ_Encoder *'
+ and 'struct LZ_Decoder *' to make interface type safe.
+ * decoder.cc: A truncated member trailer is now correctly detected.
+ * encoder.cc: Matchfinder::reset now also clears at_stream_end_,
+ allowing LZ_compress_restart_member to restart a finished stream.
+ * lzlib.cc: Accept only query or close operations after a fatal
+ error has occurred.
+ * The shared version of lzlib is no longer built by default.
+ * check.sh: Use 'test1' instead of 'COPYING' for testing.
+
+2009-10-20 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 0.7 released.
+ * Compression time has been reduced by 4%.
+ * check.sh: Remove -9 to run in less than 256MiB of RAM.
+ * lzcheck.cc: Read files of any size up to 2^63 bytes.
+
+2009-09-02 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 0.6 released.
+ * The LZ_compress_sync_flush mechanism has been fixed.
+
+2009-07-03 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 0.5 released.
+ * Decompression speed has been improved.
+ * main.cc (signal_handler): Declare as 'extern "C"'.
+
+2009-06-03 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 0.4 released.
+ * New functions LZ_compress_sync_flush and LZ_compress_write_size.
+ * Decompression speed has been improved.
+ * lzlib.texinfo: New chapter 'Buffering'.
+
+2009-05-03 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 0.3 released.
+ * Lzlib is now built as a shared library (in addition to static).
+
+2009-04-26 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 0.2 released.
+ * Fix a segfault when decompressing trailing garbage.
+ * Fix a false positive in LZ_(de)compress_finished.
+
+2009-04-21 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 0.1 released.
+
+
+Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+This file is a collection of facts, and thus it is not copyrightable,
+but just in case, you have unlimited permission to copy, distribute, and
+modify it.
diff --git a/INSTALL b/INSTALL
new file mode 100644
index 0000000..275b69b
--- /dev/null
+++ b/INSTALL
@@ -0,0 +1,85 @@
+Requirements
+------------
+You will need a C99 compiler. (gcc 3.3.6 or newer is recommended).
+I use gcc 6.1.0 and 3.3.6, but the code should compile with any standards
+compliant compiler.
+Gcc is available at http://gcc.gnu.org.
+
+The operating system must allow signal handlers read access to objects with
+static storage duration so that the cleanup handler for Control-C can delete
+the partial output file. (This requirement is for minilzip only).
+
+
+Procedure
+---------
+1. Unpack the archive if you have not done so already:
+
+ tar -xf lzlib[version].tar.lz
+or
+ lzip -cd lzlib[version].tar.lz | tar -xf -
+
+This creates the directory ./lzlib[version] containing the source code
+extracted from the archive.
+
+2. Change to lzlib directory and run configure.
+ (Try 'configure --help' for usage instructions).
+
+ cd lzlib[version]
+ ./configure
+
+ If you choose a C standard, enable the POSIX features explicitly:
+
+ ./configure CFLAGS+='--std=c99 -D_XOPEN_SOURCE=500'
+
+ If you are compiling on MinGW, use:
+
+ ./configure CFLAGS+='-D __USE_MINGW_ANSI_STDIO'
+
+3. Run make.
+
+ make
+
+4. Optionally, type 'make check' to run the tests that come with lzlib.
+
+5. Type 'make install' to install the library and any data files and
+ documentation. You need root privileges to install into a prefix owned
+ by root. (You may need to run ldconfig also).
+
+ Or type 'make install-compress', which additionally compresses the
+ info manual after installation.
+ (Installing compressed docs may become the default in the future).
+
+ You can install only the library or the info manual by typing
+ 'make install-lib' or 'make install-info' respectively.
+
+ 'make install-bin install-man' installs the program minilzip and its man
+ page. 'install-bin' installs a shared minilzip if the shared library has
+ been configured. Else it installs a static minilzip.
+ 'make install-bin-compress' additionally compresses the man page after
+ installation.
+
+ 'make install-as-lzip' runs 'make install-bin' and then links minilzip to
+ the name 'lzip'.
+
+
+Another way
+-----------
+You can also compile lzlib into a separate directory.
+To do this, you must use a version of 'make' that supports the variable
+'VPATH', such as GNU 'make'. 'cd' to the directory where you want the
+object files and executables to go and run the 'configure' script.
+'configure' automatically checks for the source code in '.', in '..', and
+in the directory that 'configure' is in.
+
+'configure' recognizes the option '--srcdir=DIR' to control where to look
+for the source code. Usually 'configure' can determine that directory
+automatically.
+
+After running 'configure', you can run 'make' and 'make install' as
+explained above.
+
+
+Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+This file is free documentation: you have unlimited permission to copy,
+distribute, and modify it.
diff --git a/Makefile.in b/Makefile.in
new file mode 100644
index 0000000..de54626
--- /dev/null
+++ b/Makefile.in
@@ -0,0 +1,205 @@
+
+DISTNAME = $(pkgname)-$(pkgversion)
+INSTALL = install
+INSTALL_PROGRAM = $(INSTALL) -m 755
+INSTALL_DATA = $(INSTALL) -m 644
+INSTALL_DIR = $(INSTALL) -d -m 755
+LDCONFIG = /sbin/ldconfig
+SHELL = /bin/sh
+CAN_RUN_INSTALLINFO = $(SHELL) -c "install-info --version" > /dev/null 2>&1
+
+objs = carg_parser.o minilzip.o
+
+
+.PHONY : all install install-bin install-info install-man \
+ install-strip install-compress install-strip-compress \
+ install-bin-strip install-info-compress install-man-compress \
+ install-bin-compress install-bin-strip-compress \
+ install-lib install-lib-strip \
+ install-as-lzip \
+ uninstall uninstall-bin uninstall-lib uninstall-info uninstall-man \
+ doc info man check dist clean distclean
+
+all : $(progname_static) $(progname_shared)
+
+lib$(libname).a : lzlib.o
+ $(AR) $(ARFLAGS) $@ $<
+
+lib$(libname).so.$(pkgversion) : lzlib_sh.o
+ $(CC) $(CFLAGS) $(LDFLAGS) -fpic -fPIC -shared -Wl,--soname=lib$(libname).so.$(soversion) -o $@ $<
+
+$(progname) : $(objs) lib$(libname).a
+ $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(objs) lib$(libname).a
+
+$(progname)_shared : $(objs) lib$(libname).so.$(pkgversion)
+ $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(objs) lib$(libname).so.$(pkgversion)
+
+bbexample : bbexample.o lib$(libname).a
+ $(CC) $(CFLAGS) $(LDFLAGS) -o $@ bbexample.o lib$(libname).a
+
+ffexample : ffexample.o lib$(libname).a
+ $(CC) $(CFLAGS) $(LDFLAGS) -o $@ ffexample.o lib$(libname).a
+
+lzcheck : lzcheck.o lib$(libname).a
+ $(CC) $(CFLAGS) $(LDFLAGS) -o $@ lzcheck.o lib$(libname).a
+
+minilzip.o : minilzip.c
+ $(CC) $(CPPFLAGS) $(CFLAGS) -DPROGVERSION=\"$(pkgversion)\" -c -o $@ $<
+
+lzlib_sh.o : lzlib.c
+ $(CC) $(CPPFLAGS) $(CFLAGS) -fpic -fPIC -c -o $@ $<
+
+%.o : %.c
+ $(CC) $(CPPFLAGS) $(CFLAGS) -c -o $@ $<
+
+# prevent 'make' from trying to remake source files
+$(VPATH)/configure $(VPATH)/Makefile.in $(VPATH)/doc/$(pkgname).texi : ;
+%.h %.c : ;
+
+lzdeps = lzlib.h lzip.h cbuffer.c decoder.h decoder.c encoder_base.h \
+ encoder_base.c encoder.h encoder.c fast_encoder.h fast_encoder.c
+
+$(objs) : Makefile
+carg_parser.o : carg_parser.h
+lzlib.o : Makefile $(lzdeps)
+lzlib_sh.o : Makefile $(lzdeps)
+minilzip.o : carg_parser.h lzlib.h
+bbexample.o : Makefile lzlib.h
+ffexample.o : Makefile lzlib.h
+lzcheck.o : Makefile lzlib.h
+
+doc : info man
+
+info : $(VPATH)/doc/$(pkgname).info
+
+$(VPATH)/doc/$(pkgname).info : $(VPATH)/doc/$(pkgname).texi
+ cd $(VPATH)/doc && $(MAKEINFO) $(pkgname).texi
+
+man : $(VPATH)/doc/$(progname).1
+
+$(VPATH)/doc/$(progname).1 : $(progname)
+ help2man -n 'reduces the size of files' -o $@ --info-page=$(pkgname) ./$(progname)
+
+Makefile : $(VPATH)/configure $(VPATH)/Makefile.in
+ ./config.status
+
+check : $(progname) bbexample ffexample lzcheck
+ @$(VPATH)/testsuite/check.sh $(VPATH)/testsuite $(pkgversion)
+
+install : install-lib install-info
+install-strip : install-lib-strip install-info
+install-compress : install-lib install-info-compress
+install-strip-compress : install-lib-strip install-info-compress
+install-bin-compress : install-bin install-man-compress
+install-bin-strip-compress : install-bin-strip install-man-compress
+
+install-bin : all
+ if [ ! -d "$(DESTDIR)$(bindir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(bindir)" ; fi
+ $(INSTALL_PROGRAM) ./$(progname_lzip) "$(DESTDIR)$(bindir)/$(progname)"
+
+install-bin-strip : all
+ $(MAKE) INSTALL_PROGRAM='$(INSTALL_PROGRAM) -s' install-bin
+
+install-lib : all
+ if [ ! -d "$(DESTDIR)$(includedir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(includedir)" ; fi
+ if [ ! -d "$(DESTDIR)$(libdir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(libdir)" ; fi
+ $(INSTALL_DATA) $(VPATH)/$(libname)lib.h "$(DESTDIR)$(includedir)/$(libname)lib.h"
+ if [ -n "$(progname_static)" ] ; then \
+ $(INSTALL_DATA) ./lib$(libname).a "$(DESTDIR)$(libdir)/lib$(libname).a" ; \
+ fi
+ if [ -n "$(progname_shared)" ] ; then \
+ $(INSTALL_PROGRAM) ./lib$(libname).so.$(pkgversion) "$(DESTDIR)$(libdir)/lib$(libname).so.$(pkgversion)" ; \
+ if [ -e "$(DESTDIR)$(libdir)/lib$(libname).so.$(soversion)" ] ; then \
+ run_ldconfig=no ; \
+ else run_ldconfig=yes ; \
+ fi ; \
+ rm -f "$(DESTDIR)$(libdir)/lib$(libname).so" ; \
+ rm -f "$(DESTDIR)$(libdir)/lib$(libname).so.$(soversion)" ; \
+ cd "$(DESTDIR)$(libdir)" && ln -s lib$(libname).so.$(pkgversion) lib$(libname).so ; \
+ cd "$(DESTDIR)$(libdir)" && ln -s lib$(libname).so.$(pkgversion) lib$(libname).so.$(soversion) ; \
+ if [ "${disable_ldconfig}" != yes ] && [ $${run_ldconfig} = yes ] && \
+ [ -x "$(LDCONFIG)" ] ; then "$(LDCONFIG)" -n "$(DESTDIR)$(libdir)" || true ; fi ; \
+ fi
+
+install-lib-strip : all
+ $(MAKE) INSTALL_PROGRAM='$(INSTALL_PROGRAM) -s' install-lib
+
+install-info :
+ if [ ! -d "$(DESTDIR)$(infodir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(infodir)" ; fi
+ -rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"*
+ $(INSTALL_DATA) $(VPATH)/doc/$(pkgname).info "$(DESTDIR)$(infodir)/$(pkgname).info"
+ -if $(CAN_RUN_INSTALLINFO) ; then \
+ install-info --info-dir="$(DESTDIR)$(infodir)" "$(DESTDIR)$(infodir)/$(pkgname).info" ; \
+ fi
+
+install-info-compress : install-info
+ lzip -v -9 "$(DESTDIR)$(infodir)/$(pkgname).info"
+
+install-man :
+ if [ ! -d "$(DESTDIR)$(mandir)/man1" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1" ; fi
+ -rm -f "$(DESTDIR)$(mandir)/man1/$(progname).1"*
+ $(INSTALL_DATA) $(VPATH)/doc/$(progname).1 "$(DESTDIR)$(mandir)/man1/$(progname).1"
+
+install-man-compress : install-man
+ lzip -v -9 "$(DESTDIR)$(mandir)/man1/$(progname).1"
+
+install-as-lzip : install-bin
+ -rm -f "$(DESTDIR)$(bindir)/lzip"
+ cd "$(DESTDIR)$(bindir)" && ln -s $(progname) lzip
+
+uninstall : uninstall-info uninstall-lib
+
+uninstall-bin :
+ -rm -f "$(DESTDIR)$(bindir)/$(progname)"
+
+uninstall-lib :
+ -rm -f "$(DESTDIR)$(includedir)/$(libname)lib.h"
+ -rm -f "$(DESTDIR)$(libdir)/lib$(libname).a"
+ -rm -f "$(DESTDIR)$(libdir)/lib$(libname).so"
+ -rm -f "$(DESTDIR)$(libdir)/lib$(libname).so.$(soversion)"
+ -rm -f "$(DESTDIR)$(libdir)/lib$(libname).so.$(pkgversion)"
+
+uninstall-info :
+ -if $(CAN_RUN_INSTALLINFO) ; then \
+ install-info --info-dir="$(DESTDIR)$(infodir)" --remove "$(DESTDIR)$(infodir)/$(pkgname).info" ; \
+ fi
+ -rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"*
+
+uninstall-man :
+ -rm -f "$(DESTDIR)$(mandir)/man1/$(progname).1"*
+
+dist : doc
+ ln -sf $(VPATH) $(DISTNAME)
+ tar -Hustar --owner=root --group=root -cvf $(DISTNAME).tar \
+ $(DISTNAME)/AUTHORS \
+ $(DISTNAME)/COPYING \
+ $(DISTNAME)/COPYING.GPL \
+ $(DISTNAME)/ChangeLog \
+ $(DISTNAME)/INSTALL \
+ $(DISTNAME)/Makefile.in \
+ $(DISTNAME)/NEWS \
+ $(DISTNAME)/README \
+ $(DISTNAME)/configure \
+ $(DISTNAME)/doc/$(progname).1 \
+ $(DISTNAME)/doc/$(pkgname).info \
+ $(DISTNAME)/doc/$(pkgname).texi \
+ $(DISTNAME)/*.h \
+ $(DISTNAME)/*.c \
+ $(DISTNAME)/testsuite/check.sh \
+ $(DISTNAME)/testsuite/test.txt \
+ $(DISTNAME)/testsuite/fox_lf \
+ $(DISTNAME)/testsuite/fox.lz \
+ $(DISTNAME)/testsuite/fox_*.lz \
+ $(DISTNAME)/testsuite/test_sync.lz \
+ $(DISTNAME)/testsuite/test.txt.lz \
+ $(DISTNAME)/testsuite/test_em.txt.lz
+ rm -f $(DISTNAME)
+ lzip -v -9 $(DISTNAME).tar
+
+clean :
+ -rm -f $(progname) $(objs) lzlib.o lib$(libname).a
+ -rm -f $(progname)_shared lzlib_sh.o lib$(libname).so*
+ -rm -f bbexample bbexample.o ffexample ffexample.o lzcheck lzcheck.o
+
+distclean : clean
+ -rm -f Makefile config.status *.tar *.tar.lz
diff --git a/NEWS b/NEWS
new file mode 100644
index 0000000..101b35a
--- /dev/null
+++ b/NEWS
@@ -0,0 +1,19 @@
+Changes in version 1.14:
+
+In minilzip, file diagnostics have been reformatted as 'PROGRAM: FILE: MESSAGE'.
+
+In minilzip, diagnostics caused by invalid arguments to command-line options
+now show the argument and the name of the option.
+
+The option '-o, --output' of minilzip now preserves dates, permissions, and
+ownership of the file, when (de)compressing exactly one file.
+
+It has been documented in the manual that it is the responsibility of the
+program using lzlib to include before 'lzlib.h' some header that declares
+the type 'uint8_t'. (Reported by Michal Górny).
+
+The variable MAKEINFO has been added to configure and Makefile.in.
+
+It has been documented in INSTALL that when choosing a C standard, the POSIX
+features need to be enabled explicitly:
+ ./configure CFLAGS+='--std=c99 -D_XOPEN_SOURCE=500'
diff --git a/README b/README
new file mode 100644
index 0000000..7dc2950
--- /dev/null
+++ b/README
@@ -0,0 +1,108 @@
+Description
+
+Lzlib is a data compression library providing in-memory LZMA compression and
+decompression functions, including integrity checking of the decompressed
+data. The compressed data format used by the library is the lzip format.
+Lzlib is written in C.
+
+The lzip file format is designed for data sharing and long-term archiving,
+taking into account both data integrity and decoder availability:
+
+ * The lzip format provides very safe integrity checking and some data
+ recovery means. The program lziprecover can repair bit flip errors
+ (one of the most common forms of data corruption) in lzip files, and
+ provides data recovery capabilities, including error-checked merging
+ of damaged copies of a file.
+
+ * The lzip format is as simple as possible (but not simpler). The lzip
+ manual provides the source code of a simple decompressor along with a
+ detailed explanation of how it works, so that with the only help of the
+ lzip manual it would be possible for a digital archaeologist to extract
+ the data from a lzip file long after quantum computers eventually
+ render LZMA obsolete.
+
+ * Additionally the lzip reference implementation is copylefted, which
+ guarantees that it will remain free forever.
+
+A nice feature of the lzip format is that a corrupt byte is easier to repair
+the nearer it is from the beginning of the file. Therefore, with the help of
+lziprecover, losing an entire archive just because of a corrupt byte near
+the beginning is a thing of the past.
+
+The functions and variables forming the interface of the compression library
+are declared in the file 'lzlib.h'. Usage examples of the library are given
+in the files 'bbexample.c', 'ffexample.c', and 'minilzip.c' from the source
+distribution.
+
+As 'lzlib.h' can be used by C and C++ programs, it must not impose a choice
+of system headers on the program by including one of them. Therefore it is
+the responsibility of the program using lzlib to include before 'lzlib.h'
+some header that declares the type 'uint8_t'. There are at least four such
+headers in C and C++: 'stdint.h', 'cstdint', 'inttypes.h', and 'cinttypes'.
+
+All the library functions are thread safe. The library does not install any
+signal handler. The decoder checks the consistency of the compressed data,
+so the library should never crash even in case of corrupted input.
+
+Compression/decompression is done by repeatedly calling a couple of
+read/write functions until all the data have been processed by the library.
+This interface is safer and less error prone than the traditional zlib
+interface.
+
+Compression/decompression is done when the read function is called. This
+means the value returned by the position functions is not updated until a
+read call, even if a lot of data are written. If you want the data to be
+compressed in advance, just call the read function with a size equal to 0.
+
+If all the data to be compressed are written in advance, lzlib automatically
+adjusts the header of the compressed data to use the largest dictionary size
+that does not exceed neither the data size nor the limit given to
+'LZ_compress_open'. This feature reduces the amount of memory needed for
+decompression and allows minilzip to produce identical compressed output as
+lzip.
+
+Lzlib correctly decompresses a data stream which is the concatenation of
+two or more compressed data streams. The result is the concatenation of the
+corresponding decompressed data streams. Integrity testing of concatenated
+compressed data streams is also supported.
+
+Lzlib is able to compress and decompress streams of unlimited size by
+automatically creating multimember output. The members so created are large,
+about 2 PiB each.
+
+In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a
+concrete algorithm; it is more like "any algorithm using the LZMA coding
+scheme". For example, the option '-0' of lzip uses the scheme in almost the
+simplest way possible; issuing the longest match it can find, or a literal
+byte if it can't find a match. Inversely, a much more elaborated way of
+finding coding sequences of minimum size than the one currently used by lzip
+could be developed, and the resulting sequence could also be coded using the
+LZMA coding scheme.
+
+Lzlib currently implements two variants of the LZMA algorithm: fast (used by
+option '-0' of minilzip) and normal (used by all other compression levels).
+
+The high compression of LZMA comes from combining two basic, well-proven
+compression ideas: sliding dictionaries (LZ77) and markov models (the thing
+used by every compression algorithm that uses a range encoder or similar
+order-0 entropy coder as its last stage) with segregation of contexts
+according to what the bits are used for.
+
+The ideas embodied in lzlib are due to (at least) the following people:
+Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the
+definition of Markov chains), G.N.N. Martin (for the definition of range
+encoding), Igor Pavlov (for putting all the above together in LZMA), and
+Julian Seward (for bzip2's CLI).
+
+LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never have
+been compressed. Decompressed is used to refer to data which have undergone
+the process of decompression.
+
+
+Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+This file is free documentation: you have unlimited permission to copy,
+distribute, and modify it.
+
+The file Makefile.in is a data file used by configure to produce the Makefile.
+It has the same copyright owner and permissions that configure itself.
diff --git a/bbexample.c b/bbexample.c
new file mode 100644
index 0000000..50ccf33
--- /dev/null
+++ b/bbexample.c
@@ -0,0 +1,367 @@
+/* Buffer to buffer example - Test program for the library lzlib
+ Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+ This program is free software: you have unlimited permission
+ to copy, distribute, and modify it.
+
+ Usage: bbexample filename
+
+ This program is an example of how buffer-to-buffer
+ compression/decompression can be implemented using lzlib.
+*/
+
+#define _FILE_OFFSET_BITS 64
+
+#include <errno.h>
+#include <limits.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include "lzlib.h"
+
+#ifndef min
+ #define min(x,y) ((x) <= (y) ? (x) : (y))
+#endif
+
+
+/* Return the address of a malloc'd buffer containing the file data and
+ the file size in '*file_sizep'.
+ In case of error, return 0 and do not modify '*file_sizep'.
+*/
+uint8_t * read_file( const char * const name, long * const file_sizep )
+ {
+ long buffer_size = 1 << 20, file_size;
+ uint8_t * buffer, * tmp;
+ FILE * const f = fopen( name, "rb" );
+ if( !f )
+ { fprintf( stderr, "bbexample: %s: Can't open input file: %s\n",
+ name, strerror( errno ) ); return 0; }
+
+ buffer = (uint8_t *)malloc( buffer_size );
+ if( !buffer )
+ { fputs( "bbexample: read_file: Not enough memory.\n", stderr );
+ fclose( f ); return 0; }
+ file_size = fread( buffer, 1, buffer_size, f );
+ while( file_size >= buffer_size )
+ {
+ if( buffer_size >= LONG_MAX )
+ {
+ fprintf( stderr, "bbexample: %s: Input file is too large.\n", name );
+ free( buffer ); fclose( f ); return 0;
+ }
+ buffer_size = ( buffer_size <= LONG_MAX / 2 ) ? 2 * buffer_size : LONG_MAX;
+ tmp = (uint8_t *)realloc( buffer, buffer_size );
+ if( !tmp )
+ { fputs( "bbexample: read_file: Not enough memory.\n", stderr );
+ free( buffer ); fclose( f ); return 0; }
+ buffer = tmp;
+ file_size += fread( buffer + file_size, 1, buffer_size - file_size, f );
+ }
+ if( ferror( f ) || !feof( f ) )
+ {
+ fprintf( stderr, "bbexample: %s: Error reading file: %s\n",
+ name, strerror( errno ) );
+ free( buffer ); fclose( f ); return 0;
+ }
+ fclose( f );
+ *file_sizep = file_size;
+ return buffer;
+ }
+
+
+/* Compress 'insize' bytes from 'inbuf'.
+ Return the address of a malloc'd buffer containing the compressed data,
+ and the size of the data in '*outlenp'.
+ In case of error, return 0 and do not modify '*outlenp'.
+*/
+uint8_t * bbcompressl( const uint8_t * const inbuf, const long insize,
+ const int level, long * const outlenp )
+ {
+ struct Lzma_options
+ {
+ int dictionary_size; /* 4 KiB .. 512 MiB */
+ int match_len_limit; /* 5 .. 273 */
+ };
+ /* Mapping from gzip/bzip2 style 0..9 compression levels to the
+ corresponding LZMA compression parameters. */
+ const struct Lzma_options option_mapping[] =
+ {
+ { 65535, 16 }, /* -0 (65535,16 chooses fast encoder) */
+ { 1 << 20, 5 }, /* -1 */
+ { 3 << 19, 6 }, /* -2 */
+ { 1 << 21, 8 }, /* -3 */
+ { 3 << 20, 12 }, /* -4 */
+ { 1 << 22, 20 }, /* -5 */
+ { 1 << 23, 36 }, /* -6 */
+ { 1 << 24, 68 }, /* -7 */
+ { 3 << 23, 132 }, /* -8 */
+ { 1 << 25, 273 } }; /* -9 */
+ struct Lzma_options encoder_options;
+ struct LZ_Encoder * encoder;
+ uint8_t * outbuf;
+ const long delta_size = ( insize / 4 ) + 64; /* insize may be zero */
+ long outsize = delta_size; /* initial outsize */
+ long inpos = 0;
+ long outpos = 0;
+ bool error = false;
+
+ if( level < 0 || level > 9 ) return 0;
+ encoder_options = option_mapping[level];
+
+ if( encoder_options.dictionary_size > insize && level != 0 )
+ encoder_options.dictionary_size = insize; /* saves memory */
+ if( encoder_options.dictionary_size < LZ_min_dictionary_size() )
+ encoder_options.dictionary_size = LZ_min_dictionary_size();
+ encoder = LZ_compress_open( encoder_options.dictionary_size,
+ encoder_options.match_len_limit, INT64_MAX );
+ outbuf = (uint8_t *)malloc( outsize );
+ if( !encoder || LZ_compress_errno( encoder ) != LZ_ok || !outbuf )
+ { free( outbuf ); LZ_compress_close( encoder ); return 0; }
+
+ while( true )
+ {
+ int ret = LZ_compress_write( encoder, inbuf + inpos,
+ min( INT_MAX, insize - inpos ) );
+ if( ret < 0 ) { error = true; break; }
+ inpos += ret;
+ if( inpos >= insize ) LZ_compress_finish( encoder );
+ ret = LZ_compress_read( encoder, outbuf + outpos,
+ min( INT_MAX, outsize - outpos ) );
+ if( ret < 0 ) { error = true; break; }
+ outpos += ret;
+ if( LZ_compress_finished( encoder ) == 1 ) break;
+ if( outpos >= outsize )
+ {
+ uint8_t * tmp;
+ if( outsize > LONG_MAX - delta_size ) { error = true; break; }
+ outsize += delta_size;
+ tmp = (uint8_t *)realloc( outbuf, outsize );
+ if( !tmp ) { error = true; break; }
+ outbuf = tmp;
+ }
+ }
+
+ if( LZ_compress_close( encoder ) < 0 ) error = true;
+ if( error ) { free( outbuf ); return 0; }
+ *outlenp = outpos;
+ return outbuf;
+ }
+
+
+/* Decompress 'insize' bytes from 'inbuf'.
+ Return the address of a malloc'd buffer containing the decompressed
+ data, and the size of the data in '*outlenp'.
+ In case of error, return 0 and do not modify '*outlenp'.
+*/
+uint8_t * bbdecompressl( const uint8_t * const inbuf, const long insize,
+ long * const outlenp )
+ {
+ struct LZ_Decoder * const decoder = LZ_decompress_open();
+ const long delta_size = insize; /* insize must be > zero */
+ long outsize = delta_size; /* initial outsize */
+ uint8_t * outbuf = (uint8_t *)malloc( outsize );
+ long inpos = 0;
+ long outpos = 0;
+ bool error = false;
+ if( !decoder || LZ_decompress_errno( decoder ) != LZ_ok || !outbuf )
+ { free( outbuf ); LZ_decompress_close( decoder ); return 0; }
+
+ while( true )
+ {
+ int ret = LZ_decompress_write( decoder, inbuf + inpos,
+ min( INT_MAX, insize - inpos ) );
+ if( ret < 0 ) { error = true; break; }
+ inpos += ret;
+ if( inpos >= insize ) LZ_decompress_finish( decoder );
+ ret = LZ_decompress_read( decoder, outbuf + outpos,
+ min( INT_MAX, outsize - outpos ) );
+ if( ret < 0 ) { error = true; break; }
+ outpos += ret;
+ if( LZ_decompress_finished( decoder ) == 1 ) break;
+ if( outpos >= outsize )
+ {
+ uint8_t * tmp;
+ if( outsize > LONG_MAX - delta_size ) { error = true; break; }
+ outsize += delta_size;
+ tmp = (uint8_t *)realloc( outbuf, outsize );
+ if( !tmp ) { error = true; break; }
+ outbuf = tmp;
+ }
+ }
+
+ if( LZ_decompress_close( decoder ) < 0 ) error = true;
+ if( error ) { free( outbuf ); return 0; }
+ *outlenp = outpos;
+ return outbuf;
+ }
+
+
+/* Test the whole file at all levels. */
+int full_test( const uint8_t * const inbuf, const long insize )
+ {
+ int level;
+ for( level = 0; level <= 9; ++level )
+ {
+ long midsize = 0, outsize = 0;
+ uint8_t * outbuf;
+ uint8_t * midbuf = bbcompressl( inbuf, insize, level, &midsize );
+ if( !midbuf )
+ { fputs( "bbexample: full_test: Not enough memory or compress error.\n",
+ stderr ); return 1; }
+
+ outbuf = bbdecompressl( midbuf, midsize, &outsize );
+ free( midbuf );
+ if( !outbuf )
+ { fputs( "bbexample: full_test: Not enough memory or decompress error.\n",
+ stderr ); return 1; }
+
+ if( insize != outsize ||
+ ( insize > 0 && memcmp( inbuf, outbuf, insize ) != 0 ) )
+ { fputs( "bbexample: full_test: Decompressed data differs from original.\n",
+ stderr ); free( outbuf ); return 1; }
+
+ free( outbuf );
+ }
+ return 0;
+ }
+
+
+/* Compress 'insize' bytes from 'inbuf' to 'outbuf'.
+ Return the size of the compressed data in '*outlenp'.
+ In case of error, or if 'outsize' is too small, return false and do not
+ modify '*outlenp'.
+*/
+bool bbcompress( const uint8_t * const inbuf, const int insize,
+ const int dictionary_size, const int match_len_limit,
+ uint8_t * const outbuf, const int outsize,
+ int * const outlenp )
+ {
+ int inpos = 0, outpos = 0;
+ bool error = false;
+ struct LZ_Encoder * const encoder =
+ LZ_compress_open( dictionary_size, match_len_limit, INT64_MAX );
+ if( !encoder || LZ_compress_errno( encoder ) != LZ_ok )
+ { LZ_compress_close( encoder ); return false; }
+
+ while( true )
+ {
+ int ret = LZ_compress_write( encoder, inbuf + inpos, insize - inpos );
+ if( ret < 0 ) { error = true; break; }
+ inpos += ret;
+ if( inpos >= insize ) LZ_compress_finish( encoder );
+ ret = LZ_compress_read( encoder, outbuf + outpos, outsize - outpos );
+ if( ret < 0 ) { error = true; break; }
+ outpos += ret;
+ if( LZ_compress_finished( encoder ) == 1 ) break;
+ if( outpos >= outsize ) { error = true; break; }
+ }
+
+ if( LZ_compress_close( encoder ) < 0 ) error = true;
+ if( error ) return false;
+ *outlenp = outpos;
+ return true;
+ }
+
+
+/* Decompress 'insize' bytes from 'inbuf' to 'outbuf'.
+ Return the size of the decompressed data in '*outlenp'.
+ In case of error, or if 'outsize' is too small, return false and do not
+ modify '*outlenp'.
+*/
+bool bbdecompress( const uint8_t * const inbuf, const int insize,
+ uint8_t * const outbuf, const int outsize,
+ int * const outlenp )
+ {
+ int inpos = 0, outpos = 0;
+ bool error = false;
+ struct LZ_Decoder * const decoder = LZ_decompress_open();
+ if( !decoder || LZ_decompress_errno( decoder ) != LZ_ok )
+ { LZ_decompress_close( decoder ); return false; }
+
+ while( true )
+ {
+ int ret = LZ_decompress_write( decoder, inbuf + inpos, insize - inpos );
+ if( ret < 0 ) { error = true; break; }
+ inpos += ret;
+ if( inpos >= insize ) LZ_decompress_finish( decoder );
+ ret = LZ_decompress_read( decoder, outbuf + outpos, outsize - outpos );
+ if( ret < 0 ) { error = true; break; }
+ outpos += ret;
+ if( LZ_decompress_finished( decoder ) == 1 ) break;
+ if( outpos >= outsize ) { error = true; break; }
+ }
+
+ if( LZ_decompress_close( decoder ) < 0 ) error = true;
+ if( error ) return false;
+ *outlenp = outpos;
+ return true;
+ }
+
+
+/* Test at most INT_MAX bytes from the file with buffers of fixed size. */
+int fixed_test( const uint8_t * const inbuf, const int insize )
+ {
+ int dictionary_size = 65535; /* fast encoder */
+ int midsize = min( INT_MAX, ( insize / 8 ) * 9LL + 44 ), outsize = insize;
+ uint8_t * midbuf = (uint8_t *)malloc( midsize );
+ uint8_t * outbuf = (uint8_t *)malloc( outsize );
+ if( !midbuf || !outbuf )
+ { fputs( "bbexample: fixed_test: Not enough memory.\n", stderr );
+ free( outbuf ); free( midbuf ); return 1; }
+
+ for( ; dictionary_size <= 8 << 20; dictionary_size += 8323073 )
+ {
+ int midlen, outlen;
+ if( !bbcompress( inbuf, insize, dictionary_size, 16, midbuf, midsize, &midlen ) )
+ { fputs( "bbexample: fixed_test: Not enough memory or compress error.\n",
+ stderr ); free( outbuf ); free( midbuf ); return 1; }
+
+ if( !bbdecompress( midbuf, midlen, outbuf, outsize, &outlen ) )
+ { fputs( "bbexample: fixed_test: Not enough memory or decompress error.\n",
+ stderr ); free( outbuf ); free( midbuf ); return 1; }
+
+ if( insize != outlen ||
+ ( insize > 0 && memcmp( inbuf, outbuf, insize ) != 0 ) )
+ { fputs( "bbexample: fixed_test: Decompressed data differs from original.\n",
+ stderr ); free( outbuf ); free( midbuf ); return 1; }
+
+ }
+ free( outbuf );
+ free( midbuf );
+ return 0;
+ }
+
+
+int main( const int argc, const char * const argv[] )
+ {
+ int retval = 0, i;
+ int open_failures = 0;
+ const bool verbose = ( argc > 2 );
+
+ if( argc < 2 )
+ {
+ fputs( "Usage: bbexample filename\n", stderr );
+ return 1;
+ }
+
+ for( i = 1; i < argc && retval == 0; ++i )
+ {
+ long insize;
+ uint8_t * const inbuf = read_file( argv[i], &insize );
+ if( !inbuf ) { ++open_failures; continue; }
+ if( verbose ) fprintf( stderr, " Testing file '%s'\n", argv[i] );
+
+ retval = full_test( inbuf, insize );
+ if( retval == 0 ) retval = fixed_test( inbuf, min( INT_MAX, insize ) );
+ free( inbuf );
+ }
+ if( open_failures > 0 && verbose )
+ fprintf( stderr, "bbexample: warning: %d %s failed to open.\n",
+ open_failures, ( open_failures == 1 ) ? "file" : "files" );
+ if( retval == 0 && open_failures ) retval = 1;
+ return retval;
+ }
diff --git a/carg_parser.c b/carg_parser.c
new file mode 100644
index 0000000..edb4eb9
--- /dev/null
+++ b/carg_parser.c
@@ -0,0 +1,319 @@
+/* Arg_parser - POSIX/GNU command-line argument parser. (C version)
+ Copyright (C) 2006-2024 Antonio Diaz Diaz.
+
+ This library is free software. Redistribution and use in source and
+ binary forms, with or without modification, are permitted provided
+ that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions, and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions, and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+#include <stdlib.h>
+#include <string.h>
+
+#include "carg_parser.h"
+
+
+/* assure at least a minimum size for buffer 'buf' */
+static void * ap_resize_buffer( void * buf, const int min_size )
+ {
+ if( buf ) buf = realloc( buf, min_size );
+ else buf = malloc( min_size );
+ return buf;
+ }
+
+
+static char push_back_record( struct Arg_parser * const ap, const int code,
+ const char * const long_name,
+ const char * const argument )
+ {
+ struct ap_Record * p;
+ void * tmp = ap_resize_buffer( ap->data,
+ ( ap->data_size + 1 ) * sizeof (struct ap_Record) );
+ if( !tmp ) return 0;
+ ap->data = (struct ap_Record *)tmp;
+ p = &(ap->data[ap->data_size]);
+ p->code = code;
+ if( long_name )
+ {
+ const int len = strlen( long_name );
+ p->parsed_name = (char *)malloc( len + 2 + 1 );
+ if( !p->parsed_name ) return 0;
+ p->parsed_name[0] = p->parsed_name[1] = '-';
+ strncpy( p->parsed_name + 2, long_name, len + 1 );
+ }
+ else if( code > 0 && code < 256 )
+ {
+ p->parsed_name = (char *)malloc( 2 + 1 );
+ if( !p->parsed_name ) return 0;
+ p->parsed_name[0] = '-'; p->parsed_name[1] = code; p->parsed_name[2] = 0;
+ }
+ else p->parsed_name = 0;
+ if( argument )
+ {
+ const int len = strlen( argument );
+ p->argument = (char *)malloc( len + 1 );
+ if( !p->argument ) { free( p->parsed_name ); return 0; }
+ strncpy( p->argument, argument, len + 1 );
+ }
+ else p->argument = 0;
+ ++ap->data_size;
+ return 1;
+ }
+
+
+static char add_error( struct Arg_parser * const ap, const char * const msg )
+ {
+ const int len = strlen( msg );
+ void * tmp = ap_resize_buffer( ap->error, ap->error_size + len + 1 );
+ if( !tmp ) return 0;
+ ap->error = (char *)tmp;
+ strncpy( ap->error + ap->error_size, msg, len + 1 );
+ ap->error_size += len;
+ return 1;
+ }
+
+
+static void free_data( struct Arg_parser * const ap )
+ {
+ int i;
+ for( i = 0; i < ap->data_size; ++i )
+ { free( ap->data[i].argument ); free( ap->data[i].parsed_name ); }
+ if( ap->data ) { free( ap->data ); ap->data = 0; }
+ ap->data_size = 0;
+ }
+
+
+/* Return 0 only if out of memory. */
+static char parse_long_option( struct Arg_parser * const ap,
+ const char * const opt, const char * const arg,
+ const struct ap_Option options[],
+ int * const argindp )
+ {
+ unsigned len;
+ int index = -1, i;
+ char exact = 0, ambig = 0;
+
+ for( len = 0; opt[len+2] && opt[len+2] != '='; ++len ) ;
+
+ /* Test all long options for either exact match or abbreviated matches. */
+ for( i = 0; options[i].code != 0; ++i )
+ if( options[i].long_name &&
+ strncmp( options[i].long_name, &opt[2], len ) == 0 )
+ {
+ if( strlen( options[i].long_name ) == len ) /* Exact match found */
+ { index = i; exact = 1; break; }
+ else if( index < 0 ) index = i; /* First nonexact match found */
+ else if( options[index].code != options[i].code ||
+ options[index].has_arg != options[i].has_arg )
+ ambig = 1; /* Second or later nonexact match found */
+ }
+
+ if( ambig && !exact )
+ {
+ add_error( ap, "option '" ); add_error( ap, opt );
+ add_error( ap, "' is ambiguous" );
+ return 1;
+ }
+
+ if( index < 0 ) /* nothing found */
+ {
+ add_error( ap, "unrecognized option '" ); add_error( ap, opt );
+ add_error( ap, "'" );
+ return 1;
+ }
+
+ ++*argindp;
+
+ if( opt[len+2] ) /* '--<long_option>=<argument>' syntax */
+ {
+ if( options[index].has_arg == ap_no )
+ {
+ add_error( ap, "option '--" ); add_error( ap, options[index].long_name );
+ add_error( ap, "' doesn't allow an argument" );
+ return 1;
+ }
+ if( options[index].has_arg == ap_yes && !opt[len+3] )
+ {
+ add_error( ap, "option '--" ); add_error( ap, options[index].long_name );
+ add_error( ap, "' requires an argument" );
+ return 1;
+ }
+ return push_back_record( ap, options[index].code,
+ options[index].long_name, &opt[len+3] );
+ }
+
+ if( options[index].has_arg == ap_yes )
+ {
+ if( !arg || !arg[0] )
+ {
+ add_error( ap, "option '--" ); add_error( ap, options[index].long_name );
+ add_error( ap, "' requires an argument" );
+ return 1;
+ }
+ ++*argindp;
+ return push_back_record( ap, options[index].code,
+ options[index].long_name, arg );
+ }
+
+ return push_back_record( ap, options[index].code,
+ options[index].long_name, 0 );
+ }
+
+
+/* Return 0 only if out of memory. */
+static char parse_short_option( struct Arg_parser * const ap,
+ const char * const opt, const char * const arg,
+ const struct ap_Option options[],
+ int * const argindp )
+ {
+ int cind = 1; /* character index in opt */
+
+ while( cind > 0 )
+ {
+ int index = -1, i;
+ const unsigned char c = opt[cind];
+ char code_str[2];
+ code_str[0] = c; code_str[1] = 0;
+
+ if( c != 0 )
+ for( i = 0; options[i].code; ++i )
+ if( c == options[i].code )
+ { index = i; break; }
+
+ if( index < 0 )
+ {
+ add_error( ap, "invalid option -- '" ); add_error( ap, code_str );
+ add_error( ap, "'" );
+ return 1;
+ }
+
+ if( opt[++cind] == 0 ) { ++*argindp; cind = 0; } /* opt finished */
+
+ if( options[index].has_arg != ap_no && cind > 0 && opt[cind] )
+ {
+ if( !push_back_record( ap, c, 0, &opt[cind] ) ) return 0;
+ ++*argindp; cind = 0;
+ }
+ else if( options[index].has_arg == ap_yes )
+ {
+ if( !arg || !arg[0] )
+ {
+ add_error( ap, "option requires an argument -- '" );
+ add_error( ap, code_str ); add_error( ap, "'" );
+ return 1;
+ }
+ ++*argindp; cind = 0;
+ if( !push_back_record( ap, c, 0, arg ) ) return 0;
+ }
+ else if( !push_back_record( ap, c, 0, 0 ) ) return 0;
+ }
+ return 1;
+ }
+
+
+char ap_init( struct Arg_parser * const ap,
+ const int argc, const char * const argv[],
+ const struct ap_Option options[], const char in_order )
+ {
+ const char ** non_options = 0; /* skipped non-options */
+ int non_options_size = 0; /* number of skipped non-options */
+ int argind = 1; /* index in argv */
+ char done = 0; /* false until success */
+
+ ap->data = 0;
+ ap->error = 0;
+ ap->data_size = 0;
+ ap->error_size = 0;
+ if( argc < 2 || !argv || !options ) return 1;
+
+ while( argind < argc )
+ {
+ const unsigned char ch1 = argv[argind][0];
+ const unsigned char ch2 = ch1 ? argv[argind][1] : 0;
+
+ if( ch1 == '-' && ch2 ) /* we found an option */
+ {
+ const char * const opt = argv[argind];
+ const char * const arg = ( argind + 1 < argc ) ? argv[argind+1] : 0;
+ if( ch2 == '-' )
+ {
+ if( !argv[argind][2] ) { ++argind; break; } /* we found "--" */
+ else if( !parse_long_option( ap, opt, arg, options, &argind ) ) goto out;
+ }
+ else if( !parse_short_option( ap, opt, arg, options, &argind ) ) goto out;
+ if( ap->error ) break;
+ }
+ else
+ {
+ if( in_order )
+ { if( !push_back_record( ap, 0, 0, argv[argind++] ) ) goto out; }
+ else
+ {
+ void * tmp = ap_resize_buffer( non_options,
+ ( non_options_size + 1 ) * sizeof *non_options );
+ if( !tmp ) goto out;
+ non_options = (const char **)tmp;
+ non_options[non_options_size++] = argv[argind++];
+ }
+ }
+ }
+ if( ap->error ) free_data( ap );
+ else
+ {
+ int i;
+ for( i = 0; i < non_options_size; ++i )
+ if( !push_back_record( ap, 0, 0, non_options[i] ) ) goto out;
+ while( argind < argc )
+ if( !push_back_record( ap, 0, 0, argv[argind++] ) ) goto out;
+ }
+ done = 1;
+out: if( non_options ) free( non_options );
+ return done;
+ }
+
+
+void ap_free( struct Arg_parser * const ap )
+ {
+ free_data( ap );
+ if( ap->error ) { free( ap->error ); ap->error = 0; }
+ ap->error_size = 0;
+ }
+
+
+const char * ap_error( const struct Arg_parser * const ap )
+ { return ap->error; }
+
+
+int ap_arguments( const struct Arg_parser * const ap )
+ { return ap->data_size; }
+
+
+int ap_code( const struct Arg_parser * const ap, const int i )
+ {
+ if( i < 0 || i >= ap_arguments( ap ) ) return 0;
+ return ap->data[i].code;
+ }
+
+
+const char * ap_parsed_name( const struct Arg_parser * const ap, const int i )
+ {
+ if( i < 0 || i >= ap_arguments( ap ) || !ap->data[i].parsed_name ) return "";
+ return ap->data[i].parsed_name;
+ }
+
+
+const char * ap_argument( const struct Arg_parser * const ap, const int i )
+ {
+ if( i < 0 || i >= ap_arguments( ap ) || !ap->data[i].argument ) return "";
+ return ap->data[i].argument;
+ }
diff --git a/carg_parser.h b/carg_parser.h
new file mode 100644
index 0000000..69ce271
--- /dev/null
+++ b/carg_parser.h
@@ -0,0 +1,97 @@
+/* Arg_parser - POSIX/GNU command-line argument parser. (C version)
+ Copyright (C) 2006-2024 Antonio Diaz Diaz.
+
+ This library is free software. Redistribution and use in source and
+ binary forms, with or without modification, are permitted provided
+ that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions, and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions, and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+/* Arg_parser reads the arguments in 'argv' and creates a number of
+ option codes, option arguments, and non-option arguments.
+
+ In case of error, 'ap_error' returns a non-null pointer to an error
+ message.
+
+ 'options' is an array of 'struct ap_Option' terminated by an element
+ containing a code which is zero. A null long_name means a short-only
+ option. A code value outside the unsigned char range means a long-only
+ option.
+
+ Arg_parser normally makes it appear as if all the option arguments
+ were specified before all the non-option arguments for the purposes
+ of parsing, even if the user of your program intermixed option and
+ non-option arguments. If you want the arguments in the exact order
+ the user typed them, call 'ap_init' with 'in_order' = true.
+
+ The argument '--' terminates all options; any following arguments are
+ treated as non-option arguments, even if they begin with a hyphen.
+
+ The syntax for optional option arguments is '-<short_option><argument>'
+ (without whitespace), or '--<long_option>=<argument>'.
+*/
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+enum ap_Has_arg { ap_no, ap_yes, ap_maybe };
+
+struct ap_Option
+ {
+ int code; /* Short option letter or code ( code != 0 ) */
+ const char * long_name; /* Long option name (maybe null) */
+ enum ap_Has_arg has_arg;
+ };
+
+
+struct ap_Record
+ {
+ int code;
+ char * parsed_name;
+ char * argument;
+ };
+
+
+struct Arg_parser
+ {
+ struct ap_Record * data;
+ char * error;
+ int data_size;
+ int error_size;
+ };
+
+
+char ap_init( struct Arg_parser * const ap,
+ const int argc, const char * const argv[],
+ const struct ap_Option options[], const char in_order );
+
+void ap_free( struct Arg_parser * const ap );
+
+const char * ap_error( const struct Arg_parser * const ap );
+
+/* The number of arguments parsed. May be different from argc. */
+int ap_arguments( const struct Arg_parser * const ap );
+
+/* If ap_code( i ) is 0, ap_argument( i ) is a non-option.
+ Else ap_argument( i ) is the option's argument (or empty). */
+int ap_code( const struct Arg_parser * const ap, const int i );
+
+/* Full name of the option parsed (short or long). */
+const char * ap_parsed_name( const struct Arg_parser * const ap, const int i );
+
+const char * ap_argument( const struct Arg_parser * const ap, const int i );
+
+#ifdef __cplusplus
+}
+#endif
diff --git a/cbuffer.c b/cbuffer.c
new file mode 100644
index 0000000..4cadc1e
--- /dev/null
+++ b/cbuffer.c
@@ -0,0 +1,143 @@
+/* Lzlib - Compression library for the lzip format
+ Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+ This library is free software. Redistribution and use in source and
+ binary forms, with or without modification, are permitted provided
+ that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions, and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions, and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+struct Circular_buffer
+ {
+ uint8_t * buffer;
+ unsigned buffer_size; /* capacity == buffer_size - 1 */
+ unsigned get; /* buffer is empty when get == put */
+ unsigned put;
+ };
+
+static inline bool Cb_init( struct Circular_buffer * const cb,
+ const unsigned buf_size )
+ {
+ cb->buffer_size = buf_size + 1;
+ cb->get = 0;
+ cb->put = 0;
+ cb->buffer =
+ ( cb->buffer_size > 1 ) ? (uint8_t *)malloc( cb->buffer_size ) : 0;
+ return cb->buffer != 0;
+ }
+
+static inline void Cb_free( struct Circular_buffer * const cb )
+ { free( cb->buffer ); cb->buffer = 0; }
+
+static inline void Cb_reset( struct Circular_buffer * const cb )
+ { cb->get = 0; cb->put = 0; }
+
+static inline unsigned Cb_empty( const struct Circular_buffer * const cb )
+ { return cb->get == cb->put; }
+
+static inline unsigned Cb_used_bytes( const struct Circular_buffer * const cb )
+ { return ( (cb->get <= cb->put) ? 0 : cb->buffer_size ) + cb->put - cb->get; }
+
+static inline unsigned Cb_free_bytes( const struct Circular_buffer * const cb )
+ { return ( (cb->get <= cb->put) ? cb->buffer_size : 0 ) - cb->put + cb->get - 1; }
+
+static inline uint8_t Cb_get_byte( struct Circular_buffer * const cb )
+ {
+ const uint8_t b = cb->buffer[cb->get];
+ if( ++cb->get >= cb->buffer_size ) cb->get = 0;
+ return b;
+ }
+
+static inline void Cb_put_byte( struct Circular_buffer * const cb,
+ const uint8_t b )
+ {
+ cb->buffer[cb->put] = b;
+ if( ++cb->put >= cb->buffer_size ) cb->put = 0;
+ }
+
+
+static bool Cb_unread_data( struct Circular_buffer * const cb,
+ const unsigned size )
+ {
+ if( size > Cb_free_bytes( cb ) ) return false;
+ if( cb->get >= size ) cb->get -= size;
+ else cb->get = cb->buffer_size - size + cb->get;
+ return true;
+ }
+
+
+/* Copy up to 'out_size' bytes to 'out_buffer' and update 'get'.
+ If 'out_buffer' is null, the bytes are discarded.
+ Return the number of bytes copied or discarded.
+*/
+static unsigned Cb_read_data( struct Circular_buffer * const cb,
+ uint8_t * const out_buffer,
+ const unsigned out_size )
+ {
+ unsigned size = 0;
+ if( out_size == 0 ) return 0;
+ if( cb->get > cb->put )
+ {
+ size = min( cb->buffer_size - cb->get, out_size );
+ if( size > 0 )
+ {
+ if( out_buffer ) memcpy( out_buffer, cb->buffer + cb->get, size );
+ cb->get += size;
+ if( cb->get >= cb->buffer_size ) cb->get = 0;
+ }
+ }
+ if( cb->get < cb->put )
+ {
+ const unsigned size2 = min( cb->put - cb->get, out_size - size );
+ if( size2 > 0 )
+ {
+ if( out_buffer ) memcpy( out_buffer + size, cb->buffer + cb->get, size2 );
+ cb->get += size2;
+ size += size2;
+ }
+ }
+ return size;
+ }
+
+
+/* Copy up to 'in_size' bytes from 'in_buffer' and update 'put'.
+ Return the number of bytes copied.
+*/
+static unsigned Cb_write_data( struct Circular_buffer * const cb,
+ const uint8_t * const in_buffer,
+ const unsigned in_size )
+ {
+ unsigned size = 0;
+ if( in_size == 0 ) return 0;
+ if( cb->put >= cb->get )
+ {
+ size = min( cb->buffer_size - cb->put - (cb->get == 0), in_size );
+ if( size > 0 )
+ {
+ memcpy( cb->buffer + cb->put, in_buffer, size );
+ cb->put += size;
+ if( cb->put >= cb->buffer_size ) cb->put = 0;
+ }
+ }
+ if( cb->put < cb->get )
+ {
+ const unsigned size2 = min( cb->get - cb->put - 1, in_size - size );
+ if( size2 > 0 )
+ {
+ memcpy( cb->buffer + cb->put, in_buffer + size, size2 );
+ cb->put += size2;
+ size += size2;
+ }
+ }
+ return size;
+ }
diff --git a/configure b/configure
new file mode 100755
index 0000000..ec5c149
--- /dev/null
+++ b/configure
@@ -0,0 +1,244 @@
+#! /bin/sh
+# configure script for Lzlib - Compression library for the lzip format
+# Copyright (C) 2009-2024 Antonio Diaz Diaz.
+#
+# This configure script is free software: you have unlimited permission
+# to copy, distribute, and modify it.
+
+pkgname=lzlib
+pkgversion=1.14
+soversion=1
+progname=minilzip
+progname_static=${progname}
+progname_shared=
+progname_lzip=${progname}
+disable_ldconfig=
+libname=lz
+srctrigger=doc/${pkgname}.texi
+
+# clear some things potentially inherited from environment.
+LC_ALL=C
+export LC_ALL
+srcdir=
+prefix=/usr/local
+exec_prefix='$(prefix)'
+bindir='$(exec_prefix)/bin'
+datarootdir='$(prefix)/share'
+includedir='$(prefix)/include'
+infodir='$(datarootdir)/info'
+libdir='$(exec_prefix)/lib'
+mandir='$(datarootdir)/man'
+CC=gcc
+AR=ar
+CPPFLAGS=
+CFLAGS='-Wall -W -O2'
+LDFLAGS=
+ARFLAGS=-rcs
+MAKEINFO=makeinfo
+
+# checking whether we are using GNU C.
+/bin/sh -c "${CC} --version" > /dev/null 2>&1 || { CC=cc ; CFLAGS=-O2 ; }
+
+# Loop over all args
+args=
+no_create=
+while [ $# != 0 ] ; do
+
+ # Get the first arg, and shuffle
+ option=$1 ; arg2=no
+ shift
+
+ # Add the argument quoted to args
+ if [ -z "${args}" ] ; then args="\"${option}\""
+ else args="${args} \"${option}\"" ; fi
+
+ # Split out the argument for options that take them
+ case ${option} in
+ *=*) optarg=`echo "${option}" | sed -e 's,^[^=]*=,,;s,/$,,'` ;;
+ esac
+
+ # Process the options
+ case ${option} in
+ --help | -h)
+ echo "Usage: $0 [OPTION]... [VAR=VALUE]..."
+ echo
+ echo "To assign makefile variables (e.g., CC, CFLAGS...), specify them as"
+ echo "arguments to configure in the form VAR=VALUE."
+ echo
+ echo "Options and variables: [defaults in brackets]"
+ echo " -h, --help display this help and exit"
+ echo " -V, --version output version information and exit"
+ echo " --srcdir=DIR find the source code in DIR [. or ..]"
+ echo " --prefix=DIR install into DIR [${prefix}]"
+ echo " --exec-prefix=DIR base directory for arch-dependent files [${exec_prefix}]"
+ echo " --bindir=DIR user executables directory [${bindir}]"
+ echo " --datarootdir=DIR base directory for doc and data [${datarootdir}]"
+ echo " --includedir=DIR C header files [${includedir}]"
+ echo " --infodir=DIR info files directory [${infodir}]"
+ echo " --libdir=DIR object code libraries [${libdir}]"
+ echo " --mandir=DIR man pages directory [${mandir}]"
+ echo " --disable-static don't build a static library [enable]"
+ echo " (implies --enable-shared)"
+ echo " --enable-shared build also a shared library [disable]"
+ echo " --disable-ldconfig don't run ldconfig after install"
+ echo " CC=COMPILER C compiler to use [${CC}]"
+ echo " AR=ARCHIVER library archiver to use [${AR}]"
+ echo " CPPFLAGS=OPTIONS command-line options for the preprocessor [${CPPFLAGS}]"
+ echo " CFLAGS=OPTIONS command-line options for the C compiler [${CFLAGS}]"
+ echo " CFLAGS+=OPTIONS append options to the current value of CFLAGS"
+ echo " LDFLAGS=OPTIONS command-line options for the linker [${LDFLAGS}]"
+ echo " ARFLAGS=OPTIONS command-line options for the library archiver [${ARFLAGS}]"
+ echo " MAKEINFO=NAME makeinfo program to use [${MAKEINFO}]"
+ echo
+ exit 0 ;;
+ --version | -V)
+ echo "Configure script for ${pkgname} version ${pkgversion}"
+ exit 0 ;;
+ --srcdir) srcdir=$1 ; arg2=yes ;;
+ --prefix) prefix=$1 ; arg2=yes ;;
+ --exec-prefix) exec_prefix=$1 ; arg2=yes ;;
+ --bindir) bindir=$1 ; arg2=yes ;;
+ --datarootdir) datarootdir=$1 ; arg2=yes ;;
+ --includedir) includedir=$1 ; arg2=yes ;;
+ --infodir) infodir=$1 ; arg2=yes ;;
+ --libdir) libdir=$1 ; arg2=yes ;;
+ --mandir) mandir=$1 ; arg2=yes ;;
+
+ --srcdir=*) srcdir=${optarg} ;;
+ --prefix=*) prefix=${optarg} ;;
+ --exec-prefix=*) exec_prefix=${optarg} ;;
+ --bindir=*) bindir=${optarg} ;;
+ --datarootdir=*) datarootdir=${optarg} ;;
+ --includedir=*) includedir=${optarg} ;;
+ --infodir=*) infodir=${optarg} ;;
+ --libdir=*) libdir=${optarg} ;;
+ --mandir=*) mandir=${optarg} ;;
+ --no-create) no_create=yes ;;
+ --disable-static)
+ progname_static=
+ progname_shared=${progname}_shared
+ progname_lzip=${progname}_shared ;;
+ --enable-shared)
+ progname_shared=${progname}_shared
+ progname_lzip=${progname}_shared ;;
+ --disable-ldconfig) disable_ldconfig=yes ;;
+
+ CC=*) CC=${optarg} ;;
+ AR=*) AR=${optarg} ;;
+ CPPFLAGS=*) CPPFLAGS=${optarg} ;;
+ CFLAGS=*) CFLAGS=${optarg} ;;
+ CFLAGS+=*) CFLAGS="${CFLAGS} ${optarg}" ;;
+ LDFLAGS=*) LDFLAGS=${optarg} ;;
+ ARFLAGS=*) ARFLAGS=${optarg} ;;
+ MAKEINFO=*) MAKEINFO=${optarg} ;;
+
+ --*)
+ echo "configure: WARNING: unrecognized option: '${option}'" 1>&2 ;;
+ *=* | *-*-*) ;;
+ *)
+ echo "configure: unrecognized option: '${option}'" 1>&2
+ echo "Try 'configure --help' for more information." 1>&2
+ exit 1 ;;
+ esac
+
+ # Check if the option took a separate argument
+ if [ "${arg2}" = yes ] ; then
+ if [ $# != 0 ] ; then args="${args} \"$1\"" ; shift
+ else echo "configure: Missing argument to '${option}'" 1>&2
+ exit 1
+ fi
+ fi
+done
+
+# Find the source code, if location was not specified.
+srcdirtext=
+if [ -z "${srcdir}" ] ; then
+ srcdirtext="or . or .." ; srcdir=.
+ if [ ! -r "${srcdir}/${srctrigger}" ] ; then srcdir=.. ; fi
+ if [ ! -r "${srcdir}/${srctrigger}" ] ; then
+ ## the sed command below emulates the dirname command
+ srcdir=`echo "$0" | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'`
+ fi
+fi
+
+if [ ! -r "${srcdir}/${srctrigger}" ] ; then
+ echo "configure: Can't find source code in ${srcdir} ${srcdirtext}" 1>&2
+ echo "configure: (At least ${srctrigger} is missing)." 1>&2
+ exit 1
+fi
+
+# Set srcdir to . if that's what it is.
+if [ "`pwd`" = "`cd "${srcdir}" ; pwd`" ] ; then srcdir=. ; fi
+
+echo
+if [ -z "${no_create}" ] ; then
+ echo "creating config.status"
+ rm -f config.status
+ cat > config.status << EOF
+#! /bin/sh
+# This file was generated automatically by configure. Don't edit.
+# Run this file to recreate the current configuration.
+#
+# This script is free software: you have unlimited permission
+# to copy, distribute, and modify it.
+
+exec /bin/sh "$0" ${args} --no-create
+EOF
+ chmod +x config.status
+fi
+
+echo "creating Makefile"
+echo "VPATH = ${srcdir}"
+echo "prefix = ${prefix}"
+echo "exec_prefix = ${exec_prefix}"
+echo "bindir = ${bindir}"
+echo "datarootdir = ${datarootdir}"
+echo "includedir = ${includedir}"
+echo "infodir = ${infodir}"
+echo "libdir = ${libdir}"
+echo "mandir = ${mandir}"
+echo "CC = ${CC}"
+echo "AR = ${AR}"
+echo "CPPFLAGS = ${CPPFLAGS}"
+echo "CFLAGS = ${CFLAGS}"
+echo "LDFLAGS = ${LDFLAGS}"
+echo "ARFLAGS = ${ARFLAGS}"
+echo "MAKEINFO = ${MAKEINFO}"
+rm -f Makefile
+cat > Makefile << EOF
+# Makefile for Lzlib - Compression library for the lzip format
+# Copyright (C) 2009-2024 Antonio Diaz Diaz.
+# This file was generated automatically by configure. Don't edit.
+#
+# This Makefile is free software: you have unlimited permission
+# to copy, distribute, and modify it.
+
+pkgname = ${pkgname}
+pkgversion = ${pkgversion}
+soversion = ${soversion}
+progname = ${progname}
+progname_static = ${progname_static}
+progname_shared = ${progname_shared}
+progname_lzip = ${progname_lzip}
+disable_ldconfig = ${disable_ldconfig}
+libname = ${libname}
+VPATH = ${srcdir}
+prefix = ${prefix}
+exec_prefix = ${exec_prefix}
+bindir = ${bindir}
+datarootdir = ${datarootdir}
+includedir = ${includedir}
+infodir = ${infodir}
+libdir = ${libdir}
+mandir = ${mandir}
+CC = ${CC}
+AR = ${AR}
+CPPFLAGS = ${CPPFLAGS}
+CFLAGS = ${CFLAGS}
+LDFLAGS = ${LDFLAGS}
+ARFLAGS = ${ARFLAGS}
+MAKEINFO = ${MAKEINFO}
+EOF
+cat "${srcdir}/Makefile.in" >> Makefile
+
+echo "OK. Now you can run make."
diff --git a/decoder.c b/decoder.c
new file mode 100644
index 0000000..6544ec6
--- /dev/null
+++ b/decoder.c
@@ -0,0 +1,145 @@
+/* Lzlib - Compression library for the lzip format
+ Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+ This library is free software. Redistribution and use in source and
+ binary forms, with or without modification, are permitted provided
+ that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions, and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions, and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+static int LZd_try_check_trailer( struct LZ_decoder * const d )
+ {
+ Lzip_trailer trailer;
+ if( Rd_available_bytes( d->rdec ) < Lt_size )
+ { if( !d->rdec->at_stream_end ) return 0; else return 2; }
+ d->check_trailer_pending = false;
+ d->member_finished = true;
+
+ if( Rd_read_data( d->rdec, trailer, Lt_size ) == Lt_size &&
+ Lt_get_data_crc( trailer ) == LZd_crc( d ) &&
+ Lt_get_data_size( trailer ) == LZd_data_position( d ) &&
+ Lt_get_member_size( trailer ) == d->rdec->member_position ) return 0;
+ return 3;
+ }
+
+
+/* Return value: 0 = OK, 1 = decoder error, 2 = unexpected EOF,
+ 3 = trailer error, 4 = unknown marker found,
+ 5 = library error. */
+static int LZd_decode_member( struct LZ_decoder * const d )
+ {
+ struct Range_decoder * const rdec = d->rdec;
+ State * const state = &d->state;
+ /* unsigned old_mpos = rdec->member_position; */
+
+ if( d->member_finished ) return 0;
+ if( !Rd_try_reload( rdec ) )
+ { if( !rdec->at_stream_end ) return 0; else return 2; }
+ if( d->check_trailer_pending ) return LZd_try_check_trailer( d );
+
+ while( !Rd_finished( rdec ) )
+ {
+ /* const unsigned mpos = rdec->member_position;
+ if( mpos - old_mpos > rd_min_available_bytes ) return 5;
+ old_mpos = mpos; */
+ if( !Rd_enough_available_bytes( rdec ) ) /* check unexpected EOF */
+ { if( !rdec->at_stream_end ) return 0;
+ if( Cb_empty( &rdec->cb ) ) break; } /* decode until EOF */
+ if( !LZd_enough_free_bytes( d ) ) return 0;
+ const int pos_state = LZd_data_position( d ) & pos_state_mask;
+ if( Rd_decode_bit( rdec, &d->bm_match[*state][pos_state] ) == 0 ) /* 1st bit */
+ {
+ /* literal byte */
+ Bit_model * const bm = d->bm_literal[get_lit_state(LZd_peek_prev( d ))];
+ if( ( *state = St_set_char( *state ) ) < 4 )
+ LZd_put_byte( d, Rd_decode_tree8( rdec, bm ) );
+ else
+ LZd_put_byte( d, Rd_decode_matched( rdec, bm, LZd_peek( d, d->rep0 ) ) );
+ continue;
+ }
+ /* match or repeated match */
+ int len;
+ if( Rd_decode_bit( rdec, &d->bm_rep[*state] ) != 0 ) /* 2nd bit */
+ {
+ if( Rd_decode_bit( rdec, &d->bm_rep0[*state] ) == 0 ) /* 3rd bit */
+ {
+ if( Rd_decode_bit( rdec, &d->bm_len[*state][pos_state] ) == 0 ) /* 4th bit */
+ { *state = St_set_short_rep( *state );
+ LZd_put_byte( d, LZd_peek( d, d->rep0 ) ); continue; }
+ }
+ else
+ {
+ unsigned distance;
+ if( Rd_decode_bit( rdec, &d->bm_rep1[*state] ) == 0 ) /* 4th bit */
+ distance = d->rep1;
+ else
+ {
+ if( Rd_decode_bit( rdec, &d->bm_rep2[*state] ) == 0 ) /* 5th bit */
+ distance = d->rep2;
+ else
+ { distance = d->rep3; d->rep3 = d->rep2; }
+ d->rep2 = d->rep1;
+ }
+ d->rep1 = d->rep0;
+ d->rep0 = distance;
+ }
+ *state = St_set_rep( *state );
+ len = Rd_decode_len( rdec, &d->rep_len_model, pos_state );
+ }
+ else /* match */
+ {
+ len = Rd_decode_len( rdec, &d->match_len_model, pos_state );
+ unsigned distance = Rd_decode_tree6( rdec, d->bm_dis_slot[get_len_state(len)] );
+ if( distance >= start_dis_model )
+ {
+ const unsigned dis_slot = distance;
+ const int direct_bits = ( dis_slot >> 1 ) - 1;
+ distance = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
+ if( dis_slot < end_dis_model )
+ distance += Rd_decode_tree_reversed( rdec,
+ d->bm_dis + ( distance - dis_slot ), direct_bits );
+ else
+ {
+ distance +=
+ Rd_decode( rdec, direct_bits - dis_align_bits ) << dis_align_bits;
+ distance += Rd_decode_tree_reversed4( rdec, d->bm_align );
+ if( distance == 0xFFFFFFFFU ) /* marker found */
+ {
+ Rd_normalize( rdec );
+ /* const unsigned mpos = rdec->member_position;
+ if( mpos - old_mpos > rd_min_available_bytes ) return 5;
+ old_mpos = mpos; */
+ if( len == min_match_len ) /* End Of Stream marker */
+ {
+ d->check_trailer_pending = true;
+ return LZd_try_check_trailer( d );
+ }
+ if( len == min_match_len + 1 ) /* Sync Flush marker */
+ {
+ rdec->reload_pending = true;
+ if( Rd_try_reload( rdec ) ) continue;
+ if( !rdec->at_stream_end ) return 0; else break;
+ }
+ return 4;
+ }
+ }
+ }
+ d->rep3 = d->rep2; d->rep2 = d->rep1; d->rep1 = d->rep0; d->rep0 = distance;
+ *state = St_set_match( *state );
+ if( d->rep0 >= d->dictionary_size ||
+ ( d->rep0 >= d->cb.put && !d->pos_wrapped ) ) return 1;
+ }
+ LZd_copy_block( d, d->rep0, len );
+ }
+ return 2;
+ }
diff --git a/decoder.h b/decoder.h
new file mode 100644
index 0000000..4b91fec
--- /dev/null
+++ b/decoder.h
@@ -0,0 +1,463 @@
+/* Lzlib - Compression library for the lzip format
+ Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+ This library is free software. Redistribution and use in source and
+ binary forms, with or without modification, are permitted provided
+ that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions, and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions, and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+enum { rd_min_available_bytes = 10 };
+
+struct Range_decoder
+ {
+ struct Circular_buffer cb; /* input buffer */
+ unsigned long long member_position;
+ uint32_t code;
+ uint32_t range;
+ bool at_stream_end;
+ bool reload_pending;
+ };
+
+static inline bool Rd_init( struct Range_decoder * const rdec )
+ {
+ if( !Cb_init( &rdec->cb, 65536 + rd_min_available_bytes ) ) return false;
+ rdec->member_position = 0;
+ rdec->code = 0;
+ rdec->range = 0xFFFFFFFFU;
+ rdec->at_stream_end = false;
+ rdec->reload_pending = false;
+ return true;
+ }
+
+static inline void Rd_free( struct Range_decoder * const rdec )
+ { Cb_free( &rdec->cb ); }
+
+static inline bool Rd_finished( const struct Range_decoder * const rdec )
+ { return rdec->at_stream_end && Cb_empty( &rdec->cb ); }
+
+static inline void Rd_finish( struct Range_decoder * const rdec )
+ { rdec->at_stream_end = true; }
+
+static inline bool Rd_enough_available_bytes( const struct Range_decoder * const rdec )
+ { return Cb_used_bytes( &rdec->cb ) >= rd_min_available_bytes; }
+
+static inline unsigned Rd_available_bytes( const struct Range_decoder * const rdec )
+ { return Cb_used_bytes( &rdec->cb ); }
+
+static inline unsigned Rd_free_bytes( const struct Range_decoder * const rdec )
+ { return rdec->at_stream_end ? 0 : Cb_free_bytes( &rdec->cb ); }
+
+static inline unsigned long long Rd_purge( struct Range_decoder * const rdec )
+ {
+ const unsigned long long size =
+ rdec->member_position + Cb_used_bytes( &rdec->cb );
+ Cb_reset( &rdec->cb );
+ rdec->member_position = 0; rdec->at_stream_end = true;
+ return size;
+ }
+
+static inline void Rd_reset( struct Range_decoder * const rdec )
+ { Cb_reset( &rdec->cb );
+ rdec->member_position = 0; rdec->at_stream_end = false; }
+
+
+/* Seek for a member header and update 'get'. Set '*skippedp' to the number
+ of bytes skipped. Return true if a valid header is found.
+*/
+static bool Rd_find_header( struct Range_decoder * const rdec,
+ unsigned * const skippedp )
+ {
+ *skippedp = 0;
+ while( rdec->cb.get != rdec->cb.put )
+ {
+ if( rdec->cb.buffer[rdec->cb.get] == lzip_magic[0] )
+ {
+ unsigned get = rdec->cb.get;
+ int i;
+ Lzip_header header;
+ for( i = 0; i < Lh_size; ++i )
+ {
+ if( get == rdec->cb.put ) return false; /* not enough data */
+ header[i] = rdec->cb.buffer[get];
+ if( ++get >= rdec->cb.buffer_size ) get = 0;
+ }
+ if( Lh_check( header ) ) return true;
+ }
+ if( ++rdec->cb.get >= rdec->cb.buffer_size ) rdec->cb.get = 0;
+ ++*skippedp;
+ }
+ return false;
+ }
+
+
+static inline int Rd_write_data( struct Range_decoder * const rdec,
+ const uint8_t * const inbuf, const int size )
+ {
+ if( rdec->at_stream_end || size <= 0 ) return 0;
+ return Cb_write_data( &rdec->cb, inbuf, size );
+ }
+
+static inline uint8_t Rd_get_byte( struct Range_decoder * const rdec )
+ {
+ /* 0xFF avoids decoder error if member is truncated at EOS marker */
+ if( Rd_finished( rdec ) ) return 0xFF;
+ ++rdec->member_position;
+ return Cb_get_byte( &rdec->cb );
+ }
+
+static inline int Rd_read_data( struct Range_decoder * const rdec,
+ uint8_t * const outbuf, const int size )
+ {
+ const int sz = Cb_read_data( &rdec->cb, outbuf, size );
+ if( sz > 0 ) rdec->member_position += sz;
+ return sz;
+ }
+
+static inline bool Rd_unread_data( struct Range_decoder * const rdec,
+ const unsigned size )
+ {
+ if( size > rdec->member_position || !Cb_unread_data( &rdec->cb, size ) )
+ return false;
+ rdec->member_position -= size;
+ return true;
+ }
+
+static bool Rd_try_reload( struct Range_decoder * const rdec )
+ {
+ if( rdec->reload_pending && Rd_available_bytes( rdec ) >= 5 )
+ {
+ rdec->reload_pending = false;
+ rdec->code = 0;
+ rdec->range = 0xFFFFFFFFU;
+ Rd_get_byte( rdec ); /* discard first byte of the LZMA stream */
+ int i; for( i = 0; i < 4; ++i )
+ rdec->code = (rdec->code << 8) | Rd_get_byte( rdec );
+ }
+ return !rdec->reload_pending;
+ }
+
+static inline void Rd_normalize( struct Range_decoder * const rdec )
+ {
+ if( rdec->range <= 0x00FFFFFFU )
+ { rdec->range <<= 8; rdec->code = (rdec->code << 8) | Rd_get_byte( rdec ); }
+ }
+
+static inline unsigned Rd_decode( struct Range_decoder * const rdec,
+ const int num_bits )
+ {
+ unsigned symbol = 0;
+ int i;
+ for( i = num_bits; i > 0; --i )
+ {
+ Rd_normalize( rdec );
+ rdec->range >>= 1;
+/* symbol <<= 1; */
+/* if( rdec->code >= rdec->range ) { rdec->code -= rdec->range; symbol |= 1; } */
+ const bool bit = ( rdec->code >= rdec->range );
+ symbol <<= 1; symbol += bit;
+ rdec->code -= rdec->range & ( 0U - bit );
+ }
+ return symbol;
+ }
+
+static inline unsigned Rd_decode_bit( struct Range_decoder * const rdec,
+ Bit_model * const probability )
+ {
+ Rd_normalize( rdec );
+ const uint32_t bound = ( rdec->range >> bit_model_total_bits ) * *probability;
+ if( rdec->code < bound )
+ {
+ rdec->range = bound;
+ *probability += ( bit_model_total - *probability ) >> bit_model_move_bits;
+ return 0;
+ }
+ else
+ {
+ rdec->code -= bound;
+ rdec->range -= bound;
+ *probability -= *probability >> bit_model_move_bits;
+ return 1;
+ }
+ }
+
+static inline void Rd_decode_symbol_bit( struct Range_decoder * const rdec,
+ Bit_model * const probability, unsigned * symbol )
+ {
+ Rd_normalize( rdec );
+ *symbol <<= 1;
+ const uint32_t bound = ( rdec->range >> bit_model_total_bits ) * *probability;
+ if( rdec->code < bound )
+ {
+ rdec->range = bound;
+ *probability += ( bit_model_total - *probability ) >> bit_model_move_bits;
+ }
+ else
+ {
+ rdec->code -= bound;
+ rdec->range -= bound;
+ *probability -= *probability >> bit_model_move_bits;
+ *symbol |= 1;
+ }
+ }
+
+static inline void Rd_decode_symbol_bit_reversed( struct Range_decoder * const rdec,
+ Bit_model * const probability, unsigned * model,
+ unsigned * symbol, const int i )
+ {
+ Rd_normalize( rdec );
+ *model <<= 1;
+ const uint32_t bound = ( rdec->range >> bit_model_total_bits ) * *probability;
+ if( rdec->code < bound )
+ {
+ rdec->range = bound;
+ *probability += ( bit_model_total - *probability ) >> bit_model_move_bits;
+ }
+ else
+ {
+ rdec->code -= bound;
+ rdec->range -= bound;
+ *probability -= *probability >> bit_model_move_bits;
+ *model |= 1;
+ *symbol |= 1 << i;
+ }
+ }
+
+static inline unsigned Rd_decode_tree6( struct Range_decoder * const rdec,
+ Bit_model bm[] )
+ {
+ unsigned symbol = 1;
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ return symbol & 0x3F;
+ }
+
+static inline unsigned Rd_decode_tree8( struct Range_decoder * const rdec,
+ Bit_model bm[] )
+ {
+ unsigned symbol = 1;
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ return symbol & 0xFF;
+ }
+
+static inline unsigned
+Rd_decode_tree_reversed( struct Range_decoder * const rdec,
+ Bit_model bm[], const int num_bits )
+ {
+ unsigned model = 1;
+ unsigned symbol = 0;
+ int i;
+ for( i = 0; i < num_bits; ++i )
+ Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, i );
+ return symbol;
+ }
+
+static inline unsigned
+Rd_decode_tree_reversed4( struct Range_decoder * const rdec, Bit_model bm[] )
+ {
+ unsigned model = 1;
+ unsigned symbol = 0;
+ Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 0 );
+ Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 1 );
+ Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 2 );
+ Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 3 );
+ return symbol;
+ }
+
+static inline unsigned Rd_decode_matched( struct Range_decoder * const rdec,
+ Bit_model bm[], unsigned match_byte )
+ {
+ unsigned symbol = 1;
+ unsigned mask = 0x100;
+ while( true )
+ {
+ const unsigned match_bit = ( match_byte <<= 1 ) & mask;
+ const unsigned bit = Rd_decode_bit( rdec, &bm[symbol+match_bit+mask] );
+ symbol <<= 1; symbol += bit;
+ if( symbol > 0xFF ) return symbol & 0xFF;
+ mask &= ~(match_bit ^ (bit << 8)); /* if( match_bit != bit ) mask = 0; */
+ }
+ }
+
+static inline unsigned Rd_decode_len( struct Range_decoder * const rdec,
+ struct Len_model * const lm,
+ const int pos_state )
+ {
+ Bit_model * bm;
+ unsigned mask, offset, symbol = 1;
+
+ if( Rd_decode_bit( rdec, &lm->choice1 ) == 0 )
+ { bm = lm->bm_low[pos_state]; mask = 7; offset = 0; goto len3; }
+ if( Rd_decode_bit( rdec, &lm->choice2 ) == 0 )
+ { bm = lm->bm_mid[pos_state]; mask = 7; offset = len_low_symbols; goto len3; }
+ bm = lm->bm_high; mask = 0xFF; offset = len_low_symbols + len_mid_symbols;
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+len3:
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+ return ( symbol & mask ) + min_match_len + offset;
+ }
+
+
+enum { lzd_min_free_bytes = max_match_len };
+
+struct LZ_decoder
+ {
+ struct Circular_buffer cb;
+ unsigned long long partial_data_pos;
+ struct Range_decoder * rdec;
+ unsigned dictionary_size;
+ uint32_t crc;
+ bool check_trailer_pending;
+ bool member_finished;
+ bool pos_wrapped;
+ unsigned rep0; /* rep[0-3] latest four distances */
+ unsigned rep1; /* used for efficient coding of */
+ unsigned rep2; /* repeated distances */
+ unsigned rep3;
+ State state;
+
+ Bit_model bm_literal[1<<literal_context_bits][0x300];
+ Bit_model bm_match[states][pos_states];
+ Bit_model bm_rep[states];
+ Bit_model bm_rep0[states];
+ Bit_model bm_rep1[states];
+ Bit_model bm_rep2[states];
+ Bit_model bm_len[states][pos_states];
+ Bit_model bm_dis_slot[len_states][1<<dis_slot_bits];
+ Bit_model bm_dis[modeled_distances-end_dis_model+1];
+ Bit_model bm_align[dis_align_size];
+
+ struct Len_model match_len_model;
+ struct Len_model rep_len_model;
+ };
+
+static inline bool LZd_enough_free_bytes( const struct LZ_decoder * const d )
+ { return Cb_free_bytes( &d->cb ) >= lzd_min_free_bytes; }
+
+static inline uint8_t LZd_peek_prev( const struct LZ_decoder * const d )
+ { return d->cb.buffer[((d->cb.put > 0) ? d->cb.put : d->cb.buffer_size)-1]; }
+
+static inline uint8_t LZd_peek( const struct LZ_decoder * const d,
+ const unsigned distance )
+ {
+ const unsigned i = ( ( d->cb.put > distance ) ? 0 : d->cb.buffer_size ) +
+ d->cb.put - distance - 1;
+ return d->cb.buffer[i];
+ }
+
+static inline void LZd_put_byte( struct LZ_decoder * const d, const uint8_t b )
+ {
+ CRC32_update_byte( &d->crc, b );
+ d->cb.buffer[d->cb.put] = b;
+ if( ++d->cb.put >= d->cb.buffer_size )
+ { d->partial_data_pos += d->cb.put; d->cb.put = 0; d->pos_wrapped = true; }
+ }
+
+static inline void LZd_copy_block( struct LZ_decoder * const d,
+ const unsigned distance, unsigned len )
+ {
+ unsigned lpos = d->cb.put, i = lpos - distance - 1;
+ bool fast, fast2;
+ if( lpos > distance )
+ {
+ fast = ( len < d->cb.buffer_size - lpos );
+ fast2 = ( fast && len <= lpos - i );
+ }
+ else
+ {
+ i += d->cb.buffer_size;
+ fast = ( len < d->cb.buffer_size - i ); /* (i == pos) may happen */
+ fast2 = ( fast && len <= i - lpos );
+ }
+ if( fast ) /* no wrap */
+ {
+ const unsigned tlen = len;
+ if( fast2 ) /* no wrap, no overlap */
+ memcpy( d->cb.buffer + lpos, d->cb.buffer + i, len );
+ else
+ for( ; len > 0; --len ) d->cb.buffer[lpos++] = d->cb.buffer[i++];
+ CRC32_update_buf( &d->crc, d->cb.buffer + d->cb.put, tlen );
+ d->cb.put += tlen;
+ }
+ else for( ; len > 0; --len )
+ {
+ LZd_put_byte( d, d->cb.buffer[i] );
+ if( ++i >= d->cb.buffer_size ) i = 0;
+ }
+ }
+
+static inline bool LZd_init( struct LZ_decoder * const d,
+ struct Range_decoder * const rde,
+ const unsigned dict_size )
+ {
+ if( !Cb_init( &d->cb, max( 65536, dict_size ) + lzd_min_free_bytes ) )
+ return false;
+ d->partial_data_pos = 0;
+ d->rdec = rde;
+ d->dictionary_size = dict_size;
+ d->crc = 0xFFFFFFFFU;
+ d->check_trailer_pending = false;
+ d->member_finished = false;
+ d->pos_wrapped = false;
+ /* prev_byte of first byte; also for LZd_peek( 0 ) on corrupt file */
+ d->cb.buffer[d->cb.buffer_size-1] = 0;
+ d->rep0 = 0;
+ d->rep1 = 0;
+ d->rep2 = 0;
+ d->rep3 = 0;
+ d->state = 0;
+
+ Bm_array_init( d->bm_literal[0], (1 << literal_context_bits) * 0x300 );
+ Bm_array_init( d->bm_match[0], states * pos_states );
+ Bm_array_init( d->bm_rep, states );
+ Bm_array_init( d->bm_rep0, states );
+ Bm_array_init( d->bm_rep1, states );
+ Bm_array_init( d->bm_rep2, states );
+ Bm_array_init( d->bm_len[0], states * pos_states );
+ Bm_array_init( d->bm_dis_slot[0], len_states * (1 << dis_slot_bits) );
+ Bm_array_init( d->bm_dis, modeled_distances - end_dis_model + 1 );
+ Bm_array_init( d->bm_align, dis_align_size );
+ Lm_init( &d->match_len_model );
+ Lm_init( &d->rep_len_model );
+ return true;
+ }
+
+static inline void LZd_free( struct LZ_decoder * const d )
+ { Cb_free( &d->cb ); }
+
+static inline bool LZd_member_finished( const struct LZ_decoder * const d )
+ { return d->member_finished && Cb_empty( &d->cb ); }
+
+static inline unsigned LZd_crc( const struct LZ_decoder * const d )
+ { return d->crc ^ 0xFFFFFFFFU; }
+
+static inline unsigned long long
+LZd_data_position( const struct LZ_decoder * const d )
+ { return d->partial_data_pos + d->cb.put; }
diff --git a/doc/lzlib.info b/doc/lzlib.info
new file mode 100644
index 0000000..979c477
--- /dev/null
+++ b/doc/lzlib.info
@@ -0,0 +1,1336 @@
+This is lzlib.info, produced by makeinfo version 4.13+ from lzlib.texi.
+
+INFO-DIR-SECTION Compression
+START-INFO-DIR-ENTRY
+* Lzlib: (lzlib). Compression library for the lzip format
+END-INFO-DIR-ENTRY
+
+
+File: lzlib.info, Node: Top, Next: Introduction, Up: (dir)
+
+Lzlib Manual
+************
+
+This manual is for Lzlib (version 1.14, 20 January 2024).
+
+* Menu:
+
+* Introduction:: Purpose and features of lzlib
+* Library version:: Checking library version
+* Buffering:: Sizes of lzlib's buffers
+* Parameter limits:: Min / max values for some parameters
+* Compression functions:: Descriptions of the compression functions
+* Decompression functions:: Descriptions of the decompression functions
+* Error codes:: Meaning of codes returned by functions
+* Error messages:: Error messages corresponding to error codes
+* Invoking minilzip:: Command-line interface of the test program
+* Data format:: Detailed format of the compressed data
+* Examples:: A small tutorial with examples
+* Problems:: Reporting bugs
+* Concept index:: Index of concepts
+
+
+ Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+ This manual is free documentation: you have unlimited permission to copy,
+distribute, and modify it.
+
+
+File: lzlib.info, Node: Introduction, Next: Library version, Prev: Top, Up: Top
+
+1 Introduction
+**************
+
+Lzlib is a data compression library providing in-memory LZMA compression and
+decompression functions, including integrity checking of the decompressed
+data. The compressed data format used by the library is the lzip format.
+Lzlib is written in C.
+
+ The lzip file format is designed for data sharing and long-term
+archiving, taking into account both data integrity and decoder availability:
+
+ * The lzip format provides very safe integrity checking and some data
+ recovery means. The program lziprecover can repair bit flip errors
+ (one of the most common forms of data corruption) in lzip files, and
+ provides data recovery capabilities, including error-checked merging
+ of damaged copies of a file. *Note Data safety: (lziprecover)Data
+ safety.
+
+ * The lzip format is as simple as possible (but not simpler). The lzip
+ manual provides the source code of a simple decompressor along with a
+ detailed explanation of how it works, so that with the only help of the
+ lzip manual it would be possible for a digital archaeologist to extract
+ the data from a lzip file long after quantum computers eventually
+ render LZMA obsolete.
+
+ * Additionally the lzip reference implementation is copylefted, which
+ guarantees that it will remain free forever.
+
+ A nice feature of the lzip format is that a corrupt byte is easier to
+repair the nearer it is from the beginning of the file. Therefore, with the
+help of lziprecover, losing an entire archive just because of a corrupt
+byte near the beginning is a thing of the past.
+
+ The functions and variables forming the interface of the compression
+library are declared in the file 'lzlib.h'. Usage examples of the library
+are given in the files 'bbexample.c', 'ffexample.c', and 'minilzip.c' from
+the source distribution.
+
+ As 'lzlib.h' can be used by C and C++ programs, it must not impose a
+choice of system headers on the program by including one of them. Therefore
+it is the responsibility of the program using lzlib to include before
+'lzlib.h' some header that declares the type 'uint8_t'. There are at least
+four such headers in C and C++: 'stdint.h', 'cstdint', 'inttypes.h', and
+'cinttypes'.
+
+ All the library functions are thread safe. The library does not install
+any signal handler. The decoder checks the consistency of the compressed
+data, so the library should never crash even in case of corrupted input.
+
+ Compression/decompression is done by repeatedly calling a couple of
+read/write functions until all the data have been processed by the library.
+This interface is safer and less error prone than the traditional zlib
+interface.
+
+ Compression/decompression is done when the read function is called. This
+means the value returned by the position functions is not updated until a
+read call, even if a lot of data are written. If you want the data to be
+compressed in advance, just call the read function with a SIZE equal to 0.
+
+ If all the data to be compressed are written in advance, lzlib
+automatically adjusts the header of the compressed data to use the largest
+dictionary size that does not exceed neither the data size nor the limit
+given to 'LZ_compress_open'. This feature reduces the amount of memory
+needed for decompression and allows minilzip to produce identical
+compressed output as lzip.
+
+ Lzlib correctly decompresses a data stream which is the concatenation of
+two or more compressed data streams. The result is the concatenation of the
+corresponding decompressed data streams. Integrity testing of concatenated
+compressed data streams is also supported.
+
+ Lzlib is able to compress and decompress streams of unlimited size by
+automatically creating multimember output. The members so created are large,
+about 2 PiB each.
+
+ In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a
+concrete algorithm; it is more like "any algorithm using the LZMA coding
+scheme". For example, the option '-0' of lzip uses the scheme in almost the
+simplest way possible; issuing the longest match it can find, or a literal
+byte if it can't find a match. Inversely, a much more elaborated way of
+finding coding sequences of minimum size than the one currently used by
+lzip could be developed, and the resulting sequence could also be coded
+using the LZMA coding scheme.
+
+ Lzlib currently implements two variants of the LZMA algorithm: fast
+(used by option '-0' of minilzip) and normal (used by all other compression
+levels).
+
+ The high compression of LZMA comes from combining two basic, well-proven
+compression ideas: sliding dictionaries (LZ77) and markov models (the thing
+used by every compression algorithm that uses a range encoder or similar
+order-0 entropy coder as its last stage) with segregation of contexts
+according to what the bits are used for.
+
+ The ideas embodied in lzlib are due to (at least) the following people:
+Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the
+definition of Markov chains), G.N.N. Martin (for the definition of range
+encoding), Igor Pavlov (for putting all the above together in LZMA), and
+Julian Seward (for bzip2's CLI).
+
+ LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never
+have been compressed. Decompressed is used to refer to data which have
+undergone the process of decompression.
+
+
+File: lzlib.info, Node: Library version, Next: Buffering, Prev: Introduction, Up: Top
+
+2 Library version
+*****************
+
+One goal of lzlib is to keep perfect backward compatibility with older
+versions of itself down to 1.0. Any application working with an older lzlib
+should work with a newer lzlib. Installing a newer lzlib should not break
+anything. This chapter describes the constants and functions that the
+application can use to discover the version of the library being used. All
+of them are declared in 'lzlib.h'.
+
+ -- Constant: LZ_API_VERSION
+ This constant is defined in 'lzlib.h' and works as a version test
+ macro. The application should check at compile time that
+ LZ_API_VERSION is greater than or equal to the version required by the
+ application:
+
+ #if !defined LZ_API_VERSION || LZ_API_VERSION < 1012
+ #error "lzlib 1.12 or newer needed."
+ #endif
+
+ Before version 1.8, lzlib didn't define LZ_API_VERSION.
+ LZ_API_VERSION was first defined in lzlib 1.8 to 1.
+ Since lzlib 1.12, LZ_API_VERSION is defined as (major * 1000 + minor).
+
+ NOTE: Version test macros are the library's way of announcing
+functionality to the application. They should not be confused with feature
+test macros, which allow the application to announce to the library its
+desire to have certain symbols and prototypes exposed.
+
+ -- Function: int LZ_api_version ( void )
+ If LZ_API_VERSION >= 1012, this function is declared in 'lzlib.h' (else
+ it doesn't exist). It returns the LZ_API_VERSION of the library object
+ code being used. The application should check at run time that the
+ value returned by 'LZ_api_version' is greater than or equal to the
+ version required by the application. An application may be dynamically
+ linked at run time with a different version of lzlib than the one it
+ was compiled for, and this should not break the application as long as
+ the library used provides the functionality required by the
+ application.
+
+ #if defined LZ_API_VERSION && LZ_API_VERSION >= 1012
+ if( LZ_api_version() < 1012 )
+ show_error( "lzlib 1.12 or newer needed." );
+ #endif
+
+ -- Constant: const char * LZ_version_string
+ This string constant is defined in the header file 'lzlib.h' and
+ represents the version of the library being used at compile time.
+
+ -- Function: const char * LZ_version ( void )
+ This function returns a string representing the version of the library
+ being used at run time.
+
+
+File: lzlib.info, Node: Buffering, Next: Parameter limits, Prev: Library version, Up: Top
+
+3 Buffering
+***********
+
+Lzlib internal functions need access to a memory chunk at least as large as
+the dictionary size (sliding window). For efficiency reasons, the input
+buffer for compression is twice or sixteen times as large as the dictionary
+size.
+
+ Finally, for safety reasons, lzlib uses two more internal buffers.
+
+ These are the four buffers used by lzlib, and their guaranteed minimum
+sizes:
+
+ * Input compression buffer. Written to by the function
+ 'LZ_compress_write'. For the normal variant of LZMA, its size is two
+ times the dictionary size set with the function 'LZ_compress_open' or
+ 64 KiB, whichever is larger. For the fast variant, its size is 1 MiB.
+
+ * Output compression buffer. Read from by the function
+ 'LZ_compress_read'. Its size is 64 KiB.
+
+ * Input decompression buffer. Written to by the function
+ 'LZ_decompress_write'. Its size is 64 KiB.
+
+ * Output decompression buffer. Read from by the function
+ 'LZ_decompress_read'. Its size is the dictionary size set in the header
+ of the member currently being decompressed or 64 KiB, whichever is
+ larger.
+
+
+File: lzlib.info, Node: Parameter limits, Next: Compression functions, Prev: Buffering, Up: Top
+
+4 Parameter limits
+******************
+
+These functions provide minimum and maximum values for some parameters.
+Current values are shown in square brackets.
+
+ -- Function: int LZ_min_dictionary_bits ( void )
+ Returns the base 2 logarithm of the smallest valid dictionary size
+ [12].
+
+ -- Function: int LZ_min_dictionary_size ( void )
+ Returns the smallest valid dictionary size [4 KiB].
+
+ -- Function: int LZ_max_dictionary_bits ( void )
+ Returns the base 2 logarithm of the largest valid dictionary size [29].
+
+ -- Function: int LZ_max_dictionary_size ( void )
+ Returns the largest valid dictionary size [512 MiB].
+
+ -- Function: int LZ_min_match_len_limit ( void )
+ Returns the smallest valid match length limit [5].
+
+ -- Function: int LZ_max_match_len_limit ( void )
+ Returns the largest valid match length limit [273].
+
+
+File: lzlib.info, Node: Compression functions, Next: Decompression functions, Prev: Parameter limits, Up: Top
+
+5 Compression functions
+***********************
+
+These are the functions used to compress data. In case of error, all of
+them return -1 or 0, for signed and unsigned return values respectively,
+except 'LZ_compress_open' whose return value must be checked by calling
+'LZ_compress_errno' before using it.
+
+ -- Function: struct LZ_Encoder * LZ_compress_open ( const int
+ DICTIONARY_SIZE, const int MATCH_LEN_LIMIT, const unsigned long
+ long MEMBER_SIZE )
+ Initializes the internal stream state for compression and returns a
+ pointer that can only be used as the ENCODER argument for the other
+ LZ_compress functions, or a null pointer if the encoder could not be
+ allocated.
+
+ The returned pointer must be checked by calling 'LZ_compress_errno'
+ before using it. If 'LZ_compress_errno' does not return 'LZ_ok', the
+ returned pointer must not be used and should be freed with
+ 'LZ_compress_close' to avoid memory leaks.
+
+ DICTIONARY_SIZE sets the dictionary size to be used, in bytes. Valid
+ values range from 4 KiB to 512 MiB. Note that dictionary sizes are
+ quantized. If the size specified does not match one of the valid
+ sizes, it is rounded upwards by adding up to (DICTIONARY_SIZE / 8) to
+ it.
+
+ MATCH_LEN_LIMIT sets the match length limit in bytes. Valid values
+ range from 5 to 273. Larger values usually give better compression
+ ratios but longer compression times.
+
+ If DICTIONARY_SIZE is 65535 and MATCH_LEN_LIMIT is 16, the fast
+ variant of LZMA is chosen, which produces identical compressed output
+ as 'lzip -0'. (The dictionary size used is rounded upwards to 64 KiB).
+
+ MEMBER_SIZE sets the member size limit in bytes. Valid values range
+ from 4 KiB to 2 PiB. A small member size may degrade compression
+ ratio, so use it only when needed. To produce a single-member data
+ stream, give MEMBER_SIZE a value larger than the amount of data to be
+ produced. Values larger than 2 PiB are reduced to 2 PiB to prevent the
+ uncompressed size of the member from overflowing.
+
+ -- Function: int LZ_compress_close ( struct LZ_Encoder * const ENCODER )
+ Frees all dynamically allocated data structures for this stream. This
+ function discards any unprocessed input and does not flush any pending
+ output. After a call to 'LZ_compress_close', ENCODER can no longer be
+ used as an argument to any LZ_compress function. It is safe to call
+ 'LZ_compress_close' with a null argument.
+
+ -- Function: int LZ_compress_finish ( struct LZ_Encoder * const ENCODER )
+ Use this function to tell 'lzlib' that all the data for this member
+ have already been written (with the function 'LZ_compress_write'). It
+ is safe to call 'LZ_compress_finish' as many times as needed. After
+ all the compressed data have been read with 'LZ_compress_read' and
+ 'LZ_compress_member_finished' returns 1, a new member can be started
+ with 'LZ_compress_restart_member'.
+
+ -- Function: int LZ_compress_restart_member ( struct LZ_Encoder * const
+ ENCODER, const unsigned long long MEMBER_SIZE )
+ Use this function to start a new member in a multimember data stream.
+ Call this function only after 'LZ_compress_member_finished' indicates
+ that the current member has been fully read (with the function
+ 'LZ_compress_read'). *Note member_size::, for a description of
+ MEMBER_SIZE.
+
+ -- Function: int LZ_compress_sync_flush ( struct LZ_Encoder * const
+ ENCODER )
+ Use this function to make available to 'LZ_compress_read' all the data
+ already written with the function 'LZ_compress_write'. First call
+ 'LZ_compress_sync_flush'. Then call 'LZ_compress_read' until it
+ returns 0.
+
+ This function writes at least one LZMA marker '3' ("Sync Flush" marker)
+ to the compressed output. Note that the sync flush marker is not
+ allowed in lzip files; it is a device for interactive communication
+ between applications using lzlib, but is useless and wasteful in a
+ file, and is excluded from the media type 'application/lzip'. The LZMA
+ marker '2' ("End Of Stream" marker) is the only marker allowed in lzip
+ files. *Note Data format::.
+
+ Repeated use of 'LZ_compress_sync_flush' may degrade compression
+ ratio, so use it only when needed. If the interval between calls to
+ 'LZ_compress_sync_flush' is large (comparable to dictionary size),
+ creating a multimember data stream with 'LZ_compress_restart_member'
+ may be an alternative.
+
+ Combining multimember stream creation with flushing may be tricky. If
+ there are more bytes available than those needed to complete
+ MEMBER_SIZE, 'LZ_compress_restart_member' needs to be called when
+ 'LZ_compress_member_finished' returns 1, followed by a new call to
+ 'LZ_compress_sync_flush'.
+
+ -- Function: int LZ_compress_read ( struct LZ_Encoder * const ENCODER,
+ uint8_t * const BUFFER, const int SIZE )
+ Reads up to SIZE bytes from the stream pointed to by ENCODER, storing
+ the results in BUFFER. If LZ_API_VERSION >= 1012, BUFFER may be a null
+ pointer, in which case the bytes read are discarded.
+
+ Returns the number of bytes actually read. This might be less than
+ SIZE; for example, if there aren't that many bytes left in the stream
+ or if more bytes have to be yet written with the function
+ 'LZ_compress_write'. Note that reading less than SIZE bytes is not an
+ error.
+
+ -- Function: int LZ_compress_write ( struct LZ_Encoder * const ENCODER,
+ uint8_t * const BUFFER, const int SIZE )
+ Writes up to SIZE bytes from BUFFER to the stream pointed to by
+ ENCODER. Returns the number of bytes actually written. This might be
+ less than SIZE. Note that writing less than SIZE bytes is not an error.
+
+ -- Function: int LZ_compress_write_size ( struct LZ_Encoder * const
+ ENCODER )
+ Returns the maximum number of bytes that can be immediately written
+ through 'LZ_compress_write'. For efficiency reasons, once the input
+ buffer is full and 'LZ_compress_write_size' returns 0, almost all the
+ buffer must be compressed before a size greater than 0 is returned
+ again. (This is done to minimize the amount of data that must be
+ copied to the beginning of the buffer before new data can be accepted).
+
+ It is guaranteed that an immediate call to 'LZ_compress_write' will
+ accept a SIZE up to the returned number of bytes.
+
+ -- Function: enum LZ_Errno LZ_compress_errno ( struct LZ_Encoder * const
+ ENCODER )
+ Returns the current error code for ENCODER. *Note Error codes::. It is
+ safe to call 'LZ_compress_errno' with a null argument, in which case
+ it returns 'LZ_bad_argument'.
+
+ -- Function: int LZ_compress_finished ( struct LZ_Encoder * const ENCODER )
+ Returns 1 if all the data have been read and 'LZ_compress_close' can
+ be safely called. Otherwise it returns 0. 'LZ_compress_finished'
+ implies 'LZ_compress_member_finished'.
+
+ -- Function: int LZ_compress_member_finished ( struct LZ_Encoder * const
+ ENCODER )
+ Returns 1 if the current member, in a multimember data stream, has been
+ fully read and 'LZ_compress_restart_member' can be safely called.
+ Otherwise it returns 0.
+
+ -- Function: unsigned long long LZ_compress_data_position ( struct
+ LZ_Encoder * const ENCODER )
+ Returns the number of input bytes already compressed in the current
+ member.
+
+ -- Function: unsigned long long LZ_compress_member_position ( struct
+ LZ_Encoder * const ENCODER )
+ Returns the number of compressed bytes already produced, but perhaps
+ not yet read, in the current member.
+
+ -- Function: unsigned long long LZ_compress_total_in_size ( struct
+ LZ_Encoder * const ENCODER )
+ Returns the total number of input bytes already compressed.
+
+ -- Function: unsigned long long LZ_compress_total_out_size ( struct
+ LZ_Encoder * const ENCODER )
+ Returns the total number of compressed bytes already produced, but
+ perhaps not yet read.
+
+
+File: lzlib.info, Node: Decompression functions, Next: Error codes, Prev: Compression functions, Up: Top
+
+6 Decompression functions
+*************************
+
+These are the functions used to decompress data. In case of error, all of
+them return -1 or 0, for signed and unsigned return values respectively,
+except 'LZ_decompress_open' whose return value must be checked by calling
+'LZ_decompress_errno' before using it.
+
+ -- Function: struct LZ_Decoder * LZ_decompress_open ( void )
+ Initializes the internal stream state for decompression and returns a
+ pointer that can only be used as the DECODER argument for the other
+ LZ_decompress functions, or a null pointer if the decoder could not be
+ allocated.
+
+ The returned pointer must be checked by calling 'LZ_decompress_errno'
+ before using it. If 'LZ_decompress_errno' does not return 'LZ_ok', the
+ returned pointer must not be used and should be freed with
+ 'LZ_decompress_close' to avoid memory leaks.
+
+ -- Function: int LZ_decompress_close ( struct LZ_Decoder * const DECODER )
+ Frees all dynamically allocated data structures for this stream. This
+ function discards any unprocessed input and does not flush any pending
+ output. After a call to 'LZ_decompress_close', DECODER can no longer
+ be used as an argument to any LZ_decompress function. It is safe to
+ call 'LZ_decompress_close' with a null argument.
+
+ -- Function: int LZ_decompress_finish ( struct LZ_Decoder * const DECODER )
+ Use this function to tell 'lzlib' that all the data for this stream
+ have already been written (with the function 'LZ_decompress_write').
+ It is safe to call 'LZ_decompress_finish' as many times as needed. It
+ is not required to call 'LZ_decompress_finish' if the input stream
+ only contains whole members, but not calling it prevents lzlib from
+ detecting a truncated member.
+
+ -- Function: int LZ_decompress_reset ( struct LZ_Decoder * const DECODER )
+ Resets the internal state of DECODER as it was just after opening it
+ with the function 'LZ_decompress_open'. Data stored in the internal
+ buffers is discarded. Position counters are set to 0.
+
+ -- Function: int LZ_decompress_sync_to_member ( struct LZ_Decoder * const
+ DECODER )
+ Resets the error state of DECODER and enters a search state that lasts
+ until a new member header (or the end of the stream) is found. After a
+ successful call to 'LZ_decompress_sync_to_member', data written with
+ 'LZ_decompress_write' is consumed and 'LZ_decompress_read' returns 0
+ until a header is found.
+
+ This function is useful to discard any data preceding the first
+ member, or to discard the rest of the current member, for example in
+ case of a data error. If the decoder is already at the beginning of a
+ member, this function does nothing.
+
+ -- Function: int LZ_decompress_read ( struct LZ_Decoder * const DECODER,
+ uint8_t * const BUFFER, const int SIZE )
+ Reads up to SIZE bytes from the stream pointed to by DECODER, storing
+ the results in BUFFER. If LZ_API_VERSION >= 1012, BUFFER may be a null
+ pointer, in which case the bytes read are discarded.
+
+ Returns the number of bytes actually read. This might be less than
+ SIZE; for example, if there aren't that many bytes left in the stream
+ or if more bytes have to be yet written with the function
+ 'LZ_decompress_write'. Note that reading less than SIZE bytes is not
+ an error.
+
+ 'LZ_decompress_read' returns at least once per member so that
+ 'LZ_decompress_member_finished' can be called (and trailer data
+ retrieved) for each member, even for empty members. Therefore,
+ 'LZ_decompress_read' returning 0 does not mean that the end of the
+ stream has been reached. The increase in the value returned by
+ 'LZ_decompress_total_in_size' can be used to tell the end of the stream
+ from an empty member.
+
+ In case of decompression error caused by corrupt or truncated data,
+ 'LZ_decompress_read' does not signal the error immediately to the
+ application, but waits until all the bytes decoded have been read. This
+ allows tools like tarlz to recover as much data as possible from each
+ damaged member. *Note tarlz manual: (tarlz)Top.
+
+ -- Function: int LZ_decompress_write ( struct LZ_Decoder * const DECODER,
+ uint8_t * const BUFFER, const int SIZE )
+ Writes up to SIZE bytes from BUFFER to the stream pointed to by
+ DECODER. Returns the number of bytes actually written. This might be
+ less than SIZE. Note that writing less than SIZE bytes is not an error.
+
+ -- Function: int LZ_decompress_write_size ( struct LZ_Decoder * const
+ DECODER )
+ Returns the maximum number of bytes that can be immediately written
+ through 'LZ_decompress_write'. This number varies smoothly; each
+ compressed byte consumed may be overwritten immediately, increasing by
+ 1 the value returned.
+
+ It is guaranteed that an immediate call to 'LZ_decompress_write' will
+ accept a SIZE up to the returned number of bytes.
+
+ -- Function: enum LZ_Errno LZ_decompress_errno ( struct LZ_Decoder * const
+ DECODER )
+ Returns the current error code for DECODER. *Note Error codes::. It is
+ safe to call 'LZ_decompress_errno' with a null argument, in which case
+ it returns 'LZ_bad_argument'.
+
+ -- Function: int LZ_decompress_finished ( struct LZ_Decoder * const
+ DECODER )
+ Returns 1 if all the data have been read and 'LZ_decompress_close' can
+ be safely called. Otherwise it returns 0. 'LZ_decompress_finished'
+ does not imply 'LZ_decompress_member_finished'.
+
+ -- Function: int LZ_decompress_member_finished ( struct LZ_Decoder * const
+ DECODER )
+ Returns 1 if the previous call to 'LZ_decompress_read' finished reading
+ the current member, indicating that final values for the member are
+ available through 'LZ_decompress_data_crc',
+ 'LZ_decompress_data_position', and 'LZ_decompress_member_position'.
+ Otherwise it returns 0.
+
+ -- Function: int LZ_decompress_member_version ( struct LZ_Decoder * const
+ DECODER )
+ Returns the version of the current member, read from the member header.
+
+ -- Function: int LZ_decompress_dictionary_size ( struct LZ_Decoder * const
+ DECODER )
+ Returns the dictionary size of the current member, read from the
+ member header.
+
+ -- Function: unsigned LZ_decompress_data_crc ( struct LZ_Decoder * const
+ DECODER )
+ Returns the 32 bit Cyclic Redundancy Check of the data decompressed
+ from the current member. The value returned is valid only when
+ 'LZ_decompress_member_finished' returns 1.
+
+ -- Function: unsigned long long LZ_decompress_data_position ( struct
+ LZ_Decoder * const DECODER )
+ Returns the number of decompressed bytes already produced, but perhaps
+ not yet read, in the current member.
+
+ -- Function: unsigned long long LZ_decompress_member_position ( struct
+ LZ_Decoder * const DECODER )
+ Returns the number of input bytes already decompressed in the current
+ member.
+
+ -- Function: unsigned long long LZ_decompress_total_in_size ( struct
+ LZ_Decoder * const DECODER )
+ Returns the total number of input bytes already decompressed.
+
+ -- Function: unsigned long long LZ_decompress_total_out_size ( struct
+ LZ_Decoder * const DECODER )
+ Returns the total number of decompressed bytes already produced, but
+ perhaps not yet read.
+
+
+File: lzlib.info, Node: Error codes, Next: Error messages, Prev: Decompression functions, Up: Top
+
+7 Error codes
+*************
+
+Most library functions return -1 to indicate that they have failed. But
+this return value only tells you that an error has occurred. To find out
+what kind of error it was, you need to check the error code by calling
+'LZ_(de)compress_errno'.
+
+ Library functions don't change the value returned by
+'LZ_(de)compress_errno' when they succeed; thus, the value returned by
+'LZ_(de)compress_errno' after a successful call is not necessarily LZ_ok,
+and you should not use 'LZ_(de)compress_errno' to determine whether a call
+failed. If the call failed, then you can examine 'LZ_(de)compress_errno'.
+
+ The error codes are defined in the header file 'lzlib.h'.
+
+ -- Constant: enum LZ_Errno LZ_ok
+ The value of this constant is 0 and is used to indicate that there is
+ no error.
+
+ -- Constant: enum LZ_Errno LZ_bad_argument
+ At least one of the arguments passed to the library function was
+ invalid.
+
+ -- Constant: enum LZ_Errno LZ_mem_error
+ No memory available. The system cannot allocate more virtual memory
+ because its capacity is full.
+
+ -- Constant: enum LZ_Errno LZ_sequence_error
+ A library function was called in the wrong order. For example
+ 'LZ_compress_restart_member' was called before
+ 'LZ_compress_member_finished' indicates that the current member is
+ finished.
+
+ -- Constant: enum LZ_Errno LZ_header_error
+ An invalid member header (one with the wrong magic bytes) was read. If
+ this happens at the end of the data stream it may indicate trailing
+ data.
+
+ -- Constant: enum LZ_Errno LZ_unexpected_eof
+ The end of the data stream was reached in the middle of a member.
+
+ -- Constant: enum LZ_Errno LZ_data_error
+ The data stream is corrupt. If 'LZ_decompress_member_position' is 6 or
+ less, it indicates either a format version not supported, an invalid
+ dictionary size, a corrupt header in a multimember data stream, or
+ trailing data too similar to a valid lzip header. Lziprecover can be
+ used to remove conflicting trailing data from a file.
+
+ -- Constant: enum LZ_Errno LZ_library_error
+ A bug was detected in the library. Please, report it. *Note Problems::.
+
+
+File: lzlib.info, Node: Error messages, Next: Invoking minilzip, Prev: Error codes, Up: Top
+
+8 Error messages
+****************
+
+ -- Function: const char * LZ_strerror ( const enum LZ_Errno LZ_ERRNO )
+ Returns the standard error message for a given error code. The messages
+ are fairly short; there are no multi-line messages or embedded
+ newlines. This function makes it easy for your program to report
+ informative error messages about the failure of a library call.
+
+ The value of LZ_ERRNO normally comes from a call to
+ 'LZ_(de)compress_errno'.
+
+
+File: lzlib.info, Node: Invoking minilzip, Next: Data format, Prev: Error messages, Up: Top
+
+9 Invoking minilzip
+*******************
+
+Minilzip is a test program for the compression library lzlib, compatible
+with lzip 1.4 or newer.
+
+ Lzip is a lossless data compressor with a user interface similar to the
+one of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
+chain-Algorithm' (LZMA) stream format to maximize interoperability. The
+maximum dictionary size is 512 MiB so that any lzip file can be decompressed
+on 32-bit machines. Lzip provides accurate and robust 3-factor integrity
+checking. Lzip can compress about as fast as gzip (lzip -0) or compress most
+files more than bzip2 (lzip -9). Decompression speed is intermediate between
+gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
+perspective. Lzip has been designed, written, and tested with great care to
+replace gzip and bzip2 as the standard general-purpose compressed format for
+Unix-like systems.
+
+The format for running minilzip is:
+
+ minilzip [OPTIONS] [FILES]
+
+If no file names are specified, minilzip compresses (or decompresses) from
+standard input to standard output. A hyphen '-' used as a FILE argument
+means standard input. It can be mixed with other FILES and is read just
+once, the first time it appears in the command line. Remember to prepend
+'./' to any file name beginning with a hyphen, or use '--'.
+
+ minilzip supports the following options: *Note Argument syntax:
+(arg_parser)Argument syntax.
+
+'-h'
+'--help'
+ Print an informative help message describing the options and exit.
+
+'-V'
+'--version'
+ Print the version number of minilzip on the standard output and exit.
+ This version number should be included in all bug reports.
+
+'-a'
+'--trailing-error'
+ Exit with error status 2 if any remaining input is detected after
+ decompressing the last member. Such remaining input is usually trailing
+ garbage that can be safely ignored.
+
+'-b BYTES'
+'--member-size=BYTES'
+ When compressing, set the member size limit to BYTES. It is advisable
+ to keep members smaller than RAM size so that they can be repaired with
+ lziprecover in case of corruption. A small member size may degrade
+ compression ratio, so use it only when needed. Valid values range from
+ 100 kB to 2 PiB. Defaults to 2 PiB.
+
+'-c'
+'--stdout'
+ Compress or decompress to standard output; keep input files unchanged.
+ If compressing several files, each file is compressed independently.
+ (The output consists of a sequence of independently compressed
+ members). This option (or '-o') is needed when reading from a named
+ pipe (fifo) or from a device. Use it also to recover as much of the
+ decompressed data as possible when decompressing a corrupt file. '-c'
+ overrides '-o' and '-S'. '-c' has no effect when testing.
+
+'-d'
+'--decompress'
+ Decompress the files specified. The integrity of the files specified is
+ checked. If a file does not exist, can't be opened, or the destination
+ file already exists and '--force' has not been specified, minilzip
+ continues decompressing the rest of the files and exits with error
+ status 1. If a file fails to decompress, or is a terminal, minilzip
+ exits immediately with error status 2 without decompressing the rest
+ of the files. A terminal is considered an uncompressed file, and
+ therefore invalid.
+
+'-f'
+'--force'
+ Force overwrite of output files.
+
+'-F'
+'--recompress'
+ When compressing, force re-compression of files whose name already has
+ the '.lz' or '.tlz' suffix.
+
+'-k'
+'--keep'
+ Keep (don't delete) input files during compression or decompression.
+
+'-m BYTES'
+'--match-length=BYTES'
+ When compressing, set the match length limit in bytes. After a match
+ this long is found, the search is finished. Valid values range from 5
+ to 273. Larger values usually give better compression ratios but
+ longer compression times.
+
+'-o FILE'
+'--output=FILE'
+ If '-c' has not been also specified, write the (de)compressed output
+ to FILE; keep input files unchanged. If compressing several files,
+ each file is compressed independently. (The output consists of a
+ sequence of independently compressed members). This option (or '-c')
+ is needed when reading from a named pipe (fifo) or from a device.
+ '-o -' is equivalent to '-c'. '-o' has no effect when testing.
+
+ When compressing and splitting the output in volumes, FILE is used as
+ a prefix, and several files named 'FILE00001.lz', 'FILE00002.lz', etc,
+ are created. In this case, only one input file is allowed.
+
+'-q'
+'--quiet'
+ Quiet operation. Suppress all messages.
+
+'-s BYTES'
+'--dictionary-size=BYTES'
+ When compressing, set the dictionary size limit in bytes. Minilzip
+ uses for each file the largest dictionary size that does not exceed
+ neither the file size nor this limit. Valid values range from 4 KiB to
+ 512 MiB. Values 12 to 29 are interpreted as powers of two, meaning
+ 2^12 to 2^29 bytes. Dictionary sizes are quantized so that they can be
+ coded in just one byte (*note coded-dict-size::). If the size
+ specified does not match one of the valid sizes, it is rounded upwards
+ by adding up to (BYTES / 8) to it.
+
+ For maximum compression you should use a dictionary size limit as large
+ as possible, but keep in mind that the decompression memory requirement
+ is affected at compression time by the choice of dictionary size limit.
+
+'-S BYTES'
+'--volume-size=BYTES'
+ When compressing, and '-c' has not been also specified, split the
+ compressed output into several volume files with names
+ 'original_name00001.lz', 'original_name00002.lz', etc, and set the
+ volume size limit to BYTES. Input files are kept unchanged. Each
+ volume is a complete, maybe multimember, lzip file. A small volume
+ size may degrade compression ratio, so use it only when needed. Valid
+ values range from 100 kB to 4 EiB.
+
+'-t'
+'--test'
+ Check integrity of the files specified, but don't decompress them. This
+ really performs a trial decompression and throws away the result. Use
+ it together with '-v' to see information about the files. If a file
+ fails the test, does not exist, can't be opened, or is a terminal,
+ minilzip continues testing the rest of the files. A final diagnostic
+ is shown at verbosity level 1 or higher if any file fails the test
+ when testing multiple files.
+
+'-v'
+'--verbose'
+ Verbose mode.
+ When compressing, show the compression ratio and size for each file
+ processed.
+ When decompressing or testing, further -v's (up to 4) increase the
+ verbosity level, showing status, compression ratio, dictionary size,
+ and trailer contents (CRC, data size, member size).
+
+'-0 .. -9'
+ Compression level. Set the compression parameters (dictionary size and
+ match length limit) as shown in the table below. The default
+ compression level is '-6', equivalent to '-s8MiB -m36'. Note that '-9'
+ can be much slower than '-0'. These options have no effect when
+ decompressing or testing.
+
+ The bidimensional parameter space of LZMA can't be mapped to a linear
+ scale optimal for all files. If your files are large, very repetitive,
+ etc, you may need to use the options '--dictionary-size' and
+ '--match-length' directly to achieve optimal performance.
+
+ If several compression levels or '-s' or '-m' options are given, the
+ last setting is used. For example '-9 -s64MiB' is equivalent to
+ '-s64MiB -m273'
+
+ Level Dictionary size (-s) Match length limit (-m)
+ -0 64 KiB 16 bytes
+ -1 1 MiB 5 bytes
+ -2 1.5 MiB 6 bytes
+ -3 2 MiB 8 bytes
+ -4 3 MiB 12 bytes
+ -5 4 MiB 20 bytes
+ -6 8 MiB 36 bytes
+ -7 16 MiB 68 bytes
+ -8 24 MiB 132 bytes
+ -9 32 MiB 273 bytes
+
+'--fast'
+'--best'
+ Aliases for GNU gzip compatibility.
+
+'--loose-trailing'
+ When decompressing or testing, allow trailing data whose first bytes
+ are so similar to the magic bytes of a lzip header that they can be
+ confused with a corrupt header. Use this option if a file triggers a
+ "corrupt header" error and the cause is not indeed a corrupt header.
+
+'--check-lib'
+ Compare the version of lzlib used to compile minilzip with the version
+ actually being used at run time and exit. Report any differences
+ found. Exit with error status 1 if differences are found. A mismatch
+ may indicate that lzlib is not correctly installed or that a different
+ version of lzlib has been installed after compiling the shared version
+ of minilzip. Exit with error status 2 if LZ_API_VERSION and
+ LZ_version_string don't match. 'minilzip -v --check-lib' shows the
+ version of lzlib being used and the value of LZ_API_VERSION (if
+ defined). *Note Library version::.
+
+
+ Numbers given as arguments to options may be expressed in decimal,
+hexadecimal, or octal (using the same syntax as integer constants in C++),
+and may be followed by a multiplier and an optional 'B' for "byte".
+
+ Table of SI and binary prefixes (unit multipliers):
+
+Prefix Value | Prefix Value
+k kilobyte (10^3 = 1000) | Ki kibibyte (2^10 = 1024)
+M megabyte (10^6) | Mi mebibyte (2^20)
+G gigabyte (10^9) | Gi gibibyte (2^30)
+T terabyte (10^12) | Ti tebibyte (2^40)
+P petabyte (10^15) | Pi pebibyte (2^50)
+E exabyte (10^18) | Ei exbibyte (2^60)
+Z zettabyte (10^21) | Zi zebibyte (2^70)
+Y yottabyte (10^24) | Yi yobibyte (2^80)
+R ronnabyte (10^27) | Ri robibyte (2^90)
+Q quettabyte (10^30) | Qi quebibyte (2^100)
+
+
+ Exit status: 0 for a normal exit, 1 for environmental problems (file not
+found, invalid command-line options, I/O errors, etc), 2 to indicate a
+corrupt or invalid input file, 3 for an internal consistency error (e.g.,
+bug) which caused minilzip to panic.
+
+
+File: lzlib.info, Node: Data format, Next: Examples, Prev: Invoking minilzip, Up: Top
+
+10 Data format
+**************
+
+Perfection is reached, not when there is no longer anything to add, but
+when there is no longer anything to take away.
+-- Antoine de Saint-Exupery
+
+
+ In the diagram below, a box like this:
+
++---+
+| | <-- the vertical bars might be missing
++---+
+
+ represents one byte; a box like this:
+
++==============+
+| |
++==============+
+
+ represents a variable number of bytes.
+
+
+ Lzip data consist of one or more independent "members" (compressed data
+sets). The members simply appear one after another in the data stream, with
+no additional information before, between, or after them. Each member can
+encode in compressed form up to 16 EiB - 1 byte of uncompressed data. The
+size of a multimember data stream is unlimited.
+
+ Each member has the following structure:
+
++--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
++--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ All multibyte values are stored in little endian order.
+
+'ID string (the "magic" bytes)'
+ A four byte string, identifying the lzip format, with the value "LZIP"
+ (0x4C, 0x5A, 0x49, 0x50).
+
+'VN (version number, 1 byte)'
+ Just in case something needs to be modified in the future. 1 for now.
+
+'DS (coded dictionary size, 1 byte)'
+ The dictionary size is calculated by taking a power of 2 (the base
+ size) and subtracting from it a fraction between 0/16 and 7/16 of the
+ base size.
+ Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).
+ Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
+ from the base size to obtain the dictionary size.
+ Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB
+ Valid values for dictionary size range from 4 KiB to 512 MiB.
+
+'LZMA stream'
+ The LZMA stream, finished by an "End Of Stream" marker. Uses default
+ values for encoder properties. *Note Stream format: (lzip)Stream
+ format, for a complete description.
+ Lzip only uses the LZMA marker '2' ("End Of Stream" marker). Lzlib
+ also uses the LZMA marker '3' ("Sync Flush" marker). *Note
+ sync_flush::.
+
+'CRC32 (4 bytes)'
+ Cyclic Redundancy Check (CRC) of the original uncompressed data.
+
+'Data size (8 bytes)'
+ Size of the original uncompressed data.
+
+'Member size (8 bytes)'
+ Total size of the member, including header and trailer. This field acts
+ as a distributed index, improves the checking of stream integrity, and
+ facilitates the safe recovery of undamaged members from multimember
+ files. Lzip limits the member size to 2 PiB to prevent the data size
+ field from overflowing.
+
+
+
+File: lzlib.info, Node: Examples, Next: Problems, Prev: Data format, Up: Top
+
+11 A small tutorial with examples
+*********************************
+
+This chapter provides real code examples for the most common uses of the
+library. See these examples in context in the files 'bbexample.c' and
+'ffexample.c' from the source distribution of lzlib.
+
+ Note that the interface of lzlib is symmetrical. That is, the code for
+normal compression and decompression is identical except because one calls
+LZ_compress* functions while the other calls LZ_decompress* functions.
+
+* Menu:
+
+* Buffer compression:: Buffer-to-buffer single-member compression
+* Buffer decompression:: Buffer-to-buffer decompression
+* File compression:: File-to-file single-member compression
+* File decompression:: File-to-file decompression
+* File compression mm:: File-to-file multimember compression
+* Skipping data errors:: Decompression with automatic resynchronization
+
+
+File: lzlib.info, Node: Buffer compression, Next: Buffer decompression, Up: Examples
+
+11.1 Buffer compression
+=======================
+
+Buffer-to-buffer single-member compression (MEMBER_SIZE > total output).
+
+/* Compress 'insize' bytes from 'inbuf' to 'outbuf'.
+ Return the size of the compressed data in '*outlenp'.
+ In case of error, or if 'outsize' is too small, return false and do not
+ modify '*outlenp'.
+*/
+bool bbcompress( const uint8_t * const inbuf, const int insize,
+ const int dictionary_size, const int match_len_limit,
+ uint8_t * const outbuf, const int outsize,
+ int * const outlenp )
+ {
+ int inpos = 0, outpos = 0;
+ bool error = false;
+ struct LZ_Encoder * const encoder =
+ LZ_compress_open( dictionary_size, match_len_limit, INT64_MAX );
+ if( !encoder || LZ_compress_errno( encoder ) != LZ_ok )
+ { LZ_compress_close( encoder ); return false; }
+
+ while( true )
+ {
+ int ret = LZ_compress_write( encoder, inbuf + inpos, insize - inpos );
+ if( ret < 0 ) { error = true; break; }
+ inpos += ret;
+ if( inpos >= insize ) LZ_compress_finish( encoder );
+ ret = LZ_compress_read( encoder, outbuf + outpos, outsize - outpos );
+ if( ret < 0 ) { error = true; break; }
+ outpos += ret;
+ if( LZ_compress_finished( encoder ) == 1 ) break;
+ if( outpos >= outsize ) { error = true; break; }
+ }
+
+ if( LZ_compress_close( encoder ) < 0 ) error = true;
+ if( error ) return false;
+ *outlenp = outpos;
+ return true;
+ }
+
+
+File: lzlib.info, Node: Buffer decompression, Next: File compression, Prev: Buffer compression, Up: Examples
+
+11.2 Buffer decompression
+=========================
+
+Buffer-to-buffer decompression.
+
+/* Decompress 'insize' bytes from 'inbuf' to 'outbuf'.
+ Return the size of the decompressed data in '*outlenp'.
+ In case of error, or if 'outsize' is too small, return false and do not
+ modify '*outlenp'.
+*/
+bool bbdecompress( const uint8_t * const inbuf, const int insize,
+ uint8_t * const outbuf, const int outsize,
+ int * const outlenp )
+ {
+ int inpos = 0, outpos = 0;
+ bool error = false;
+ struct LZ_Decoder * const decoder = LZ_decompress_open();
+ if( !decoder || LZ_decompress_errno( decoder ) != LZ_ok )
+ { LZ_decompress_close( decoder ); return false; }
+
+ while( true )
+ {
+ int ret = LZ_decompress_write( decoder, inbuf + inpos, insize - inpos );
+ if( ret < 0 ) { error = true; break; }
+ inpos += ret;
+ if( inpos >= insize ) LZ_decompress_finish( decoder );
+ ret = LZ_decompress_read( decoder, outbuf + outpos, outsize - outpos );
+ if( ret < 0 ) { error = true; break; }
+ outpos += ret;
+ if( LZ_decompress_finished( decoder ) == 1 ) break;
+ if( outpos >= outsize ) { error = true; break; }
+ }
+
+ if( LZ_decompress_close( decoder ) < 0 ) error = true;
+ if( error ) return false;
+ *outlenp = outpos;
+ return true;
+ }
+
+
+File: lzlib.info, Node: File compression, Next: File decompression, Prev: Buffer decompression, Up: Examples
+
+11.3 File compression
+=====================
+
+File-to-file compression using LZ_compress_write_size.
+
+int ffcompress( struct LZ_Encoder * const encoder,
+ FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384 };
+ uint8_t buffer[buffer_size];
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_compress_write_size( encoder ) );
+ if( size > 0 )
+ {
+ len = fread( buffer, 1, size, infile );
+ ret = LZ_compress_write( encoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) ) LZ_compress_finish( encoder );
+ }
+ ret = LZ_compress_read( encoder, buffer, buffer_size );
+ if( ret < 0 ) break;
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_compress_finished( encoder ) == 1 ) return 0;
+ }
+ return 1;
+ }
+
+
+File: lzlib.info, Node: File decompression, Next: File compression mm, Prev: File compression, Up: Examples
+
+11.4 File decompression
+=======================
+
+File-to-file decompression using LZ_decompress_write_size.
+
+int ffdecompress( struct LZ_Decoder * const decoder,
+ FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384 };
+ uint8_t buffer[buffer_size];
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_decompress_write_size( decoder ) );
+ if( size > 0 )
+ {
+ len = fread( buffer, 1, size, infile );
+ ret = LZ_decompress_write( decoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) ) LZ_decompress_finish( decoder );
+ }
+ ret = LZ_decompress_read( decoder, buffer, buffer_size );
+ if( ret < 0 ) break;
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_decompress_finished( decoder ) == 1 ) return 0;
+ }
+ return 1;
+ }
+
+
+File: lzlib.info, Node: File compression mm, Next: Skipping data errors, Prev: File decompression, Up: Examples
+
+11.5 File-to-file multimember compression
+=========================================
+
+Example 1: Multimember compression with members of fixed size
+(MEMBER_SIZE < total output).
+
+int ffmmcompress( FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384, member_size = 4096 };
+ uint8_t buffer[buffer_size];
+ bool done = false;
+ struct LZ_Encoder * const encoder =
+ LZ_compress_open( 65535, 16, member_size );
+ if( !encoder || LZ_compress_errno( encoder ) != LZ_ok )
+ { fputs( "ffexample: Not enough memory.\n", stderr );
+ LZ_compress_close( encoder ); return 1; }
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_compress_write_size( encoder ) );
+ if( size > 0 )
+ {
+ len = fread( buffer, 1, size, infile );
+ ret = LZ_compress_write( encoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) ) LZ_compress_finish( encoder );
+ }
+ ret = LZ_compress_read( encoder, buffer, buffer_size );
+ if( ret < 0 ) break;
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_compress_member_finished( encoder ) == 1 )
+ {
+ if( LZ_compress_finished( encoder ) == 1 ) { done = true; break; }
+ if( LZ_compress_restart_member( encoder, member_size ) < 0 ) break;
+ }
+ }
+ if( LZ_compress_close( encoder ) < 0 ) done = false;
+ return done;
+ }
+
+
+Example 2: Multimember compression (user-restarted members). (Call
+LZ_compress_open with MEMBER_SIZE > largest member).
+
+/* Compress 'infile' to 'outfile' as a multimember stream with one member
+ for each line of text terminated by a newline character or by EOF.
+ Return 0 if success, 1 if error.
+*/
+int fflfcompress( struct LZ_Encoder * const encoder,
+ FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384 };
+ uint8_t buffer[buffer_size];
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_compress_write_size( encoder ) );
+ if( size > 0 )
+ {
+ for( len = 0; len < size; )
+ {
+ int ch = getc( infile );
+ if( ch == EOF || ( buffer[len++] = ch ) == '\n' ) break;
+ }
+ /* avoid writing an empty member to outfile */
+ if( len == 0 && LZ_compress_data_position( encoder ) == 0 ) return 0;
+ ret = LZ_compress_write( encoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) || buffer[len-1] == '\n' )
+ LZ_compress_finish( encoder );
+ }
+ ret = LZ_compress_read( encoder, buffer, buffer_size );
+ if( ret < 0 ) break;
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_compress_member_finished( encoder ) == 1 )
+ {
+ if( feof( infile ) && LZ_compress_finished( encoder ) == 1 ) return 0;
+ if( LZ_compress_restart_member( encoder, INT64_MAX ) < 0 ) break;
+ }
+ }
+ return 1;
+ }
+
+
+File: lzlib.info, Node: Skipping data errors, Prev: File compression mm, Up: Examples
+
+11.6 Skipping data errors
+=========================
+
+/* Decompress 'infile' to 'outfile' with automatic resynchronization to
+ next member in case of data error, including the automatic removal of
+ leading garbage.
+*/
+int ffrsdecompress( struct LZ_Decoder * const decoder,
+ FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384 };
+ uint8_t buffer[buffer_size];
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_decompress_write_size( decoder ) );
+ if( size > 0 )
+ {
+ len = fread( buffer, 1, size, infile );
+ ret = LZ_decompress_write( decoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) ) LZ_decompress_finish( decoder );
+ }
+ ret = LZ_decompress_read( decoder, buffer, buffer_size );
+ if( ret < 0 )
+ {
+ if( LZ_decompress_errno( decoder ) == LZ_header_error ||
+ LZ_decompress_errno( decoder ) == LZ_data_error )
+ { LZ_decompress_sync_to_member( decoder ); continue; }
+ break;
+ }
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_decompress_finished( decoder ) == 1 ) return 0;
+ }
+ return 1;
+ }
+
+
+File: lzlib.info, Node: Problems, Next: Concept index, Prev: Examples, Up: Top
+
+12 Reporting bugs
+*****************
+
+There are probably bugs in lzlib. There are certainly errors and omissions
+in this manual. If you report them, they will get fixed. If you don't, no
+one will ever know about them and they will remain unfixed for all
+eternity, if not longer.
+
+ If you find a bug in lzlib, please send electronic mail to
+<lzip-bug@nongnu.org>. Include the version number, which you can find by
+running 'minilzip --version' and 'minilzip -v --check-lib'.
+
+
+File: lzlib.info, Node: Concept index, Prev: Problems, Up: Top
+
+Concept index
+*************
+
+
+* Menu:
+
+* buffer compression: Buffer compression. (line 6)
+* buffer decompression: Buffer decompression. (line 6)
+* buffering: Buffering. (line 6)
+* bugs: Problems. (line 6)
+* compression functions: Compression functions. (line 6)
+* data format: Data format. (line 6)
+* decompression functions: Decompression functions. (line 6)
+* error codes: Error codes. (line 6)
+* error messages: Error messages. (line 6)
+* examples: Examples. (line 6)
+* file compression: File compression. (line 6)
+* file decompression: File decompression. (line 6)
+* getting help: Problems. (line 6)
+* introduction: Introduction. (line 6)
+* invoking: Invoking minilzip. (line 6)
+* library version: Library version. (line 6)
+* multimember compression: File compression mm. (line 6)
+* options: Invoking minilzip. (line 6)
+* parameter limits: Parameter limits. (line 6)
+* skipping data errors: Skipping data errors. (line 6)
+
+
+
+Tag Table:
+Node: Top215
+Node: Introduction1338
+Node: Library version6778
+Node: Buffering9329
+Node: Parameter limits10554
+Node: Compression functions11508
+Ref: member_size13301
+Ref: sync_flush15063
+Node: Decompression functions19751
+Node: Error codes27308
+Node: Error messages29598
+Node: Invoking minilzip30177
+Node: Data format40595
+Ref: coded-dict-size42041
+Node: Examples43446
+Node: Buffer compression44407
+Node: Buffer decompression45927
+Node: File compression47341
+Node: File decompression48324
+Node: File compression mm49328
+Node: Skipping data errors52357
+Node: Problems53662
+Node: Concept index54223
+
+End Tag Table
+
+
+Local Variables:
+coding: iso-8859-15
+End:
diff --git a/doc/lzlib.texi b/doc/lzlib.texi
new file mode 100644
index 0000000..75cb7ba
--- /dev/null
+++ b/doc/lzlib.texi
@@ -0,0 +1,1407 @@
+\input texinfo @c -*-texinfo-*-
+@c %**start of header
+@setfilename lzlib.info
+@documentencoding ISO-8859-15
+@settitle Lzlib Manual
+@finalout
+@c %**end of header
+
+@set UPDATED 20 January 2024
+@set VERSION 1.14
+
+@dircategory Compression
+@direntry
+* Lzlib: (lzlib). Compression library for the lzip format
+@end direntry
+
+
+@ifnothtml
+@titlepage
+@title Lzlib
+@subtitle Compression library for the lzip format
+@subtitle for Lzlib version @value{VERSION}, @value{UPDATED}
+@author by Antonio Diaz Diaz
+
+@page
+@vskip 0pt plus 1filll
+@end titlepage
+
+@contents
+@end ifnothtml
+
+@ifnottex
+@node Top
+@top
+
+This manual is for Lzlib (version @value{VERSION}, @value{UPDATED}).
+
+@menu
+* Introduction:: Purpose and features of lzlib
+* Library version:: Checking library version
+* Buffering:: Sizes of lzlib's buffers
+* Parameter limits:: Min / max values for some parameters
+* Compression functions:: Descriptions of the compression functions
+* Decompression functions:: Descriptions of the decompression functions
+* Error codes:: Meaning of codes returned by functions
+* Error messages:: Error messages corresponding to error codes
+* Invoking minilzip:: Command-line interface of the test program
+* Data format:: Detailed format of the compressed data
+* Examples:: A small tutorial with examples
+* Problems:: Reporting bugs
+* Concept index:: Index of concepts
+@end menu
+
+@sp 1
+Copyright @copyright{} 2009-2024 Antonio Diaz Diaz.
+
+This manual is free documentation: you have unlimited permission to copy,
+distribute, and modify it.
+@end ifnottex
+
+
+@node Introduction
+@chapter Introduction
+@cindex introduction
+
+@uref{http://www.nongnu.org/lzip/lzlib.html,,Lzlib}
+is a data compression library providing in-memory LZMA compression and
+decompression functions, including integrity checking of the decompressed
+data. The compressed data format used by the library is the lzip format.
+Lzlib is written in C.
+
+The lzip file format is designed for data sharing and long-term archiving,
+taking into account both data integrity and decoder availability:
+
+@itemize @bullet
+@item
+The lzip format provides very safe integrity checking and some data
+recovery means. The program
+@uref{http://www.nongnu.org/lzip/manual/lziprecover_manual.html#Data-safety,,lziprecover}
+can repair bit flip errors (one of the most common forms of data corruption)
+in lzip files, and provides data recovery capabilities, including
+error-checked merging of damaged copies of a file.
+@ifnothtml
+@xref{Data safety,,,lziprecover}.
+@end ifnothtml
+
+@item
+The lzip format is as simple as possible (but not simpler). The lzip
+manual provides the source code of a simple decompressor along with a
+detailed explanation of how it works, so that with the only help of the
+lzip manual it would be possible for a digital archaeologist to extract
+the data from a lzip file long after quantum computers eventually
+render LZMA obsolete.
+
+@item
+Additionally the lzip reference implementation is copylefted, which
+guarantees that it will remain free forever.
+@end itemize
+
+A nice feature of the lzip format is that a corrupt byte is easier to repair
+the nearer it is from the beginning of the file. Therefore, with the help of
+lziprecover, losing an entire archive just because of a corrupt byte near
+the beginning is a thing of the past.
+
+The functions and variables forming the interface of the compression library
+are declared in the file @samp{lzlib.h}. Usage examples of the library are
+given in the files @samp{bbexample.c}, @samp{ffexample.c}, and
+@samp{minilzip.c} from the source distribution.
+
+As @samp{lzlib.h} can be used by C and C++ programs, it must not impose a
+choice of system headers on the program by including one of them. Therefore
+it is the responsibility of the program using lzlib to include before
+@samp{lzlib.h} some header that declares the type @samp{uint8_t}. There are
+at least four such headers in C and C++: @samp{stdint.h}, @samp{cstdint},
+@samp{inttypes.h}, and @samp{cinttypes}.
+
+All the library functions are thread safe. The library does not install any
+signal handler. The decoder checks the consistency of the compressed data,
+so the library should never crash even in case of corrupted input.
+
+Compression/decompression is done by repeatedly calling a couple of
+read/write functions until all the data have been processed by the library.
+This interface is safer and less error prone than the traditional zlib
+interface.
+
+Compression/decompression is done when the read function is called. This
+means the value returned by the position functions is not updated until a
+read call, even if a lot of data are written. If you want the data to be
+compressed in advance, just call the read function with a @var{size} equal
+to 0.
+
+If all the data to be compressed are written in advance, lzlib automatically
+adjusts the header of the compressed data to use the largest dictionary size
+that does not exceed neither the data size nor the limit given to
+@samp{LZ_compress_open}. This feature reduces the amount of memory needed for
+decompression and allows minilzip to produce identical compressed output as
+lzip.
+
+Lzlib correctly decompresses a data stream which is the concatenation of
+two or more compressed data streams. The result is the concatenation of the
+corresponding decompressed data streams. Integrity testing of concatenated
+compressed data streams is also supported.
+
+Lzlib is able to compress and decompress streams of unlimited size by
+automatically creating multimember output. The members so created are large,
+about @w{2 PiB} each.
+
+In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a
+concrete algorithm; it is more like "any algorithm using the LZMA coding
+scheme". For example, the option @option{-0} of lzip uses the scheme in
+almost the simplest way possible; issuing the longest match it can find, or
+a literal byte if it can't find a match. Inversely, a much more elaborated
+way of finding coding sequences of minimum size than the one currently used
+by lzip could be developed, and the resulting sequence could also be coded
+using the LZMA coding scheme.
+
+Lzlib currently implements two variants of the LZMA algorithm: fast (used by
+option @option{-0} of minilzip) and normal (used by all other compression levels).
+
+The high compression of LZMA comes from combining two basic, well-proven
+compression ideas: sliding dictionaries (LZ77) and markov models (the thing
+used by every compression algorithm that uses a range encoder or similar
+order-0 entropy coder as its last stage) with segregation of contexts
+according to what the bits are used for.
+
+The ideas embodied in lzlib are due to (at least) the following people:
+Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the
+definition of Markov chains), G.N.N. Martin (for the definition of range
+encoding), Igor Pavlov (for putting all the above together in LZMA), and
+Julian Seward (for bzip2's CLI).
+
+LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never have
+been compressed. Decompressed is used to refer to data which have undergone
+the process of decompression.
+
+
+@node Library version
+@chapter Library version
+@cindex library version
+
+One goal of lzlib is to keep perfect backward compatibility with older
+versions of itself down to 1.0. Any application working with an older lzlib
+should work with a newer lzlib. Installing a newer lzlib should not break
+anything. This chapter describes the constants and functions that the
+application can use to discover the version of the library being used. All
+of them are declared in @samp{lzlib.h}.
+
+@defvr Constant LZ_API_VERSION
+This constant is defined in @samp{lzlib.h} and works as a version test
+macro. The application should check at compile time that LZ_API_VERSION is
+greater than or equal to the version required by the application:
+
+@example
+#if !defined LZ_API_VERSION || LZ_API_VERSION < 1012
+#error "lzlib 1.12 or newer needed."
+#endif
+@end example
+
+Before version 1.8, lzlib didn't define LZ_API_VERSION.@*
+LZ_API_VERSION was first defined in lzlib 1.8 to 1.@*
+Since lzlib 1.12, LZ_API_VERSION is defined as (major * 1000 + minor).
+@end defvr
+
+NOTE: Version test macros are the library's way of announcing functionality
+to the application. They should not be confused with feature test macros,
+which allow the application to announce to the library its desire to have
+certain symbols and prototypes exposed.
+
+@deftypefun int LZ_api_version ( void )
+If LZ_API_VERSION >= 1012, this function is declared in @samp{lzlib.h} (else
+it doesn't exist). It returns the LZ_API_VERSION of the library object code
+being used. The application should check at run time that the value
+returned by @code{LZ_api_version} is greater than or equal to the version
+required by the application. An application may be dynamically linked at run
+time with a different version of lzlib than the one it was compiled for, and
+this should not break the application as long as the library used provides
+the functionality required by the application.
+
+@example
+#if defined LZ_API_VERSION && LZ_API_VERSION >= 1012
+ if( LZ_api_version() < 1012 )
+ show_error( "lzlib 1.12 or newer needed." );
+#endif
+@end example
+@end deftypefun
+
+@deftypevr Constant {const char *} LZ_version_string
+This string constant is defined in the header file @samp{lzlib.h} and
+represents the version of the library being used at compile time.
+@end deftypevr
+
+@deftypefun {const char *} LZ_version ( void )
+This function returns a string representing the version of the library being
+used at run time.
+@end deftypefun
+
+
+@node Buffering
+@chapter Buffering
+@cindex buffering
+
+Lzlib internal functions need access to a memory chunk at least as large
+as the dictionary size (sliding window). For efficiency reasons, the
+input buffer for compression is twice or sixteen times as large as the
+dictionary size.
+
+Finally, for safety reasons, lzlib uses two more internal buffers.
+
+These are the four buffers used by lzlib, and their guaranteed minimum sizes:
+
+@itemize @bullet
+@item Input compression buffer. Written to by the function
+@samp{LZ_compress_write}. For the normal variant of LZMA, its size is two
+times the dictionary size set with the function @samp{LZ_compress_open} or
+@w{64 KiB}, whichever is larger. For the fast variant, its size is @w{1 MiB}.
+
+@item Output compression buffer. Read from by the function
+@samp{LZ_compress_read}. Its size is @w{64 KiB}.
+
+@item Input decompression buffer. Written to by the function
+@samp{LZ_decompress_write}. Its size is @w{64 KiB}.
+
+@item Output decompression buffer. Read from by the function
+@samp{LZ_decompress_read}. Its size is the dictionary size set in the header
+of the member currently being decompressed or @w{64 KiB}, whichever is larger.
+@end itemize
+
+
+@node Parameter limits
+@chapter Parameter limits
+@cindex parameter limits
+
+These functions provide minimum and maximum values for some parameters.
+Current values are shown in square brackets.
+
+@deftypefun int LZ_min_dictionary_bits ( void )
+Returns the base 2 logarithm of the smallest valid dictionary size [12].
+@end deftypefun
+
+@deftypefun int LZ_min_dictionary_size ( void )
+Returns the smallest valid dictionary size [4 KiB].
+@end deftypefun
+
+@deftypefun int LZ_max_dictionary_bits ( void )
+Returns the base 2 logarithm of the largest valid dictionary size [29].
+@end deftypefun
+
+@deftypefun int LZ_max_dictionary_size ( void )
+Returns the largest valid dictionary size [512 MiB].
+@end deftypefun
+
+@deftypefun int LZ_min_match_len_limit ( void )
+Returns the smallest valid match length limit [5].
+@end deftypefun
+
+@deftypefun int LZ_max_match_len_limit ( void )
+Returns the largest valid match length limit [273].
+@end deftypefun
+
+
+@node Compression functions
+@chapter Compression functions
+@cindex compression functions
+
+These are the functions used to compress data. In case of error, all of
+them return -1 or 0, for signed and unsigned return values respectively,
+except @samp{LZ_compress_open} whose return value must be checked by
+calling @samp{LZ_compress_errno} before using it.
+
+
+@deftypefun {struct LZ_Encoder *} LZ_compress_open ( const int @var{dictionary_size}, const int @var{match_len_limit}, const unsigned long long @var{member_size} )
+Initializes the internal stream state for compression and returns a
+pointer that can only be used as the @var{encoder} argument for the
+other LZ_compress functions, or a null pointer if the encoder could not
+be allocated.
+
+The returned pointer must be checked by calling @samp{LZ_compress_errno}
+before using it. If @samp{LZ_compress_errno} does not return @samp{LZ_ok},
+the returned pointer must not be used and should be freed with
+@samp{LZ_compress_close} to avoid memory leaks.
+
+@var{dictionary_size} sets the dictionary size to be used, in bytes.
+Valid values range from @w{4 KiB} to @w{512 MiB}. Note that dictionary
+sizes are quantized. If the size specified does not match one of the
+valid sizes, it is rounded upwards by adding up to
+@w{(@var{dictionary_size} / 8)} to it.
+
+@var{match_len_limit} sets the match length limit in bytes. Valid values
+range from 5 to 273. Larger values usually give better compression ratios
+but longer compression times.
+
+If @var{dictionary_size} is 65535 and @var{match_len_limit} is 16, the fast
+variant of LZMA is chosen, which produces identical compressed output as
+@w{@samp{lzip -0}}. (The dictionary size used is rounded upwards to
+@w{64 KiB}).
+
+@anchor{member_size}
+@var{member_size} sets the member size limit in bytes. Valid values range
+from @w{4 KiB} to @w{2 PiB}. A small member size may degrade compression
+ratio, so use it only when needed. To produce a single-member data stream,
+give @var{member_size} a value larger than the amount of data to be
+produced. Values larger than @w{2 PiB} are reduced to @w{2 PiB} to prevent
+the uncompressed size of the member from overflowing.
+@end deftypefun
+
+
+@deftypefun int LZ_compress_close ( struct LZ_Encoder * const @var{encoder} )
+Frees all dynamically allocated data structures for this stream. This
+function discards any unprocessed input and does not flush any pending
+output. After a call to @samp{LZ_compress_close}, @var{encoder} can no
+longer be used as an argument to any LZ_compress function.
+It is safe to call @samp{LZ_compress_close} with a null argument.
+@end deftypefun
+
+
+@deftypefun int LZ_compress_finish ( struct LZ_Encoder * const @var{encoder} )
+Use this function to tell @samp{lzlib} that all the data for this member
+have already been written (with the function @samp{LZ_compress_write}).
+It is safe to call @samp{LZ_compress_finish} as many times as needed.
+After all the compressed data have been read with @samp{LZ_compress_read}
+and @samp{LZ_compress_member_finished} returns 1, a new member can be
+started with @samp{LZ_compress_restart_member}.
+@end deftypefun
+
+
+@deftypefun int LZ_compress_restart_member ( struct LZ_Encoder * const @var{encoder}, const unsigned long long @var{member_size} )
+Use this function to start a new member in a multimember data stream. Call
+this function only after @samp{LZ_compress_member_finished} indicates that
+the current member has been fully read (with the function
+@samp{LZ_compress_read}). @xref{member_size}, for a description of
+@var{member_size}.
+@end deftypefun
+
+
+@anchor{sync_flush}
+@deftypefun int LZ_compress_sync_flush ( struct LZ_Encoder * const @var{encoder} )
+Use this function to make available to @samp{LZ_compress_read} all the data
+already written with the function @samp{LZ_compress_write}. First call
+@samp{LZ_compress_sync_flush}. Then call @samp{LZ_compress_read} until it
+returns 0.
+
+This function writes at least one LZMA marker @samp{3} ("Sync Flush" marker)
+to the compressed output. Note that the sync flush marker is not allowed in
+lzip files; it is a device for interactive communication between
+applications using lzlib, but is useless and wasteful in a file, and is
+excluded from the media type @samp{application/lzip}. The LZMA marker
+@samp{2} ("End Of Stream" marker) is the only marker allowed in lzip files.
+@xref{Data format}.
+
+Repeated use of @samp{LZ_compress_sync_flush} may degrade compression
+ratio, so use it only when needed. If the interval between calls to
+@samp{LZ_compress_sync_flush} is large (comparable to dictionary size),
+creating a multimember data stream with @samp{LZ_compress_restart_member}
+may be an alternative.
+
+Combining multimember stream creation with flushing may be tricky. If there
+are more bytes available than those needed to complete @var{member_size},
+@samp{LZ_compress_restart_member} needs to be called when
+@samp{LZ_compress_member_finished} returns 1, followed by a new call to
+@samp{LZ_compress_sync_flush}.
+@end deftypefun
+
+
+@deftypefun int LZ_compress_read ( struct LZ_Encoder * const @var{encoder}, uint8_t * const @var{buffer}, const int @var{size} )
+Reads up to @var{size} bytes from the stream pointed to by @var{encoder},
+storing the results in @var{buffer}. If @w{LZ_API_VERSION >= 1012},
+@var{buffer} may be a null pointer, in which case the bytes read are
+discarded.
+
+Returns the number of bytes actually read. This might be less than
+@var{size}; for example, if there aren't that many bytes left in the stream
+or if more bytes have to be yet written with the function
+@samp{LZ_compress_write}. Note that reading less than @var{size} bytes is
+not an error.
+@end deftypefun
+
+
+@deftypefun int LZ_compress_write ( struct LZ_Encoder * const @var{encoder}, uint8_t * const @var{buffer}, const int @var{size} )
+Writes up to @var{size} bytes from @var{buffer} to the stream pointed to by
+@var{encoder}. Returns the number of bytes actually written. This might be
+less than @var{size}. Note that writing less than @var{size} bytes is not an
+error.
+@end deftypefun
+
+
+@deftypefun int LZ_compress_write_size ( struct LZ_Encoder * const @var{encoder} )
+Returns the maximum number of bytes that can be immediately written through
+@samp{LZ_compress_write}. For efficiency reasons, once the input buffer is
+full and @samp{LZ_compress_write_size} returns 0, almost all the buffer must
+be compressed before a size greater than 0 is returned again. (This is done
+to minimize the amount of data that must be copied to the beginning of the
+buffer before new data can be accepted).
+
+It is guaranteed that an immediate call to @samp{LZ_compress_write} will
+accept a @var{size} up to the returned number of bytes.
+@end deftypefun
+
+
+@deftypefun {enum LZ_Errno} LZ_compress_errno ( struct LZ_Encoder * const @var{encoder} )
+Returns the current error code for @var{encoder}. @xref{Error codes}.
+It is safe to call @samp{LZ_compress_errno} with a null argument, in which
+case it returns @samp{LZ_bad_argument}.
+@end deftypefun
+
+
+@deftypefun int LZ_compress_finished ( struct LZ_Encoder * const @var{encoder} )
+Returns 1 if all the data have been read and @samp{LZ_compress_close}
+can be safely called. Otherwise it returns 0. @samp{LZ_compress_finished}
+implies @samp{LZ_compress_member_finished}.
+@end deftypefun
+
+
+@deftypefun int LZ_compress_member_finished ( struct LZ_Encoder * const @var{encoder} )
+Returns 1 if the current member, in a multimember data stream, has been
+fully read and @samp{LZ_compress_restart_member} can be safely called.
+Otherwise it returns 0.
+@end deftypefun
+
+
+@deftypefun {unsigned long long} LZ_compress_data_position ( struct LZ_Encoder * const @var{encoder} )
+Returns the number of input bytes already compressed in the current member.
+@end deftypefun
+
+
+@deftypefun {unsigned long long} LZ_compress_member_position ( struct LZ_Encoder * const @var{encoder} )
+Returns the number of compressed bytes already produced, but perhaps not
+yet read, in the current member.
+@end deftypefun
+
+
+@deftypefun {unsigned long long} LZ_compress_total_in_size ( struct LZ_Encoder * const @var{encoder} )
+Returns the total number of input bytes already compressed.
+@end deftypefun
+
+
+@deftypefun {unsigned long long} LZ_compress_total_out_size ( struct LZ_Encoder * const @var{encoder} )
+Returns the total number of compressed bytes already produced, but
+perhaps not yet read.
+@end deftypefun
+
+
+@node Decompression functions
+@chapter Decompression functions
+@cindex decompression functions
+
+These are the functions used to decompress data. In case of error, all of
+them return -1 or 0, for signed and unsigned return values respectively,
+except @samp{LZ_decompress_open} whose return value must be checked by
+calling @samp{LZ_decompress_errno} before using it.
+
+
+@deftypefun {struct LZ_Decoder *} LZ_decompress_open ( void )
+Initializes the internal stream state for decompression and returns a
+pointer that can only be used as the @var{decoder} argument for the other
+LZ_decompress functions, or a null pointer if the decoder could not be
+allocated.
+
+The returned pointer must be checked by calling @samp{LZ_decompress_errno}
+before using it. If @samp{LZ_decompress_errno} does not return @samp{LZ_ok},
+the returned pointer must not be used and should be freed with
+@samp{LZ_decompress_close} to avoid memory leaks.
+@end deftypefun
+
+
+@deftypefun int LZ_decompress_close ( struct LZ_Decoder * const @var{decoder} )
+Frees all dynamically allocated data structures for this stream. This
+function discards any unprocessed input and does not flush any pending
+output. After a call to @samp{LZ_decompress_close}, @var{decoder} can no
+longer be used as an argument to any LZ_decompress function.
+It is safe to call @samp{LZ_decompress_close} with a null argument.
+@end deftypefun
+
+
+@deftypefun int LZ_decompress_finish ( struct LZ_Decoder * const @var{decoder} )
+Use this function to tell @samp{lzlib} that all the data for this stream
+have already been written (with the function @samp{LZ_decompress_write}).
+It is safe to call @samp{LZ_decompress_finish} as many times as needed.
+It is not required to call @samp{LZ_decompress_finish} if the input stream
+only contains whole members, but not calling it prevents lzlib from
+detecting a truncated member.
+@end deftypefun
+
+
+@deftypefun int LZ_decompress_reset ( struct LZ_Decoder * const @var{decoder} )
+Resets the internal state of @var{decoder} as it was just after opening
+it with the function @samp{LZ_decompress_open}. Data stored in the
+internal buffers is discarded. Position counters are set to 0.
+@end deftypefun
+
+
+@deftypefun int LZ_decompress_sync_to_member ( struct LZ_Decoder * const @var{decoder} )
+Resets the error state of @var{decoder} and enters a search state that lasts
+until a new member header (or the end of the stream) is found. After a
+successful call to @samp{LZ_decompress_sync_to_member}, data written with
+@samp{LZ_decompress_write} is consumed and @samp{LZ_decompress_read} returns
+0 until a header is found.
+
+This function is useful to discard any data preceding the first member, or
+to discard the rest of the current member, for example in case of a data
+error. If the decoder is already at the beginning of a member, this function
+does nothing.
+@end deftypefun
+
+
+@deftypefun int LZ_decompress_read ( struct LZ_Decoder * const @var{decoder}, uint8_t * const @var{buffer}, const int @var{size} )
+Reads up to @var{size} bytes from the stream pointed to by @var{decoder},
+storing the results in @var{buffer}. If @w{LZ_API_VERSION >= 1012},
+@var{buffer} may be a null pointer, in which case the bytes read are
+discarded.
+
+Returns the number of bytes actually read. This might be less than
+@var{size}; for example, if there aren't that many bytes left in the stream
+or if more bytes have to be yet written with the function
+@samp{LZ_decompress_write}. Note that reading less than @var{size} bytes is
+not an error.
+
+@samp{LZ_decompress_read} returns at least once per member so that
+@samp{LZ_decompress_member_finished} can be called (and trailer data
+retrieved) for each member, even for empty members. Therefore,
+@samp{LZ_decompress_read} returning 0 does not mean that the end of the
+stream has been reached. The increase in the value returned by
+@samp{LZ_decompress_total_in_size} can be used to tell the end of the stream
+from an empty member.
+
+In case of decompression error caused by corrupt or truncated data,
+@samp{LZ_decompress_read} does not signal the error immediately to the
+application, but waits until all the bytes decoded have been read. This
+allows tools like
+@uref{http://www.nongnu.org/lzip/manual/tarlz_manual.html,,tarlz} to
+recover as much data as possible from each damaged member.
+@ifnothtml
+@xref{Top,tarlz manual,,tarlz}.
+@end ifnothtml
+@end deftypefun
+
+
+@deftypefun int LZ_decompress_write ( struct LZ_Decoder * const @var{decoder}, uint8_t * const @var{buffer}, const int @var{size} )
+Writes up to @var{size} bytes from @var{buffer} to the stream pointed to by
+@var{decoder}. Returns the number of bytes actually written. This might be
+less than @var{size}. Note that writing less than @var{size} bytes is not an
+error.
+@end deftypefun
+
+
+@deftypefun int LZ_decompress_write_size ( struct LZ_Decoder * const @var{decoder} )
+Returns the maximum number of bytes that can be immediately written through
+@samp{LZ_decompress_write}. This number varies smoothly; each compressed
+byte consumed may be overwritten immediately, increasing by 1 the value
+returned.
+
+It is guaranteed that an immediate call to @samp{LZ_decompress_write} will
+accept a @var{size} up to the returned number of bytes.
+@end deftypefun
+
+
+@deftypefun {enum LZ_Errno} LZ_decompress_errno ( struct LZ_Decoder * const @var{decoder} )
+Returns the current error code for @var{decoder}. @xref{Error codes}.
+It is safe to call @samp{LZ_decompress_errno} with a null argument, in which
+case it returns @samp{LZ_bad_argument}.
+@end deftypefun
+
+
+@deftypefun int LZ_decompress_finished ( struct LZ_Decoder * const @var{decoder} )
+Returns 1 if all the data have been read and @samp{LZ_decompress_close}
+can be safely called. Otherwise it returns 0. @samp{LZ_decompress_finished}
+does not imply @samp{LZ_decompress_member_finished}.
+@end deftypefun
+
+
+@deftypefun int LZ_decompress_member_finished ( struct LZ_Decoder * const @var{decoder} )
+Returns 1 if the previous call to @samp{LZ_decompress_read} finished reading
+the current member, indicating that final values for the member are available
+through @samp{LZ_decompress_data_crc}, @samp{LZ_decompress_data_position},
+and @samp{LZ_decompress_member_position}. Otherwise it returns 0.
+@end deftypefun
+
+
+@deftypefun int LZ_decompress_member_version ( struct LZ_Decoder * const @var{decoder} )
+Returns the version of the current member, read from the member header.
+@end deftypefun
+
+
+@deftypefun int LZ_decompress_dictionary_size ( struct LZ_Decoder * const @var{decoder} )
+Returns the dictionary size of the current member, read from the member header.
+@end deftypefun
+
+
+@deftypefun {unsigned} LZ_decompress_data_crc ( struct LZ_Decoder * const @var{decoder} )
+Returns the 32 bit Cyclic Redundancy Check of the data decompressed from
+the current member. The value returned is valid only when
+@samp{LZ_decompress_member_finished} returns 1.
+@end deftypefun
+
+
+@deftypefun {unsigned long long} LZ_decompress_data_position ( struct LZ_Decoder * const @var{decoder} )
+Returns the number of decompressed bytes already produced, but perhaps
+not yet read, in the current member.
+@end deftypefun
+
+
+@deftypefun {unsigned long long} LZ_decompress_member_position ( struct LZ_Decoder * const @var{decoder} )
+Returns the number of input bytes already decompressed in the current member.
+@end deftypefun
+
+
+@deftypefun {unsigned long long} LZ_decompress_total_in_size ( struct LZ_Decoder * const @var{decoder} )
+Returns the total number of input bytes already decompressed.
+@end deftypefun
+
+
+@deftypefun {unsigned long long} LZ_decompress_total_out_size ( struct LZ_Decoder * const @var{decoder} )
+Returns the total number of decompressed bytes already produced, but
+perhaps not yet read.
+@end deftypefun
+
+
+@node Error codes
+@chapter Error codes
+@cindex error codes
+
+Most library functions return -1 to indicate that they have failed. But
+this return value only tells you that an error has occurred. To find out
+what kind of error it was, you need to check the error code by calling
+@samp{LZ_(de)compress_errno}.
+
+Library functions don't change the value returned by
+@samp{LZ_(de)compress_errno} when they succeed; thus, the value returned
+by @samp{LZ_(de)compress_errno} after a successful call is not
+necessarily LZ_ok, and you should not use @samp{LZ_(de)compress_errno}
+to determine whether a call failed. If the call failed, then you can
+examine @samp{LZ_(de)compress_errno}.
+
+The error codes are defined in the header file @samp{lzlib.h}.
+
+@deftypevr Constant {enum LZ_Errno} LZ_ok
+The value of this constant is 0 and is used to indicate that there is no error.
+@end deftypevr
+
+@deftypevr Constant {enum LZ_Errno} LZ_bad_argument
+At least one of the arguments passed to the library function was invalid.
+@end deftypevr
+
+@deftypevr Constant {enum LZ_Errno} LZ_mem_error
+No memory available. The system cannot allocate more virtual memory
+because its capacity is full.
+@end deftypevr
+
+@deftypevr Constant {enum LZ_Errno} LZ_sequence_error
+A library function was called in the wrong order. For example
+@samp{LZ_compress_restart_member} was called before
+@samp{LZ_compress_member_finished} indicates that the current member is
+finished.
+@end deftypevr
+
+@deftypevr Constant {enum LZ_Errno} LZ_header_error
+An invalid member header (one with the wrong magic bytes) was read. If
+this happens at the end of the data stream it may indicate trailing data.
+@end deftypevr
+
+@deftypevr Constant {enum LZ_Errno} LZ_unexpected_eof
+The end of the data stream was reached in the middle of a member.
+@end deftypevr
+
+@deftypevr Constant {enum LZ_Errno} LZ_data_error
+The data stream is corrupt. If @samp{LZ_decompress_member_position} is 6
+or less, it indicates either a format version not supported, an invalid
+dictionary size, a corrupt header in a multimember data stream, or
+trailing data too similar to a valid lzip header. Lziprecover can be
+used to remove conflicting trailing data from a file.
+@end deftypevr
+
+@deftypevr Constant {enum LZ_Errno} LZ_library_error
+A bug was detected in the library. Please, report it. @xref{Problems}.
+@end deftypevr
+
+
+@node Error messages
+@chapter Error messages
+@cindex error messages
+
+@deftypefun {const char *} LZ_strerror ( const enum LZ_Errno @var{lz_errno} )
+Returns the standard error message for a given error code. The messages
+are fairly short; there are no multi-line messages or embedded newlines.
+This function makes it easy for your program to report informative error
+messages about the failure of a library call.
+
+The value of @var{lz_errno} normally comes from a call to
+@samp{LZ_(de)compress_errno}.
+@end deftypefun
+
+
+@node Invoking minilzip
+@chapter Invoking minilzip
+@cindex invoking
+@cindex options
+
+Minilzip is a test program for the compression library lzlib, compatible
+with lzip 1.4 or newer.
+
+@uref{http://www.nongnu.org/lzip/lzip.html,,Lzip}
+is a lossless data compressor with a user interface similar to the one
+of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
+chain-Algorithm' (LZMA) stream format to maximize interoperability. The
+maximum dictionary size is 512 MiB so that any lzip file can be decompressed
+on 32-bit machines. Lzip provides accurate and robust 3-factor integrity
+checking. Lzip can compress about as fast as gzip @w{(lzip -0)} or compress most
+files more than bzip2 @w{(lzip -9)}. Decompression speed is intermediate between
+gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
+perspective. Lzip has been designed, written, and tested with great care to
+replace gzip and bzip2 as the standard general-purpose compressed format for
+Unix-like systems.
+
+@noindent
+The format for running minilzip is:
+
+@example
+minilzip [@var{options}] [@var{files}]
+@end example
+
+@noindent
+If no file names are specified, minilzip compresses (or decompresses) from
+standard input to standard output. A hyphen @samp{-} used as a @var{file}
+argument means standard input. It can be mixed with other @var{files} and is
+read just once, the first time it appears in the command line. Remember to
+prepend @file{./} to any file name beginning with a hyphen, or use @samp{--}.
+
+minilzip supports the following
+@uref{http://www.nongnu.org/arg-parser/manual/arg_parser_manual.html#Argument-syntax,,options}:
+@ifnothtml
+@xref{Argument syntax,,,arg_parser}.
+@end ifnothtml
+
+@table @code
+@item -h
+@itemx --help
+Print an informative help message describing the options and exit.
+
+@item -V
+@itemx --version
+Print the version number of minilzip on the standard output and exit.
+This version number should be included in all bug reports.
+
+@item -a
+@itemx --trailing-error
+Exit with error status 2 if any remaining input is detected after
+decompressing the last member. Such remaining input is usually trailing
+garbage that can be safely ignored.
+
+@item -b @var{bytes}
+@itemx --member-size=@var{bytes}
+When compressing, set the member size limit to @var{bytes}. It is advisable
+to keep members smaller than RAM size so that they can be repaired with
+lziprecover in case of corruption. A small member size may degrade
+compression ratio, so use it only when needed. Valid values range from
+@w{100 kB} to @w{2 PiB}. Defaults to @w{2 PiB}.
+
+@item -c
+@itemx --stdout
+Compress or decompress to standard output; keep input files unchanged. If
+compressing several files, each file is compressed independently. (The
+output consists of a sequence of independently compressed members). This
+option (or @option{-o}) is needed when reading from a named pipe (fifo) or
+from a device. Use it also to recover as much of the decompressed data as
+possible when decompressing a corrupt file. @option{-c} overrides @option{-o}
+and @option{-S}. @option{-c} has no effect when testing.
+
+@item -d
+@itemx --decompress
+Decompress the files specified. The integrity of the files specified is
+checked. If a file does not exist, can't be opened, or the destination file
+already exists and @option{--force} has not been specified, minilzip continues
+decompressing the rest of the files and exits with error status 1. If a file
+fails to decompress, or is a terminal, minilzip exits immediately with error
+status 2 without decompressing the rest of the files. A terminal is
+considered an uncompressed file, and therefore invalid.
+
+@item -f
+@itemx --force
+Force overwrite of output files.
+
+@item -F
+@itemx --recompress
+When compressing, force re-compression of files whose name already has
+the @samp{.lz} or @samp{.tlz} suffix.
+
+@item -k
+@itemx --keep
+Keep (don't delete) input files during compression or decompression.
+
+@item -m @var{bytes}
+@itemx --match-length=@var{bytes}
+When compressing, set the match length limit in bytes. After a match this
+long is found, the search is finished. Valid values range from 5 to 273.
+Larger values usually give better compression ratios but longer compression
+times.
+
+@item -o @var{file}
+@itemx --output=@var{file}
+If @option{-c} has not been also specified, write the (de)compressed output
+to @var{file}; keep input files unchanged. If compressing several files,
+each file is compressed independently. (The output consists of a sequence of
+independently compressed members). This option (or @option{-c}) is needed
+when reading from a named pipe (fifo) or from a device. @w{@option{-o -}} is
+equivalent to @option{-c}. @option{-o} has no effect when testing.
+
+When compressing and splitting the output in volumes, @var{file} is used as
+a prefix, and several files named @samp{@var{file}00001.lz},
+@samp{@var{file}00002.lz}, etc, are created. In this case, only one input
+file is allowed.
+
+@item -q
+@itemx --quiet
+Quiet operation. Suppress all messages.
+
+@item -s @var{bytes}
+@itemx --dictionary-size=@var{bytes}
+When compressing, set the dictionary size limit in bytes. Minilzip uses for
+each file the largest dictionary size that does not exceed neither the file
+size nor this limit. Valid values range from @w{4 KiB} to @w{512 MiB}.
+Values 12 to 29 are interpreted as powers of two, meaning 2^12 to 2^29
+bytes. Dictionary sizes are quantized so that they can be coded in just one
+byte (@pxref{coded-dict-size}). If the size specified does not match one of
+the valid sizes, it is rounded upwards by adding up to @w{(@var{bytes} / 8)}
+to it.
+
+For maximum compression you should use a dictionary size limit as large
+as possible, but keep in mind that the decompression memory requirement
+is affected at compression time by the choice of dictionary size limit.
+
+@item -S @var{bytes}
+@itemx --volume-size=@var{bytes}
+When compressing, and @option{-c} has not been also specified, split the
+compressed output into several volume files with names
+@samp{original_name00001.lz}, @samp{original_name00002.lz}, etc, and set the
+volume size limit to @var{bytes}. Input files are kept unchanged. Each
+volume is a complete, maybe multimember, lzip file. A small volume size may
+degrade compression ratio, so use it only when needed. Valid values range
+from @w{100 kB} to @w{4 EiB}.
+
+@item -t
+@itemx --test
+Check integrity of the files specified, but don't decompress them. This
+really performs a trial decompression and throws away the result. Use it
+together with @option{-v} to see information about the files. If a file
+fails the test, does not exist, can't be opened, or is a terminal, minilzip
+continues testing the rest of the files. A final diagnostic is shown at
+verbosity level 1 or higher if any file fails the test when testing multiple
+files.
+
+@item -v
+@itemx --verbose
+Verbose mode.@*
+When compressing, show the compression ratio and size for each file
+processed.@*
+When decompressing or testing, further -v's (up to 4) increase the
+verbosity level, showing status, compression ratio, dictionary size,
+and trailer contents (CRC, data size, member size).
+
+@item -0 .. -9
+Compression level. Set the compression parameters (dictionary size and
+match length limit) as shown in the table below. The default compression
+level is @option{-6}, equivalent to @w{@option{-s8MiB -m36}}. Note that
+@option{-9} can be much slower than @option{-0}. These options have no
+effect when decompressing or testing.
+
+The bidimensional parameter space of LZMA can't be mapped to a linear scale
+optimal for all files. If your files are large, very repetitive, etc, you
+may need to use the options @option{--dictionary-size} and
+@option{--match-length} directly to achieve optimal performance.
+
+If several compression levels or @option{-s} or @option{-m} options are
+given, the last setting is used. For example @w{@option{-9 -s64MiB}} is
+equivalent to @w{@option{-s64MiB -m273}}
+
+@multitable {Level} {Dictionary size (-s)} {Match length limit (-m)}
+@item Level @tab Dictionary size (-s) @tab Match length limit (-m)
+@item -0 @tab 64 KiB @tab 16 bytes
+@item -1 @tab 1 MiB @tab 5 bytes
+@item -2 @tab 1.5 MiB @tab 6 bytes
+@item -3 @tab 2 MiB @tab 8 bytes
+@item -4 @tab 3 MiB @tab 12 bytes
+@item -5 @tab 4 MiB @tab 20 bytes
+@item -6 @tab 8 MiB @tab 36 bytes
+@item -7 @tab 16 MiB @tab 68 bytes
+@item -8 @tab 24 MiB @tab 132 bytes
+@item -9 @tab 32 MiB @tab 273 bytes
+@end multitable
+
+@item --fast
+@itemx --best
+Aliases for GNU gzip compatibility.
+
+@item --loose-trailing
+When decompressing or testing, allow trailing data whose first bytes are
+so similar to the magic bytes of a lzip header that they can be confused
+with a corrupt header. Use this option if a file triggers a "corrupt
+header" error and the cause is not indeed a corrupt header.
+
+@item --check-lib
+Compare the @uref{#Library-version,,version of lzlib} used to compile
+minilzip with the version actually being used at run time and exit. Report
+any differences found. Exit with error status 1 if differences are found. A
+mismatch may indicate that lzlib is not correctly installed or that a
+different version of lzlib has been installed after compiling the shared
+version of minilzip. Exit with error status 2 if LZ_API_VERSION and
+LZ_version_string don't match. @w{@samp{minilzip -v --check-lib}} shows the
+version of lzlib being used and the value of LZ_API_VERSION (if defined).
+@ifnothtml
+@xref{Library version}.
+@end ifnothtml
+
+@end table
+
+Numbers given as arguments to options may be expressed in decimal,
+hexadecimal, or octal (using the same syntax as integer constants in C++),
+and may be followed by a multiplier and an optional @samp{B} for "byte".
+
+Table of SI and binary prefixes (unit multipliers):
+
+@multitable {Prefix} {kilobyte (10^3 = 1000)} {|} {Prefix} {kibibyte (2^10 = 1024)}
+@item Prefix @tab Value @tab | @tab Prefix @tab Value
+@item k @tab kilobyte (10^3 = 1000) @tab | @tab Ki @tab kibibyte (2^10 = 1024)
+@item M @tab megabyte (10^6) @tab | @tab Mi @tab mebibyte (2^20)
+@item G @tab gigabyte (10^9) @tab | @tab Gi @tab gibibyte (2^30)
+@item T @tab terabyte (10^12) @tab | @tab Ti @tab tebibyte (2^40)
+@item P @tab petabyte (10^15) @tab | @tab Pi @tab pebibyte (2^50)
+@item E @tab exabyte (10^18) @tab | @tab Ei @tab exbibyte (2^60)
+@item Z @tab zettabyte (10^21) @tab | @tab Zi @tab zebibyte (2^70)
+@item Y @tab yottabyte (10^24) @tab | @tab Yi @tab yobibyte (2^80)
+@item R @tab ronnabyte (10^27) @tab | @tab Ri @tab robibyte (2^90)
+@item Q @tab quettabyte (10^30) @tab | @tab Qi @tab quebibyte (2^100)
+@end multitable
+
+@sp 1
+Exit status: 0 for a normal exit, 1 for environmental problems
+(file not found, invalid command-line options, I/O errors, etc), 2 to
+indicate a corrupt or invalid input file, 3 for an internal consistency
+error (e.g., bug) which caused minilzip to panic.
+
+
+@node Data format
+@chapter Data format
+@cindex data format
+
+Perfection is reached, not when there is no longer anything to add, but
+when there is no longer anything to take away.@*
+--- Antoine de Saint-Exupery
+
+@sp 1
+In the diagram below, a box like this:
+
+@verbatim
++---+
+| | <-- the vertical bars might be missing
++---+
+@end verbatim
+
+represents one byte; a box like this:
+
+@verbatim
++==============+
+| |
++==============+
+@end verbatim
+
+represents a variable number of bytes.
+
+@sp 1
+Lzip data consist of one or more independent "members" (compressed data
+sets). The members simply appear one after another in the data stream, with
+no additional information before, between, or after them. Each member can
+encode in compressed form up to @w{16 EiB - 1 byte} of uncompressed data.
+The size of a multimember data stream is unlimited.
+
+Each member has the following structure:
+
+@verbatim
++--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size |
++--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+@end verbatim
+
+All multibyte values are stored in little endian order.
+
+@table @samp
+@item ID string (the "magic" bytes)
+A four byte string, identifying the lzip format, with the value "LZIP"
+(0x4C, 0x5A, 0x49, 0x50).
+
+@item VN (version number, 1 byte)
+Just in case something needs to be modified in the future. 1 for now.
+
+@anchor{coded-dict-size}
+@item DS (coded dictionary size, 1 byte)
+The dictionary size is calculated by taking a power of 2 (the base size)
+and subtracting from it a fraction between 0/16 and 7/16 of the base size.@*
+Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).@*
+Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
+from the base size to obtain the dictionary size.@*
+Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB@*
+Valid values for dictionary size range from 4 KiB to 512 MiB.
+
+@item LZMA stream
+The LZMA stream, finished by an "End Of Stream" marker. Uses default values
+for encoder properties.
+@ifnothtml
+@xref{Stream format,,,lzip},
+@end ifnothtml
+@ifhtml
+See
+@uref{http://www.nongnu.org/lzip/manual/lzip_manual.html#Stream-format,,Stream format}
+@end ifhtml
+for a complete description.@*
+Lzip only uses the LZMA marker @samp{2} ("End Of Stream" marker). Lzlib
+also uses the LZMA marker @samp{3} ("Sync Flush" marker). @xref{sync_flush}.
+
+@item CRC32 (4 bytes)
+Cyclic Redundancy Check (CRC) of the original uncompressed data.
+
+@item Data size (8 bytes)
+Size of the original uncompressed data.
+
+@item Member size (8 bytes)
+Total size of the member, including header and trailer. This field acts
+as a distributed index, improves the checking of stream integrity, and
+facilitates the safe recovery of undamaged members from multimember files.
+Lzip limits the member size to @w{2 PiB} to prevent the data size field from
+overflowing.
+
+@end table
+
+
+@node Examples
+@chapter A small tutorial with examples
+@cindex examples
+
+This chapter provides real code examples for the most common uses of the
+library. See these examples in context in the files @samp{bbexample.c} and
+@samp{ffexample.c} from the source distribution of lzlib.
+
+Note that the interface of lzlib is symmetrical. That is, the code for
+normal compression and decompression is identical except because one calls
+LZ_compress* functions while the other calls LZ_decompress* functions.
+
+@menu
+* Buffer compression:: Buffer-to-buffer single-member compression
+* Buffer decompression:: Buffer-to-buffer decompression
+* File compression:: File-to-file single-member compression
+* File decompression:: File-to-file decompression
+* File compression mm:: File-to-file multimember compression
+* Skipping data errors:: Decompression with automatic resynchronization
+@end menu
+
+
+@node Buffer compression
+@section Buffer compression
+@cindex buffer compression
+
+Buffer-to-buffer single-member compression
+@w{(@var{member_size} > total output)}.
+
+@verbatim
+/* Compress 'insize' bytes from 'inbuf' to 'outbuf'.
+ Return the size of the compressed data in '*outlenp'.
+ In case of error, or if 'outsize' is too small, return false and do not
+ modify '*outlenp'.
+*/
+bool bbcompress( const uint8_t * const inbuf, const int insize,
+ const int dictionary_size, const int match_len_limit,
+ uint8_t * const outbuf, const int outsize,
+ int * const outlenp )
+ {
+ int inpos = 0, outpos = 0;
+ bool error = false;
+ struct LZ_Encoder * const encoder =
+ LZ_compress_open( dictionary_size, match_len_limit, INT64_MAX );
+ if( !encoder || LZ_compress_errno( encoder ) != LZ_ok )
+ { LZ_compress_close( encoder ); return false; }
+
+ while( true )
+ {
+ int ret = LZ_compress_write( encoder, inbuf + inpos, insize - inpos );
+ if( ret < 0 ) { error = true; break; }
+ inpos += ret;
+ if( inpos >= insize ) LZ_compress_finish( encoder );
+ ret = LZ_compress_read( encoder, outbuf + outpos, outsize - outpos );
+ if( ret < 0 ) { error = true; break; }
+ outpos += ret;
+ if( LZ_compress_finished( encoder ) == 1 ) break;
+ if( outpos >= outsize ) { error = true; break; }
+ }
+
+ if( LZ_compress_close( encoder ) < 0 ) error = true;
+ if( error ) return false;
+ *outlenp = outpos;
+ return true;
+ }
+@end verbatim
+
+
+@node Buffer decompression
+@section Buffer decompression
+@cindex buffer decompression
+
+Buffer-to-buffer decompression.
+
+@verbatim
+/* Decompress 'insize' bytes from 'inbuf' to 'outbuf'.
+ Return the size of the decompressed data in '*outlenp'.
+ In case of error, or if 'outsize' is too small, return false and do not
+ modify '*outlenp'.
+*/
+bool bbdecompress( const uint8_t * const inbuf, const int insize,
+ uint8_t * const outbuf, const int outsize,
+ int * const outlenp )
+ {
+ int inpos = 0, outpos = 0;
+ bool error = false;
+ struct LZ_Decoder * const decoder = LZ_decompress_open();
+ if( !decoder || LZ_decompress_errno( decoder ) != LZ_ok )
+ { LZ_decompress_close( decoder ); return false; }
+
+ while( true )
+ {
+ int ret = LZ_decompress_write( decoder, inbuf + inpos, insize - inpos );
+ if( ret < 0 ) { error = true; break; }
+ inpos += ret;
+ if( inpos >= insize ) LZ_decompress_finish( decoder );
+ ret = LZ_decompress_read( decoder, outbuf + outpos, outsize - outpos );
+ if( ret < 0 ) { error = true; break; }
+ outpos += ret;
+ if( LZ_decompress_finished( decoder ) == 1 ) break;
+ if( outpos >= outsize ) { error = true; break; }
+ }
+
+ if( LZ_decompress_close( decoder ) < 0 ) error = true;
+ if( error ) return false;
+ *outlenp = outpos;
+ return true;
+ }
+@end verbatim
+
+
+@node File compression
+@section File compression
+@cindex file compression
+
+File-to-file compression using LZ_compress_write_size.
+
+@verbatim
+int ffcompress( struct LZ_Encoder * const encoder,
+ FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384 };
+ uint8_t buffer[buffer_size];
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_compress_write_size( encoder ) );
+ if( size > 0 )
+ {
+ len = fread( buffer, 1, size, infile );
+ ret = LZ_compress_write( encoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) ) LZ_compress_finish( encoder );
+ }
+ ret = LZ_compress_read( encoder, buffer, buffer_size );
+ if( ret < 0 ) break;
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_compress_finished( encoder ) == 1 ) return 0;
+ }
+ return 1;
+ }
+@end verbatim
+
+
+@node File decompression
+@section File decompression
+@cindex file decompression
+
+File-to-file decompression using LZ_decompress_write_size.
+
+@verbatim
+int ffdecompress( struct LZ_Decoder * const decoder,
+ FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384 };
+ uint8_t buffer[buffer_size];
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_decompress_write_size( decoder ) );
+ if( size > 0 )
+ {
+ len = fread( buffer, 1, size, infile );
+ ret = LZ_decompress_write( decoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) ) LZ_decompress_finish( decoder );
+ }
+ ret = LZ_decompress_read( decoder, buffer, buffer_size );
+ if( ret < 0 ) break;
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_decompress_finished( decoder ) == 1 ) return 0;
+ }
+ return 1;
+ }
+@end verbatim
+
+
+@node File compression mm
+@section File-to-file multimember compression
+@cindex multimember compression
+
+Example 1: Multimember compression with members of fixed size
+@w{(@var{member_size} < total output)}.
+
+@verbatim
+int ffmmcompress( FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384, member_size = 4096 };
+ uint8_t buffer[buffer_size];
+ bool done = false;
+ struct LZ_Encoder * const encoder =
+ LZ_compress_open( 65535, 16, member_size );
+ if( !encoder || LZ_compress_errno( encoder ) != LZ_ok )
+ { fputs( "ffexample: Not enough memory.\n", stderr );
+ LZ_compress_close( encoder ); return 1; }
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_compress_write_size( encoder ) );
+ if( size > 0 )
+ {
+ len = fread( buffer, 1, size, infile );
+ ret = LZ_compress_write( encoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) ) LZ_compress_finish( encoder );
+ }
+ ret = LZ_compress_read( encoder, buffer, buffer_size );
+ if( ret < 0 ) break;
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_compress_member_finished( encoder ) == 1 )
+ {
+ if( LZ_compress_finished( encoder ) == 1 ) { done = true; break; }
+ if( LZ_compress_restart_member( encoder, member_size ) < 0 ) break;
+ }
+ }
+ if( LZ_compress_close( encoder ) < 0 ) done = false;
+ return done;
+ }
+@end verbatim
+
+@sp 1
+@noindent
+Example 2: Multimember compression (user-restarted members).
+(Call LZ_compress_open with @var{member_size} > largest member).
+
+@verbatim
+/* Compress 'infile' to 'outfile' as a multimember stream with one member
+ for each line of text terminated by a newline character or by EOF.
+ Return 0 if success, 1 if error.
+*/
+int fflfcompress( struct LZ_Encoder * const encoder,
+ FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384 };
+ uint8_t buffer[buffer_size];
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_compress_write_size( encoder ) );
+ if( size > 0 )
+ {
+ for( len = 0; len < size; )
+ {
+ int ch = getc( infile );
+ if( ch == EOF || ( buffer[len++] = ch ) == '\n' ) break;
+ }
+ /* avoid writing an empty member to outfile */
+ if( len == 0 && LZ_compress_data_position( encoder ) == 0 ) return 0;
+ ret = LZ_compress_write( encoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) || buffer[len-1] == '\n' )
+ LZ_compress_finish( encoder );
+ }
+ ret = LZ_compress_read( encoder, buffer, buffer_size );
+ if( ret < 0 ) break;
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_compress_member_finished( encoder ) == 1 )
+ {
+ if( feof( infile ) && LZ_compress_finished( encoder ) == 1 ) return 0;
+ if( LZ_compress_restart_member( encoder, INT64_MAX ) < 0 ) break;
+ }
+ }
+ return 1;
+ }
+@end verbatim
+
+
+@node Skipping data errors
+@section Skipping data errors
+@cindex skipping data errors
+
+@verbatim
+/* Decompress 'infile' to 'outfile' with automatic resynchronization to
+ next member in case of data error, including the automatic removal of
+ leading garbage.
+*/
+int ffrsdecompress( struct LZ_Decoder * const decoder,
+ FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384 };
+ uint8_t buffer[buffer_size];
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_decompress_write_size( decoder ) );
+ if( size > 0 )
+ {
+ len = fread( buffer, 1, size, infile );
+ ret = LZ_decompress_write( decoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) ) LZ_decompress_finish( decoder );
+ }
+ ret = LZ_decompress_read( decoder, buffer, buffer_size );
+ if( ret < 0 )
+ {
+ if( LZ_decompress_errno( decoder ) == LZ_header_error ||
+ LZ_decompress_errno( decoder ) == LZ_data_error )
+ { LZ_decompress_sync_to_member( decoder ); continue; }
+ break;
+ }
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_decompress_finished( decoder ) == 1 ) return 0;
+ }
+ return 1;
+ }
+@end verbatim
+
+
+@node Problems
+@chapter Reporting bugs
+@cindex bugs
+@cindex getting help
+
+There are probably bugs in lzlib. There are certainly errors and
+omissions in this manual. If you report them, they will get fixed. If
+you don't, no one will ever know about them and they will remain unfixed
+for all eternity, if not longer.
+
+If you find a bug in lzlib, please send electronic mail to
+@email{lzip-bug@@nongnu.org}. Include the version number, which you can
+find by running @w{@samp{minilzip --version}} and
+@w{@samp{minilzip -v --check-lib}}.
+
+
+@node Concept index
+@unnumbered Concept index
+
+@printindex cp
+
+@bye
diff --git a/doc/minilzip.1 b/doc/minilzip.1
new file mode 100644
index 0000000..3532520
--- /dev/null
+++ b/doc/minilzip.1
@@ -0,0 +1,136 @@
+.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.49.2.
+.TH MINILZIP "1" "January 2024" "minilzip 1.14" "User Commands"
+.SH NAME
+minilzip \- reduces the size of files
+.SH SYNOPSIS
+.B minilzip
+[\fI\,options\/\fR] [\fI\,files\/\fR]
+.SH DESCRIPTION
+Minilzip is a test program for the compression library lzlib, compatible
+with lzip 1.4 or newer.
+.PP
+Lzip is a lossless data compressor with a user interface similar to the one
+of gzip or bzip2. Lzip uses a simplified form of the 'Lempel\-Ziv\-Markov
+chain\-Algorithm' (LZMA) stream format to maximize interoperability. The
+maximum dictionary size is 512 MiB so that any lzip file can be decompressed
+on 32\-bit machines. Lzip provides accurate and robust 3\-factor integrity
+checking. Lzip can compress about as fast as gzip (lzip \fB\-0\fR) or compress most
+files more than bzip2 (lzip \fB\-9\fR). Decompression speed is intermediate between
+gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
+perspective. Lzip has been designed, written, and tested with great care to
+replace gzip and bzip2 as the standard general\-purpose compressed format for
+Unix\-like systems.
+.SH OPTIONS
+.TP
+\fB\-h\fR, \fB\-\-help\fR
+display this help and exit
+.TP
+\fB\-V\fR, \fB\-\-version\fR
+output version information and exit
+.TP
+\fB\-a\fR, \fB\-\-trailing\-error\fR
+exit with error status if trailing data
+.TP
+\fB\-b\fR, \fB\-\-member\-size=\fR<bytes>
+set member size limit in bytes
+.TP
+\fB\-c\fR, \fB\-\-stdout\fR
+write to standard output, keep input files
+.TP
+\fB\-d\fR, \fB\-\-decompress\fR
+decompress, test compressed file integrity
+.TP
+\fB\-f\fR, \fB\-\-force\fR
+overwrite existing output files
+.TP
+\fB\-F\fR, \fB\-\-recompress\fR
+force re\-compression of compressed files
+.TP
+\fB\-k\fR, \fB\-\-keep\fR
+keep (don't delete) input files
+.TP
+\fB\-m\fR, \fB\-\-match\-length=\fR<bytes>
+set match length limit in bytes [36]
+.TP
+\fB\-o\fR, \fB\-\-output=\fR<file>
+write to <file>, keep input files
+.TP
+\fB\-q\fR, \fB\-\-quiet\fR
+suppress all messages
+.TP
+\fB\-s\fR, \fB\-\-dictionary\-size=\fR<bytes>
+set dictionary size limit in bytes [8 MiB]
+.TP
+\fB\-S\fR, \fB\-\-volume\-size=\fR<bytes>
+set volume size limit in bytes
+.TP
+\fB\-t\fR, \fB\-\-test\fR
+test compressed file integrity
+.TP
+\fB\-v\fR, \fB\-\-verbose\fR
+be verbose (a 2nd \fB\-v\fR gives more)
+.TP
+\fB\-0\fR .. \fB\-9\fR
+set compression level [default 6]
+.TP
+\fB\-\-fast\fR
+alias for \fB\-0\fR
+.TP
+\fB\-\-best\fR
+alias for \fB\-9\fR
+.TP
+\fB\-\-loose\-trailing\fR
+allow trailing data seeming corrupt header
+.TP
+\fB\-\-check\-lib\fR
+compare version of lzlib.h with liblz.{a,so}
+.PP
+If no file names are given, or if a file is '\-', minilzip compresses or
+decompresses from standard input to standard output.
+Numbers may be followed by a multiplier: k = kB = 10^3 = 1000,
+Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc...
+Dictionary sizes 12 to 29 are interpreted as powers of two, meaning 2^12 to
+2^29 bytes.
+.PP
+The bidimensional parameter space of LZMA can't be mapped to a linear scale
+optimal for all files. If your files are large, very repetitive, etc, you
+may need to use the options \fB\-\-dictionary\-size\fR and \fB\-\-match\-length\fR directly
+to achieve optimal performance.
+.PP
+To extract all the files from archive 'foo.tar.lz', use the commands
+\&'tar \fB\-xf\fR foo.tar.lz' or 'minilzip \fB\-cd\fR foo.tar.lz | tar \fB\-xf\fR \-'.
+.PP
+Exit status: 0 for a normal exit, 1 for environmental problems
+(file not found, invalid command\-line options, I/O errors, etc), 2 to
+indicate a corrupt or invalid input file, 3 for an internal consistency
+error (e.g., bug) which caused minilzip to panic.
+.PP
+The ideas embodied in lzlib are due to (at least) the following people:
+Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the
+definition of Markov chains), G.N.N. Martin (for the definition of range
+encoding), Igor Pavlov (for putting all the above together in LZMA), and
+Julian Seward (for bzip2's CLI).
+.SH "REPORTING BUGS"
+Report bugs to lzip\-bug@nongnu.org
+.br
+Lzlib home page: http://www.nongnu.org/lzip/lzlib.html
+.SH COPYRIGHT
+Copyright \(co 2024 Antonio Diaz Diaz.
+Using lzlib 1.14
+Using LZ_API_VERSION = 1014
+License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>
+.br
+This is free software: you are free to change and redistribute it.
+There is NO WARRANTY, to the extent permitted by law.
+.SH "SEE ALSO"
+The full documentation for
+.B minilzip
+is maintained as a Texinfo manual. If the
+.B info
+and
+.B minilzip
+programs are properly installed at your site, the command
+.IP
+.B info lzlib
+.PP
+should give you access to the complete manual.
diff --git a/encoder.c b/encoder.c
new file mode 100644
index 0000000..c6190ea
--- /dev/null
+++ b/encoder.c
@@ -0,0 +1,587 @@
+/* Lzlib - Compression library for the lzip format
+ Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+ This library is free software. Redistribution and use in source and
+ binary forms, with or without modification, are permitted provided
+ that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions, and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions, and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+static int LZe_get_match_pairs( struct LZ_encoder * const e, struct Pair * pairs )
+ {
+ int32_t * ptr0 = e->eb.mb.pos_array + ( e->eb.mb.cyclic_pos << 1 );
+ int32_t * ptr1 = ptr0 + 1;
+ int len_limit = e->match_len_limit;
+ if( len_limit > Mb_available_bytes( &e->eb.mb ) )
+ {
+ e->been_flushed = true;
+ len_limit = Mb_available_bytes( &e->eb.mb );
+ if( len_limit < 4 ) { *ptr0 = *ptr1 = 0; return 0; }
+ }
+
+ int maxlen = 3; /* only used if pairs != 0 */
+ int num_pairs = 0;
+ const int min_pos = ( e->eb.mb.pos > e->eb.mb.dictionary_size ) ?
+ e->eb.mb.pos - e->eb.mb.dictionary_size : 0;
+ const uint8_t * const data = Mb_ptr_to_current_pos( &e->eb.mb );
+
+ unsigned tmp = crc32[data[0]] ^ data[1];
+ const int key2 = tmp & ( num_prev_positions2 - 1 );
+ tmp ^= (unsigned)data[2] << 8;
+ const int key3 = num_prev_positions2 + ( tmp & ( num_prev_positions3 - 1 ) );
+ const int key4 = num_prev_positions2 + num_prev_positions3 +
+ ( ( tmp ^ ( crc32[data[3]] << 5 ) ) & e->eb.mb.key4_mask );
+
+ if( pairs )
+ {
+ const int np2 = e->eb.mb.prev_positions[key2];
+ const int np3 = e->eb.mb.prev_positions[key3];
+ if( np2 > min_pos && e->eb.mb.buffer[np2-1] == data[0] )
+ {
+ pairs[0].dis = e->eb.mb.pos - np2;
+ pairs[0].len = maxlen = 2 + ( np2 == np3 );
+ num_pairs = 1;
+ }
+ if( np2 != np3 && np3 > min_pos && e->eb.mb.buffer[np3-1] == data[0] )
+ {
+ maxlen = 3;
+ pairs[num_pairs++].dis = e->eb.mb.pos - np3;
+ }
+ if( num_pairs > 0 )
+ {
+ const int delta = pairs[num_pairs-1].dis + 1;
+ while( maxlen < len_limit && data[maxlen-delta] == data[maxlen] )
+ ++maxlen;
+ pairs[num_pairs-1].len = maxlen;
+ if( maxlen < 3 ) maxlen = 3;
+ if( maxlen >= len_limit ) pairs = 0; /* done. now just skip */
+ }
+ }
+
+ const int pos1 = e->eb.mb.pos + 1;
+ e->eb.mb.prev_positions[key2] = pos1;
+ e->eb.mb.prev_positions[key3] = pos1;
+ int newpos1 = e->eb.mb.prev_positions[key4];
+ e->eb.mb.prev_positions[key4] = pos1;
+
+ int len = 0, len0 = 0, len1 = 0;
+
+ int count;
+ for( count = e->cycles; ; )
+ {
+ if( newpos1 <= min_pos || --count < 0 ) { *ptr0 = *ptr1 = 0; break; }
+
+ if( e->been_flushed ) len = 0;
+ const int delta = pos1 - newpos1;
+ int32_t * const newptr = e->eb.mb.pos_array +
+ ( ( e->eb.mb.cyclic_pos - delta +
+ ( (e->eb.mb.cyclic_pos >= delta) ? 0 : e->eb.mb.dictionary_size + 1 ) ) << 1 );
+ if( data[len-delta] == data[len] )
+ {
+ while( ++len < len_limit && data[len-delta] == data[len] ) {}
+ if( pairs && maxlen < len )
+ {
+ pairs[num_pairs].dis = delta - 1;
+ pairs[num_pairs].len = maxlen = len;
+ ++num_pairs;
+ }
+ if( len >= len_limit )
+ {
+ *ptr0 = newptr[0];
+ *ptr1 = newptr[1];
+ break;
+ }
+ }
+ if( data[len-delta] < data[len] )
+ {
+ *ptr0 = newpos1;
+ ptr0 = newptr + 1;
+ newpos1 = *ptr0;
+ len0 = len; if( len1 < len ) len = len1;
+ }
+ else
+ {
+ *ptr1 = newpos1;
+ ptr1 = newptr;
+ newpos1 = *ptr1;
+ len1 = len; if( len0 < len ) len = len0;
+ }
+ }
+ return num_pairs;
+ }
+
+
+static void LZe_update_distance_prices( struct LZ_encoder * const e )
+ {
+ int dis, len_state;
+ for( dis = start_dis_model; dis < modeled_distances; ++dis )
+ {
+ const int dis_slot = dis_slots[dis];
+ const int direct_bits = ( dis_slot >> 1 ) - 1;
+ const int base = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
+ const int price = price_symbol_reversed( e->eb.bm_dis + ( base - dis_slot ),
+ dis - base, direct_bits );
+ for( len_state = 0; len_state < len_states; ++len_state )
+ e->dis_prices[len_state][dis] = price;
+ }
+
+ for( len_state = 0; len_state < len_states; ++len_state )
+ {
+ int * const dsp = e->dis_slot_prices[len_state];
+ const Bit_model * const bmds = e->eb.bm_dis_slot[len_state];
+ int slot = 0;
+ for( ; slot < end_dis_model; ++slot )
+ dsp[slot] = price_symbol6( bmds, slot );
+ for( ; slot < e->num_dis_slots; ++slot )
+ dsp[slot] = price_symbol6( bmds, slot ) +
+ (((( slot >> 1 ) - 1 ) - dis_align_bits ) << price_shift_bits );
+
+ int * const dp = e->dis_prices[len_state];
+ for( dis = 0; dis < start_dis_model; ++dis )
+ dp[dis] = dsp[dis];
+ for( ; dis < modeled_distances; ++dis )
+ dp[dis] += dsp[dis_slots[dis]];
+ }
+ }
+
+
+/* Return the number of bytes advanced (ahead).
+ trials[0]..trials[ahead-1] contain the steps to encode.
+ ( trials[0].dis4 == -1 ) means literal.
+ A match/rep longer or equal than match_len_limit finishes the sequence.
+*/
+static int LZe_sequence_optimizer( struct LZ_encoder * const e,
+ const int reps[num_rep_distances],
+ const State state )
+ {
+ int num_pairs, num_trials;
+ int i, rep, len;
+
+ if( e->pending_num_pairs > 0 ) /* from previous call */
+ {
+ num_pairs = e->pending_num_pairs;
+ e->pending_num_pairs = 0;
+ }
+ else
+ num_pairs = LZe_read_match_distances( e );
+ const int main_len = ( num_pairs > 0 ) ? e->pairs[num_pairs-1].len : 0;
+
+ int replens[num_rep_distances];
+ int rep_index = 0;
+ for( i = 0; i < num_rep_distances; ++i )
+ {
+ replens[i] = Mb_true_match_len( &e->eb.mb, 0, reps[i] + 1 );
+ if( replens[i] > replens[rep_index] ) rep_index = i;
+ }
+ if( replens[rep_index] >= e->match_len_limit )
+ {
+ e->trials[0].price = replens[rep_index];
+ e->trials[0].dis4 = rep_index;
+ if( !LZe_move_and_update( e, replens[rep_index] ) ) return 0;
+ return replens[rep_index];
+ }
+
+ if( main_len >= e->match_len_limit )
+ {
+ e->trials[0].price = main_len;
+ e->trials[0].dis4 = e->pairs[num_pairs-1].dis + num_rep_distances;
+ if( !LZe_move_and_update( e, main_len ) ) return 0;
+ return main_len;
+ }
+
+ const int pos_state = Mb_data_position( &e->eb.mb ) & pos_state_mask;
+ const uint8_t prev_byte = Mb_peek( &e->eb.mb, 1 );
+ const uint8_t cur_byte = Mb_peek( &e->eb.mb, 0 );
+ const uint8_t match_byte = Mb_peek( &e->eb.mb, reps[0] + 1 );
+
+ e->trials[1].price = price0( e->eb.bm_match[state][pos_state] );
+ if( St_is_char( state ) )
+ e->trials[1].price += LZeb_price_literal( &e->eb, prev_byte, cur_byte );
+ else
+ e->trials[1].price += LZeb_price_matched( &e->eb, prev_byte, cur_byte, match_byte );
+ e->trials[1].dis4 = -1; /* literal */
+
+ const int match_price = price1( e->eb.bm_match[state][pos_state] );
+ const int rep_match_price = match_price + price1( e->eb.bm_rep[state] );
+
+ if( match_byte == cur_byte )
+ Tr_update( &e->trials[1], rep_match_price +
+ LZeb_price_shortrep( &e->eb, state, pos_state ), 0, 0 );
+
+ num_trials = max( main_len, replens[rep_index] );
+
+ if( num_trials < min_match_len )
+ {
+ e->trials[0].price = 1;
+ e->trials[0].dis4 = e->trials[1].dis4;
+ if( !Mb_move_pos( &e->eb.mb ) ) return 0;
+ return 1;
+ }
+
+ e->trials[0].state = state;
+ for( i = 0; i < num_rep_distances; ++i )
+ e->trials[0].reps[i] = reps[i];
+
+ for( len = min_match_len; len <= num_trials; ++len )
+ e->trials[len].price = infinite_price;
+
+ for( rep = 0; rep < num_rep_distances; ++rep )
+ {
+ if( replens[rep] < min_match_len ) continue;
+ const int price = rep_match_price + LZeb_price_rep( &e->eb, rep, state, pos_state );
+ for( len = min_match_len; len <= replens[rep]; ++len )
+ Tr_update( &e->trials[len], price +
+ Lp_price( &e->rep_len_prices, len, pos_state ), rep, 0 );
+ }
+
+ if( main_len > replens[0] )
+ {
+ const int normal_match_price = match_price + price0( e->eb.bm_rep[state] );
+ int i = 0, len = max( replens[0] + 1, min_match_len );
+ while( len > e->pairs[i].len ) ++i;
+ while( true )
+ {
+ const int dis = e->pairs[i].dis;
+ Tr_update( &e->trials[len], normal_match_price +
+ LZe_price_pair( e, dis, len, pos_state ),
+ dis + num_rep_distances, 0 );
+ if( ++len > e->pairs[i].len && ++i >= num_pairs ) break;
+ }
+ }
+
+ int cur = 0;
+ while( true ) /* price optimization loop */
+ {
+ if( !Mb_move_pos( &e->eb.mb ) ) return 0;
+ if( ++cur >= num_trials ) /* no more initialized trials */
+ {
+ LZe_backward( e, cur );
+ return cur;
+ }
+
+ const int num_pairs = LZe_read_match_distances( e );
+ const int newlen = ( num_pairs > 0 ) ? e->pairs[num_pairs-1].len : 0;
+ if( newlen >= e->match_len_limit )
+ {
+ e->pending_num_pairs = num_pairs;
+ LZe_backward( e, cur );
+ return cur;
+ }
+
+ /* give final values to current trial */
+ struct Trial * cur_trial = &e->trials[cur];
+ State cur_state;
+ {
+ const int dis4 = cur_trial->dis4;
+ int prev_index = cur_trial->prev_index;
+ const int prev_index2 = cur_trial->prev_index2;
+
+ if( prev_index2 == single_step_trial )
+ {
+ cur_state = e->trials[prev_index].state;
+ if( prev_index + 1 == cur ) /* len == 1 */
+ {
+ if( dis4 == 0 ) cur_state = St_set_short_rep( cur_state );
+ else cur_state = St_set_char( cur_state ); /* literal */
+ }
+ else if( dis4 < num_rep_distances ) cur_state = St_set_rep( cur_state );
+ else cur_state = St_set_match( cur_state );
+ }
+ else
+ {
+ if( prev_index2 == dual_step_trial ) /* dis4 == 0 (rep0) */
+ --prev_index;
+ else /* prev_index2 >= 0 */
+ prev_index = prev_index2;
+ cur_state = St_set_char_rep();
+ }
+ cur_trial->state = cur_state;
+ for( i = 0; i < num_rep_distances; ++i )
+ cur_trial->reps[i] = e->trials[prev_index].reps[i];
+ mtf_reps( dis4, cur_trial->reps ); /* literal is ignored */
+ }
+
+ const int pos_state = Mb_data_position( &e->eb.mb ) & pos_state_mask;
+ const uint8_t prev_byte = Mb_peek( &e->eb.mb, 1 );
+ const uint8_t cur_byte = Mb_peek( &e->eb.mb, 0 );
+ const uint8_t match_byte = Mb_peek( &e->eb.mb, cur_trial->reps[0] + 1 );
+
+ int next_price = cur_trial->price +
+ price0( e->eb.bm_match[cur_state][pos_state] );
+ if( St_is_char( cur_state ) )
+ next_price += LZeb_price_literal( &e->eb, prev_byte, cur_byte );
+ else
+ next_price += LZeb_price_matched( &e->eb, prev_byte, cur_byte, match_byte );
+
+ /* try last updates to next trial */
+ struct Trial * next_trial = &e->trials[cur+1];
+
+ Tr_update( next_trial, next_price, -1, cur ); /* literal */
+
+ const int match_price = cur_trial->price + price1( e->eb.bm_match[cur_state][pos_state] );
+ const int rep_match_price = match_price + price1( e->eb.bm_rep[cur_state] );
+
+ if( match_byte == cur_byte && next_trial->dis4 != 0 &&
+ next_trial->prev_index2 == single_step_trial )
+ {
+ const int price = rep_match_price +
+ LZeb_price_shortrep( &e->eb, cur_state, pos_state );
+ if( price <= next_trial->price )
+ {
+ next_trial->price = price;
+ next_trial->dis4 = 0; /* rep0 */
+ next_trial->prev_index = cur;
+ }
+ }
+
+ const int triable_bytes =
+ min( Mb_available_bytes( &e->eb.mb ), max_num_trials - 1 - cur );
+ if( triable_bytes < min_match_len ) continue;
+
+ const int len_limit = min( e->match_len_limit, triable_bytes );
+
+ /* try literal + rep0 */
+ if( match_byte != cur_byte && next_trial->prev_index != cur )
+ {
+ const uint8_t * const data = Mb_ptr_to_current_pos( &e->eb.mb );
+ const int dis = cur_trial->reps[0] + 1;
+ const int limit = min( e->match_len_limit + 1, triable_bytes );
+ int len = 1;
+ while( len < limit && data[len-dis] == data[len] ) ++len;
+ if( --len >= min_match_len )
+ {
+ const int pos_state2 = ( pos_state + 1 ) & pos_state_mask;
+ const State state2 = St_set_char( cur_state );
+ const int price = next_price +
+ price1( e->eb.bm_match[state2][pos_state2] ) +
+ price1( e->eb.bm_rep[state2] ) +
+ LZe_price_rep0_len( e, len, state2, pos_state2 );
+ while( num_trials < cur + 1 + len )
+ e->trials[++num_trials].price = infinite_price;
+ Tr_update2( &e->trials[cur+1+len], price, cur + 1 );
+ }
+ }
+
+ int start_len = min_match_len;
+
+ /* try rep distances */
+ for( rep = 0; rep < num_rep_distances; ++rep )
+ {
+ const uint8_t * const data = Mb_ptr_to_current_pos( &e->eb.mb );
+ const int dis = cur_trial->reps[rep] + 1;
+
+ if( data[0-dis] != data[0] || data[1-dis] != data[1] ) continue;
+ for( len = min_match_len; len < len_limit; ++len )
+ if( data[len-dis] != data[len] ) break;
+ while( num_trials < cur + len )
+ e->trials[++num_trials].price = infinite_price;
+ int price = rep_match_price + LZeb_price_rep( &e->eb, rep, cur_state, pos_state );
+ for( i = min_match_len; i <= len; ++i )
+ Tr_update( &e->trials[cur+i], price +
+ Lp_price( &e->rep_len_prices, i, pos_state ), rep, cur );
+
+ if( rep == 0 ) start_len = len + 1; /* discard shorter matches */
+
+ /* try rep + literal + rep0 */
+ int len2 = len + 1;
+ const int limit = min( e->match_len_limit + len2, triable_bytes );
+ while( len2 < limit && data[len2-dis] == data[len2] ) ++len2;
+ len2 -= len + 1;
+ if( len2 < min_match_len ) continue;
+
+ int pos_state2 = ( pos_state + len ) & pos_state_mask;
+ State state2 = St_set_rep( cur_state );
+ price += Lp_price( &e->rep_len_prices, len, pos_state ) +
+ price0( e->eb.bm_match[state2][pos_state2] ) +
+ LZeb_price_matched( &e->eb, data[len-1], data[len], data[len-dis] );
+ pos_state2 = ( pos_state2 + 1 ) & pos_state_mask;
+ state2 = St_set_char( state2 );
+ price += price1( e->eb.bm_match[state2][pos_state2] ) +
+ price1( e->eb.bm_rep[state2] ) +
+ LZe_price_rep0_len( e, len2, state2, pos_state2 );
+ while( num_trials < cur + len + 1 + len2 )
+ e->trials[++num_trials].price = infinite_price;
+ Tr_update3( &e->trials[cur+len+1+len2], price, rep, cur + len + 1, cur );
+ }
+
+ /* try matches */
+ if( newlen >= start_len && newlen <= len_limit )
+ {
+ const int normal_match_price = match_price +
+ price0( e->eb.bm_rep[cur_state] );
+
+ while( num_trials < cur + newlen )
+ e->trials[++num_trials].price = infinite_price;
+
+ int i = 0;
+ while( e->pairs[i].len < start_len ) ++i;
+ int dis = e->pairs[i].dis;
+ for( len = start_len; ; ++len )
+ {
+ int price = normal_match_price + LZe_price_pair( e, dis, len, pos_state );
+ Tr_update( &e->trials[cur+len], price, dis + num_rep_distances, cur );
+
+ /* try match + literal + rep0 */
+ if( len == e->pairs[i].len )
+ {
+ const uint8_t * const data = Mb_ptr_to_current_pos( &e->eb.mb );
+ const int dis2 = dis + 1;
+ int len2 = len + 1;
+ const int limit = min( e->match_len_limit + len2, triable_bytes );
+ while( len2 < limit && data[len2-dis2] == data[len2] ) ++len2;
+ len2 -= len + 1;
+ if( len2 >= min_match_len )
+ {
+ int pos_state2 = ( pos_state + len ) & pos_state_mask;
+ State state2 = St_set_match( cur_state );
+ price += price0( e->eb.bm_match[state2][pos_state2] ) +
+ LZeb_price_matched( &e->eb, data[len-1], data[len], data[len-dis2] );
+ pos_state2 = ( pos_state2 + 1 ) & pos_state_mask;
+ state2 = St_set_char( state2 );
+ price += price1( e->eb.bm_match[state2][pos_state2] ) +
+ price1( e->eb.bm_rep[state2] ) +
+ LZe_price_rep0_len( e, len2, state2, pos_state2 );
+
+ while( num_trials < cur + len + 1 + len2 )
+ e->trials[++num_trials].price = infinite_price;
+ Tr_update3( &e->trials[cur+len+1+len2], price,
+ dis + num_rep_distances, cur + len + 1, cur );
+ }
+ if( ++i >= num_pairs ) break;
+ dis = e->pairs[i].dis;
+ }
+ }
+ }
+ }
+ }
+
+
+static bool LZe_encode_member( struct LZ_encoder * const e )
+ {
+ const bool best = ( e->match_len_limit > 12 );
+ const int dis_price_count = best ? 1 : 512;
+ const int align_price_count = best ? 1 : dis_align_size;
+ const int price_count = ( e->match_len_limit > 36 ) ? 1013 : 4093;
+ int i;
+ State * const state = &e->eb.state;
+
+ if( e->eb.member_finished ) return true;
+ if( Re_member_position( &e->eb.renc ) >= e->eb.member_size_limit )
+ { LZeb_try_full_flush( &e->eb ); return true; }
+
+ if( Mb_data_position( &e->eb.mb ) == 0 &&
+ !Mb_data_finished( &e->eb.mb ) ) /* encode first byte */
+ {
+ if( !Mb_enough_available_bytes( &e->eb.mb ) ||
+ !Re_enough_free_bytes( &e->eb.renc ) ) return true;
+ const uint8_t prev_byte = 0;
+ const uint8_t cur_byte = Mb_peek( &e->eb.mb, 0 );
+ Re_encode_bit( &e->eb.renc, &e->eb.bm_match[*state][0], 0 );
+ LZeb_encode_literal( &e->eb, prev_byte, cur_byte );
+ CRC32_update_byte( &e->eb.crc, cur_byte );
+ LZe_get_match_pairs( e, 0 );
+ if( !Mb_move_pos( &e->eb.mb ) ) return false;
+ }
+
+ while( !Mb_data_finished( &e->eb.mb ) )
+ {
+ if( !Mb_enough_available_bytes( &e->eb.mb ) ||
+ !Re_enough_free_bytes( &e->eb.renc ) ) return true;
+ if( e->price_counter <= 0 && e->pending_num_pairs == 0 )
+ {
+ e->price_counter = price_count; /* recalculate prices every these bytes */
+ if( e->dis_price_counter <= 0 )
+ { e->dis_price_counter = dis_price_count; LZe_update_distance_prices( e ); }
+ if( e->align_price_counter <= 0 )
+ {
+ e->align_price_counter = align_price_count;
+ for( i = 0; i < dis_align_size; ++i )
+ e->align_prices[i] = price_symbol_reversed( e->eb.bm_align, i, dis_align_bits );
+ }
+ Lp_update_prices( &e->match_len_prices );
+ Lp_update_prices( &e->rep_len_prices );
+ }
+
+ int ahead = LZe_sequence_optimizer( e, e->eb.reps, *state );
+ e->price_counter -= ahead;
+
+ for( i = 0; ahead > 0; )
+ {
+ const int pos_state =
+ ( Mb_data_position( &e->eb.mb ) - ahead ) & pos_state_mask;
+ const int len = e->trials[i].price;
+ int dis = e->trials[i].dis4;
+
+ bool bit = ( dis < 0 );
+ Re_encode_bit( &e->eb.renc, &e->eb.bm_match[*state][pos_state], !bit );
+ if( bit ) /* literal byte */
+ {
+ const uint8_t prev_byte = Mb_peek( &e->eb.mb, ahead + 1 );
+ const uint8_t cur_byte = Mb_peek( &e->eb.mb, ahead );
+ CRC32_update_byte( &e->eb.crc, cur_byte );
+ if( ( *state = St_set_char( *state ) ) < 4 )
+ LZeb_encode_literal( &e->eb, prev_byte, cur_byte );
+ else
+ {
+ const uint8_t match_byte = Mb_peek( &e->eb.mb, ahead + e->eb.reps[0] + 1 );
+ LZeb_encode_matched( &e->eb, prev_byte, cur_byte, match_byte );
+ }
+ }
+ else /* match or repeated match */
+ {
+ CRC32_update_buf( &e->eb.crc, Mb_ptr_to_current_pos( &e->eb.mb ) - ahead, len );
+ mtf_reps( dis, e->eb.reps );
+ bit = ( dis < num_rep_distances );
+ Re_encode_bit( &e->eb.renc, &e->eb.bm_rep[*state], bit );
+ if( bit ) /* repeated match */
+ {
+ bit = ( dis == 0 );
+ Re_encode_bit( &e->eb.renc, &e->eb.bm_rep0[*state], !bit );
+ if( bit )
+ Re_encode_bit( &e->eb.renc, &e->eb.bm_len[*state][pos_state], len > 1 );
+ else
+ {
+ Re_encode_bit( &e->eb.renc, &e->eb.bm_rep1[*state], dis > 1 );
+ if( dis > 1 )
+ Re_encode_bit( &e->eb.renc, &e->eb.bm_rep2[*state], dis > 2 );
+ }
+ if( len == 1 ) *state = St_set_short_rep( *state );
+ else
+ {
+ Re_encode_len( &e->eb.renc, &e->eb.rep_len_model, len, pos_state );
+ Lp_decrement_counter( &e->rep_len_prices, pos_state );
+ *state = St_set_rep( *state );
+ }
+ }
+ else /* match */
+ {
+ dis -= num_rep_distances;
+ LZeb_encode_pair( &e->eb, dis, len, pos_state );
+ if( dis >= modeled_distances ) --e->align_price_counter;
+ --e->dis_price_counter;
+ Lp_decrement_counter( &e->match_len_prices, pos_state );
+ *state = St_set_match( *state );
+ }
+ }
+ ahead -= len; i += len;
+ if( Re_member_position( &e->eb.renc ) >= e->eb.member_size_limit )
+ {
+ if( !Mb_dec_pos( &e->eb.mb, ahead ) ) return false;
+ LZeb_try_full_flush( &e->eb );
+ return true;
+ }
+ }
+ }
+ LZeb_try_full_flush( &e->eb );
+ return true;
+ }
diff --git a/encoder.h b/encoder.h
new file mode 100644
index 0000000..edac815
--- /dev/null
+++ b/encoder.h
@@ -0,0 +1,326 @@
+/* Lzlib - Compression library for the lzip format
+ Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+ This library is free software. Redistribution and use in source and
+ binary forms, with or without modification, are permitted provided
+ that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions, and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions, and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+struct Len_prices
+ {
+ const struct Len_model * lm;
+ int len_symbols;
+ int count;
+ int prices[pos_states][max_len_symbols];
+ int counters[pos_states]; /* may decrement below 0 */
+ };
+
+static inline void Lp_update_low_mid_prices( struct Len_prices * const lp,
+ const int pos_state )
+ {
+ int * const pps = lp->prices[pos_state];
+ int tmp = price0( lp->lm->choice1 );
+ int len = 0;
+ for( ; len < len_low_symbols && len < lp->len_symbols; ++len )
+ pps[len] = tmp + price_symbol3( lp->lm->bm_low[pos_state], len );
+ if( len >= lp->len_symbols ) return;
+ tmp = price1( lp->lm->choice1 ) + price0( lp->lm->choice2 );
+ for( ; len < len_low_symbols + len_mid_symbols && len < lp->len_symbols; ++len )
+ pps[len] = tmp +
+ price_symbol3( lp->lm->bm_mid[pos_state], len - len_low_symbols );
+ }
+
+static inline void Lp_update_high_prices( struct Len_prices * const lp )
+ {
+ const int tmp = price1( lp->lm->choice1 ) + price1( lp->lm->choice2 );
+ int len;
+ for( len = len_low_symbols + len_mid_symbols; len < lp->len_symbols; ++len )
+ /* using 4 slots per value makes "Lp_price" faster */
+ lp->prices[3][len] = lp->prices[2][len] =
+ lp->prices[1][len] = lp->prices[0][len] = tmp +
+ price_symbol8( lp->lm->bm_high, len - len_low_symbols - len_mid_symbols );
+ }
+
+static inline void Lp_reset( struct Len_prices * const lp )
+ { int i; for( i = 0; i < pos_states; ++i ) lp->counters[i] = 0; }
+
+static inline void Lp_init( struct Len_prices * const lp,
+ const struct Len_model * const lm,
+ const int match_len_limit )
+ {
+ lp->lm = lm;
+ lp->len_symbols = match_len_limit + 1 - min_match_len;
+ lp->count = ( match_len_limit > 12 ) ? 1 : lp->len_symbols;
+ Lp_reset( lp );
+ }
+
+static inline void Lp_decrement_counter( struct Len_prices * const lp,
+ const int pos_state )
+ { --lp->counters[pos_state]; }
+
+static inline void Lp_update_prices( struct Len_prices * const lp )
+ {
+ int pos_state;
+ bool high_pending = false;
+ for( pos_state = 0; pos_state < pos_states; ++pos_state )
+ if( lp->counters[pos_state] <= 0 )
+ { lp->counters[pos_state] = lp->count;
+ Lp_update_low_mid_prices( lp, pos_state ); high_pending = true; }
+ if( high_pending && lp->len_symbols > len_low_symbols + len_mid_symbols )
+ Lp_update_high_prices( lp );
+ }
+
+static inline int Lp_price( const struct Len_prices * const lp,
+ const int len, const int pos_state )
+ { return lp->prices[pos_state][len - min_match_len]; }
+
+
+struct Pair /* distance-length pair */
+ {
+ int dis;
+ int len;
+ };
+
+enum { infinite_price = 0x0FFFFFFF,
+ max_num_trials = 1 << 13,
+ single_step_trial = -2,
+ dual_step_trial = -1 };
+
+struct Trial
+ {
+ State state;
+ int price; /* dual use var; cumulative price, match length */
+ int dis4; /* -1 for literal, or rep, or match distance + 4 */
+ int prev_index; /* index of prev trial in trials[] */
+ int prev_index2; /* -2 trial is single step */
+ /* -1 literal + rep0 */
+ /* >= 0 ( rep or match ) + literal + rep0 */
+ int reps[num_rep_distances];
+ };
+
+static inline void Tr_update( struct Trial * const trial, const int pr,
+ const int distance4, const int p_i )
+ {
+ if( pr < trial->price )
+ { trial->price = pr; trial->dis4 = distance4; trial->prev_index = p_i;
+ trial->prev_index2 = single_step_trial; }
+ }
+
+static inline void Tr_update2( struct Trial * const trial, const int pr,
+ const int p_i )
+ {
+ if( pr < trial->price )
+ { trial->price = pr; trial->dis4 = 0; trial->prev_index = p_i;
+ trial->prev_index2 = dual_step_trial; }
+ }
+
+static inline void Tr_update3( struct Trial * const trial, const int pr,
+ const int distance4, const int p_i,
+ const int p_i2 )
+ {
+ if( pr < trial->price )
+ { trial->price = pr; trial->dis4 = distance4; trial->prev_index = p_i;
+ trial->prev_index2 = p_i2; }
+ }
+
+
+struct LZ_encoder
+ {
+ struct LZ_encoder_base eb;
+ int cycles;
+ int match_len_limit;
+ struct Len_prices match_len_prices;
+ struct Len_prices rep_len_prices;
+ int pending_num_pairs;
+ struct Pair pairs[max_match_len+1];
+ struct Trial trials[max_num_trials];
+
+ int dis_slot_prices[len_states][2*max_dictionary_bits];
+ int dis_prices[len_states][modeled_distances];
+ int align_prices[dis_align_size];
+ int num_dis_slots;
+ int price_counter; /* counters may decrement below 0 */
+ int dis_price_counter;
+ int align_price_counter;
+ bool been_flushed;
+ };
+
+static inline bool Mb_dec_pos( struct Matchfinder_base * const mb,
+ const int ahead )
+ {
+ if( ahead < 0 || mb->pos < ahead ) return false;
+ mb->pos -= ahead;
+ if( mb->cyclic_pos < ahead ) mb->cyclic_pos += mb->dictionary_size + 1;
+ mb->cyclic_pos -= ahead;
+ return true;
+ }
+
+static int LZe_get_match_pairs( struct LZ_encoder * const e, struct Pair * pairs );
+
+ /* move-to-front dis in/into reps; do nothing if( dis4 <= 0 ) */
+static inline void mtf_reps( const int dis4, int reps[num_rep_distances] )
+ {
+ if( dis4 >= num_rep_distances ) /* match */
+ {
+ reps[3] = reps[2]; reps[2] = reps[1]; reps[1] = reps[0];
+ reps[0] = dis4 - num_rep_distances;
+ }
+ else if( dis4 > 0 ) /* repeated match */
+ {
+ const int distance = reps[dis4];
+ int i; for( i = dis4; i > 0; --i ) reps[i] = reps[i-1];
+ reps[0] = distance;
+ }
+ }
+
+static inline int LZeb_price_shortrep( const struct LZ_encoder_base * const eb,
+ const State state, const int pos_state )
+ {
+ return price0( eb->bm_rep0[state] ) + price0( eb->bm_len[state][pos_state] );
+ }
+
+static inline int LZeb_price_rep( const struct LZ_encoder_base * const eb,
+ const int rep, const State state,
+ const int pos_state )
+ {
+ if( rep == 0 ) return price0( eb->bm_rep0[state] ) +
+ price1( eb->bm_len[state][pos_state] );
+ int price = price1( eb->bm_rep0[state] );
+ if( rep == 1 )
+ price += price0( eb->bm_rep1[state] );
+ else
+ {
+ price += price1( eb->bm_rep1[state] );
+ price += price_bit( eb->bm_rep2[state], rep - 2 );
+ }
+ return price;
+ }
+
+static inline int LZe_price_rep0_len( const struct LZ_encoder * const e,
+ const int len, const State state,
+ const int pos_state )
+ {
+ return LZeb_price_rep( &e->eb, 0, state, pos_state ) +
+ Lp_price( &e->rep_len_prices, len, pos_state );
+ }
+
+static inline int LZe_price_pair( const struct LZ_encoder * const e,
+ const int dis, const int len,
+ const int pos_state )
+ {
+ const int price = Lp_price( &e->match_len_prices, len, pos_state );
+ const int len_state = get_len_state( len );
+ if( dis < modeled_distances )
+ return price + e->dis_prices[len_state][dis];
+ else
+ return price + e->dis_slot_prices[len_state][get_slot( dis )] +
+ e->align_prices[dis & (dis_align_size - 1)];
+ }
+
+static inline int LZe_read_match_distances( struct LZ_encoder * const e )
+ {
+ const int num_pairs = LZe_get_match_pairs( e, e->pairs );
+ if( num_pairs > 0 )
+ {
+ const int len = e->pairs[num_pairs-1].len;
+ if( len == e->match_len_limit && len < max_match_len )
+ e->pairs[num_pairs-1].len =
+ Mb_true_match_len( &e->eb.mb, len, e->pairs[num_pairs-1].dis + 1 );
+ }
+ return num_pairs;
+ }
+
+static inline bool LZe_move_and_update( struct LZ_encoder * const e, int n )
+ {
+ while( true )
+ {
+ if( !Mb_move_pos( &e->eb.mb ) ) return false;
+ if( --n <= 0 ) break;
+ LZe_get_match_pairs( e, 0 );
+ }
+ return true;
+ }
+
+static inline void LZe_backward( struct LZ_encoder * const e, int cur )
+ {
+ int dis4 = e->trials[cur].dis4;
+ while( cur > 0 )
+ {
+ const int prev_index = e->trials[cur].prev_index;
+ struct Trial * const prev_trial = &e->trials[prev_index];
+
+ if( e->trials[cur].prev_index2 != single_step_trial )
+ {
+ prev_trial->dis4 = -1; /* literal */
+ prev_trial->prev_index = prev_index - 1;
+ prev_trial->prev_index2 = single_step_trial;
+ if( e->trials[cur].prev_index2 >= 0 )
+ {
+ struct Trial * const prev_trial2 = &e->trials[prev_index-1];
+ prev_trial2->dis4 = dis4; dis4 = 0; /* rep0 */
+ prev_trial2->prev_index = e->trials[cur].prev_index2;
+ prev_trial2->prev_index2 = single_step_trial;
+ }
+ }
+ prev_trial->price = cur - prev_index; /* len */
+ cur = dis4; dis4 = prev_trial->dis4; prev_trial->dis4 = cur;
+ cur = prev_index;
+ }
+ }
+
+enum { num_prev_positions3 = 1 << 16,
+ num_prev_positions2 = 1 << 10 };
+
+static inline bool LZe_init( struct LZ_encoder * const e,
+ const int dict_size, const int len_limit,
+ const unsigned long long member_size )
+ {
+ enum { before_size = max_num_trials,
+ /* bytes to keep in buffer after pos */
+ after_size = max_num_trials + ( 2 * max_match_len ) + 1,
+ dict_factor = 2,
+ num_prev_positions23 = num_prev_positions2 + num_prev_positions3,
+ pos_array_factor = 2,
+ min_free_bytes = 2 * max_num_trials };
+
+ if( !LZeb_init( &e->eb, before_size, dict_size, after_size, dict_factor,
+ num_prev_positions23, pos_array_factor, min_free_bytes,
+ member_size ) ) return false;
+ e->cycles = ( len_limit < max_match_len ) ? 16 + ( len_limit / 2 ) : 256;
+ e->match_len_limit = len_limit;
+ Lp_init( &e->match_len_prices, &e->eb.match_len_model, e->match_len_limit );
+ Lp_init( &e->rep_len_prices, &e->eb.rep_len_model, e->match_len_limit );
+ e->pending_num_pairs = 0;
+ e->num_dis_slots = 2 * real_bits( e->eb.mb.dictionary_size - 1 );
+ e->trials[1].prev_index = 0;
+ e->trials[1].prev_index2 = single_step_trial;
+ e->price_counter = 0;
+ e->dis_price_counter = 0;
+ e->align_price_counter = 0;
+ e->been_flushed = false;
+ return true;
+ }
+
+static inline void LZe_reset( struct LZ_encoder * const e,
+ const unsigned long long member_size )
+ {
+ LZeb_reset( &e->eb, member_size );
+ Lp_reset( &e->match_len_prices );
+ Lp_reset( &e->rep_len_prices );
+ e->pending_num_pairs = 0;
+ e->price_counter = 0;
+ e->dis_price_counter = 0;
+ e->align_price_counter = 0;
+ e->been_flushed = false;
+ }
diff --git a/encoder_base.c b/encoder_base.c
new file mode 100644
index 0000000..047f372
--- /dev/null
+++ b/encoder_base.c
@@ -0,0 +1,194 @@
+/* Lzlib - Compression library for the lzip format
+ Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+ This library is free software. Redistribution and use in source and
+ binary forms, with or without modification, are permitted provided
+ that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions, and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions, and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+static bool Mb_normalize_pos( struct Matchfinder_base * const mb )
+ {
+ if( mb->pos > mb->stream_pos )
+ { mb->pos = mb->stream_pos; return false; }
+ if( !mb->at_stream_end )
+ {
+ int i;
+ /* offset is int32_t for the min below */
+ const int32_t offset = mb->pos - mb->before_size - mb->dictionary_size;
+ const int size = mb->stream_pos - offset;
+ memmove( mb->buffer, mb->buffer + offset, size );
+ mb->partial_data_pos += offset;
+ mb->pos -= offset; /* pos = before_size + dictionary_size */
+ mb->stream_pos -= offset;
+ for( i = 0; i < mb->num_prev_positions; ++i )
+ mb->prev_positions[i] -= min( mb->prev_positions[i], offset );
+ for( i = 0; i < mb->pos_array_size; ++i )
+ mb->pos_array[i] -= min( mb->pos_array[i], offset );
+ }
+ return true;
+ }
+
+
+static bool Mb_init( struct Matchfinder_base * const mb, const int before_size,
+ const int dict_size, const int after_size,
+ const int dict_factor, const int num_prev_positions23,
+ const int pos_array_factor )
+ {
+ const int buffer_size_limit =
+ ( dict_factor * dict_size ) + before_size + after_size;
+ int i;
+
+ mb->partial_data_pos = 0;
+ mb->before_size = before_size;
+ mb->after_size = after_size;
+ mb->pos = 0;
+ mb->cyclic_pos = 0;
+ mb->stream_pos = 0;
+ mb->num_prev_positions23 = num_prev_positions23;
+ mb->at_stream_end = false;
+ mb->sync_flush_pending = false;
+
+ mb->buffer_size = max( 65536, buffer_size_limit );
+ mb->buffer = (uint8_t *)malloc( mb->buffer_size );
+ if( !mb->buffer ) return false;
+ mb->saved_dictionary_size = dict_size;
+ mb->dictionary_size = dict_size;
+ mb->pos_limit = mb->buffer_size - after_size;
+ unsigned size = 1 << max( 16, real_bits( mb->dictionary_size - 1 ) - 2 );
+ if( mb->dictionary_size > 1 << 26 ) size >>= 1; /* 64 MiB */
+ mb->key4_mask = size - 1; /* increases with dictionary size */
+ size += num_prev_positions23;
+ mb->num_prev_positions = size;
+
+ mb->pos_array_size = pos_array_factor * ( mb->dictionary_size + 1 );
+ size += mb->pos_array_size;
+ if( size * sizeof mb->prev_positions[0] <= size ) mb->prev_positions = 0;
+ else mb->prev_positions =
+ (int32_t *)malloc( size * sizeof mb->prev_positions[0] );
+ if( !mb->prev_positions ) { free( mb->buffer ); return false; }
+ mb->pos_array = mb->prev_positions + mb->num_prev_positions;
+ for( i = 0; i < mb->num_prev_positions; ++i ) mb->prev_positions[i] = 0;
+ return true;
+ }
+
+
+static void Mb_adjust_array( struct Matchfinder_base * const mb )
+ {
+ int size = 1 << max( 16, real_bits( mb->dictionary_size - 1 ) - 2 );
+ if( mb->dictionary_size > 1 << 26 ) size >>= 1; /* 64 MiB */
+ mb->key4_mask = size - 1;
+ size += mb->num_prev_positions23;
+ mb->num_prev_positions = size;
+ mb->pos_array = mb->prev_positions + mb->num_prev_positions;
+ }
+
+
+static void Mb_adjust_dictionary_size( struct Matchfinder_base * const mb )
+ {
+ if( mb->stream_pos < mb->dictionary_size )
+ {
+ mb->dictionary_size = max( min_dictionary_size, mb->stream_pos );
+ Mb_adjust_array( mb );
+ mb->pos_limit = mb->buffer_size;
+ }
+ }
+
+
+static void Mb_reset( struct Matchfinder_base * const mb )
+ {
+ int i;
+ if( mb->stream_pos > mb->pos )
+ memmove( mb->buffer, mb->buffer + mb->pos, mb->stream_pos - mb->pos );
+ mb->partial_data_pos = 0;
+ mb->stream_pos -= mb->pos;
+ mb->pos = 0;
+ mb->cyclic_pos = 0;
+ mb->at_stream_end = false;
+ mb->sync_flush_pending = false;
+ mb->dictionary_size = mb->saved_dictionary_size;
+ Mb_adjust_array( mb );
+ mb->pos_limit = mb->buffer_size - mb->after_size;
+ for( i = 0; i < mb->num_prev_positions; ++i ) mb->prev_positions[i] = 0;
+ }
+
+
+/* End Of Stream marker => (dis == 0xFFFFFFFFU, len == min_match_len) */
+static void LZeb_try_full_flush( struct LZ_encoder_base * const eb )
+ {
+ if( eb->member_finished ||
+ Cb_free_bytes( &eb->renc.cb ) < max_marker_size + eb->renc.ff_count + Lt_size )
+ return;
+ eb->member_finished = true;
+ const int pos_state = Mb_data_position( &eb->mb ) & pos_state_mask;
+ const State state = eb->state;
+ Re_encode_bit( &eb->renc, &eb->bm_match[state][pos_state], 1 );
+ Re_encode_bit( &eb->renc, &eb->bm_rep[state], 0 );
+ LZeb_encode_pair( eb, 0xFFFFFFFFU, min_match_len, pos_state );
+ Re_flush( &eb->renc );
+ Lzip_trailer trailer;
+ Lt_set_data_crc( trailer, LZeb_crc( eb ) );
+ Lt_set_data_size( trailer, Mb_data_position( &eb->mb ) );
+ Lt_set_member_size( trailer, Re_member_position( &eb->renc ) + Lt_size );
+ int i; for( i = 0; i < Lt_size; ++i ) Cb_put_byte( &eb->renc.cb, trailer[i] );
+ }
+
+
+/* Sync Flush marker => (dis == 0xFFFFFFFFU, len == min_match_len + 1) */
+static void LZeb_try_sync_flush( struct LZ_encoder_base * const eb )
+ {
+ const unsigned min_size = eb->renc.ff_count + max_marker_size;
+ if( eb->member_finished ||
+ Cb_free_bytes( &eb->renc.cb ) < min_size + max_marker_size ) return;
+ eb->mb.sync_flush_pending = false;
+ const unsigned long long old_mpos = Re_member_position( &eb->renc );
+ const int pos_state = Mb_data_position( &eb->mb ) & pos_state_mask;
+ const State state = eb->state;
+ do { /* size of markers must be >= rd_min_available_bytes + 5 */
+ Re_encode_bit( &eb->renc, &eb->bm_match[state][pos_state], 1 );
+ Re_encode_bit( &eb->renc, &eb->bm_rep[state], 0 );
+ LZeb_encode_pair( eb, 0xFFFFFFFFU, min_match_len + 1, pos_state );
+ Re_flush( &eb->renc );
+ }
+ while( Re_member_position( &eb->renc ) - old_mpos < min_size );
+ }
+
+
+static void LZeb_reset( struct LZ_encoder_base * const eb,
+ const unsigned long long member_size )
+ {
+ const unsigned long long min_member_size = min_dictionary_size;
+ const unsigned long long max_member_size = 0x0008000000000000ULL; /* 2 PiB */
+ int i;
+ Mb_reset( &eb->mb );
+ eb->member_size_limit =
+ min( max( min_member_size, member_size ), max_member_size ) -
+ Lt_size - max_marker_size;
+ eb->crc = 0xFFFFFFFFU;
+ Bm_array_init( eb->bm_literal[0], (1 << literal_context_bits) * 0x300 );
+ Bm_array_init( eb->bm_match[0], states * pos_states );
+ Bm_array_init( eb->bm_rep, states );
+ Bm_array_init( eb->bm_rep0, states );
+ Bm_array_init( eb->bm_rep1, states );
+ Bm_array_init( eb->bm_rep2, states );
+ Bm_array_init( eb->bm_len[0], states * pos_states );
+ Bm_array_init( eb->bm_dis_slot[0], len_states * (1 << dis_slot_bits) );
+ Bm_array_init( eb->bm_dis, modeled_distances - end_dis_model + 1 );
+ Bm_array_init( eb->bm_align, dis_align_size );
+ Lm_init( &eb->match_len_model );
+ Lm_init( &eb->rep_len_model );
+ Re_reset( &eb->renc, eb->mb.dictionary_size );
+ for( i = 0; i < num_rep_distances; ++i ) eb->reps[i] = 0;
+ eb->state = 0;
+ eb->member_finished = false;
+ }
diff --git a/encoder_base.h b/encoder_base.h
new file mode 100644
index 0000000..094f679
--- /dev/null
+++ b/encoder_base.h
@@ -0,0 +1,609 @@
+/* Lzlib - Compression library for the lzip format
+ Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+ This library is free software. Redistribution and use in source and
+ binary forms, with or without modification, are permitted provided
+ that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions, and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions, and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+enum { price_shift_bits = 6,
+ price_step_bits = 2 };
+
+static const uint8_t dis_slots[1<<10] =
+ {
+ 0, 1, 2, 3, 4, 4, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7,
+ 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9,
+ 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
+ 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,
+ 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12,
+ 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12,
+ 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,
+ 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,
+ 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,
+ 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,
+ 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,
+ 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,
+ 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
+ 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
+ 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
+ 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
+ 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
+ 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
+ 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
+ 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
+ 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
+ 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
+ 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
+ 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
+ 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19 };
+
+static inline uint8_t get_slot( const unsigned dis )
+ {
+ if( dis < (1 << 10) ) return dis_slots[dis];
+ if( dis < (1 << 19) ) return dis_slots[dis>> 9] + 18;
+ if( dis < (1 << 28) ) return dis_slots[dis>>18] + 36;
+ return dis_slots[dis>>27] + 54;
+ }
+
+
+static const short prob_prices[bit_model_total >> price_step_bits] =
+{
+640, 539, 492, 461, 438, 419, 404, 390, 379, 369, 359, 351, 343, 336, 330, 323,
+318, 312, 307, 302, 298, 293, 289, 285, 281, 277, 274, 270, 267, 264, 261, 258,
+255, 252, 250, 247, 244, 242, 239, 237, 235, 232, 230, 228, 226, 224, 222, 220,
+218, 216, 214, 213, 211, 209, 207, 206, 204, 202, 201, 199, 198, 196, 195, 193,
+192, 190, 189, 188, 186, 185, 184, 182, 181, 180, 178, 177, 176, 175, 174, 172,
+171, 170, 169, 168, 167, 166, 165, 164, 163, 162, 161, 159, 158, 157, 157, 156,
+155, 154, 153, 152, 151, 150, 149, 148, 147, 146, 145, 145, 144, 143, 142, 141,
+140, 140, 139, 138, 137, 136, 136, 135, 134, 133, 133, 132, 131, 130, 130, 129,
+128, 127, 127, 126, 125, 125, 124, 123, 123, 122, 121, 121, 120, 119, 119, 118,
+117, 117, 116, 115, 115, 114, 114, 113, 112, 112, 111, 111, 110, 109, 109, 108,
+108, 107, 106, 106, 105, 105, 104, 104, 103, 103, 102, 101, 101, 100, 100, 99,
+ 99, 98, 98, 97, 97, 96, 96, 95, 95, 94, 94, 93, 93, 92, 92, 91,
+ 91, 90, 90, 89, 89, 88, 88, 88, 87, 87, 86, 86, 85, 85, 84, 84,
+ 83, 83, 83, 82, 82, 81, 81, 80, 80, 80, 79, 79, 78, 78, 77, 77,
+ 77, 76, 76, 75, 75, 75, 74, 74, 73, 73, 73, 72, 72, 71, 71, 71,
+ 70, 70, 70, 69, 69, 68, 68, 68, 67, 67, 67, 66, 66, 65, 65, 65,
+ 64, 64, 64, 63, 63, 63, 62, 62, 61, 61, 61, 60, 60, 60, 59, 59,
+ 59, 58, 58, 58, 57, 57, 57, 56, 56, 56, 55, 55, 55, 54, 54, 54,
+ 53, 53, 53, 53, 52, 52, 52, 51, 51, 51, 50, 50, 50, 49, 49, 49,
+ 48, 48, 48, 48, 47, 47, 47, 46, 46, 46, 45, 45, 45, 45, 44, 44,
+ 44, 43, 43, 43, 43, 42, 42, 42, 41, 41, 41, 41, 40, 40, 40, 40,
+ 39, 39, 39, 38, 38, 38, 38, 37, 37, 37, 37, 36, 36, 36, 35, 35,
+ 35, 35, 34, 34, 34, 34, 33, 33, 33, 33, 32, 32, 32, 32, 31, 31,
+ 31, 31, 30, 30, 30, 30, 29, 29, 29, 29, 28, 28, 28, 28, 27, 27,
+ 27, 27, 26, 26, 26, 26, 26, 25, 25, 25, 25, 24, 24, 24, 24, 23,
+ 23, 23, 23, 22, 22, 22, 22, 22, 21, 21, 21, 21, 20, 20, 20, 20,
+ 20, 19, 19, 19, 19, 18, 18, 18, 18, 18, 17, 17, 17, 17, 17, 16,
+ 16, 16, 16, 15, 15, 15, 15, 15, 14, 14, 14, 14, 14, 13, 13, 13,
+ 13, 13, 12, 12, 12, 12, 12, 11, 11, 11, 11, 10, 10, 10, 10, 10,
+ 9, 9, 9, 9, 9, 9, 8, 8, 8, 8, 8, 7, 7, 7, 7, 7,
+ 6, 6, 6, 6, 6, 5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4,
+ 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1 };
+
+static inline int get_price( const int probability )
+ { return prob_prices[probability >> price_step_bits]; }
+
+
+static inline int price0( const Bit_model probability )
+ { return get_price( probability ); }
+
+static inline int price1( const Bit_model probability )
+ { return get_price( bit_model_total - probability ); }
+
+static inline int price_bit( const Bit_model bm, const bool bit )
+ { return bit ? price1( bm ) : price0( bm ); }
+
+
+static inline int price_symbol3( const Bit_model bm[], int symbol )
+ {
+ bool bit = symbol & 1;
+ symbol |= 8; symbol >>= 1;
+ int price = price_bit( bm[symbol], bit );
+ bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+ return price + price_bit( bm[1], symbol & 1 );
+ }
+
+
+static inline int price_symbol6( const Bit_model bm[], unsigned symbol )
+ {
+ bool bit = symbol & 1;
+ symbol |= 64; symbol >>= 1;
+ int price = price_bit( bm[symbol], bit );
+ bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+ bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+ bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+ bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+ return price + price_bit( bm[1], symbol & 1 );
+ }
+
+
+static inline int price_symbol8( const Bit_model bm[], int symbol )
+ {
+ bool bit = symbol & 1;
+ symbol |= 0x100; symbol >>= 1;
+ int price = price_bit( bm[symbol], bit );
+ bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+ bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+ bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+ bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+ bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+ bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+ return price + price_bit( bm[1], symbol & 1 );
+ }
+
+
+static inline int price_symbol_reversed( const Bit_model bm[], int symbol,
+ const int num_bits )
+ {
+ int price = 0;
+ int model = 1;
+ int i;
+ for( i = num_bits; i > 0; --i )
+ {
+ const bool bit = symbol & 1;
+ symbol >>= 1;
+ price += price_bit( bm[model], bit );
+ model <<= 1; model |= bit;
+ }
+ return price;
+ }
+
+
+static inline int price_matched( const Bit_model bm[], unsigned symbol,
+ unsigned match_byte )
+ {
+ int price = 0;
+ unsigned mask = 0x100;
+ symbol |= mask;
+ while( true )
+ {
+ const unsigned match_bit = ( match_byte <<= 1 ) & mask;
+ const bool bit = ( symbol <<= 1 ) & 0x100;
+ price += price_bit( bm[(symbol>>9)+match_bit+mask], bit );
+ if( symbol >= 0x10000 ) return price;
+ mask &= ~(match_bit ^ symbol); /* if( match_bit != bit ) mask = 0; */
+ }
+ }
+
+
+struct Matchfinder_base
+ {
+ unsigned long long partial_data_pos;
+ uint8_t * buffer; /* input buffer */
+ int32_t * prev_positions; /* 1 + last seen position of key. else 0 */
+ int32_t * pos_array; /* may be tree or chain */
+ int before_size; /* bytes to keep in buffer before dictionary */
+ int after_size; /* bytes to keep in buffer after pos */
+ int buffer_size;
+ int dictionary_size; /* bytes to keep in buffer before pos */
+ int pos; /* current pos in buffer */
+ int cyclic_pos; /* cycles through [0, dictionary_size] */
+ int stream_pos; /* first byte not yet read from file */
+ int pos_limit; /* when reached, a new block must be read */
+ int key4_mask;
+ int num_prev_positions23;
+ int num_prev_positions; /* size of prev_positions */
+ int pos_array_size;
+ int saved_dictionary_size; /* dictionary_size restored by Mb_reset */
+ bool at_stream_end; /* stream_pos shows real end of file */
+ bool sync_flush_pending;
+ };
+
+static bool Mb_normalize_pos( struct Matchfinder_base * const mb );
+
+static bool Mb_init( struct Matchfinder_base * const mb, const int before_size,
+ const int dict_size, const int after_size,
+ const int dict_factor, const int num_prev_positions23,
+ const int pos_array_factor );
+
+static inline void Mb_free( struct Matchfinder_base * const mb )
+ { free( mb->prev_positions ); free( mb->buffer ); }
+
+static inline uint8_t Mb_peek( const struct Matchfinder_base * const mb,
+ const int distance )
+ { return mb->buffer[mb->pos-distance]; }
+
+static inline int Mb_available_bytes( const struct Matchfinder_base * const mb )
+ { return mb->stream_pos - mb->pos; }
+
+static inline unsigned long long
+Mb_data_position( const struct Matchfinder_base * const mb )
+ { return mb->partial_data_pos + mb->pos; }
+
+static inline void Mb_finish( struct Matchfinder_base * const mb )
+ { mb->at_stream_end = true; mb->sync_flush_pending = false; }
+
+static inline bool Mb_data_finished( const struct Matchfinder_base * const mb )
+ { return mb->at_stream_end && mb->pos >= mb->stream_pos; }
+
+static inline bool Mb_flushing_or_end( const struct Matchfinder_base * const mb )
+ { return mb->at_stream_end || mb->sync_flush_pending; }
+
+static inline int Mb_free_bytes( const struct Matchfinder_base * const mb )
+ { if( Mb_flushing_or_end( mb ) ) return 0;
+ return mb->buffer_size - mb->stream_pos; }
+
+static inline bool
+Mb_enough_available_bytes( const struct Matchfinder_base * const mb )
+ { return mb->pos + mb->after_size <= mb->stream_pos ||
+ ( Mb_flushing_or_end( mb ) && mb->pos < mb->stream_pos ); }
+
+static inline const uint8_t *
+Mb_ptr_to_current_pos( const struct Matchfinder_base * const mb )
+ { return mb->buffer + mb->pos; }
+
+static int Mb_write_data( struct Matchfinder_base * const mb,
+ const uint8_t * const inbuf, const int size )
+ {
+ const int sz = min( mb->buffer_size - mb->stream_pos, size );
+ if( Mb_flushing_or_end( mb ) || sz <= 0 ) return 0;
+ memcpy( mb->buffer + mb->stream_pos, inbuf, sz );
+ mb->stream_pos += sz;
+ return sz;
+ }
+
+static inline int Mb_true_match_len( const struct Matchfinder_base * const mb,
+ const int index, const int distance )
+ {
+ const uint8_t * const data = mb->buffer + mb->pos;
+ int i = index;
+ const int len_limit = min( Mb_available_bytes( mb ), max_match_len );
+ while( i < len_limit && data[i-distance] == data[i] ) ++i;
+ return i;
+ }
+
+static inline bool Mb_move_pos( struct Matchfinder_base * const mb )
+ {
+ if( ++mb->cyclic_pos > mb->dictionary_size ) mb->cyclic_pos = 0;
+ if( ++mb->pos >= mb->pos_limit ) return Mb_normalize_pos( mb );
+ return true;
+ }
+
+
+struct Range_encoder
+ {
+ struct Circular_buffer cb;
+ unsigned min_free_bytes;
+ uint64_t low;
+ unsigned long long partial_member_pos;
+ uint32_t range;
+ unsigned ff_count;
+ uint8_t cache;
+ Lzip_header header;
+ };
+
+static inline void Re_shift_low( struct Range_encoder * const renc )
+ {
+ if( renc->low >> 24 != 0xFF )
+ {
+ const bool carry = ( renc->low > 0xFFFFFFFFU );
+ Cb_put_byte( &renc->cb, renc->cache + carry );
+ for( ; renc->ff_count > 0; --renc->ff_count )
+ Cb_put_byte( &renc->cb, 0xFF + carry );
+ renc->cache = renc->low >> 24;
+ }
+ else ++renc->ff_count;
+ renc->low = ( renc->low & 0x00FFFFFFU ) << 8;
+ }
+
+static inline void Re_reset( struct Range_encoder * const renc,
+ const unsigned dictionary_size )
+ {
+ Cb_reset( &renc->cb );
+ renc->low = 0;
+ renc->partial_member_pos = 0;
+ renc->range = 0xFFFFFFFFU;
+ renc->ff_count = 0;
+ renc->cache = 0;
+ Lh_set_dictionary_size( renc->header, dictionary_size );
+ int i; for( i = 0; i < Lh_size; ++i ) Cb_put_byte( &renc->cb, renc->header[i] );
+ }
+
+static inline bool Re_init( struct Range_encoder * const renc,
+ const unsigned dictionary_size,
+ const unsigned min_free_bytes )
+ {
+ if( !Cb_init( &renc->cb, 65536 + min_free_bytes ) ) return false;
+ renc->min_free_bytes = min_free_bytes;
+ Lh_set_magic( renc->header );
+ Re_reset( renc, dictionary_size );
+ return true;
+ }
+
+static inline void Re_free( struct Range_encoder * const renc )
+ { Cb_free( &renc->cb ); }
+
+static inline unsigned long long
+Re_member_position( const struct Range_encoder * const renc )
+ { return renc->partial_member_pos + Cb_used_bytes( &renc->cb ) + renc->ff_count; }
+
+static inline bool Re_enough_free_bytes( const struct Range_encoder * const renc )
+ { return Cb_free_bytes( &renc->cb ) >= renc->min_free_bytes + renc->ff_count; }
+
+static inline int Re_read_data( struct Range_encoder * const renc,
+ uint8_t * const out_buffer, const int out_size )
+ {
+ const int size = Cb_read_data( &renc->cb, out_buffer, out_size );
+ if( size > 0 ) renc->partial_member_pos += size;
+ return size;
+ }
+
+static inline void Re_flush( struct Range_encoder * const renc )
+ {
+ int i; for( i = 0; i < 5; ++i ) Re_shift_low( renc );
+ renc->low = 0;
+ renc->range = 0xFFFFFFFFU;
+ renc->ff_count = 0;
+ renc->cache = 0;
+ }
+
+static inline void Re_encode( struct Range_encoder * const renc,
+ const int symbol, const int num_bits )
+ {
+ unsigned mask;
+ for( mask = 1 << ( num_bits - 1 ); mask > 0; mask >>= 1 )
+ {
+ renc->range >>= 1;
+ if( symbol & mask ) renc->low += renc->range;
+ if( renc->range <= 0x00FFFFFFU ) { renc->range <<= 8; Re_shift_low( renc ); }
+ }
+ }
+
+static inline void Re_encode_bit( struct Range_encoder * const renc,
+ Bit_model * const probability, const bool bit )
+ {
+ const uint32_t bound = ( renc->range >> bit_model_total_bits ) * *probability;
+ if( !bit )
+ {
+ renc->range = bound;
+ *probability += (bit_model_total - *probability) >> bit_model_move_bits;
+ }
+ else
+ {
+ renc->low += bound;
+ renc->range -= bound;
+ *probability -= *probability >> bit_model_move_bits;
+ }
+ if( renc->range <= 0x00FFFFFFU ) { renc->range <<= 8; Re_shift_low( renc ); }
+ }
+
+static inline void Re_encode_tree3( struct Range_encoder * const renc,
+ Bit_model bm[], const int symbol )
+ {
+ bool bit = ( symbol >> 2 ) & 1;
+ Re_encode_bit( renc, &bm[1], bit );
+ int model = 2 | bit;
+ bit = ( symbol >> 1 ) & 1;
+ Re_encode_bit( renc, &bm[model], bit ); model <<= 1; model |= bit;
+ Re_encode_bit( renc, &bm[model], symbol & 1 );
+ }
+
+static inline void Re_encode_tree6( struct Range_encoder * const renc,
+ Bit_model bm[], const unsigned symbol )
+ {
+ bool bit = ( symbol >> 5 ) & 1;
+ Re_encode_bit( renc, &bm[1], bit );
+ int model = 2 | bit;
+ bit = ( symbol >> 4 ) & 1;
+ Re_encode_bit( renc, &bm[model], bit ); model <<= 1; model |= bit;
+ bit = ( symbol >> 3 ) & 1;
+ Re_encode_bit( renc, &bm[model], bit ); model <<= 1; model |= bit;
+ bit = ( symbol >> 2 ) & 1;
+ Re_encode_bit( renc, &bm[model], bit ); model <<= 1; model |= bit;
+ bit = ( symbol >> 1 ) & 1;
+ Re_encode_bit( renc, &bm[model], bit ); model <<= 1; model |= bit;
+ Re_encode_bit( renc, &bm[model], symbol & 1 );
+ }
+
+static inline void Re_encode_tree8( struct Range_encoder * const renc,
+ Bit_model bm[], const int symbol )
+ {
+ int model = 1;
+ int i;
+ for( i = 7; i >= 0; --i )
+ {
+ const bool bit = ( symbol >> i ) & 1;
+ Re_encode_bit( renc, &bm[model], bit );
+ model <<= 1; model |= bit;
+ }
+ }
+
+static inline void Re_encode_tree_reversed( struct Range_encoder * const renc,
+ Bit_model bm[], int symbol, const int num_bits )
+ {
+ int model = 1;
+ int i;
+ for( i = num_bits; i > 0; --i )
+ {
+ const bool bit = symbol & 1;
+ symbol >>= 1;
+ Re_encode_bit( renc, &bm[model], bit );
+ model <<= 1; model |= bit;
+ }
+ }
+
+static inline void Re_encode_matched( struct Range_encoder * const renc,
+ Bit_model bm[], unsigned symbol,
+ unsigned match_byte )
+ {
+ unsigned mask = 0x100;
+ symbol |= mask;
+ while( true )
+ {
+ const unsigned match_bit = ( match_byte <<= 1 ) & mask;
+ const bool bit = ( symbol <<= 1 ) & 0x100;
+ Re_encode_bit( renc, &bm[(symbol>>9)+match_bit+mask], bit );
+ if( symbol >= 0x10000 ) break;
+ mask &= ~(match_bit ^ symbol); /* if( match_bit != bit ) mask = 0; */
+ }
+ }
+
+static inline void Re_encode_len( struct Range_encoder * const renc,
+ struct Len_model * const lm,
+ int symbol, const int pos_state )
+ {
+ bool bit = ( ( symbol -= min_match_len ) >= len_low_symbols );
+ Re_encode_bit( renc, &lm->choice1, bit );
+ if( !bit )
+ Re_encode_tree3( renc, lm->bm_low[pos_state], symbol );
+ else
+ {
+ bit = ( ( symbol -= len_low_symbols ) >= len_mid_symbols );
+ Re_encode_bit( renc, &lm->choice2, bit );
+ if( !bit )
+ Re_encode_tree3( renc, lm->bm_mid[pos_state], symbol );
+ else
+ Re_encode_tree8( renc, lm->bm_high, symbol - len_mid_symbols );
+ }
+ }
+
+
+enum { max_marker_size = 16,
+ num_rep_distances = 4 }; /* must be 4 */
+
+struct LZ_encoder_base
+ {
+ struct Matchfinder_base mb;
+ unsigned long long member_size_limit;
+ uint32_t crc;
+
+ Bit_model bm_literal[1<<literal_context_bits][0x300];
+ Bit_model bm_match[states][pos_states];
+ Bit_model bm_rep[states];
+ Bit_model bm_rep0[states];
+ Bit_model bm_rep1[states];
+ Bit_model bm_rep2[states];
+ Bit_model bm_len[states][pos_states];
+ Bit_model bm_dis_slot[len_states][1<<dis_slot_bits];
+ Bit_model bm_dis[modeled_distances-end_dis_model+1];
+ Bit_model bm_align[dis_align_size];
+ struct Len_model match_len_model;
+ struct Len_model rep_len_model;
+ struct Range_encoder renc;
+ int reps[num_rep_distances];
+ State state;
+ bool member_finished;
+ };
+
+static void LZeb_reset( struct LZ_encoder_base * const eb,
+ const unsigned long long member_size );
+
+static inline bool LZeb_init( struct LZ_encoder_base * const eb,
+ const int before_size, const int dict_size,
+ const int after_size, const int dict_factor,
+ const int num_prev_positions23,
+ const int pos_array_factor,
+ const unsigned min_free_bytes,
+ const unsigned long long member_size )
+ {
+ if( !Mb_init( &eb->mb, before_size, dict_size, after_size, dict_factor,
+ num_prev_positions23, pos_array_factor ) ) return false;
+ if( !Re_init( &eb->renc, eb->mb.dictionary_size, min_free_bytes ) )
+ return false;
+ LZeb_reset( eb, member_size );
+ return true;
+ }
+
+static inline bool LZeb_member_finished( const struct LZ_encoder_base * const eb )
+ { return eb->member_finished && Cb_empty( &eb->renc.cb ); }
+
+static inline void LZeb_free( struct LZ_encoder_base * const eb )
+ { Re_free( &eb->renc ); Mb_free( &eb->mb ); }
+
+static inline unsigned LZeb_crc( const struct LZ_encoder_base * const eb )
+ { return eb->crc ^ 0xFFFFFFFFU; }
+
+static inline int LZeb_price_literal( const struct LZ_encoder_base * const eb,
+ const uint8_t prev_byte, const uint8_t symbol )
+ { return price_symbol8( eb->bm_literal[get_lit_state(prev_byte)], symbol ); }
+
+static inline int LZeb_price_matched( const struct LZ_encoder_base * const eb,
+ const uint8_t prev_byte, const uint8_t symbol, const uint8_t match_byte )
+ { return price_matched( eb->bm_literal[get_lit_state(prev_byte)], symbol,
+ match_byte ); }
+
+static inline void LZeb_encode_literal( struct LZ_encoder_base * const eb,
+ const uint8_t prev_byte, const uint8_t symbol )
+ { Re_encode_tree8( &eb->renc, eb->bm_literal[get_lit_state(prev_byte)], symbol ); }
+
+static inline void LZeb_encode_matched( struct LZ_encoder_base * const eb,
+ const uint8_t prev_byte, const uint8_t symbol, const uint8_t match_byte )
+ { Re_encode_matched( &eb->renc, eb->bm_literal[get_lit_state(prev_byte)],
+ symbol, match_byte ); }
+
+static inline void LZeb_encode_pair( struct LZ_encoder_base * const eb,
+ const unsigned dis, const int len,
+ const int pos_state )
+ {
+ Re_encode_len( &eb->renc, &eb->match_len_model, len, pos_state );
+ const unsigned dis_slot = get_slot( dis );
+ Re_encode_tree6( &eb->renc, eb->bm_dis_slot[get_len_state(len)], dis_slot );
+
+ if( dis_slot >= start_dis_model )
+ {
+ const int direct_bits = ( dis_slot >> 1 ) - 1;
+ const unsigned base = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
+ const unsigned direct_dis = dis - base;
+
+ if( dis_slot < end_dis_model )
+ Re_encode_tree_reversed( &eb->renc, eb->bm_dis + ( base - dis_slot ),
+ direct_dis, direct_bits );
+ else
+ {
+ Re_encode( &eb->renc, direct_dis >> dis_align_bits,
+ direct_bits - dis_align_bits );
+ Re_encode_tree_reversed( &eb->renc, eb->bm_align, direct_dis, dis_align_bits );
+ }
+ }
+ }
diff --git a/fast_encoder.c b/fast_encoder.c
new file mode 100644
index 0000000..bb6363a
--- /dev/null
+++ b/fast_encoder.c
@@ -0,0 +1,175 @@
+/* Lzlib - Compression library for the lzip format
+ Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+ This library is free software. Redistribution and use in source and
+ binary forms, with or without modification, are permitted provided
+ that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions, and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions, and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+static int FLZe_longest_match_len( struct FLZ_encoder * const fe, int * const distance )
+ {
+ enum { len_limit = 16 };
+ int32_t * ptr0 = fe->eb.mb.pos_array + fe->eb.mb.cyclic_pos;
+ const int available = min( Mb_available_bytes( &fe->eb.mb ), max_match_len );
+ if( available < len_limit ) { *ptr0 = 0; return 0; }
+
+ const uint8_t * const data = Mb_ptr_to_current_pos( &fe->eb.mb );
+ fe->key4 = ( ( fe->key4 << 4 ) ^ data[3] ) & fe->eb.mb.key4_mask;
+ const int pos1 = fe->eb.mb.pos + 1;
+ int newpos1 = fe->eb.mb.prev_positions[fe->key4];
+ fe->eb.mb.prev_positions[fe->key4] = pos1;
+ int maxlen = 0, count;
+
+ for( count = 4; ; )
+ {
+ int delta;
+ if( newpos1 <= 0 || --count < 0 ||
+ ( delta = pos1 - newpos1 ) > fe->eb.mb.dictionary_size )
+ { *ptr0 = 0; break; }
+ int32_t * const newptr = fe->eb.mb.pos_array +
+ ( fe->eb.mb.cyclic_pos - delta +
+ ( ( fe->eb.mb.cyclic_pos >= delta ) ? 0 : fe->eb.mb.dictionary_size + 1 ) );
+
+ if( data[maxlen-delta] == data[maxlen] )
+ {
+ int len = 0;
+ while( len < available && data[len-delta] == data[len] ) ++len;
+ if( maxlen < len )
+ { maxlen = len; *distance = delta - 1;
+ if( maxlen >= len_limit ) { *ptr0 = *newptr; break; } }
+ }
+
+ *ptr0 = newpos1;
+ ptr0 = newptr;
+ newpos1 = *ptr0;
+ }
+ return maxlen;
+ }
+
+
+static bool FLZe_encode_member( struct FLZ_encoder * const fe )
+ {
+ int rep = 0, i;
+ State * const state = &fe->eb.state;
+
+ if( fe->eb.member_finished ) return true;
+ if( Re_member_position( &fe->eb.renc ) >= fe->eb.member_size_limit )
+ { LZeb_try_full_flush( &fe->eb ); return true; }
+
+ if( Mb_data_position( &fe->eb.mb ) == 0 &&
+ !Mb_data_finished( &fe->eb.mb ) ) /* encode first byte */
+ {
+ if( !Mb_enough_available_bytes( &fe->eb.mb ) ||
+ !Re_enough_free_bytes( &fe->eb.renc ) ) return true;
+ const uint8_t prev_byte = 0;
+ const uint8_t cur_byte = Mb_peek( &fe->eb.mb, 0 );
+ Re_encode_bit( &fe->eb.renc, &fe->eb.bm_match[*state][0], 0 );
+ LZeb_encode_literal( &fe->eb, prev_byte, cur_byte );
+ CRC32_update_byte( &fe->eb.crc, cur_byte );
+ FLZe_reset_key4( fe );
+ if( !FLZe_update_and_move( fe, 1 ) ) return false;
+ }
+
+ while( !Mb_data_finished( &fe->eb.mb ) &&
+ Re_member_position( &fe->eb.renc ) < fe->eb.member_size_limit )
+ {
+ if( !Mb_enough_available_bytes( &fe->eb.mb ) ||
+ !Re_enough_free_bytes( &fe->eb.renc ) ) return true;
+ int match_distance = 0; /* avoid warning from gcc 6.1.0 */
+ const int main_len = FLZe_longest_match_len( fe, &match_distance );
+ const int pos_state = Mb_data_position( &fe->eb.mb ) & pos_state_mask;
+ int len = 0;
+
+ for( i = 0; i < num_rep_distances; ++i )
+ {
+ const int tlen = Mb_true_match_len( &fe->eb.mb, 0, fe->eb.reps[i] + 1 );
+ if( tlen > len ) { len = tlen; rep = i; }
+ }
+ if( len > min_match_len && len + 3 > main_len )
+ {
+ CRC32_update_buf( &fe->eb.crc, Mb_ptr_to_current_pos( &fe->eb.mb ), len );
+ Re_encode_bit( &fe->eb.renc, &fe->eb.bm_match[*state][pos_state], 1 );
+ Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep[*state], 1 );
+ Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep0[*state], rep != 0 );
+ if( rep == 0 )
+ Re_encode_bit( &fe->eb.renc, &fe->eb.bm_len[*state][pos_state], 1 );
+ else
+ {
+ Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep1[*state], rep > 1 );
+ if( rep > 1 )
+ Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep2[*state], rep > 2 );
+ const int distance = fe->eb.reps[rep];
+ for( i = rep; i > 0; --i ) fe->eb.reps[i] = fe->eb.reps[i-1];
+ fe->eb.reps[0] = distance;
+ }
+ *state = St_set_rep( *state );
+ Re_encode_len( &fe->eb.renc, &fe->eb.rep_len_model, len, pos_state );
+ if( !Mb_move_pos( &fe->eb.mb ) ) return false;
+ if( !FLZe_update_and_move( fe, len - 1 ) ) return false;
+ continue;
+ }
+
+ if( main_len > min_match_len )
+ {
+ CRC32_update_buf( &fe->eb.crc, Mb_ptr_to_current_pos( &fe->eb.mb ), main_len );
+ Re_encode_bit( &fe->eb.renc, &fe->eb.bm_match[*state][pos_state], 1 );
+ Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep[*state], 0 );
+ *state = St_set_match( *state );
+ for( i = num_rep_distances - 1; i > 0; --i ) fe->eb.reps[i] = fe->eb.reps[i-1];
+ fe->eb.reps[0] = match_distance;
+ LZeb_encode_pair( &fe->eb, match_distance, main_len, pos_state );
+ if( !Mb_move_pos( &fe->eb.mb ) ) return false;
+ if( !FLZe_update_and_move( fe, main_len - 1 ) ) return false;
+ continue;
+ }
+
+ const uint8_t prev_byte = Mb_peek( &fe->eb.mb, 1 );
+ const uint8_t cur_byte = Mb_peek( &fe->eb.mb, 0 );
+ const uint8_t match_byte = Mb_peek( &fe->eb.mb, fe->eb.reps[0] + 1 );
+ if( !Mb_move_pos( &fe->eb.mb ) ) return false;
+ CRC32_update_byte( &fe->eb.crc, cur_byte );
+
+ if( match_byte == cur_byte )
+ {
+ const int short_rep_price = price1( fe->eb.bm_match[*state][pos_state] ) +
+ price1( fe->eb.bm_rep[*state] ) +
+ price0( fe->eb.bm_rep0[*state] ) +
+ price0( fe->eb.bm_len[*state][pos_state] );
+ int price = price0( fe->eb.bm_match[*state][pos_state] );
+ if( St_is_char( *state ) )
+ price += LZeb_price_literal( &fe->eb, prev_byte, cur_byte );
+ else
+ price += LZeb_price_matched( &fe->eb, prev_byte, cur_byte, match_byte );
+ if( short_rep_price < price )
+ {
+ Re_encode_bit( &fe->eb.renc, &fe->eb.bm_match[*state][pos_state], 1 );
+ Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep[*state], 1 );
+ Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep0[*state], 0 );
+ Re_encode_bit( &fe->eb.renc, &fe->eb.bm_len[*state][pos_state], 0 );
+ *state = St_set_short_rep( *state );
+ continue;
+ }
+ }
+
+ /* literal byte */
+ Re_encode_bit( &fe->eb.renc, &fe->eb.bm_match[*state][pos_state], 0 );
+ if( ( *state = St_set_char( *state ) ) < 4 )
+ LZeb_encode_literal( &fe->eb, prev_byte, cur_byte );
+ else
+ LZeb_encode_matched( &fe->eb, prev_byte, cur_byte, match_byte );
+ }
+
+ LZeb_try_full_flush( &fe->eb );
+ return true;
+ }
diff --git a/fast_encoder.h b/fast_encoder.h
new file mode 100644
index 0000000..b9421f4
--- /dev/null
+++ b/fast_encoder.h
@@ -0,0 +1,70 @@
+/* Lzlib - Compression library for the lzip format
+ Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+ This library is free software. Redistribution and use in source and
+ binary forms, with or without modification, are permitted provided
+ that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions, and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions, and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+struct FLZ_encoder
+ {
+ struct LZ_encoder_base eb;
+ unsigned key4; /* key made from latest 4 bytes */
+ };
+
+static inline void FLZe_reset_key4( struct FLZ_encoder * const fe )
+ {
+ int i;
+ fe->key4 = 0;
+ for( i = 0; i < 3 && i < Mb_available_bytes( &fe->eb.mb ); ++i )
+ fe->key4 = ( fe->key4 << 4 ) ^ fe->eb.mb.buffer[i];
+ }
+
+static inline bool FLZe_update_and_move( struct FLZ_encoder * const fe, int n )
+ {
+ struct Matchfinder_base * const mb = &fe->eb.mb;
+ while( --n >= 0 )
+ {
+ if( Mb_available_bytes( mb ) >= 4 )
+ {
+ fe->key4 = ( ( fe->key4 << 4 ) ^ mb->buffer[mb->pos+3] ) & mb->key4_mask;
+ mb->pos_array[mb->cyclic_pos] = mb->prev_positions[fe->key4];
+ mb->prev_positions[fe->key4] = mb->pos + 1;
+ }
+ else mb->pos_array[mb->cyclic_pos] = 0;
+ if( !Mb_move_pos( mb ) ) return false;
+ }
+ return true;
+ }
+
+static inline bool FLZe_init( struct FLZ_encoder * const fe,
+ const unsigned long long member_size )
+ {
+ enum { before_size = 0,
+ dict_size = 65536,
+ /* bytes to keep in buffer after pos */
+ after_size = max_match_len,
+ dict_factor = 16,
+ min_free_bytes = max_marker_size,
+ num_prev_positions23 = 0,
+ pos_array_factor = 1 };
+
+ return LZeb_init( &fe->eb, before_size, dict_size, after_size, dict_factor,
+ num_prev_positions23, pos_array_factor, min_free_bytes,
+ member_size );
+ }
+
+static inline void FLZe_reset( struct FLZ_encoder * const fe,
+ const unsigned long long member_size )
+ { LZeb_reset( &fe->eb, member_size ); }
diff --git a/ffexample.c b/ffexample.c
new file mode 100644
index 0000000..826abcd
--- /dev/null
+++ b/ffexample.c
@@ -0,0 +1,300 @@
+/* File to file example - Test program for the library lzlib
+ Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+ This program is free software: you have unlimited permission
+ to copy, distribute, and modify it.
+
+ Try 'ffexample -h' for usage information.
+
+ This program is an example of how file-to-file
+ compression/decompression can be implemented using lzlib.
+*/
+
+#define _FILE_OFFSET_BITS 64
+
+#include <errno.h>
+#include <limits.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__
+#include <fcntl.h>
+#include <io.h>
+#endif
+
+#include "lzlib.h"
+
+#ifndef min
+ #define min(x,y) ((x) <= (y) ? (x) : (y))
+#endif
+
+
+static void show_help( void )
+ {
+ printf( "ffexample is an example program showing how file-to-file (de)compression can\n"
+ "be implemented using lzlib. The content of infile is compressed,\n"
+ "decompressed, or both, and then written to outfile.\n"
+ "\nUsage: ffexample operation [infile [outfile]]\n" );
+ printf( "\nOperation:\n"
+ " -h display this help and exit\n"
+ " -c compress infile to outfile\n"
+ " -d decompress infile to outfile\n"
+ " -b both (compress then decompress) infile to outfile\n"
+ " -m compress (multimember) infile to outfile\n"
+ " -l compress (1 member per line) infile to outfile\n"
+ " -r decompress with resync if data error or leading garbage\n"
+ "\nIf infile or outfile are omitted, or are specified as '-', standard input or\n"
+ "standard output are used in their place respectively.\n"
+ "\nReport bugs to lzip-bug@nongnu.org\n"
+ "Lzlib home page: http://www.nongnu.org/lzip/lzlib.html\n" );
+ }
+
+
+int ffcompress( struct LZ_Encoder * const encoder,
+ FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384 };
+ uint8_t buffer[buffer_size];
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_compress_write_size( encoder ) );
+ if( size > 0 )
+ {
+ len = fread( buffer, 1, size, infile );
+ ret = LZ_compress_write( encoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) ) LZ_compress_finish( encoder );
+ }
+ ret = LZ_compress_read( encoder, buffer, buffer_size );
+ if( ret < 0 ) break;
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_compress_finished( encoder ) == 1 ) return 0;
+ }
+ return 1;
+ }
+
+
+int ffdecompress( struct LZ_Decoder * const decoder,
+ FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384 };
+ uint8_t buffer[buffer_size];
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_decompress_write_size( decoder ) );
+ if( size > 0 )
+ {
+ len = fread( buffer, 1, size, infile );
+ ret = LZ_decompress_write( decoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) ) LZ_decompress_finish( decoder );
+ }
+ ret = LZ_decompress_read( decoder, buffer, buffer_size );
+ if( ret < 0 ) break;
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_decompress_finished( decoder ) == 1 ) return 0;
+ }
+ return 1;
+ }
+
+
+int ffboth( struct LZ_Encoder * const encoder,
+ struct LZ_Decoder * const decoder,
+ FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384 };
+ uint8_t buffer[buffer_size];
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_compress_write_size( encoder ) );
+ if( size > 0 )
+ {
+ len = fread( buffer, 1, size, infile );
+ ret = LZ_compress_write( encoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) ) LZ_compress_finish( encoder );
+ }
+ size = min( buffer_size, LZ_decompress_write_size( decoder ) );
+ if( size > 0 )
+ {
+ ret = LZ_compress_read( encoder, buffer, size );
+ if( ret < 0 ) break;
+ ret = LZ_decompress_write( decoder, buffer, ret );
+ if( ret < 0 ) break;
+ if( LZ_compress_finished( encoder ) == 1 )
+ LZ_decompress_finish( decoder );
+ }
+ ret = LZ_decompress_read( decoder, buffer, buffer_size );
+ if( ret < 0 ) break;
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_decompress_finished( decoder ) == 1 ) return 0;
+ }
+ return 1;
+ }
+
+
+int ffmmcompress( FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384, member_size = 4096 };
+ uint8_t buffer[buffer_size];
+ bool done = false;
+ struct LZ_Encoder * const encoder =
+ LZ_compress_open( 65535, 16, member_size );
+ if( !encoder || LZ_compress_errno( encoder ) != LZ_ok )
+ { fputs( "ffexample: Not enough memory.\n", stderr );
+ LZ_compress_close( encoder ); return 1; }
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_compress_write_size( encoder ) );
+ if( size > 0 )
+ {
+ len = fread( buffer, 1, size, infile );
+ ret = LZ_compress_write( encoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) ) LZ_compress_finish( encoder );
+ }
+ ret = LZ_compress_read( encoder, buffer, buffer_size );
+ if( ret < 0 ) break;
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_compress_member_finished( encoder ) == 1 )
+ {
+ if( LZ_compress_finished( encoder ) == 1 ) { done = true; break; }
+ if( LZ_compress_restart_member( encoder, member_size ) < 0 ) break;
+ }
+ }
+ if( LZ_compress_close( encoder ) < 0 ) done = false;
+ return done;
+ }
+
+
+/* Compress 'infile' to 'outfile' as a multimember stream with one member
+ for each line of text terminated by a newline character or by EOF.
+ Return 0 if success, 1 if error.
+*/
+int fflfcompress( struct LZ_Encoder * const encoder,
+ FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384 };
+ uint8_t buffer[buffer_size];
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_compress_write_size( encoder ) );
+ if( size > 0 )
+ {
+ for( len = 0; len < size; )
+ {
+ int ch = getc( infile );
+ if( ch == EOF || ( buffer[len++] = ch ) == '\n' ) break;
+ }
+ /* avoid writing an empty member to outfile */
+ if( len == 0 && LZ_compress_data_position( encoder ) == 0 ) return 0;
+ ret = LZ_compress_write( encoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) || buffer[len-1] == '\n' )
+ LZ_compress_finish( encoder );
+ }
+ ret = LZ_compress_read( encoder, buffer, buffer_size );
+ if( ret < 0 ) break;
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_compress_member_finished( encoder ) == 1 )
+ {
+ if( feof( infile ) && LZ_compress_finished( encoder ) == 1 ) return 0;
+ if( LZ_compress_restart_member( encoder, INT64_MAX ) < 0 ) break;
+ }
+ }
+ return 1;
+ }
+
+
+/* Decompress 'infile' to 'outfile' with automatic resynchronization to
+ next member in case of data error, including the automatic removal of
+ leading garbage.
+*/
+int ffrsdecompress( struct LZ_Decoder * const decoder,
+ FILE * const infile, FILE * const outfile )
+ {
+ enum { buffer_size = 16384 };
+ uint8_t buffer[buffer_size];
+ while( true )
+ {
+ int len, ret;
+ int size = min( buffer_size, LZ_decompress_write_size( decoder ) );
+ if( size > 0 )
+ {
+ len = fread( buffer, 1, size, infile );
+ ret = LZ_decompress_write( decoder, buffer, len );
+ if( ret < 0 || ferror( infile ) ) break;
+ if( feof( infile ) ) LZ_decompress_finish( decoder );
+ }
+ ret = LZ_decompress_read( decoder, buffer, buffer_size );
+ if( ret < 0 )
+ {
+ if( LZ_decompress_errno( decoder ) == LZ_header_error ||
+ LZ_decompress_errno( decoder ) == LZ_data_error )
+ { LZ_decompress_sync_to_member( decoder ); continue; }
+ break;
+ }
+ len = fwrite( buffer, 1, ret, outfile );
+ if( len < ret ) break;
+ if( LZ_decompress_finished( decoder ) == 1 ) return 0;
+ }
+ return 1;
+ }
+
+
+int main( const int argc, const char * const argv[] )
+ {
+#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__
+ setmode( STDIN_FILENO, O_BINARY );
+ setmode( STDOUT_FILENO, O_BINARY );
+#endif
+
+ struct LZ_Encoder * const encoder = LZ_compress_open( 65535, 16, INT64_MAX );
+ struct LZ_Decoder * const decoder = LZ_decompress_open();
+ FILE * const infile = ( argc >= 3 && strcmp( argv[2], "-" ) != 0 ) ?
+ fopen( argv[2], "rb" ) : stdin;
+ FILE * const outfile = ( argc >= 4 && strcmp( argv[3], "-" ) != 0 ) ?
+ fopen( argv[3], "wb" ) : stdout;
+ int retval;
+
+ if( argc < 2 || argc > 4 || strlen( argv[1] ) != 2 || argv[1][0] != '-' )
+ { show_help(); return 1; }
+ if( !encoder || LZ_compress_errno( encoder ) != LZ_ok ||
+ !decoder || LZ_decompress_errno( decoder ) != LZ_ok )
+ { fputs( "ffexample: Not enough memory.\n", stderr );
+ LZ_compress_close( encoder ); LZ_decompress_close( decoder ); return 1; }
+ if( !infile )
+ { fprintf( stderr, "ffexample: %s: Can't open input file: %s\n",
+ argv[2], strerror( errno ) ); return 1; }
+ if( !outfile )
+ { fprintf( stderr, "ffexample: %s: Can't open output file: %s\n",
+ argv[3], strerror( errno ) ); return 1; }
+
+ switch( argv[1][1] )
+ {
+ case 'c': retval = ffcompress( encoder, infile, outfile ); break;
+ case 'd': retval = ffdecompress( decoder, infile, outfile ); break;
+ case 'b': retval = ffboth( encoder, decoder, infile, outfile ); break;
+ case 'm': retval = ffmmcompress( infile, outfile ); break;
+ case 'l': retval = fflfcompress( encoder, infile, outfile ); break;
+ case 'r': retval = ffrsdecompress( decoder, infile, outfile ); break;
+ default: show_help(); return argv[1][1] != 'h';
+ }
+
+ if( LZ_decompress_close( decoder ) < 0 || LZ_compress_close( encoder ) < 0 ||
+ fclose( outfile ) != 0 || fclose( infile ) != 0 ) retval = 1;
+ return retval;
+ }
diff --git a/lzcheck.c b/lzcheck.c
new file mode 100644
index 0000000..4f4bf9f
--- /dev/null
+++ b/lzcheck.c
@@ -0,0 +1,400 @@
+/* Lzcheck - Test program for the library lzlib
+ Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+ This program is free software: you have unlimited permission
+ to copy, distribute, and modify it.
+
+ Usage: lzcheck [-m|-s] filename.txt...
+
+ This program reads each text file specified and then compresses it,
+ line by line, to test the flushing mechanism and the member
+ restart/reset/sync functions.
+*/
+
+#define _FILE_OFFSET_BITS 64
+
+#include <ctype.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <sys/stat.h>
+
+#include "lzlib.h"
+
+
+const unsigned long long member_size = INT64_MAX;
+enum { buffer_size = 32749 }; /* largest prime < 32768 */
+uint8_t in_buffer[buffer_size];
+uint8_t mid_buffer[buffer_size];
+uint8_t out_buffer[buffer_size];
+
+
+static void show_line( const uint8_t * const buffer, const int size )
+ {
+ int i;
+ for( i = 0; i < size; ++i )
+ fputc( isprint( buffer[i] ) ? buffer[i] : '.', stderr );
+ fputc( '\n', stderr );
+ }
+
+
+static struct LZ_Encoder * xopen_encoder( const int dictionary_size )
+ {
+ const int match_len_limit = 16;
+ struct LZ_Encoder * const encoder =
+ LZ_compress_open( dictionary_size, match_len_limit, member_size );
+ if( !encoder || LZ_compress_errno( encoder ) != LZ_ok )
+ {
+ const bool bad_arg =
+ encoder && ( LZ_compress_errno( encoder ) == LZ_bad_argument );
+ LZ_compress_close( encoder );
+ if( bad_arg )
+ {
+ fputs( "lzcheck: internal error: Invalid argument to encoder.\n", stderr );
+ exit( 3 );
+ }
+ fputs( "lzcheck: Not enough memory.\n", stderr );
+ exit( 1 );
+ }
+ return encoder;
+ }
+
+
+static struct LZ_Decoder * xopen_decoder( void )
+ {
+ struct LZ_Decoder * const decoder = LZ_decompress_open();
+ if( !decoder || LZ_decompress_errno( decoder ) != LZ_ok )
+ {
+ LZ_decompress_close( decoder );
+ fputs( "lzcheck: Not enough memory.\n", stderr );
+ exit( 1 );
+ }
+ return decoder;
+ }
+
+
+static void xclose_encoder( struct LZ_Encoder * const encoder,
+ const bool finish )
+ {
+ if( finish )
+ {
+ unsigned long long size = 0;
+ LZ_compress_finish( encoder );
+ while( true )
+ {
+ const int rd = LZ_compress_read( encoder, mid_buffer, buffer_size );
+ if( rd < 0 )
+ {
+ fprintf( stderr, "lzcheck: xclose: LZ_compress_read error: %s\n",
+ LZ_strerror( LZ_compress_errno( encoder ) ) );
+ exit( 3 );
+ }
+ size += rd;
+ if( LZ_compress_finished( encoder ) == 1 ) break;
+ }
+ if( size > 0 )
+ {
+ fprintf( stderr, "lzcheck: %lld bytes remain in encoder.\n", size );
+ exit( 3 );
+ }
+ }
+ if( LZ_compress_close( encoder ) < 0 ) exit( 1 );
+ }
+
+
+static void xclose_decoder( struct LZ_Decoder * const decoder,
+ const bool finish )
+ {
+ if( finish )
+ {
+ unsigned long long size = 0;
+ LZ_decompress_finish( decoder );
+ while( true )
+ {
+ const int rd = LZ_decompress_read( decoder, out_buffer, buffer_size );
+ if( rd < 0 )
+ {
+ fprintf( stderr, "lzcheck: xclose: LZ_decompress_read error: %s\n",
+ LZ_strerror( LZ_decompress_errno( decoder ) ) );
+ exit( 3 );
+ }
+ size += rd;
+ if( LZ_decompress_finished( decoder ) == 1 ) break;
+ }
+ if( size > 0 )
+ {
+ fprintf( stderr, "lzcheck: %lld bytes remain in decoder.\n", size );
+ exit( 3 );
+ }
+ }
+ if( LZ_decompress_close( decoder ) < 0 ) exit( 1 );
+ }
+
+
+/* Return the next (usually newline-terminated) chunk of data from file.
+ The size returned in *sizep is always <= buffer_size.
+ If sizep is a null pointer, rewind the file, reset state, and return.
+ If file is at EOF, return an empty line.
+*/
+static const uint8_t * next_line( FILE * const file, int * const sizep )
+ {
+ static int l = 0;
+ static int read_size = 0;
+ int r;
+
+ if( !sizep ) { rewind( file ); l = read_size = 0; return in_buffer; }
+ if( l >= read_size )
+ {
+ l = 0; read_size = fread( in_buffer, 1, buffer_size, file );
+ if( l >= read_size ) { *sizep = 0; return in_buffer; } /* end of file */
+ }
+
+ for( r = l + 1; r < read_size && in_buffer[r-1] != '\n'; ++r );
+ *sizep = r - l; l = r;
+ return in_buffer + l - *sizep;
+ }
+
+
+static int check_sync_flush( FILE * const file, const int dictionary_size )
+ {
+ struct LZ_Encoder * const encoder = xopen_encoder( dictionary_size );
+ struct LZ_Decoder * const decoder = xopen_decoder();
+ int retval = 0;
+
+ while( retval <= 1 ) /* test LZ_compress_sync_flush */
+ {
+ int in_size, mid_size, out_size;
+ int line_size;
+ const uint8_t * const line_buf = next_line( file, &line_size );
+ if( line_size <= 0 ) break; /* end of file */
+
+ in_size = LZ_compress_write( encoder, line_buf, line_size );
+ if( in_size < 0 )
+ {
+ fprintf( stderr, "lzcheck: LZ_compress_write error: %s\n",
+ LZ_strerror( LZ_compress_errno( encoder ) ) );
+ retval = 3; break;
+ }
+ if( in_size < line_size )
+ {
+ fprintf( stderr, "lzcheck: sync: LZ_compress_write only accepted %d "
+ "of %d bytes\n", in_size, line_size );
+ mid_size = LZ_compress_read( encoder, mid_buffer, buffer_size );
+ const int wr =
+ LZ_compress_write( encoder, line_buf + in_size, line_size - in_size );
+ if( wr < 0 )
+ {
+ fprintf( stderr, "lzcheck: LZ_compress_write error: %s\n",
+ LZ_strerror( LZ_compress_errno( encoder ) ) );
+ retval = 3; break;
+ }
+ if( wr + in_size != line_size )
+ {
+ fprintf( stderr, "lzcheck: sync: LZ_compress_write only accepted %d "
+ "of %d remaining bytes\n", wr, line_size - in_size );
+ retval = 3; break;
+ }
+ in_size += wr;
+ LZ_compress_sync_flush( encoder );
+ const int rd = LZ_compress_read( encoder, mid_buffer + mid_size,
+ buffer_size - mid_size );
+ if( rd > 0 ) mid_size += rd;
+ else if( rd < 0 ) mid_size = -1;
+ }
+ else
+ {
+ LZ_compress_sync_flush( encoder );
+ if( line_buf[0] & 1 ) /* read all data at once or byte by byte */
+ mid_size = LZ_compress_read( encoder, mid_buffer, buffer_size );
+ else for( mid_size = 0; mid_size < buffer_size; )
+ {
+ const int rd = LZ_compress_read( encoder, mid_buffer + mid_size, 1 );
+ if( rd > 0 ) mid_size += rd;
+ else { if( rd < 0 ) { mid_size = -1; } break; }
+ }
+ }
+ if( mid_size < 0 )
+ {
+ fprintf( stderr, "lzcheck: LZ_compress_read error: %s\n",
+ LZ_strerror( LZ_compress_errno( encoder ) ) );
+ retval = 3; break;
+ }
+ LZ_decompress_write( decoder, mid_buffer, mid_size );
+ out_size = LZ_decompress_read( decoder, out_buffer, buffer_size );
+ if( out_size < 0 )
+ {
+ fprintf( stderr, "lzcheck: LZ_decompress_read error: %s\n",
+ LZ_strerror( LZ_decompress_errno( decoder ) ) );
+ retval = 3; break;
+ }
+
+ if( out_size != in_size || memcmp( line_buf, out_buffer, out_size ) )
+ {
+ fprintf( stderr, "lzcheck: LZ_compress_sync_flush error: "
+ "in_size = %d, out_size = %d\n", in_size, out_size );
+ show_line( line_buf, in_size );
+ show_line( out_buffer, out_size );
+ retval = 1;
+ }
+ }
+
+ if( retval <= 1 )
+ {
+ int rd = 0;
+ if( LZ_compress_finish( encoder ) < 0 ||
+ ( rd = LZ_compress_read( encoder, mid_buffer, buffer_size ) ) < 0 )
+ {
+ fprintf( stderr, "lzcheck: Can't drain encoder: %s\n",
+ LZ_strerror( LZ_compress_errno( encoder ) ) );
+ retval = 3;
+ }
+ LZ_decompress_write( decoder, mid_buffer, rd );
+ }
+
+ xclose_decoder( decoder, retval == 0 );
+ xclose_encoder( encoder, retval == 0 );
+ return retval;
+ }
+
+
+/* Test member by member decompression without calling LZ_decompress_finish,
+ inserting leading garbage before some members, and resetting the
+ decompressor sometimes. Test that the increase in total_in_size when
+ syncing to member is equal to the size of the leading garbage skipped.
+*/
+static int check_members( FILE * const file, const int dictionary_size )
+ {
+ struct LZ_Encoder * const encoder = xopen_encoder( dictionary_size );
+ struct LZ_Decoder * const decoder = xopen_decoder();
+ int retval = 0;
+
+ while( retval <= 1 ) /* test LZ_compress_restart_member */
+ {
+ unsigned long long garbage_begin = 0; /* avoid warning from gcc 3.3.6 */
+ int leading_garbage, in_size, mid_size, out_size;
+ int line_size;
+ const uint8_t * const line_buf = next_line( file, &line_size );
+ if( line_size <= 0 && /* end of file, write at least 1 member */
+ LZ_decompress_total_in_size( decoder ) != 0 ) break;
+
+ if( LZ_compress_finished( encoder ) == 1 )
+ {
+ if( LZ_compress_restart_member( encoder, member_size ) < 0 )
+ {
+ fprintf( stderr, "lzcheck: Can't restart member: %s\n",
+ LZ_strerror( LZ_compress_errno( encoder ) ) );
+ retval = 3; break;
+ }
+ if( line_size >= 2 && line_buf[1] == 'h' )
+ LZ_decompress_reset( decoder );
+ }
+ in_size = LZ_compress_write( encoder, line_buf, line_size );
+ if( in_size < line_size )
+ fprintf( stderr, "lzcheck: member: LZ_compress_write only accepted %d of %d bytes\n",
+ in_size, line_size );
+ LZ_compress_finish( encoder );
+ if( line_size * 3 < buffer_size && line_buf[0] == 't' )
+ { leading_garbage = line_size;
+ memset( mid_buffer, in_buffer[0], leading_garbage );
+ garbage_begin = LZ_decompress_total_in_size( decoder ); }
+ else leading_garbage = 0;
+ mid_size = LZ_compress_read( encoder, mid_buffer + leading_garbage,
+ buffer_size - leading_garbage );
+ if( mid_size < 0 )
+ {
+ fprintf( stderr, "lzcheck: member: LZ_compress_read error: %s\n",
+ LZ_strerror( LZ_compress_errno( encoder ) ) );
+ retval = 3; break;
+ }
+ LZ_decompress_write( decoder, mid_buffer, leading_garbage + mid_size );
+ out_size = LZ_decompress_read( decoder, out_buffer, buffer_size );
+ if( out_size < 0 )
+ {
+ if( leading_garbage &&
+ ( LZ_decompress_errno( decoder ) == LZ_header_error ||
+ LZ_decompress_errno( decoder ) == LZ_data_error ) )
+ {
+ LZ_decompress_sync_to_member( decoder ); /* skip leading garbage */
+ const unsigned long long garbage_end =
+ LZ_decompress_total_in_size( decoder );
+ if( garbage_end - garbage_begin != (unsigned)leading_garbage )
+ {
+ fprintf( stderr, "lzcheck: member: LZ_decompress_sync_to_member error:\n"
+ " garbage_begin = %llu garbage_end = %llu "
+ "difference = %llu expected = %d\n", garbage_begin,
+ garbage_end, garbage_end - garbage_begin, leading_garbage );
+ retval = 3; break;
+ }
+ out_size = LZ_decompress_read( decoder, out_buffer, buffer_size );
+ }
+ if( out_size < 0 )
+ {
+ fprintf( stderr, "lzcheck: member: LZ_decompress_read error: %s\n",
+ LZ_strerror( LZ_decompress_errno( decoder ) ) );
+ retval = 3; break;
+ }
+ }
+
+ if( out_size != in_size || memcmp( line_buf, out_buffer, out_size ) )
+ {
+ fprintf( stderr, "lzcheck: LZ_compress_restart_member error: "
+ "in_size = %d, out_size = %d\n", in_size, out_size );
+ show_line( line_buf, in_size );
+ show_line( out_buffer, out_size );
+ retval = 1;
+ }
+ }
+
+ xclose_decoder( decoder, retval == 0 );
+ xclose_encoder( encoder, retval == 0 );
+ return retval;
+ }
+
+
+int main( const int argc, const char * const argv[] )
+ {
+ int retval = 0, i;
+ int open_failures = 0;
+ const char opt = ( argc > 2 &&
+ ( strcmp( argv[1], "-m" ) == 0 || strcmp( argv[1], "-s" ) == 0 ) ) ?
+ argv[1][1] : 0;
+ const int first = opt ? 2 : 1;
+ const bool verbose = ( opt != 0 || argc > first + 1 );
+
+ if( argc < 2 )
+ {
+ fputs( "Usage: lzcheck [-m|-s] filename.txt...\n", stderr );
+ return 1;
+ }
+
+ for( i = first; i < argc && retval == 0; ++i )
+ {
+ struct stat st;
+ if( stat( argv[i], &st ) != 0 || !S_ISREG( st.st_mode ) ) continue;
+ FILE * file = fopen( argv[i], "rb" );
+ if( !file )
+ {
+ fprintf( stderr, "lzcheck: %s: Can't open file for reading.\n", argv[i] );
+ ++open_failures; continue;
+ }
+ if( verbose ) fprintf( stderr, " Testing file '%s'\n", argv[i] );
+
+ /* 65535,16 chooses fast encoder */
+ if( opt != 'm' ) retval = check_sync_flush( file, 65535 );
+ if( retval == 0 && opt != 'm' )
+ { next_line( file, 0 ); retval = check_sync_flush( file, 1 << 20 ); }
+ if( retval == 0 && opt != 's' )
+ { next_line( file, 0 ); retval = check_members( file, 65535 ); }
+ if( retval == 0 && opt != 's' )
+ { next_line( file, 0 ); retval = check_members( file, 1 << 20 ); }
+ fclose( file );
+ }
+ if( open_failures > 0 && verbose )
+ fprintf( stderr, "lzcheck: warning: %d %s failed to open.\n",
+ open_failures, ( open_failures == 1 ) ? "file" : "files" );
+ if( retval == 0 && open_failures ) retval = 1;
+ return retval;
+ }
diff --git a/lzip.h b/lzip.h
new file mode 100644
index 0000000..e0bdae9
--- /dev/null
+++ b/lzip.h
@@ -0,0 +1,294 @@
+/* Lzlib - Compression library for the lzip format
+ Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+ This library is free software. Redistribution and use in source and
+ binary forms, with or without modification, are permitted provided
+ that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions, and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions, and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+#ifndef max
+ #define max(x,y) ((x) >= (y) ? (x) : (y))
+#endif
+#ifndef min
+ #define min(x,y) ((x) <= (y) ? (x) : (y))
+#endif
+
+typedef int State;
+
+enum { states = 12 };
+
+static inline bool St_is_char( const State st ) { return st < 7; }
+
+static inline State St_set_char( const State st )
+ {
+ static const State next[states] = { 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 4, 5 };
+ return next[st];
+ }
+static inline State St_set_char_rep() { return 8; }
+static inline State St_set_match( const State st )
+ { return ( st < 7 ) ? 7 : 10; }
+static inline State St_set_rep( const State st )
+ { return ( st < 7 ) ? 8 : 11; }
+static inline State St_set_short_rep( const State st )
+ { return ( st < 7 ) ? 9 : 11; }
+
+
+enum {
+ min_dictionary_bits = 12,
+ min_dictionary_size = 1 << min_dictionary_bits, /* >= modeled_distances */
+ max_dictionary_bits = 29,
+ max_dictionary_size = 1 << max_dictionary_bits,
+ literal_context_bits = 3,
+ literal_pos_state_bits = 0, /* not used */
+ pos_state_bits = 2,
+ pos_states = 1 << pos_state_bits,
+ pos_state_mask = pos_states - 1,
+
+ len_states = 4,
+ dis_slot_bits = 6,
+ start_dis_model = 4,
+ end_dis_model = 14,
+ modeled_distances = 1 << (end_dis_model / 2), /* 128 */
+ dis_align_bits = 4,
+ dis_align_size = 1 << dis_align_bits,
+
+ len_low_bits = 3,
+ len_mid_bits = 3,
+ len_high_bits = 8,
+ len_low_symbols = 1 << len_low_bits,
+ len_mid_symbols = 1 << len_mid_bits,
+ len_high_symbols = 1 << len_high_bits,
+ max_len_symbols = len_low_symbols + len_mid_symbols + len_high_symbols,
+
+ min_match_len = 2, /* must be 2 */
+ max_match_len = min_match_len + max_len_symbols - 1, /* 273 */
+ min_match_len_limit = 5 };
+
+static inline int get_len_state( const int len )
+ { return min( len - min_match_len, len_states - 1 ); }
+
+static inline int get_lit_state( const uint8_t prev_byte )
+ { return prev_byte >> ( 8 - literal_context_bits ); }
+
+
+enum { bit_model_move_bits = 5,
+ bit_model_total_bits = 11,
+ bit_model_total = 1 << bit_model_total_bits };
+
+typedef int Bit_model;
+
+static inline void Bm_init( Bit_model * const probability )
+ { *probability = bit_model_total / 2; }
+
+static inline void Bm_array_init( Bit_model bm[], const int size )
+ { int i; for( i = 0; i < size; ++i ) Bm_init( &bm[i] ); }
+
+struct Len_model
+ {
+ Bit_model choice1;
+ Bit_model choice2;
+ Bit_model bm_low[pos_states][len_low_symbols];
+ Bit_model bm_mid[pos_states][len_mid_symbols];
+ Bit_model bm_high[len_high_symbols];
+ };
+
+static inline void Lm_init( struct Len_model * const lm )
+ {
+ Bm_init( &lm->choice1 );
+ Bm_init( &lm->choice2 );
+ Bm_array_init( lm->bm_low[0], pos_states * len_low_symbols );
+ Bm_array_init( lm->bm_mid[0], pos_states * len_mid_symbols );
+ Bm_array_init( lm->bm_high, len_high_symbols );
+ }
+
+
+/* Table of CRCs of all 8-bit messages. */
+static const uint32_t crc32[256] =
+ {
+ 0x00000000, 0x77073096, 0xEE0E612C, 0x990951BA, 0x076DC419, 0x706AF48F,
+ 0xE963A535, 0x9E6495A3, 0x0EDB8832, 0x79DCB8A4, 0xE0D5E91E, 0x97D2D988,
+ 0x09B64C2B, 0x7EB17CBD, 0xE7B82D07, 0x90BF1D91, 0x1DB71064, 0x6AB020F2,
+ 0xF3B97148, 0x84BE41DE, 0x1ADAD47D, 0x6DDDE4EB, 0xF4D4B551, 0x83D385C7,
+ 0x136C9856, 0x646BA8C0, 0xFD62F97A, 0x8A65C9EC, 0x14015C4F, 0x63066CD9,
+ 0xFA0F3D63, 0x8D080DF5, 0x3B6E20C8, 0x4C69105E, 0xD56041E4, 0xA2677172,
+ 0x3C03E4D1, 0x4B04D447, 0xD20D85FD, 0xA50AB56B, 0x35B5A8FA, 0x42B2986C,
+ 0xDBBBC9D6, 0xACBCF940, 0x32D86CE3, 0x45DF5C75, 0xDCD60DCF, 0xABD13D59,
+ 0x26D930AC, 0x51DE003A, 0xC8D75180, 0xBFD06116, 0x21B4F4B5, 0x56B3C423,
+ 0xCFBA9599, 0xB8BDA50F, 0x2802B89E, 0x5F058808, 0xC60CD9B2, 0xB10BE924,
+ 0x2F6F7C87, 0x58684C11, 0xC1611DAB, 0xB6662D3D, 0x76DC4190, 0x01DB7106,
+ 0x98D220BC, 0xEFD5102A, 0x71B18589, 0x06B6B51F, 0x9FBFE4A5, 0xE8B8D433,
+ 0x7807C9A2, 0x0F00F934, 0x9609A88E, 0xE10E9818, 0x7F6A0DBB, 0x086D3D2D,
+ 0x91646C97, 0xE6635C01, 0x6B6B51F4, 0x1C6C6162, 0x856530D8, 0xF262004E,
+ 0x6C0695ED, 0x1B01A57B, 0x8208F4C1, 0xF50FC457, 0x65B0D9C6, 0x12B7E950,
+ 0x8BBEB8EA, 0xFCB9887C, 0x62DD1DDF, 0x15DA2D49, 0x8CD37CF3, 0xFBD44C65,
+ 0x4DB26158, 0x3AB551CE, 0xA3BC0074, 0xD4BB30E2, 0x4ADFA541, 0x3DD895D7,
+ 0xA4D1C46D, 0xD3D6F4FB, 0x4369E96A, 0x346ED9FC, 0xAD678846, 0xDA60B8D0,
+ 0x44042D73, 0x33031DE5, 0xAA0A4C5F, 0xDD0D7CC9, 0x5005713C, 0x270241AA,
+ 0xBE0B1010, 0xC90C2086, 0x5768B525, 0x206F85B3, 0xB966D409, 0xCE61E49F,
+ 0x5EDEF90E, 0x29D9C998, 0xB0D09822, 0xC7D7A8B4, 0x59B33D17, 0x2EB40D81,
+ 0xB7BD5C3B, 0xC0BA6CAD, 0xEDB88320, 0x9ABFB3B6, 0x03B6E20C, 0x74B1D29A,
+ 0xEAD54739, 0x9DD277AF, 0x04DB2615, 0x73DC1683, 0xE3630B12, 0x94643B84,
+ 0x0D6D6A3E, 0x7A6A5AA8, 0xE40ECF0B, 0x9309FF9D, 0x0A00AE27, 0x7D079EB1,
+ 0xF00F9344, 0x8708A3D2, 0x1E01F268, 0x6906C2FE, 0xF762575D, 0x806567CB,
+ 0x196C3671, 0x6E6B06E7, 0xFED41B76, 0x89D32BE0, 0x10DA7A5A, 0x67DD4ACC,
+ 0xF9B9DF6F, 0x8EBEEFF9, 0x17B7BE43, 0x60B08ED5, 0xD6D6A3E8, 0xA1D1937E,
+ 0x38D8C2C4, 0x4FDFF252, 0xD1BB67F1, 0xA6BC5767, 0x3FB506DD, 0x48B2364B,
+ 0xD80D2BDA, 0xAF0A1B4C, 0x36034AF6, 0x41047A60, 0xDF60EFC3, 0xA867DF55,
+ 0x316E8EEF, 0x4669BE79, 0xCB61B38C, 0xBC66831A, 0x256FD2A0, 0x5268E236,
+ 0xCC0C7795, 0xBB0B4703, 0x220216B9, 0x5505262F, 0xC5BA3BBE, 0xB2BD0B28,
+ 0x2BB45A92, 0x5CB36A04, 0xC2D7FFA7, 0xB5D0CF31, 0x2CD99E8B, 0x5BDEAE1D,
+ 0x9B64C2B0, 0xEC63F226, 0x756AA39C, 0x026D930A, 0x9C0906A9, 0xEB0E363F,
+ 0x72076785, 0x05005713, 0x95BF4A82, 0xE2B87A14, 0x7BB12BAE, 0x0CB61B38,
+ 0x92D28E9B, 0xE5D5BE0D, 0x7CDCEFB7, 0x0BDBDF21, 0x86D3D2D4, 0xF1D4E242,
+ 0x68DDB3F8, 0x1FDA836E, 0x81BE16CD, 0xF6B9265B, 0x6FB077E1, 0x18B74777,
+ 0x88085AE6, 0xFF0F6A70, 0x66063BCA, 0x11010B5C, 0x8F659EFF, 0xF862AE69,
+ 0x616BFFD3, 0x166CCF45, 0xA00AE278, 0xD70DD2EE, 0x4E048354, 0x3903B3C2,
+ 0xA7672661, 0xD06016F7, 0x4969474D, 0x3E6E77DB, 0xAED16A4A, 0xD9D65ADC,
+ 0x40DF0B66, 0x37D83BF0, 0xA9BCAE53, 0xDEBB9EC5, 0x47B2CF7F, 0x30B5FFE9,
+ 0xBDBDF21C, 0xCABAC28A, 0x53B39330, 0x24B4A3A6, 0xBAD03605, 0xCDD70693,
+ 0x54DE5729, 0x23D967BF, 0xB3667A2E, 0xC4614AB8, 0x5D681B02, 0x2A6F2B94,
+ 0xB40BBE37, 0xC30C8EA1, 0x5A05DF1B, 0x2D02EF8D };
+
+
+static inline void CRC32_update_byte( uint32_t * const crc, const uint8_t byte )
+ { *crc = crc32[(*crc^byte)&0xFF] ^ ( *crc >> 8 ); }
+
+/* about as fast as it is possible without messing with endianness */
+static inline void CRC32_update_buf( uint32_t * const crc,
+ const uint8_t * const buffer,
+ const int size )
+ {
+ int i;
+ uint32_t c = *crc;
+ for( i = 0; i < size; ++i )
+ c = crc32[(c^buffer[i])&0xFF] ^ ( c >> 8 );
+ *crc = c;
+ }
+
+
+static inline bool isvalid_ds( const unsigned dictionary_size )
+ { return dictionary_size >= min_dictionary_size &&
+ dictionary_size <= max_dictionary_size; }
+
+
+static inline int real_bits( unsigned value )
+ {
+ int bits = 0;
+ while( value > 0 ) { value >>= 1; ++bits; }
+ return bits;
+ }
+
+
+static const uint8_t lzip_magic[4] = { 0x4C, 0x5A, 0x49, 0x50 }; /* "LZIP" */
+
+enum { Lh_size = 6 };
+typedef uint8_t Lzip_header[Lh_size]; /* 0-3 magic bytes */
+ /* 4 version */
+ /* 5 coded dictionary size */
+
+static inline void Lh_set_magic( Lzip_header data )
+ { memcpy( data, lzip_magic, 4 ); data[4] = 1; }
+
+static inline bool Lh_check_magic( const Lzip_header data )
+ { return memcmp( data, lzip_magic, 4 ) == 0; }
+
+/* detect (truncated) header */
+static inline bool Lh_check_prefix( const Lzip_header data, const int sz )
+ {
+ int i; for( i = 0; i < sz && i < 4; ++i )
+ if( data[i] != lzip_magic[i] ) return false;
+ return sz > 0;
+ }
+
+/* detect corrupt header */
+static inline bool Lh_check_corrupt( const Lzip_header data )
+ {
+ int matches = 0;
+ int i; for( i = 0; i < 4; ++i )
+ if( data[i] == lzip_magic[i] ) ++matches;
+ return matches > 1 && matches < 4;
+ }
+
+static inline uint8_t Lh_version( const Lzip_header data )
+ { return data[4]; }
+
+static inline bool Lh_check_version( const Lzip_header data )
+ { return data[4] == 1; }
+
+static inline unsigned Lh_get_dictionary_size( const Lzip_header data )
+ {
+ unsigned sz = 1 << ( data[5] & 0x1F );
+ if( sz > min_dictionary_size )
+ sz -= ( sz / 16 ) * ( ( data[5] >> 5 ) & 7 );
+ return sz;
+ }
+
+static inline bool Lh_set_dictionary_size( Lzip_header data, const unsigned sz )
+ {
+ if( !isvalid_ds( sz ) ) return false;
+ data[5] = real_bits( sz - 1 );
+ if( sz > min_dictionary_size )
+ {
+ const unsigned base_size = 1 << data[5];
+ const unsigned fraction = base_size / 16;
+ unsigned i;
+ for( i = 7; i >= 1; --i )
+ if( base_size - ( i * fraction ) >= sz )
+ { data[5] |= i << 5; break; }
+ }
+ return true;
+ }
+
+static inline bool Lh_check( const Lzip_header data )
+ {
+ return Lh_check_magic( data ) && Lh_check_version( data ) &&
+ isvalid_ds( Lh_get_dictionary_size( data ) );
+ }
+
+
+enum { Lt_size = 20 };
+typedef uint8_t Lzip_trailer[Lt_size];
+ /* 0-3 CRC32 of the uncompressed data */
+ /* 4-11 size of the uncompressed data */
+ /* 12-19 member size including header and trailer */
+
+static inline unsigned Lt_get_data_crc( const Lzip_trailer data )
+ {
+ unsigned tmp = 0;
+ int i; for( i = 3; i >= 0; --i ) { tmp <<= 8; tmp += data[i]; }
+ return tmp;
+ }
+
+static inline void Lt_set_data_crc( Lzip_trailer data, unsigned crc )
+ { int i; for( i = 0; i <= 3; ++i ) { data[i] = (uint8_t)crc; crc >>= 8; } }
+
+static inline unsigned long long Lt_get_data_size( const Lzip_trailer data )
+ {
+ unsigned long long tmp = 0;
+ int i; for( i = 11; i >= 4; --i ) { tmp <<= 8; tmp += data[i]; }
+ return tmp;
+ }
+
+static inline void Lt_set_data_size( Lzip_trailer data, unsigned long long sz )
+ { int i; for( i = 4; i <= 11; ++i ) { data[i] = (uint8_t)sz; sz >>= 8; } }
+
+static inline unsigned long long Lt_get_member_size( const Lzip_trailer data )
+ {
+ unsigned long long tmp = 0;
+ int i; for( i = 19; i >= 12; --i ) { tmp <<= 8; tmp += data[i]; }
+ return tmp;
+ }
+
+static inline void Lt_set_member_size( Lzip_trailer data, unsigned long long sz )
+ { int i; for( i = 12; i <= 19; ++i ) { data[i] = (uint8_t)sz; sz >>= 8; } }
diff --git a/lzlib.c b/lzlib.c
new file mode 100644
index 0000000..4105205
--- /dev/null
+++ b/lzlib.c
@@ -0,0 +1,601 @@
+/* Lzlib - Compression library for the lzip format
+ Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+ This library is free software. Redistribution and use in source and
+ binary forms, with or without modification, are permitted provided
+ that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions, and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions, and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "lzlib.h"
+#include "lzip.h"
+#include "cbuffer.c"
+#include "decoder.h"
+#include "decoder.c"
+#include "encoder_base.h"
+#include "encoder_base.c"
+#include "encoder.h"
+#include "encoder.c"
+#include "fast_encoder.h"
+#include "fast_encoder.c"
+
+
+struct LZ_Encoder
+ {
+ unsigned long long partial_in_size;
+ unsigned long long partial_out_size;
+ struct LZ_encoder_base * lz_encoder_base; /* these 3 pointers make a */
+ struct LZ_encoder * lz_encoder; /* polymorphic encoder */
+ struct FLZ_encoder * flz_encoder;
+ enum LZ_Errno lz_errno;
+ bool fatal;
+ };
+
+static void LZ_Encoder_init( struct LZ_Encoder * const e )
+ {
+ e->partial_in_size = 0;
+ e->partial_out_size = 0;
+ e->lz_encoder_base = 0;
+ e->lz_encoder = 0;
+ e->flz_encoder = 0;
+ e->lz_errno = LZ_ok;
+ e->fatal = false;
+ }
+
+
+struct LZ_Decoder
+ {
+ unsigned long long partial_in_size;
+ unsigned long long partial_out_size;
+ struct Range_decoder * rdec;
+ struct LZ_decoder * lz_decoder;
+ enum LZ_Errno lz_errno;
+ Lzip_header member_header; /* header of current member */
+ bool fatal;
+ bool first_header; /* true until first header is read */
+ bool seeking;
+ };
+
+static void LZ_Decoder_init( struct LZ_Decoder * const d )
+ {
+ int i;
+ d->partial_in_size = 0;
+ d->partial_out_size = 0;
+ d->rdec = 0;
+ d->lz_decoder = 0;
+ d->lz_errno = LZ_ok;
+ for( i = 0; i < Lh_size; ++i ) d->member_header[i] = 0;
+ d->fatal = false;
+ d->first_header = true;
+ d->seeking = false;
+ }
+
+
+static bool check_encoder( struct LZ_Encoder * const e )
+ {
+ if( !e ) return false;
+ if( !e->lz_encoder_base || ( !e->lz_encoder && !e->flz_encoder ) ||
+ ( e->lz_encoder && e->flz_encoder ) )
+ { e->lz_errno = LZ_bad_argument; return false; }
+ return true;
+ }
+
+
+static bool check_decoder( struct LZ_Decoder * const d )
+ {
+ if( !d ) return false;
+ if( !d->rdec )
+ { d->lz_errno = LZ_bad_argument; return false; }
+ return true;
+ }
+
+
+/* ------------------------- Misc Functions ------------------------- */
+
+int LZ_api_version( void ) { return LZ_API_VERSION; }
+
+const char * LZ_version( void ) { return LZ_version_string; }
+
+const char * LZ_strerror( const enum LZ_Errno lz_errno )
+ {
+ switch( lz_errno )
+ {
+ case LZ_ok : return "ok";
+ case LZ_bad_argument : return "Bad argument";
+ case LZ_mem_error : return "Not enough memory";
+ case LZ_sequence_error: return "Sequence error";
+ case LZ_header_error : return "Header error";
+ case LZ_unexpected_eof: return "Unexpected EOF";
+ case LZ_data_error : return "Data error";
+ case LZ_library_error : return "Library error";
+ }
+ return "Invalid error code";
+ }
+
+
+int LZ_min_dictionary_bits( void ) { return min_dictionary_bits; }
+int LZ_min_dictionary_size( void ) { return min_dictionary_size; }
+int LZ_max_dictionary_bits( void ) { return max_dictionary_bits; }
+int LZ_max_dictionary_size( void ) { return max_dictionary_size; }
+int LZ_min_match_len_limit( void ) { return min_match_len_limit; }
+int LZ_max_match_len_limit( void ) { return max_match_len; }
+
+
+/* --------------------- Compression Functions --------------------- */
+
+struct LZ_Encoder * LZ_compress_open( const int dictionary_size,
+ const int match_len_limit,
+ const unsigned long long member_size )
+ {
+ Lzip_header header;
+ struct LZ_Encoder * const e =
+ (struct LZ_Encoder *)malloc( sizeof (struct LZ_Encoder) );
+ if( !e ) return 0;
+ LZ_Encoder_init( e );
+ if( !Lh_set_dictionary_size( header, dictionary_size ) ||
+ match_len_limit < min_match_len_limit ||
+ match_len_limit > max_match_len ||
+ member_size < min_dictionary_size )
+ e->lz_errno = LZ_bad_argument;
+ else
+ {
+ if( dictionary_size == 65535 && match_len_limit == 16 )
+ {
+ e->flz_encoder = (struct FLZ_encoder *)malloc( sizeof (struct FLZ_encoder) );
+ if( e->flz_encoder && FLZe_init( e->flz_encoder, member_size ) )
+ { e->lz_encoder_base = &e->flz_encoder->eb; return e; }
+ free( e->flz_encoder ); e->flz_encoder = 0;
+ }
+ else
+ {
+ e->lz_encoder = (struct LZ_encoder *)malloc( sizeof (struct LZ_encoder) );
+ if( e->lz_encoder && LZe_init( e->lz_encoder, Lh_get_dictionary_size( header ),
+ match_len_limit, member_size ) )
+ { e->lz_encoder_base = &e->lz_encoder->eb; return e; }
+ free( e->lz_encoder ); e->lz_encoder = 0;
+ }
+ e->lz_errno = LZ_mem_error;
+ }
+ e->fatal = true;
+ return e;
+ }
+
+
+int LZ_compress_close( struct LZ_Encoder * const e )
+ {
+ if( !e ) return -1;
+ if( e->lz_encoder_base )
+ { LZeb_free( e->lz_encoder_base );
+ free( e->lz_encoder ); free( e->flz_encoder ); }
+ free( e );
+ return 0;
+ }
+
+
+int LZ_compress_finish( struct LZ_Encoder * const e )
+ {
+ if( !check_encoder( e ) || e->fatal ) return -1;
+ Mb_finish( &e->lz_encoder_base->mb );
+ /* if (open --> write --> finish) use same dictionary size as lzip. */
+ /* this does not save any memory. */
+ if( Mb_data_position( &e->lz_encoder_base->mb ) == 0 &&
+ Re_member_position( &e->lz_encoder_base->renc ) == Lh_size )
+ {
+ Mb_adjust_dictionary_size( &e->lz_encoder_base->mb );
+ Lh_set_dictionary_size( e->lz_encoder_base->renc.header,
+ e->lz_encoder_base->mb.dictionary_size );
+ e->lz_encoder_base->renc.cb.buffer[5] = e->lz_encoder_base->renc.header[5];
+ }
+ return 0;
+ }
+
+
+int LZ_compress_restart_member( struct LZ_Encoder * const e,
+ const unsigned long long member_size )
+ {
+ if( !check_encoder( e ) || e->fatal ) return -1;
+ if( !LZeb_member_finished( e->lz_encoder_base ) )
+ { e->lz_errno = LZ_sequence_error; return -1; }
+ if( member_size < min_dictionary_size )
+ { e->lz_errno = LZ_bad_argument; return -1; }
+
+ e->partial_in_size += Mb_data_position( &e->lz_encoder_base->mb );
+ e->partial_out_size += Re_member_position( &e->lz_encoder_base->renc );
+
+ if( e->lz_encoder ) LZe_reset( e->lz_encoder, member_size );
+ else FLZe_reset( e->flz_encoder, member_size );
+ e->lz_errno = LZ_ok;
+ return 0;
+ }
+
+
+int LZ_compress_sync_flush( struct LZ_Encoder * const e )
+ {
+ if( !check_encoder( e ) || e->fatal ) return -1;
+ if( !e->lz_encoder_base->mb.at_stream_end )
+ e->lz_encoder_base->mb.sync_flush_pending = true;
+ return 0;
+ }
+
+
+int LZ_compress_read( struct LZ_Encoder * const e,
+ uint8_t * const buffer, const int size )
+ {
+ if( !check_encoder( e ) || e->fatal ) return -1;
+ if( size < 0 ) return 0;
+
+ { struct LZ_encoder_base * const eb = e->lz_encoder_base;
+ int out_size = Re_read_data( &eb->renc, buffer, size );
+ /* minimize number of calls to encode_member */
+ if( out_size < size || size == 0 )
+ {
+ if( ( e->flz_encoder && !FLZe_encode_member( e->flz_encoder ) ) ||
+ ( e->lz_encoder && !LZe_encode_member( e->lz_encoder ) ) )
+ { e->lz_errno = LZ_library_error; e->fatal = true; return -1; }
+ if( eb->mb.sync_flush_pending && Mb_available_bytes( &eb->mb ) <= 0 )
+ LZeb_try_sync_flush( eb );
+ out_size += Re_read_data( &eb->renc, buffer + out_size, size - out_size );
+ }
+ return out_size; }
+ }
+
+
+int LZ_compress_write( struct LZ_Encoder * const e,
+ const uint8_t * const buffer, const int size )
+ {
+ if( !check_encoder( e ) || e->fatal ) return -1;
+ return Mb_write_data( &e->lz_encoder_base->mb, buffer, size );
+ }
+
+
+int LZ_compress_write_size( struct LZ_Encoder * const e )
+ {
+ if( !check_encoder( e ) || e->fatal ) return -1;
+ return Mb_free_bytes( &e->lz_encoder_base->mb );
+ }
+
+
+enum LZ_Errno LZ_compress_errno( struct LZ_Encoder * const e )
+ {
+ if( !e ) return LZ_bad_argument;
+ return e->lz_errno;
+ }
+
+
+int LZ_compress_finished( struct LZ_Encoder * const e )
+ {
+ if( !check_encoder( e ) ) return -1;
+ return Mb_data_finished( &e->lz_encoder_base->mb ) &&
+ LZeb_member_finished( e->lz_encoder_base );
+ }
+
+
+int LZ_compress_member_finished( struct LZ_Encoder * const e )
+ {
+ if( !check_encoder( e ) ) return -1;
+ return LZeb_member_finished( e->lz_encoder_base );
+ }
+
+
+unsigned long long LZ_compress_data_position( struct LZ_Encoder * const e )
+ {
+ if( !check_encoder( e ) ) return 0;
+ return Mb_data_position( &e->lz_encoder_base->mb );
+ }
+
+
+unsigned long long LZ_compress_member_position( struct LZ_Encoder * const e )
+ {
+ if( !check_encoder( e ) ) return 0;
+ return Re_member_position( &e->lz_encoder_base->renc );
+ }
+
+
+unsigned long long LZ_compress_total_in_size( struct LZ_Encoder * const e )
+ {
+ if( !check_encoder( e ) ) return 0;
+ return e->partial_in_size + Mb_data_position( &e->lz_encoder_base->mb );
+ }
+
+
+unsigned long long LZ_compress_total_out_size( struct LZ_Encoder * const e )
+ {
+ if( !check_encoder( e ) ) return 0;
+ return e->partial_out_size + Re_member_position( &e->lz_encoder_base->renc );
+ }
+
+
+/* -------------------- Decompression Functions -------------------- */
+
+struct LZ_Decoder * LZ_decompress_open( void )
+ {
+ struct LZ_Decoder * const d =
+ (struct LZ_Decoder *)malloc( sizeof (struct LZ_Decoder) );
+ if( !d ) return 0;
+ LZ_Decoder_init( d );
+
+ d->rdec = (struct Range_decoder *)malloc( sizeof (struct Range_decoder) );
+ if( !d->rdec || !Rd_init( d->rdec ) )
+ {
+ if( d->rdec ) { Rd_free( d->rdec ); free( d->rdec ); d->rdec = 0; }
+ d->lz_errno = LZ_mem_error; d->fatal = true;
+ }
+ return d;
+ }
+
+
+int LZ_decompress_close( struct LZ_Decoder * const d )
+ {
+ if( !d ) return -1;
+ if( d->lz_decoder )
+ { LZd_free( d->lz_decoder ); free( d->lz_decoder ); }
+ if( d->rdec ) { Rd_free( d->rdec ); free( d->rdec ); }
+ free( d );
+ return 0;
+ }
+
+
+int LZ_decompress_finish( struct LZ_Decoder * const d )
+ {
+ if( !check_decoder( d ) || d->fatal ) return -1;
+ if( d->seeking )
+ { d->seeking = false; d->partial_in_size += Rd_purge( d->rdec ); }
+ else Rd_finish( d->rdec );
+ return 0;
+ }
+
+
+int LZ_decompress_reset( struct LZ_Decoder * const d )
+ {
+ if( !check_decoder( d ) ) return -1;
+ if( d->lz_decoder )
+ { LZd_free( d->lz_decoder ); free( d->lz_decoder ); d->lz_decoder = 0; }
+ d->partial_in_size = 0;
+ d->partial_out_size = 0;
+ Rd_reset( d->rdec );
+ d->lz_errno = LZ_ok;
+ d->fatal = false;
+ d->first_header = true;
+ d->seeking = false;
+ return 0;
+ }
+
+
+int LZ_decompress_sync_to_member( struct LZ_Decoder * const d )
+ {
+ unsigned skipped = 0;
+ if( !check_decoder( d ) ) return -1;
+ if( d->lz_decoder )
+ { LZd_free( d->lz_decoder ); free( d->lz_decoder ); d->lz_decoder = 0; }
+ if( Rd_find_header( d->rdec, &skipped ) ) d->seeking = false;
+ else
+ {
+ if( !d->rdec->at_stream_end ) d->seeking = true;
+ else { d->seeking = false; d->partial_in_size += Rd_purge( d->rdec ); }
+ }
+ d->partial_in_size += skipped;
+ d->lz_errno = LZ_ok;
+ d->fatal = false;
+ return 0;
+ }
+
+
+int LZ_decompress_read( struct LZ_Decoder * const d,
+ uint8_t * const buffer, const int size )
+ {
+ int result;
+ if( !check_decoder( d ) ) return -1;
+ if( size < 0 ) return 0;
+ if( d->fatal ) /* don't return error until pending bytes are read */
+ { if( d->lz_decoder && !Cb_empty( &d->lz_decoder->cb ) ) goto get_data;
+ return -1; }
+ if( d->seeking ) return 0;
+
+ if( d->lz_decoder && LZd_member_finished( d->lz_decoder ) )
+ {
+ d->partial_out_size += LZd_data_position( d->lz_decoder );
+ LZd_free( d->lz_decoder ); free( d->lz_decoder ); d->lz_decoder = 0;
+ }
+ if( !d->lz_decoder )
+ {
+ int rd;
+ d->partial_in_size += d->rdec->member_position;
+ d->rdec->member_position = 0;
+ if( Rd_available_bytes( d->rdec ) < Lh_size + 5 &&
+ !d->rdec->at_stream_end ) return 0;
+ if( Rd_finished( d->rdec ) && !d->first_header ) return 0;
+ rd = Rd_read_data( d->rdec, d->member_header, Lh_size );
+ if( rd < Lh_size || Rd_finished( d->rdec ) ) /* End Of File */
+ {
+ if( rd <= 0 || Lh_check_prefix( d->member_header, rd ) )
+ d->lz_errno = LZ_unexpected_eof;
+ else
+ d->lz_errno = LZ_header_error;
+ d->fatal = true;
+ return -1;
+ }
+ if( !Lh_check_magic( d->member_header ) )
+ {
+ /* unreading the header prevents sync_to_member from skipping a member
+ if leading garbage is shorter than a full header; "lgLZIP\x01\x0C" */
+ if( Rd_unread_data( d->rdec, rd ) )
+ {
+ if( d->first_header || !Lh_check_corrupt( d->member_header ) )
+ d->lz_errno = LZ_header_error;
+ else
+ d->lz_errno = LZ_data_error; /* corrupt header */
+ }
+ else
+ d->lz_errno = LZ_library_error;
+ d->fatal = true;
+ return -1;
+ }
+ if( !Lh_check_version( d->member_header ) ||
+ !isvalid_ds( Lh_get_dictionary_size( d->member_header ) ) )
+ {
+ /* Skip a possible "LZIP" leading garbage; "LZIPLZIP\x01\x0C".
+ Leave member_pos pointing to the first error. */
+ if( Rd_unread_data( d->rdec, 1 + !Lh_check_version( d->member_header ) ) )
+ d->lz_errno = LZ_data_error; /* bad version or bad dict size */
+ else
+ d->lz_errno = LZ_library_error;
+ d->fatal = true;
+ return -1;
+ }
+ d->first_header = false;
+ if( Rd_available_bytes( d->rdec ) < 5 )
+ {
+ /* set position at EOF */
+ d->rdec->member_position += Cb_used_bytes( &d->rdec->cb );
+ Cb_reset( &d->rdec->cb );
+ d->lz_errno = LZ_unexpected_eof;
+ d->fatal = true;
+ return -1;
+ }
+ d->lz_decoder = (struct LZ_decoder *)malloc( sizeof (struct LZ_decoder) );
+ if( !d->lz_decoder || !LZd_init( d->lz_decoder, d->rdec,
+ Lh_get_dictionary_size( d->member_header ) ) )
+ { /* not enough free memory */
+ if( d->lz_decoder )
+ { LZd_free( d->lz_decoder ); free( d->lz_decoder ); d->lz_decoder = 0; }
+ d->lz_errno = LZ_mem_error;
+ d->fatal = true;
+ return -1;
+ }
+ d->rdec->reload_pending = true;
+ }
+ result = LZd_decode_member( d->lz_decoder );
+ if( result != 0 )
+ {
+ if( result == 2 ) /* set input position at EOF */
+ { d->rdec->member_position += Cb_used_bytes( &d->rdec->cb );
+ Cb_reset( &d->rdec->cb );
+ d->lz_errno = LZ_unexpected_eof; }
+ else if( result == 5 ) d->lz_errno = LZ_library_error;
+ else d->lz_errno = LZ_data_error;
+ d->fatal = true;
+ if( Cb_empty( &d->lz_decoder->cb ) ) return -1;
+ }
+get_data:
+ return Cb_read_data( &d->lz_decoder->cb, buffer, size );
+ }
+
+
+int LZ_decompress_write( struct LZ_Decoder * const d,
+ const uint8_t * const buffer, const int size )
+ {
+ int result;
+ if( !check_decoder( d ) || d->fatal ) return -1;
+ if( size < 0 ) return 0;
+
+ result = Rd_write_data( d->rdec, buffer, size );
+ while( d->seeking )
+ {
+ int size2;
+ unsigned skipped = 0;
+ if( Rd_find_header( d->rdec, &skipped ) ) d->seeking = false;
+ d->partial_in_size += skipped;
+ if( result >= size ) break;
+ size2 = Rd_write_data( d->rdec, buffer + result, size - result );
+ if( size2 > 0 ) result += size2;
+ else break;
+ }
+ return result;
+ }
+
+
+int LZ_decompress_write_size( struct LZ_Decoder * const d )
+ {
+ if( !check_decoder( d ) || d->fatal ) return -1;
+ return Rd_free_bytes( d->rdec );
+ }
+
+
+enum LZ_Errno LZ_decompress_errno( struct LZ_Decoder * const d )
+ {
+ if( !d ) return LZ_bad_argument;
+ return d->lz_errno;
+ }
+
+
+int LZ_decompress_finished( struct LZ_Decoder * const d )
+ {
+ if( !check_decoder( d ) || d->fatal ) return -1;
+ return Rd_finished( d->rdec ) &&
+ ( !d->lz_decoder || LZd_member_finished( d->lz_decoder ) );
+ }
+
+
+int LZ_decompress_member_finished( struct LZ_Decoder * const d )
+ {
+ if( !check_decoder( d ) || d->fatal ) return -1;
+ return d->lz_decoder && LZd_member_finished( d->lz_decoder );
+ }
+
+
+int LZ_decompress_member_version( struct LZ_Decoder * const d )
+ {
+ if( !check_decoder( d ) ) return -1;
+ return Lh_version( d->member_header );
+ }
+
+
+int LZ_decompress_dictionary_size( struct LZ_Decoder * const d )
+ {
+ if( !check_decoder( d ) ) return -1;
+ return Lh_get_dictionary_size( d->member_header );
+ }
+
+
+unsigned LZ_decompress_data_crc( struct LZ_Decoder * const d )
+ {
+ if( check_decoder( d ) && d->lz_decoder )
+ return LZd_crc( d->lz_decoder );
+ return 0;
+ }
+
+
+unsigned long long LZ_decompress_data_position( struct LZ_Decoder * const d )
+ {
+ if( check_decoder( d ) && d->lz_decoder )
+ return LZd_data_position( d->lz_decoder );
+ return 0;
+ }
+
+
+unsigned long long LZ_decompress_member_position( struct LZ_Decoder * const d )
+ {
+ if( !check_decoder( d ) ) return 0;
+ return d->rdec->member_position;
+ }
+
+
+unsigned long long LZ_decompress_total_in_size( struct LZ_Decoder * const d )
+ {
+ if( !check_decoder( d ) ) return 0;
+ return d->partial_in_size + d->rdec->member_position;
+ }
+
+
+unsigned long long LZ_decompress_total_out_size( struct LZ_Decoder * const d )
+ {
+ if( !check_decoder( d ) ) return 0;
+ if( d->lz_decoder )
+ return d->partial_out_size + LZd_data_position( d->lz_decoder );
+ return d->partial_out_size;
+ }
diff --git a/lzlib.h b/lzlib.h
new file mode 100644
index 0000000..b7357d2
--- /dev/null
+++ b/lzlib.h
@@ -0,0 +1,110 @@
+/* Lzlib - Compression library for the lzip format
+ Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+ This library is free software. Redistribution and use in source and
+ binary forms, with or without modification, are permitted provided
+ that the following conditions are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions, and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions, and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ This library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* LZ_API_VERSION was first defined in lzlib 1.8 to 1.
+ Since lzlib 1.12, LZ_API_VERSION is defined as (major * 1000 + minor). */
+
+#define LZ_API_VERSION 1014
+
+static const char * const LZ_version_string = "1.14";
+
+enum LZ_Errno { LZ_ok = 0, LZ_bad_argument, LZ_mem_error,
+ LZ_sequence_error, LZ_header_error, LZ_unexpected_eof,
+ LZ_data_error, LZ_library_error };
+
+
+int LZ_api_version( void ); /* new in 1.12 */
+const char * LZ_version( void );
+const char * LZ_strerror( const enum LZ_Errno lz_errno );
+
+int LZ_min_dictionary_bits( void );
+int LZ_min_dictionary_size( void );
+int LZ_max_dictionary_bits( void );
+int LZ_max_dictionary_size( void );
+int LZ_min_match_len_limit( void );
+int LZ_max_match_len_limit( void );
+
+
+/* --------------------- Compression Functions --------------------- */
+
+struct LZ_Encoder;
+
+struct LZ_Encoder * LZ_compress_open( const int dictionary_size,
+ const int match_len_limit,
+ const unsigned long long member_size );
+int LZ_compress_close( struct LZ_Encoder * const encoder );
+
+int LZ_compress_finish( struct LZ_Encoder * const encoder );
+int LZ_compress_restart_member( struct LZ_Encoder * const encoder,
+ const unsigned long long member_size );
+int LZ_compress_sync_flush( struct LZ_Encoder * const encoder );
+
+int LZ_compress_read( struct LZ_Encoder * const encoder,
+ uint8_t * const buffer, const int size );
+int LZ_compress_write( struct LZ_Encoder * const encoder,
+ const uint8_t * const buffer, const int size );
+int LZ_compress_write_size( struct LZ_Encoder * const encoder );
+
+enum LZ_Errno LZ_compress_errno( struct LZ_Encoder * const encoder );
+int LZ_compress_finished( struct LZ_Encoder * const encoder );
+int LZ_compress_member_finished( struct LZ_Encoder * const encoder );
+
+unsigned long long LZ_compress_data_position( struct LZ_Encoder * const encoder );
+unsigned long long LZ_compress_member_position( struct LZ_Encoder * const encoder );
+unsigned long long LZ_compress_total_in_size( struct LZ_Encoder * const encoder );
+unsigned long long LZ_compress_total_out_size( struct LZ_Encoder * const encoder );
+
+
+/* -------------------- Decompression Functions -------------------- */
+
+struct LZ_Decoder;
+
+struct LZ_Decoder * LZ_decompress_open( void );
+int LZ_decompress_close( struct LZ_Decoder * const decoder );
+
+int LZ_decompress_finish( struct LZ_Decoder * const decoder );
+int LZ_decompress_reset( struct LZ_Decoder * const decoder );
+int LZ_decompress_sync_to_member( struct LZ_Decoder * const decoder );
+
+int LZ_decompress_read( struct LZ_Decoder * const decoder,
+ uint8_t * const buffer, const int size );
+int LZ_decompress_write( struct LZ_Decoder * const decoder,
+ const uint8_t * const buffer, const int size );
+int LZ_decompress_write_size( struct LZ_Decoder * const decoder );
+
+enum LZ_Errno LZ_decompress_errno( struct LZ_Decoder * const decoder );
+int LZ_decompress_finished( struct LZ_Decoder * const decoder );
+int LZ_decompress_member_finished( struct LZ_Decoder * const decoder );
+
+int LZ_decompress_member_version( struct LZ_Decoder * const decoder );
+int LZ_decompress_dictionary_size( struct LZ_Decoder * const decoder );
+unsigned LZ_decompress_data_crc( struct LZ_Decoder * const decoder );
+
+unsigned long long LZ_decompress_data_position( struct LZ_Decoder * const decoder );
+unsigned long long LZ_decompress_member_position( struct LZ_Decoder * const decoder );
+unsigned long long LZ_decompress_total_in_size( struct LZ_Decoder * const decoder );
+unsigned long long LZ_decompress_total_out_size( struct LZ_Decoder * const decoder );
+
+#ifdef __cplusplus
+}
+#endif
diff --git a/minilzip.c b/minilzip.c
new file mode 100644
index 0000000..a0a6721
--- /dev/null
+++ b/minilzip.c
@@ -0,0 +1,1292 @@
+/* Minilzip - Test program for the library lzlib
+ Copyright (C) 2009-2024 Antonio Diaz Diaz.
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
+*/
+/*
+ Exit status: 0 for a normal exit, 1 for environmental problems
+ (file not found, invalid command-line options, I/O errors, etc), 2 to
+ indicate a corrupt or invalid input file, 3 for an internal consistency
+ error (e.g., bug) which caused minilzip to panic.
+*/
+
+#define _FILE_OFFSET_BITS 64
+
+#include <ctype.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <limits.h> /* SSIZE_MAX */
+#include <signal.h>
+#include <stdbool.h>
+#include <stdint.h> /* SIZE_MAX */
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <utime.h>
+#include <sys/stat.h>
+#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__
+#include <io.h>
+#if defined __MSVCRT__
+#define fchmod(x,y) 0
+#define fchown(x,y,z) 0
+#define strtoull strtoul
+#define SIGHUP SIGTERM
+#define S_ISSOCK(x) 0
+#ifndef S_IRGRP
+#define S_IRGRP 0
+#define S_IWGRP 0
+#define S_IROTH 0
+#define S_IWOTH 0
+#endif
+#endif
+#if defined __DJGPP__
+#define S_ISSOCK(x) 0
+#define S_ISVTX 0
+#endif
+#endif
+
+#include "carg_parser.h"
+#include "lzlib.h"
+
+#ifndef O_BINARY
+#define O_BINARY 0
+#endif
+
+#if CHAR_BIT != 8
+#error "Environments where CHAR_BIT != 8 are not supported."
+#endif
+
+#if ( defined SIZE_MAX && SIZE_MAX < UINT_MAX ) || \
+ ( defined SSIZE_MAX && SSIZE_MAX < INT_MAX )
+#error "Environments where 'size_t' is narrower than 'int' are not supported."
+#endif
+
+#ifndef max
+ #define max(x,y) ((x) >= (y) ? (x) : (y))
+#endif
+#ifndef min
+ #define min(x,y) ((x) <= (y) ? (x) : (y))
+#endif
+
+static void cleanup_and_fail( const int retval );
+static void show_error( const char * const msg, const int errcode,
+ const bool help );
+static void show_file_error( const char * const filename,
+ const char * const msg, const int errcode );
+static void internal_error( const char * const msg );
+static const char * const mem_msg = "Not enough memory.";
+
+int verbosity = 0;
+
+static const char * const program_name = "minilzip";
+static const char * const program_year = "2024";
+static const char * invocation_name = "minilzip"; /* default value */
+
+static const struct { const char * from; const char * to; } known_extensions[] = {
+ { ".lz", "" },
+ { ".tlz", ".tar" },
+ { 0, 0 } };
+
+struct Lzma_options
+ {
+ int dictionary_size; /* 4 KiB .. 512 MiB */
+ int match_len_limit; /* 5 .. 273 */
+ };
+
+enum Mode { m_compress, m_decompress, m_test };
+
+/* Variables used in signal handler context.
+ They are not declared volatile because the handler never returns. */
+static char * output_filename = 0;
+static int outfd = -1;
+static bool delete_output_on_interrupt = false;
+
+
+static void show_help( void )
+ {
+ printf( "Minilzip is a test program for the compression library lzlib, compatible\n"
+ "with lzip 1.4 or newer.\n"
+ "\nLzip is a lossless data compressor with a user interface similar to the one\n"
+ "of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov\n"
+ "chain-Algorithm' (LZMA) stream format to maximize interoperability. The\n"
+ "maximum dictionary size is 512 MiB so that any lzip file can be decompressed\n"
+ "on 32-bit machines. Lzip provides accurate and robust 3-factor integrity\n"
+ "checking. Lzip can compress about as fast as gzip (lzip -0) or compress most\n"
+ "files more than bzip2 (lzip -9). Decompression speed is intermediate between\n"
+ "gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery\n"
+ "perspective. Lzip has been designed, written, and tested with great care to\n"
+ "replace gzip and bzip2 as the standard general-purpose compressed format for\n"
+ "Unix-like systems.\n"
+ "\nUsage: %s [options] [files]\n", invocation_name );
+ printf( "\nOptions:\n"
+ " -h, --help display this help and exit\n"
+ " -V, --version output version information and exit\n"
+ " -a, --trailing-error exit with error status if trailing data\n"
+ " -b, --member-size=<bytes> set member size limit in bytes\n"
+ " -c, --stdout write to standard output, keep input files\n"
+ " -d, --decompress decompress, test compressed file integrity\n"
+ " -f, --force overwrite existing output files\n"
+ " -F, --recompress force re-compression of compressed files\n"
+ " -k, --keep keep (don't delete) input files\n"
+ " -m, --match-length=<bytes> set match length limit in bytes [36]\n"
+ " -o, --output=<file> write to <file>, keep input files\n"
+ " -q, --quiet suppress all messages\n"
+ " -s, --dictionary-size=<bytes> set dictionary size limit in bytes [8 MiB]\n"
+ " -S, --volume-size=<bytes> set volume size limit in bytes\n"
+ " -t, --test test compressed file integrity\n"
+ " -v, --verbose be verbose (a 2nd -v gives more)\n"
+ " -0 .. -9 set compression level [default 6]\n"
+ " --fast alias for -0\n"
+ " --best alias for -9\n"
+ " --loose-trailing allow trailing data seeming corrupt header\n"
+ " --check-lib compare version of lzlib.h with liblz.{a,so}\n"
+ "\nIf no file names are given, or if a file is '-', minilzip compresses or\n"
+ "decompresses from standard input to standard output.\n"
+ "Numbers may be followed by a multiplier: k = kB = 10^3 = 1000,\n"
+ "Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc...\n"
+ "Dictionary sizes 12 to 29 are interpreted as powers of two, meaning 2^12 to\n"
+ "2^29 bytes.\n"
+ "\nThe bidimensional parameter space of LZMA can't be mapped to a linear scale\n"
+ "optimal for all files. If your files are large, very repetitive, etc, you\n"
+ "may need to use the options --dictionary-size and --match-length directly\n"
+ "to achieve optimal performance.\n"
+ "\nTo extract all the files from archive 'foo.tar.lz', use the commands\n"
+ "'tar -xf foo.tar.lz' or 'minilzip -cd foo.tar.lz | tar -xf -'.\n"
+ "\nExit status: 0 for a normal exit, 1 for environmental problems\n"
+ "(file not found, invalid command-line options, I/O errors, etc), 2 to\n"
+ "indicate a corrupt or invalid input file, 3 for an internal consistency\n"
+ "error (e.g., bug) which caused minilzip to panic.\n"
+ "\nThe ideas embodied in lzlib are due to (at least) the following people:\n"
+ "Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the\n"
+ "definition of Markov chains), G.N.N. Martin (for the definition of range\n"
+ "encoding), Igor Pavlov (for putting all the above together in LZMA), and\n"
+ "Julian Seward (for bzip2's CLI).\n"
+ "\nReport bugs to lzip-bug@nongnu.org\n"
+ "Lzlib home page: http://www.nongnu.org/lzip/lzlib.html\n" );
+ }
+
+
+static void show_lzlib_version( void )
+ {
+ printf( "Using lzlib %s\n", LZ_version() );
+#if !defined LZ_API_VERSION
+ fputs( "LZ_API_VERSION is not defined.\n", stdout );
+#elif LZ_API_VERSION >= 1012
+ printf( "Using LZ_API_VERSION = %u\n", LZ_api_version() );
+#else
+ printf( "Compiled with LZ_API_VERSION = %u. "
+ "Using an unknown LZ_API_VERSION\n", LZ_API_VERSION );
+#endif
+ }
+
+
+static void show_version( void )
+ {
+ printf( "%s %s\n", program_name, PROGVERSION );
+ printf( "Copyright (C) %s Antonio Diaz Diaz.\n", program_year );
+ show_lzlib_version();
+ printf( "License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>\n"
+ "This is free software: you are free to change and redistribute it.\n"
+ "There is NO WARRANTY, to the extent permitted by law.\n" );
+ }
+
+
+static inline void set_retval( int * retval, const int new_val )
+ { if( *retval < new_val ) *retval = new_val; }
+
+
+static int check_lzlib_ver() /* <major>.<minor> or <major>.<minor>[a-z.-]* */
+ {
+#if defined LZ_API_VERSION && LZ_API_VERSION >= 1012
+ const unsigned char * p = (unsigned char *)LZ_version_string;
+ unsigned major = 0, minor = 0;
+ while( major < 100000 && isdigit( *p ) )
+ { major *= 10; major += *p - '0'; ++p; }
+ if( *p == '.' ) ++p;
+ else
+out: { show_error( "Invalid LZ_version_string in lzlib.h", 0, false ); return 2; }
+ while( minor < 100 && isdigit( *p ) )
+ { minor *= 10; minor += *p - '0'; ++p; }
+ if( *p && *p != '-' && *p != '.' && !islower( *p ) ) goto out;
+ const unsigned version = major * 1000 + minor;
+ if( LZ_API_VERSION != version )
+ {
+ if( verbosity >= 0 )
+ fprintf( stderr, "%s: Version mismatch in lzlib.h: "
+ "LZ_API_VERSION = %u, should be %u.\n",
+ program_name, LZ_API_VERSION, version );
+ return 2;
+ }
+#endif
+ return 0;
+ }
+
+
+static int check_lib()
+ {
+ int retval = check_lzlib_ver();
+ if( strcmp( LZ_version_string, LZ_version() ) != 0 )
+ { set_retval( &retval, 1 );
+ if( verbosity >= 0 )
+ printf( "warning: LZ_version_string != LZ_version() (%s vs %s)\n",
+ LZ_version_string, LZ_version() ); }
+#if defined LZ_API_VERSION && LZ_API_VERSION >= 1012
+ if( LZ_API_VERSION != LZ_api_version() )
+ { set_retval( &retval, 1 );
+ if( verbosity >= 0 )
+ printf( "warning: LZ_API_VERSION != LZ_api_version() (%u vs %u)\n",
+ LZ_API_VERSION, LZ_api_version() ); }
+#endif
+ if( verbosity >= 1 ) show_lzlib_version();
+ return retval;
+ }
+
+
+/* assure at least a minimum size for buffer 'buf' */
+static void * resize_buffer( void * buf, const unsigned min_size )
+ {
+ if( buf ) buf = realloc( buf, min_size );
+ else buf = malloc( min_size );
+ if( !buf ) { show_error( mem_msg, 0, false ); cleanup_and_fail( 1 ); }
+ return buf;
+ }
+
+
+struct Pretty_print
+ {
+ const char * name;
+ char * padded_name;
+ const char * stdin_name;
+ unsigned longest_name;
+ bool first_post;
+ };
+
+static void Pp_init( struct Pretty_print * const pp,
+ const char * const filenames[], const int num_filenames )
+ {
+ pp->name = 0;
+ pp->padded_name = 0;
+ pp->stdin_name = "(stdin)";
+ pp->longest_name = 0;
+ pp->first_post = false;
+
+ if( verbosity <= 0 ) return;
+ const unsigned stdin_name_len = strlen( pp->stdin_name );
+ int i;
+ for( i = 0; i < num_filenames; ++i )
+ {
+ const char * const s = filenames[i];
+ const unsigned len = (strcmp( s, "-" ) == 0) ? stdin_name_len : strlen( s );
+ if( pp->longest_name < len ) pp->longest_name = len;
+ }
+ if( pp->longest_name == 0 ) pp->longest_name = stdin_name_len;
+ }
+
+static void Pp_set_name( struct Pretty_print * const pp,
+ const char * const filename )
+ {
+ unsigned name_len, padded_name_len, i = 0;
+
+ if( filename && filename[0] && strcmp( filename, "-" ) != 0 )
+ pp->name = filename;
+ else pp->name = pp->stdin_name;
+ name_len = strlen( pp->name );
+ padded_name_len = max( name_len, pp->longest_name ) + 4;
+ pp->padded_name = resize_buffer( pp->padded_name, padded_name_len + 1 );
+ while( i < 2 ) pp->padded_name[i++] = ' ';
+ while( i < name_len + 2 ) { pp->padded_name[i] = pp->name[i-2]; ++i; }
+ pp->padded_name[i++] = ':';
+ while( i < padded_name_len ) pp->padded_name[i++] = ' ';
+ pp->padded_name[i] = 0;
+ pp->first_post = true;
+ }
+
+static void Pp_reset( struct Pretty_print * const pp )
+ { if( pp->name && pp->name[0] ) pp->first_post = true; }
+
+static void Pp_show_msg( struct Pretty_print * const pp, const char * const msg )
+ {
+ if( verbosity < 0 ) return;
+ if( pp->first_post )
+ {
+ pp->first_post = false;
+ fputs( pp->padded_name, stderr );
+ if( !msg ) fflush( stderr );
+ }
+ if( msg ) fprintf( stderr, "%s\n", msg );
+ }
+
+
+static void show_header( const unsigned dictionary_size )
+ {
+ enum { factor = 1024, n = 3 };
+ const char * const prefix[n] = { "Ki", "Mi", "Gi" };
+ const char * p = "";
+ const char * np = " ";
+ unsigned num = dictionary_size;
+ bool exact = ( num % factor == 0 );
+
+ int i; for( i = 0; i < n && ( num > 9999 || ( exact && num >= factor ) ); ++i )
+ { num /= factor; if( num % factor != 0 ) exact = false;
+ p = prefix[i]; np = ""; }
+ fprintf( stderr, "dict %s%4u %sB, ", np, num, p );
+ }
+
+
+/* separate numbers of 5 or more digits in groups of 3 digits using '_' */
+static const char * format_num3( unsigned long long num )
+ {
+ enum { buffers = 8, bufsize = 4 * sizeof num, n = 10 };
+ const char * const si_prefix = "kMGTPEZYRQ";
+ const char * const binary_prefix = "KMGTPEZYRQ";
+ static char buffer[buffers][bufsize]; /* circle of static buffers for printf */
+ static int current = 0;
+ int i;
+ char * const buf = buffer[current++]; current %= buffers;
+ char * p = buf + bufsize - 1; /* fill the buffer backwards */
+ *p = 0; /* terminator */
+ if( num > 1024 )
+ {
+ char prefix = 0; /* try binary first, then si */
+ for( i = 0; i < n && num != 0 && num % 1024 == 0; ++i )
+ { num /= 1024; prefix = binary_prefix[i]; }
+ if( prefix ) *(--p) = 'i';
+ else
+ for( i = 0; i < n && num != 0 && num % 1000 == 0; ++i )
+ { num /= 1000; prefix = si_prefix[i]; }
+ if( prefix ) *(--p) = prefix;
+ }
+ const bool split = num >= 10000;
+
+ for( i = 0; ; )
+ {
+ *(--p) = num % 10 + '0'; num /= 10; if( num == 0 ) break;
+ if( split && ++i >= 3 ) { i = 0; *(--p) = '_'; }
+ }
+ return p;
+ }
+
+
+void show_option_error( const char * const arg, const char * const msg,
+ const char * const option_name )
+ {
+ if( verbosity >= 0 )
+ fprintf( stderr, "%s: '%s': %s option '%s'.\n",
+ program_name, arg, msg, option_name );
+ }
+
+
+/* Recognized formats: <num>k, <num>Ki, <num>[MGTPEZYRQ][i] */
+static unsigned long long getnum( const char * const arg,
+ const char * const option_name,
+ const unsigned long long llimit,
+ const unsigned long long ulimit )
+ {
+ char * tail;
+ errno = 0;
+ unsigned long long result = strtoull( arg, &tail, 0 );
+ if( tail == arg )
+ { show_option_error( arg, "Bad or missing numerical argument in",
+ option_name ); exit( 1 ); }
+
+ if( !errno && tail[0] )
+ {
+ const unsigned factor = ( tail[1] == 'i' ) ? 1024 : 1000;
+ int exponent = 0; /* 0 = bad multiplier */
+ int i;
+ switch( tail[0] )
+ {
+ case 'Q': exponent = 10; break;
+ case 'R': exponent = 9; break;
+ case 'Y': exponent = 8; break;
+ case 'Z': exponent = 7; break;
+ case 'E': exponent = 6; break;
+ case 'P': exponent = 5; break;
+ case 'T': exponent = 4; break;
+ case 'G': exponent = 3; break;
+ case 'M': exponent = 2; break;
+ case 'K': if( factor == 1024 ) exponent = 1; break;
+ case 'k': if( factor == 1000 ) exponent = 1; break;
+ }
+ if( exponent <= 0 )
+ { show_option_error( arg, "Bad multiplier in numerical argument of",
+ option_name ); exit( 1 ); }
+ for( i = 0; i < exponent; ++i )
+ {
+ if( ulimit / factor >= result ) result *= factor;
+ else { errno = ERANGE; break; }
+ }
+ }
+ if( !errno && ( result < llimit || result > ulimit ) ) errno = ERANGE;
+ if( errno )
+ {
+ if( verbosity >= 0 )
+ fprintf( stderr, "%s: '%s': Value out of limits [%s,%s] in "
+ "option '%s'.\n", program_name, arg, format_num3( llimit ),
+ format_num3( ulimit ), option_name );
+ exit( 1 );
+ }
+ return result;
+ }
+
+
+static int get_dict_size( const char * const arg, const char * const option_name )
+ {
+ char * tail;
+ const long bits = strtol( arg, &tail, 0 );
+ if( bits >= LZ_min_dictionary_bits() &&
+ bits <= LZ_max_dictionary_bits() && *tail == 0 )
+ return 1 << bits;
+ int dictionary_size = getnum( arg, option_name, LZ_min_dictionary_size(),
+ LZ_max_dictionary_size() );
+ if( dictionary_size == 65535 ) ++dictionary_size; /* no fast encoder */
+ return dictionary_size;
+ }
+
+
+static void set_mode( enum Mode * const program_modep, const enum Mode new_mode )
+ {
+ if( *program_modep != m_compress && *program_modep != new_mode )
+ {
+ show_error( "Only one operation can be specified.", 0, true );
+ exit( 1 );
+ }
+ *program_modep = new_mode;
+ }
+
+
+static int extension_index( const char * const name )
+ {
+ int eindex;
+ for( eindex = 0; known_extensions[eindex].from; ++eindex )
+ {
+ const char * const ext = known_extensions[eindex].from;
+ const unsigned name_len = strlen( name );
+ const unsigned ext_len = strlen( ext );
+ if( name_len > ext_len &&
+ strncmp( name + name_len - ext_len, ext, ext_len ) == 0 )
+ return eindex;
+ }
+ return -1;
+ }
+
+
+static void set_c_outname( const char * const name, const bool force_ext,
+ const bool multifile )
+ {
+ output_filename = resize_buffer( output_filename, strlen( name ) + 5 +
+ strlen( known_extensions[0].from ) + 1 );
+ strcpy( output_filename, name );
+ if( multifile ) strcat( output_filename, "00001" );
+ if( force_ext || multifile )
+ strcat( output_filename, known_extensions[0].from );
+ }
+
+
+static void set_d_outname( const char * const name, const int eindex )
+ {
+ const unsigned name_len = strlen( name );
+ if( eindex >= 0 )
+ {
+ const char * const from = known_extensions[eindex].from;
+ const unsigned from_len = strlen( from );
+ if( name_len > from_len )
+ {
+ output_filename = resize_buffer( output_filename, name_len +
+ strlen( known_extensions[eindex].to ) + 1 );
+ strcpy( output_filename, name );
+ strcpy( output_filename + name_len - from_len, known_extensions[eindex].to );
+ return;
+ }
+ }
+ output_filename = resize_buffer( output_filename, name_len + 4 + 1 );
+ strcpy( output_filename, name );
+ strcat( output_filename, ".out" );
+ if( verbosity >= 1 )
+ fprintf( stderr, "%s: %s: Can't guess original name -- using '%s'\n",
+ program_name, name, output_filename );
+ }
+
+
+static int open_instream( const char * const name, struct stat * const in_statsp,
+ const enum Mode program_mode, const int eindex,
+ const bool one_to_one, const bool recompress )
+ {
+ if( program_mode == m_compress && !recompress && eindex >= 0 )
+ {
+ if( verbosity >= 0 )
+ fprintf( stderr, "%s: %s: Input file already has '%s' suffix.\n",
+ program_name, name, known_extensions[eindex].from );
+ return -1;
+ }
+ int infd = open( name, O_RDONLY | O_BINARY );
+ if( infd < 0 )
+ show_file_error( name, "Can't open input file", errno );
+ else
+ {
+ const int i = fstat( infd, in_statsp );
+ const mode_t mode = in_statsp->st_mode;
+ const bool can_read = ( i == 0 &&
+ ( S_ISBLK( mode ) || S_ISCHR( mode ) ||
+ S_ISFIFO( mode ) || S_ISSOCK( mode ) ) );
+ if( i != 0 || ( !S_ISREG( mode ) && ( !can_read || one_to_one ) ) )
+ {
+ if( verbosity >= 0 )
+ fprintf( stderr, "%s: %s: Input file is not a regular file%s.\n",
+ program_name, name, ( can_read && one_to_one ) ?
+ ",\n and neither '-c' nor '-o' were specified" : "" );
+ close( infd );
+ infd = -1;
+ }
+ }
+ return infd;
+ }
+
+
+static bool open_outstream( const bool force, const bool protect )
+ {
+ const mode_t usr_rw = S_IRUSR | S_IWUSR;
+ const mode_t all_rw = usr_rw | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH;
+ const mode_t outfd_mode = protect ? usr_rw : all_rw;
+ int flags = O_CREAT | O_WRONLY | O_BINARY;
+ if( force ) flags |= O_TRUNC; else flags |= O_EXCL;
+
+ outfd = open( output_filename, flags, outfd_mode );
+ if( outfd >= 0 ) delete_output_on_interrupt = true;
+ else if( errno == EEXIST )
+ show_file_error( output_filename,
+ "Output file already exists, skipping.", 0 );
+ else
+ show_file_error( output_filename, "Can't create output file", errno );
+ return outfd >= 0;
+ }
+
+
+static void set_signals( void (*action)(int) )
+ {
+ signal( SIGHUP, action );
+ signal( SIGINT, action );
+ signal( SIGTERM, action );
+ }
+
+
+static void cleanup_and_fail( const int retval )
+ {
+ set_signals( SIG_IGN ); /* ignore signals */
+ if( delete_output_on_interrupt )
+ {
+ delete_output_on_interrupt = false;
+ show_file_error( output_filename, "Deleting output file, if it exists.", 0 );
+ if( outfd >= 0 ) { close( outfd ); outfd = -1; }
+ if( remove( output_filename ) != 0 && errno != ENOENT )
+ show_error( "warning: deletion of output file failed", errno, false );
+ }
+ exit( retval );
+ }
+
+
+static void signal_handler( int sig )
+ {
+ if( sig ) {} /* keep compiler happy */
+ show_error( "Control-C or similar caught, quitting.", 0, false );
+ cleanup_and_fail( 1 );
+ }
+
+
+static bool check_tty_in( const char * const input_filename, const int infd,
+ const enum Mode program_mode, int * const retval )
+ {
+ if( ( program_mode == m_decompress || program_mode == m_test ) &&
+ isatty( infd ) ) /* for example /dev/tty */
+ { show_file_error( input_filename,
+ "I won't read compressed data from a terminal.", 0 );
+ close( infd ); set_retval( retval, 2 );
+ if( program_mode != m_test ) cleanup_and_fail( *retval );
+ return false; }
+ return true;
+ }
+
+static bool check_tty_out( const enum Mode program_mode )
+ {
+ if( program_mode == m_compress && isatty( outfd ) )
+ { show_file_error( output_filename[0] ?
+ output_filename : "(stdout)",
+ "I won't write compressed data to a terminal.", 0 );
+ return false; }
+ return true;
+ }
+
+
+/* Set permissions, owner, and times. */
+static void close_and_set_permissions( const struct stat * const in_statsp )
+ {
+ bool warning = false;
+ if( in_statsp )
+ {
+ const mode_t mode = in_statsp->st_mode;
+ /* fchown in many cases returns with EPERM, which can be safely ignored. */
+ if( fchown( outfd, in_statsp->st_uid, in_statsp->st_gid ) == 0 )
+ { if( fchmod( outfd, mode ) != 0 ) warning = true; }
+ else
+ if( errno != EPERM ||
+ fchmod( outfd, mode & ~( S_ISUID | S_ISGID | S_ISVTX ) ) != 0 )
+ warning = true;
+ }
+ if( close( outfd ) != 0 )
+ { show_file_error( output_filename, "Error closing output file", errno );
+ cleanup_and_fail( 1 ); }
+ outfd = -1;
+ delete_output_on_interrupt = false;
+ if( in_statsp )
+ {
+ struct utimbuf t;
+ t.actime = in_statsp->st_atime;
+ t.modtime = in_statsp->st_mtime;
+ if( utime( output_filename, &t ) != 0 ) warning = true;
+ }
+ if( warning && verbosity >= 1 )
+ show_file_error( output_filename,
+ "warning: can't change output file attributes", errno );
+ }
+
+
+/* Return the number of bytes really read.
+ If (value returned < size) and (errno == 0), means EOF was reached.
+*/
+static int readblock( const int fd, uint8_t * const buf, const int size )
+ {
+ int sz = 0;
+ errno = 0;
+ while( sz < size )
+ {
+ const int n = read( fd, buf + sz, size - sz );
+ if( n > 0 ) sz += n;
+ else if( n == 0 ) break; /* EOF */
+ else if( errno != EINTR ) break;
+ errno = 0;
+ }
+ return sz;
+ }
+
+
+/* Return the number of bytes really written.
+ If (value returned < size), it is always an error.
+*/
+static int writeblock( const int fd, const uint8_t * const buf, const int size )
+ {
+ int sz = 0;
+ errno = 0;
+ while( sz < size )
+ {
+ const int n = write( fd, buf + sz, size - sz );
+ if( n > 0 ) sz += n;
+ else if( n < 0 && errno != EINTR ) break;
+ errno = 0;
+ }
+ return sz;
+ }
+
+
+static bool next_filename( void )
+ {
+ const unsigned name_len = strlen( output_filename );
+ const unsigned ext_len = strlen( known_extensions[0].from );
+ int i, j;
+ if( name_len >= ext_len + 5 ) /* "*00001.lz" */
+ for( i = name_len - ext_len - 1, j = 0; j < 5; --i, ++j )
+ {
+ if( output_filename[i] < '9' ) { ++output_filename[i]; return true; }
+ else output_filename[i] = '0';
+ }
+ return false;
+ }
+
+
+static int do_compress( struct LZ_Encoder * const encoder,
+ const unsigned long long member_size,
+ const unsigned long long volume_size, const int infd,
+ struct Pretty_print * const pp,
+ const struct stat * const in_statsp )
+ {
+ unsigned long long partial_volume_size = 0;
+ enum { buffer_size = 65536 };
+ uint8_t buffer[buffer_size]; /* read/write buffer */
+ if( verbosity >= 1 ) Pp_show_msg( pp, 0 );
+
+ while( true )
+ {
+ int in_size = 0;
+ while( LZ_compress_write_size( encoder ) > 0 )
+ {
+ const int size = min( LZ_compress_write_size( encoder ), buffer_size );
+ const int rd = readblock( infd, buffer, size );
+ if( rd != size && errno )
+ {
+ Pp_show_msg( pp, 0 ); show_error( "Read error", errno, false );
+ return 1;
+ }
+ if( rd > 0 && rd != LZ_compress_write( encoder, buffer, rd ) )
+ internal_error( "library error (LZ_compress_write)." );
+ if( rd < size ) LZ_compress_finish( encoder );
+/* else LZ_compress_sync_flush( encoder ); */
+ in_size += rd;
+ }
+ const int out_size = LZ_compress_read( encoder, buffer, buffer_size );
+ if( out_size < 0 )
+ {
+ Pp_show_msg( pp, 0 );
+ if( verbosity >= 0 )
+ fprintf( stderr, "%s: LZ_compress_read error: %s\n",
+ program_name, LZ_strerror( LZ_compress_errno( encoder ) ) );
+ return 1;
+ }
+ else if( out_size > 0 )
+ {
+ const int wr = writeblock( outfd, buffer, out_size );
+ if( wr != out_size )
+ {
+ Pp_show_msg( pp, 0 ); show_error( "Write error", errno, false );
+ return 1;
+ }
+ }
+ else if( in_size == 0 )
+ internal_error( "library error (LZ_compress_read)." );
+ if( LZ_compress_member_finished( encoder ) )
+ {
+ unsigned long long size;
+ if( LZ_compress_finished( encoder ) == 1 ) break;
+ if( volume_size > 0 )
+ {
+ partial_volume_size += LZ_compress_member_position( encoder );
+ if( partial_volume_size >= volume_size - LZ_min_dictionary_size() )
+ {
+ partial_volume_size = 0;
+ if( delete_output_on_interrupt )
+ {
+ close_and_set_permissions( in_statsp );
+ if( !next_filename() )
+ { Pp_show_msg( pp, "Too many volume files." ); return 1; }
+ if( !open_outstream( true, in_statsp ) ) return 1;
+ }
+ }
+ size = min( member_size, volume_size - partial_volume_size );
+ }
+ else
+ size = member_size;
+ if( LZ_compress_restart_member( encoder, size ) < 0 )
+ {
+ Pp_show_msg( pp, 0 );
+ if( verbosity >= 0 )
+ fprintf( stderr, "%s: LZ_compress_restart_member error: %s\n",
+ program_name, LZ_strerror( LZ_compress_errno( encoder ) ) );
+ return 1;
+ }
+ }
+ }
+
+ if( verbosity >= 1 )
+ {
+ const unsigned long long in_size = LZ_compress_total_in_size( encoder );
+ const unsigned long long out_size = LZ_compress_total_out_size( encoder );
+ if( in_size == 0 || out_size == 0 )
+ fputs( " no data compressed.\n", stderr );
+ else
+ fprintf( stderr, "%6.3f:1, %5.2f%% ratio, %5.2f%% saved, "
+ "%llu in, %llu out.\n",
+ (double)in_size / out_size,
+ ( 100.0 * out_size ) / in_size,
+ 100.0 - ( ( 100.0 * out_size ) / in_size ),
+ in_size, out_size );
+ }
+ return 0;
+ }
+
+
+static int compress( const unsigned long long member_size,
+ const unsigned long long volume_size, const int infd,
+ const struct Lzma_options * const encoder_options,
+ struct Pretty_print * const pp,
+ const struct stat * const in_statsp )
+ {
+ struct LZ_Encoder * const encoder =
+ LZ_compress_open( encoder_options->dictionary_size,
+ encoder_options->match_len_limit, ( volume_size > 0 ) ?
+ min( member_size, volume_size ) : member_size );
+ int retval;
+
+ if( !encoder || LZ_compress_errno( encoder ) != LZ_ok )
+ {
+ if( !encoder || LZ_compress_errno( encoder ) == LZ_mem_error )
+ Pp_show_msg( pp, "Not enough memory. Try a smaller dictionary size." );
+ else
+ internal_error( "invalid argument to encoder." );
+ retval = 1;
+ }
+ else retval = do_compress( encoder, member_size, volume_size,
+ infd, pp, in_statsp );
+ LZ_compress_close( encoder );
+ return retval;
+ }
+
+
+static int do_decompress( struct LZ_Decoder * const decoder, const int infd,
+ struct Pretty_print * const pp, const bool ignore_trailing,
+ const bool loose_trailing, const bool testing )
+ {
+ enum { buffer_size = 65536 };
+ uint8_t buffer[buffer_size]; /* read/write buffer */
+ unsigned long long total_in = 0; /* to detect library stall */
+ bool first_member;
+
+ for( first_member = true; ; )
+ {
+ const int max_in_size =
+ min( LZ_decompress_write_size( decoder ), buffer_size );
+ int in_size = 0, out_size = 0;
+ if( max_in_size > 0 )
+ {
+ in_size = readblock( infd, buffer, max_in_size );
+ if( in_size != max_in_size && errno )
+ {
+ Pp_show_msg( pp, 0 ); show_error( "Read error", errno, false );
+ return 1;
+ }
+ if( in_size > 0 && in_size != LZ_decompress_write( decoder, buffer, in_size ) )
+ internal_error( "library error (LZ_decompress_write)." );
+ if( in_size < max_in_size ) LZ_decompress_finish( decoder );
+ }
+ while( true )
+ {
+ const int rd =
+ LZ_decompress_read( decoder, (outfd >= 0) ? buffer : 0, buffer_size );
+ if( rd > 0 )
+ {
+ out_size += rd;
+ if( outfd >= 0 )
+ {
+ const int wr = writeblock( outfd, buffer, rd );
+ if( wr != rd )
+ {
+ Pp_show_msg( pp, 0 ); show_error( "Write error", errno, false );
+ return 1;
+ }
+ }
+ }
+ else if( rd < 0 ) { out_size = rd; break; }
+ if( LZ_decompress_member_finished( decoder ) == 1 )
+ {
+ if( verbosity >= 1 )
+ {
+ const unsigned long long data_size = LZ_decompress_data_position( decoder );
+ const unsigned long long member_size = LZ_decompress_member_position( decoder );
+ if( verbosity >= 2 || ( verbosity == 1 && first_member ) )
+ Pp_show_msg( pp, 0 );
+ if( verbosity >= 2 )
+ {
+ if( verbosity >= 4 )
+ show_header( LZ_decompress_dictionary_size( decoder ) );
+ if( data_size == 0 || member_size == 0 )
+ fputs( "no data compressed. ", stderr );
+ else
+ fprintf( stderr, "%6.3f:1, %5.2f%% ratio, %5.2f%% saved. ",
+ (double)data_size / member_size,
+ ( 100.0 * member_size ) / data_size,
+ 100.0 - ( ( 100.0 * member_size ) / data_size ) );
+ if( verbosity >= 4 )
+ fprintf( stderr, "CRC %08X, ", LZ_decompress_data_crc( decoder ) );
+ if( verbosity >= 3 )
+ fprintf( stderr, "%9llu out, %8llu in. ", data_size, member_size );
+ fputs( testing ? "ok\n" : "done\n", stderr ); Pp_reset( pp );
+ }
+ }
+ first_member = false; /* member decompressed successfully */
+ }
+ if( rd <= 0 ) break;
+ }
+ if( out_size < 0 || ( first_member && out_size == 0 ) )
+ {
+ const unsigned long long member_pos = LZ_decompress_member_position( decoder );
+ const enum LZ_Errno lz_errno = LZ_decompress_errno( decoder );
+ if( lz_errno == LZ_library_error )
+ internal_error( "library error (LZ_decompress_read)." );
+ if( member_pos <= 6 )
+ {
+ if( lz_errno == LZ_unexpected_eof )
+ {
+ if( first_member )
+ show_file_error( pp->name, "File ends unexpectedly at member header.", 0 );
+ else
+ Pp_show_msg( pp, "Truncated header in multimember file." );
+ return 2;
+ }
+ else if( lz_errno == LZ_data_error )
+ {
+ if( member_pos == 4 )
+ { if( verbosity >= 0 )
+ { Pp_show_msg( pp, 0 );
+ fprintf( stderr, "Version %d member format not supported.\n",
+ LZ_decompress_member_version( decoder ) ); } }
+ else if( member_pos == 5 )
+ Pp_show_msg( pp, "Invalid dictionary size in member header." );
+ else if( first_member ) /* for lzlib older than 1.10 */
+ Pp_show_msg( pp, "Bad version or dictionary size in member header." );
+ else if( !loose_trailing )
+ Pp_show_msg( pp, "Corrupt header in multimember file." );
+ else if( !ignore_trailing )
+ Pp_show_msg( pp, "Trailing data not allowed." );
+ else break; /* trailing data */
+ return 2;
+ }
+ }
+ if( lz_errno == LZ_header_error )
+ {
+ if( first_member )
+ show_file_error( pp->name,
+ "Bad magic number (file not in lzip format).", 0 );
+ else if( !ignore_trailing )
+ Pp_show_msg( pp, "Trailing data not allowed." );
+ else break; /* trailing data */
+ return 2;
+ }
+ if( lz_errno == LZ_mem_error ) { Pp_show_msg( pp, mem_msg ); return 1; }
+ if( verbosity >= 0 )
+ {
+ Pp_show_msg( pp, 0 );
+ fprintf( stderr, "%s at pos %llu\n", ( lz_errno == LZ_unexpected_eof ) ?
+ "File ends unexpectedly" : "Decoder error",
+ LZ_decompress_total_in_size( decoder ) );
+ }
+ return 2;
+ }
+ if( LZ_decompress_finished( decoder ) == 1 ) break;
+ if( in_size == 0 && out_size == 0 )
+ {
+ const unsigned long long size = LZ_decompress_total_in_size( decoder );
+ if( total_in == size ) internal_error( "library error (stalled)." );
+ total_in = size;
+ }
+ }
+ if( verbosity == 1 ) fputs( testing ? "ok\n" : "done\n", stderr );
+ return 0;
+ }
+
+
+static int decompress( const int infd, struct Pretty_print * const pp,
+ const bool ignore_trailing,
+ const bool loose_trailing, const bool testing )
+ {
+ struct LZ_Decoder * const decoder = LZ_decompress_open();
+ int retval;
+
+ if( !decoder || LZ_decompress_errno( decoder ) != LZ_ok )
+ { Pp_show_msg( pp, mem_msg ); retval = 1; }
+ else retval = do_decompress( decoder, infd, pp, ignore_trailing,
+ loose_trailing, testing );
+ LZ_decompress_close( decoder );
+ return retval;
+ }
+
+
+static void show_error( const char * const msg, const int errcode,
+ const bool help )
+ {
+ if( verbosity < 0 ) return;
+ if( msg && msg[0] )
+ fprintf( stderr, "%s: %s%s%s\n", program_name, msg,
+ ( errcode > 0 ) ? ": " : "",
+ ( errcode > 0 ) ? strerror( errcode ) : "" );
+ if( help )
+ fprintf( stderr, "Try '%s --help' for more information.\n",
+ invocation_name );
+ }
+
+
+static void show_file_error( const char * const filename,
+ const char * const msg, const int errcode )
+ {
+ if( verbosity >= 0 )
+ fprintf( stderr, "%s: %s: %s%s%s\n", program_name, filename, msg,
+ ( errcode > 0 ) ? ": " : "",
+ ( errcode > 0 ) ? strerror( errcode ) : "" );
+ }
+
+
+static void internal_error( const char * const msg )
+ {
+ if( verbosity >= 0 )
+ fprintf( stderr, "%s: internal error: %s\n", program_name, msg );
+ exit( 3 );
+ }
+
+
+int main( const int argc, const char * const argv[] )
+ {
+ /* Mapping from gzip/bzip2 style 0..9 compression levels to the
+ corresponding LZMA compression parameters. */
+ const struct Lzma_options option_mapping[] =
+ {
+ { 65535, 16 }, /* -0 (65535,16 chooses fast encoder) */
+ { 1 << 20, 5 }, /* -1 */
+ { 3 << 19, 6 }, /* -2 */
+ { 1 << 21, 8 }, /* -3 */
+ { 3 << 20, 12 }, /* -4 */
+ { 1 << 22, 20 }, /* -5 */
+ { 1 << 23, 36 }, /* -6 */
+ { 1 << 24, 68 }, /* -7 */
+ { 3 << 23, 132 }, /* -8 */
+ { 1 << 25, 273 } }; /* -9 */
+ struct Lzma_options encoder_options = option_mapping[6]; /* default = "-6" */
+ const unsigned long long max_member_size = 0x0008000000000000ULL; /* 2 PiB */
+ const unsigned long long max_volume_size = 0x4000000000000000ULL; /* 4 EiB */
+ unsigned long long member_size = max_member_size;
+ unsigned long long volume_size = 0;
+ const char * default_output_filename = "";
+ enum Mode program_mode = m_compress;
+ int i;
+ bool force = false;
+ bool ignore_trailing = true;
+ bool keep_input_files = false;
+ bool loose_trailing = false;
+ bool recompress = false;
+ bool to_stdout = false;
+ if( argc > 0 ) invocation_name = argv[0];
+
+ enum { opt_chk = 256, opt_lt };
+ const struct ap_Option options[] =
+ {
+ { '0', "fast", ap_no },
+ { '1', 0, ap_no },
+ { '2', 0, ap_no },
+ { '3', 0, ap_no },
+ { '4', 0, ap_no },
+ { '5', 0, ap_no },
+ { '6', 0, ap_no },
+ { '7', 0, ap_no },
+ { '8', 0, ap_no },
+ { '9', "best", ap_no },
+ { 'a', "trailing-error", ap_no },
+ { 'b', "member-size", ap_yes },
+ { 'c', "stdout", ap_no },
+ { 'd', "decompress", ap_no },
+ { 'f', "force", ap_no },
+ { 'F', "recompress", ap_no },
+ { 'h', "help", ap_no },
+ { 'k', "keep", ap_no },
+ { 'm', "match-length", ap_yes },
+ { 'n', "threads", ap_yes },
+ { 'o', "output", ap_yes },
+ { 'q', "quiet", ap_no },
+ { 's', "dictionary-size", ap_yes },
+ { 'S', "volume-size", ap_yes },
+ { 't', "test", ap_no },
+ { 'v', "verbose", ap_no },
+ { 'V', "version", ap_no },
+ { opt_chk, "check-lib", ap_no },
+ { opt_lt, "loose-trailing", ap_no },
+ { 0, 0, ap_no } };
+
+ /* static because valgrind complains and memory management in C sucks */
+ static struct Arg_parser parser;
+ if( !ap_init( &parser, argc, argv, options, 0 ) )
+ { show_error( mem_msg, 0, false ); return 1; }
+ if( ap_error( &parser ) ) /* bad option */
+ { show_error( ap_error( &parser ), 0, true ); return 1; }
+
+ int argind = 0;
+ for( ; argind < ap_arguments( &parser ); ++argind )
+ {
+ const int code = ap_code( &parser, argind );
+ if( !code ) break; /* no more options */
+ const char * const pn = ap_parsed_name( &parser, argind );
+ const char * const arg = ap_argument( &parser, argind );
+ switch( code )
+ {
+ case '0': case '1': case '2': case '3': case '4':
+ case '5': case '6': case '7': case '8': case '9':
+ encoder_options = option_mapping[code-'0']; break;
+ case 'a': ignore_trailing = false; break;
+ case 'b': member_size = getnum( arg, pn, 100000, max_member_size ); break;
+ case 'c': to_stdout = true; break;
+ case 'd': set_mode( &program_mode, m_decompress ); break;
+ case 'f': force = true; break;
+ case 'F': recompress = true; break;
+ case 'h': show_help(); return 0;
+ case 'k': keep_input_files = true; break;
+ case 'm': encoder_options.match_len_limit =
+ getnum( arg, pn, LZ_min_match_len_limit(),
+ LZ_max_match_len_limit() ); break;
+ case 'n': break;
+ case 'o': if( strcmp( arg, "-" ) == 0 ) to_stdout = true;
+ else { default_output_filename = arg; } break;
+ case 'q': verbosity = -1; break;
+ case 's': encoder_options.dictionary_size = get_dict_size( arg, pn );
+ break;
+ case 'S': volume_size = getnum( arg, pn, 100000, max_volume_size ); break;
+ case 't': set_mode( &program_mode, m_test ); break;
+ case 'v': if( verbosity < 4 ) ++verbosity; break;
+ case 'V': show_version(); return 0;
+ case opt_chk: return check_lib();
+ case opt_lt: loose_trailing = true; break;
+ default: internal_error( "uncaught option." );
+ }
+ } /* end process options */
+
+ if( strcmp( PROGVERSION, LZ_version_string ) != 0 )
+ internal_error( "wrong PROGVERSION." );
+#if !defined LZ_API_VERSION || LZ_API_VERSION < 1012
+#error "lzlib 1.12 or newer needed."
+#else
+ if( LZ_api_version() < 1012 ) /* minilzip passes null to LZ_decompress_read */
+ { show_error( "lzlib 1.12 or newer needed. Try --check-lib.", 0, false );
+ return 1; }
+ if( LZ_api_version() != LZ_API_VERSION ) show_error(
+ "warning: wrong library API version. Try --check-lib.", 0, false );
+ else
+#endif
+ if( strcmp( LZ_version_string, LZ_version() ) != 0 ) show_error(
+ "warning: wrong library version_string. Try --check-lib.", 0, false );
+
+#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__
+ setmode( STDIN_FILENO, O_BINARY );
+ setmode( STDOUT_FILENO, O_BINARY );
+#endif
+
+ static const char ** filenames = 0;
+ int num_filenames = max( 1, ap_arguments( &parser ) - argind );
+ filenames = resize_buffer( filenames, num_filenames * sizeof filenames[0] );
+ filenames[0] = "-";
+
+ bool filenames_given = false;
+ for( i = 0; argind + i < ap_arguments( &parser ); ++i )
+ {
+ filenames[i] = ap_argument( &parser, argind + i );
+ if( strcmp( filenames[i], "-" ) != 0 ) filenames_given = true;
+ }
+
+ if( program_mode == m_compress )
+ {
+ if( volume_size > 0 && !to_stdout && default_output_filename[0] &&
+ num_filenames > 1 )
+ { show_error( "Only can compress one file when using '-o' and '-S'.",
+ 0, true ); return 1; }
+ }
+ else volume_size = 0;
+ if( program_mode == m_test ) to_stdout = false; /* apply overrides */
+ if( program_mode == m_test || to_stdout ) default_output_filename = "";
+
+ output_filename = resize_buffer( output_filename, 1 );
+ output_filename[0] = 0;
+ if( to_stdout && program_mode != m_test ) /* check tty only once */
+ { outfd = STDOUT_FILENO; if( !check_tty_out( program_mode ) ) return 1; }
+ else outfd = -1;
+
+ const bool to_file = !to_stdout && program_mode != m_test &&
+ default_output_filename[0];
+ if( !to_stdout && program_mode != m_test && ( filenames_given || to_file ) )
+ set_signals( signal_handler );
+
+ static struct Pretty_print pp;
+ Pp_init( &pp, filenames, num_filenames );
+
+ int failed_tests = 0;
+ int retval = 0;
+ const bool one_to_one = !to_stdout && program_mode != m_test && !to_file;
+ bool stdin_used = false;
+ struct stat in_stats;
+ for( i = 0; i < num_filenames; ++i )
+ {
+ const char * input_filename = "";
+ int infd;
+
+ Pp_set_name( &pp, filenames[i] );
+ if( strcmp( filenames[i], "-" ) == 0 )
+ {
+ if( stdin_used ) continue; else stdin_used = true;
+ infd = STDIN_FILENO;
+ if( !check_tty_in( pp.name, infd, program_mode, &retval ) ) continue;
+ if( one_to_one ) { outfd = STDOUT_FILENO; output_filename[0] = 0; }
+ }
+ else
+ {
+ const int eindex = extension_index( input_filename = filenames[i] );
+ infd = open_instream( input_filename, &in_stats, program_mode,
+ eindex, one_to_one, recompress );
+ if( infd < 0 ) { set_retval( &retval, 1 ); continue; }
+ if( !check_tty_in( pp.name, infd, program_mode, &retval ) ) continue;
+ if( one_to_one ) /* open outfd after checking infd */
+ {
+ if( program_mode == m_compress )
+ set_c_outname( input_filename, true, volume_size > 0 );
+ else set_d_outname( input_filename, eindex );
+ if( !open_outstream( force, true ) )
+ { close( infd ); set_retval( &retval, 1 ); continue; }
+ }
+ }
+
+ if( one_to_one && !check_tty_out( program_mode ) )
+ { set_retval( &retval, 1 ); return retval; } /* don't delete a tty */
+
+ if( to_file && outfd < 0 ) /* open outfd after checking infd */
+ {
+ if( program_mode == m_compress ) set_c_outname( default_output_filename,
+ false, volume_size > 0 );
+ else
+ { output_filename = resize_buffer( output_filename,
+ strlen( default_output_filename ) + 1 );
+ strcpy( output_filename, default_output_filename ); }
+ if( !open_outstream( force, false ) || !check_tty_out( program_mode ) )
+ return 1; /* check tty only once and don't try to delete a tty */
+ }
+
+ const struct stat * const in_statsp =
+ ( input_filename[0] && one_to_one ) ? &in_stats : 0;
+ int tmp;
+ if( program_mode == m_compress )
+ tmp = compress( member_size, volume_size, infd, &encoder_options, &pp,
+ in_statsp );
+ else
+ tmp = decompress( infd, &pp, ignore_trailing, loose_trailing,
+ program_mode == m_test );
+ if( close( infd ) != 0 )
+ { show_file_error( pp.name, "Error closing input file", errno );
+ set_retval( &tmp, 1 ); }
+ set_retval( &retval, tmp );
+ if( tmp )
+ { if( program_mode != m_test ) cleanup_and_fail( retval );
+ else ++failed_tests; }
+
+ if( delete_output_on_interrupt && one_to_one )
+ close_and_set_permissions( in_statsp );
+ if( input_filename[0] && !keep_input_files && one_to_one &&
+ ( program_mode != m_compress || volume_size == 0 ) )
+ remove( input_filename );
+ }
+ if( delete_output_on_interrupt ) /* -o */
+ close_and_set_permissions( ( retval == 0 && !stdin_used &&
+ filenames_given && num_filenames == 1 ) ? &in_stats : 0 );
+ else if( outfd >= 0 && close( outfd ) != 0 ) /* -c */
+ {
+ show_error( "Error closing stdout", errno, false );
+ set_retval( &retval, 1 );
+ }
+ if( failed_tests > 0 && verbosity >= 1 && num_filenames > 1 )
+ fprintf( stderr, "%s: warning: %d %s failed the test.\n",
+ program_name, failed_tests,
+ ( failed_tests == 1 ) ? "file" : "files" );
+ free( output_filename );
+ free( filenames );
+ ap_free( &parser );
+ return retval;
+ }
diff --git a/testsuite/check.sh b/testsuite/check.sh
new file mode 100755
index 0000000..1c5daf7
--- /dev/null
+++ b/testsuite/check.sh
@@ -0,0 +1,449 @@
+#! /bin/sh
+# check script for Lzlib - Compression library for the lzip format
+# Copyright (C) 2009-2024 Antonio Diaz Diaz.
+#
+# This script is free software: you have unlimited permission
+# to copy, distribute, and modify it.
+
+LC_ALL=C
+export LC_ALL
+objdir=`pwd`
+testdir=`cd "$1" ; pwd`
+LZIP="${objdir}"/minilzip
+BBEXAMPLE="${objdir}"/bbexample
+FFEXAMPLE="${objdir}"/ffexample
+LZCHECK="${objdir}"/lzcheck
+framework_failure() { echo "failure in testing framework" ; exit 1 ; }
+
+if [ ! -f "${LZIP}" ] || [ ! -x "${LZIP}" ] ; then
+ echo "${LZIP}: cannot execute"
+ exit 1
+fi
+
+[ -e "${LZIP}" ] 2> /dev/null ||
+ {
+ echo "$0: a POSIX shell is required to run the tests"
+ echo "Try bash -c \"$0 $1 $2\""
+ exit 1
+ }
+
+if [ -d tmp ] ; then rm -rf tmp ; fi
+mkdir tmp
+cd "${objdir}"/tmp || framework_failure
+
+cat "${testdir}"/test.txt > in || framework_failure
+in_lz="${testdir}"/test.txt.lz
+in_em="${testdir}"/test_em.txt.lz
+fox_lf="${testdir}"/fox_lf
+fox_lz="${testdir}"/fox.lz
+fail=0
+test_failed() { fail=1 ; printf " $1" ; [ -z "$2" ] || printf "($2)" ; }
+
+"${LZIP}" --check-lib # just print warning
+[ $? != 2 ] || { test_failed $LINENO ; exit 2 ; } # unless bad lzlib.h
+
+printf "testing lzlib-%s..." "$2"
+
+"${LZIP}" -fkqm4 in
+[ $? = 1 ] || test_failed $LINENO
+[ ! -e in.lz ] || test_failed $LINENO
+"${LZIP}" -fkqm274 in
+[ $? = 1 ] || test_failed $LINENO
+[ ! -e in.lz ] || test_failed $LINENO
+for i in bad_size -1 0 4095 513MiB 1G 1T 1P 1E 1Z 1Y 10KB ; do
+ "${LZIP}" -fkqs $i in
+ [ $? = 1 ] || test_failed $LINENO $i
+ [ ! -e in.lz ] || test_failed $LINENO $i
+done
+"${LZIP}" -tq in
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -tq < in
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -cdq in
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -cdq < in
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -dq -o in < "${in_lz}"
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -dq -o in "${in_lz}"
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -dq -o out nx_file.lz
+[ $? = 1 ] || test_failed $LINENO
+[ ! -e out ] || test_failed $LINENO
+"${LZIP}" -q -o out.lz nx_file
+[ $? = 1 ] || test_failed $LINENO
+[ ! -e out.lz ] || test_failed $LINENO
+"${LZIP}" -qf -S100k -o out in in
+[ $? = 1 ] || test_failed $LINENO
+{ [ ! -e out ] && [ ! -e out.lz ] ; } || test_failed $LINENO
+# these are for code coverage
+"${LZIP}" -cdt "${in_lz}" 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -t -- nx_file.lz 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -t "" < /dev/null 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --help > /dev/null || test_failed $LINENO
+"${LZIP}" -n1 -V > /dev/null || test_failed $LINENO
+"${LZIP}" -m 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -z 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --bad_option 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --t 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --test=2 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --output= 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --output 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+printf "LZIP\001-.............................." | "${LZIP}" -t 2> /dev/null
+printf "LZIP\002-.............................." | "${LZIP}" -t 2> /dev/null
+printf "LZIP\001+.............................." | "${LZIP}" -t 2> /dev/null
+
+printf "\ntesting decompression..."
+
+for i in "${in_lz}" "${in_em}" "${testdir}"/test_sync.lz ; do
+ "${LZIP}" -t "$i" || test_failed $LINENO "$i"
+ "${LZIP}" -d "$i" -o out || test_failed $LINENO "$i"
+ cmp in out || test_failed $LINENO "$i"
+ "${LZIP}" -cd "$i" > out || test_failed $LINENO "$i"
+ cmp in out || test_failed $LINENO "$i"
+ "${LZIP}" -d "$i" -o - > out || test_failed $LINENO "$i"
+ cmp in out || test_failed $LINENO "$i"
+ "${LZIP}" -d < "$i" > out || test_failed $LINENO "$i"
+ cmp in out || test_failed $LINENO "$i"
+ rm -f out || framework_failure
+done
+
+lines=`"${LZIP}" -tvv "${in_em}" 2>&1 | wc -l` || test_failed $LINENO
+[ "${lines}" -eq 8 ] || test_failed $LINENO "${lines}"
+
+cat "${in_lz}" > out.lz || framework_failure
+"${LZIP}" -dk out.lz || test_failed $LINENO
+cmp in out || test_failed $LINENO
+rm -f out || framework_failure
+"${LZIP}" -cd "${fox_lz}" > fox || test_failed $LINENO
+cat fox > copy || framework_failure
+cat "${in_lz}" > copy.lz || framework_failure
+"${LZIP}" -d copy.lz out.lz 2> /dev/null # skip copy, decompress out
+[ $? = 1 ] || test_failed $LINENO
+[ ! -e out.lz ] || test_failed $LINENO
+cmp fox copy || test_failed $LINENO
+cmp in out || test_failed $LINENO
+"${LZIP}" -df copy.lz || test_failed $LINENO
+[ ! -e copy.lz ] || test_failed $LINENO
+cmp in copy || test_failed $LINENO
+rm -f copy out || framework_failure
+
+cat "${in_lz}" > out.lz || framework_failure
+"${LZIP}" -d -S100k out.lz || test_failed $LINENO # ignore -S
+[ ! -e out.lz ] || test_failed $LINENO
+cmp in out || test_failed $LINENO
+
+printf "to be overwritten" > out || framework_failure
+"${LZIP}" -df -o out < "${in_lz}" || test_failed $LINENO
+cmp in out || test_failed $LINENO
+rm -f out || framework_failure
+"${LZIP}" -d -o ./- "${in_lz}" || test_failed $LINENO
+cmp in ./- || test_failed $LINENO
+rm -f ./- || framework_failure
+"${LZIP}" -d -o ./- < "${in_lz}" || test_failed $LINENO
+cmp in ./- || test_failed $LINENO
+rm -f ./- || framework_failure
+
+cat "${in_lz}" > anyothername || framework_failure
+"${LZIP}" -dv - anyothername - < "${in_lz}" > out 2> /dev/null ||
+ test_failed $LINENO
+cmp in out || test_failed $LINENO
+cmp in anyothername.out || test_failed $LINENO
+rm -f out anyothername.out || framework_failure
+
+"${LZIP}" -tq in "${in_lz}"
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -tq nx_file.lz "${in_lz}"
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -cdq in "${in_lz}" > out
+[ $? = 2 ] || test_failed $LINENO
+cat out in | cmp in - || test_failed $LINENO # out must be empty
+"${LZIP}" -cdq nx_file.lz "${in_lz}" > out # skip nx_file, decompress in
+[ $? = 1 ] || test_failed $LINENO
+cmp in out || test_failed $LINENO
+rm -f out || framework_failure
+cat "${in_lz}" > out.lz || framework_failure
+for i in 1 2 3 4 5 6 7 ; do
+ printf "g" >> out.lz || framework_failure
+ "${LZIP}" -atvvvv out.lz "${in_lz}" 2> /dev/null
+ [ $? = 2 ] || test_failed $LINENO $i
+done
+"${LZIP}" -dq in out.lz
+[ $? = 2 ] || test_failed $LINENO
+[ -e out.lz ] || test_failed $LINENO
+[ ! -e out ] || test_failed $LINENO
+[ ! -e in.out ] || test_failed $LINENO
+"${LZIP}" -dq nx_file.lz out.lz
+[ $? = 1 ] || test_failed $LINENO
+[ ! -e out.lz ] || test_failed $LINENO
+[ ! -e nx_file ] || test_failed $LINENO
+cmp in out || test_failed $LINENO
+rm -f out || framework_failure
+
+cat in in > in2 || framework_failure
+"${LZIP}" -t "${in_lz}" "${in_lz}" || test_failed $LINENO
+"${LZIP}" -cd "${in_lz}" "${in_lz}" -o out > out2 || test_failed $LINENO
+[ ! -e out ] || test_failed $LINENO # override -o
+cmp in2 out2 || test_failed $LINENO
+rm -f out2 || framework_failure
+"${LZIP}" -d "${in_lz}" "${in_lz}" -o out2 || test_failed $LINENO
+cmp in2 out2 || test_failed $LINENO
+rm -f out2 || framework_failure
+
+cat "${in_lz}" "${in_lz}" > out2.lz || framework_failure
+printf "\ngarbage" >> out2.lz || framework_failure
+"${LZIP}" -tvvvv out2.lz 2> /dev/null || test_failed $LINENO
+"${LZIP}" -atq out2.lz
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -atq < out2.lz
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -adkq out2.lz
+[ $? = 2 ] || test_failed $LINENO
+[ ! -e out2 ] || test_failed $LINENO
+"${LZIP}" -adkq -o out2 < out2.lz
+[ $? = 2 ] || test_failed $LINENO
+[ ! -e out2 ] || test_failed $LINENO
+printf "to be overwritten" > out2 || framework_failure
+"${LZIP}" -df out2.lz || test_failed $LINENO
+cmp in2 out2 || test_failed $LINENO
+rm -f out2 || framework_failure
+
+printf "\ntesting compression..."
+
+"${LZIP}" -c -0 in in in -S100k -o out3.lz > copy2.lz || test_failed $LINENO
+[ ! -e out3.lz ] || test_failed $LINENO # override -o and -S
+"${LZIP}" -0f in in --output=copy2.lz || test_failed $LINENO
+"${LZIP}" -d copy2.lz -o out2 || test_failed $LINENO
+[ -e copy2.lz ] || test_failed $LINENO
+cmp in2 out2 || test_failed $LINENO
+rm -f out2 copy2.lz || framework_failure
+
+"${LZIP}" -cf "${in_lz}" > lzlz 2> /dev/null # /dev/null is a tty on OS/2
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -Fvvm36 -o - -s16 "${in_lz}" > lzlz 2> /dev/null || test_failed $LINENO
+"${LZIP}" -cd lzlz | "${LZIP}" -d > out || test_failed $LINENO
+cmp in out || test_failed $LINENO
+rm -f lzlz out || framework_failure
+
+"${LZIP}" -0 -o ./- in || test_failed $LINENO
+"${LZIP}" -cd ./- | cmp in - || test_failed $LINENO
+rm -f ./- || framework_failure
+"${LZIP}" -0 -o ./- < in || test_failed $LINENO # don't add .lz
+[ ! -e ./-.lz ] || test_failed $LINENO
+"${LZIP}" -cd ./- | cmp in - || test_failed $LINENO
+rm -f ./- || framework_failure
+
+for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do
+ "${LZIP}" -k -$i -s16 in || test_failed $LINENO $i
+ mv in.lz out.lz || test_failed $LINENO $i
+ printf "garbage" >> out.lz || framework_failure
+ "${LZIP}" -df out.lz || test_failed $LINENO $i
+ cmp in out || test_failed $LINENO $i
+
+ "${LZIP}" -$i -s16 in -c > out || test_failed $LINENO $i
+ "${LZIP}" -$i -s16 in -o o_out || test_failed $LINENO $i # don't add .lz
+ [ ! -e o_out.lz ] || test_failed $LINENO
+ cmp out o_out || test_failed $LINENO $i
+ rm -f o_out || framework_failure
+ printf "g" >> out || framework_failure
+ "${LZIP}" -cd out > copy || test_failed $LINENO $i
+ cmp in copy || test_failed $LINENO $i
+
+ "${LZIP}" -$i -s16 < in > out || test_failed $LINENO $i
+ "${LZIP}" -d < out > copy || test_failed $LINENO $i
+ cmp in copy || test_failed $LINENO $i
+
+ rm -f out.lz || framework_failure
+ printf "to be overwritten" > out || framework_failure
+ "${LZIP}" -f -$i -s16 -o out < in || test_failed $LINENO $i # don't add .lz
+ [ ! -e out.lz ] || test_failed $LINENO
+ "${LZIP}" -df -o copy < out || test_failed $LINENO $i
+ cmp in copy || test_failed $LINENO $i
+done
+rm -f copy out || framework_failure
+
+cat in in in in in in in in > in8 || framework_failure
+"${LZIP}" -1s12 -S100k in8 || test_failed $LINENO
+"${LZIP}" -t in800001.lz in800002.lz || test_failed $LINENO
+"${LZIP}" -cd in800001.lz in800002.lz | cmp in8 - || test_failed $LINENO
+[ ! -e in800003.lz ] || test_failed $LINENO
+rm -f in800001.lz in800002.lz || framework_failure
+"${LZIP}" -1s12 -S100k -o out.lz in8 || test_failed $LINENO
+# ignore -S
+"${LZIP}" -d out.lz00001.lz out.lz00002.lz -S100k -o out || test_failed $LINENO
+cmp in8 out || test_failed $LINENO
+"${LZIP}" -t out.lz00001.lz out.lz00002.lz || test_failed $LINENO
+[ ! -e out.lz00003.lz ] || test_failed $LINENO
+rm -f out out.lz00001.lz out.lz00002.lz || framework_failure
+"${LZIP}" -1ks4Ki -b100000 in8 || test_failed $LINENO
+"${LZIP}" -t in8.lz || test_failed $LINENO
+"${LZIP}" -cd in8.lz -o out | cmp in8 - || test_failed $LINENO # override -o
+[ ! -e out ] || test_failed $LINENO
+"${LZIP}" -0 -S100k -o out < in8.lz || test_failed $LINENO
+"${LZIP}" -t out00001.lz out00002.lz || test_failed $LINENO
+"${LZIP}" -cd out00001.lz out00002.lz | cmp in8.lz - || test_failed $LINENO
+[ ! -e out00003.lz ] || test_failed $LINENO
+rm -f out00001.lz || framework_failure
+"${LZIP}" -1 -S100k -o out < in8.lz || test_failed $LINENO
+"${LZIP}" -t out00001.lz out00002.lz || test_failed $LINENO
+"${LZIP}" -cd out00001.lz out00002.lz | cmp in8.lz - || test_failed $LINENO
+[ ! -e out00003.lz ] || test_failed $LINENO
+rm -f out00001.lz out00002.lz || framework_failure
+"${LZIP}" -0 -F -S100k in8.lz || test_failed $LINENO
+"${LZIP}" -t in8.lz00001.lz in8.lz00002.lz || test_failed $LINENO
+"${LZIP}" -cd in8.lz00001.lz in8.lz00002.lz | cmp in8.lz - || test_failed $LINENO
+[ ! -e in8.lz00003.lz ] || test_failed $LINENO
+rm -f in8.lz00001.lz in8.lz00002.lz || framework_failure
+"${LZIP}" -0kF -b100k in8.lz || test_failed $LINENO
+"${LZIP}" -t in8.lz.lz || test_failed $LINENO
+"${LZIP}" -cd in8.lz.lz | cmp in8.lz - || test_failed $LINENO
+rm -f in8.lz in8.lz.lz || framework_failure
+
+"${BBEXAMPLE}" in || test_failed $LINENO
+"${BBEXAMPLE}" "${in_lz}" || test_failed $LINENO
+"${BBEXAMPLE}" "${fox_lf}" || test_failed $LINENO
+
+"${FFEXAMPLE}" -h > /dev/null || test_failed $LINENO
+"${FFEXAMPLE}" > /dev/null && test_failed $LINENO
+rm -f out || framework_failure
+"${FFEXAMPLE}" -b in out || test_failed $LINENO
+cmp in out || test_failed $LINENO
+"${FFEXAMPLE}" -b in | cmp in - || test_failed $LINENO
+"${FFEXAMPLE}" -b in8 | cmp in8 - || test_failed $LINENO
+"${FFEXAMPLE}" -b "${fox_lf}" | cmp "${fox_lf}" - || test_failed $LINENO
+"${FFEXAMPLE}" -d "${in_lz}" - | cmp in - || test_failed $LINENO
+"${FFEXAMPLE}" -d "${in_em}" - | cmp in - || test_failed $LINENO
+"${FFEXAMPLE}" -c in | "${FFEXAMPLE}" -d | cmp in - || test_failed $LINENO
+"${FFEXAMPLE}" -m in | "${FFEXAMPLE}" -d | cmp in - || test_failed $LINENO
+"${FFEXAMPLE}" -l in | "${FFEXAMPLE}" -d | cmp in - || test_failed $LINENO
+cat "${fox_lf}" "${in_lz}" | "${FFEXAMPLE}" -r | cmp in - || test_failed $LINENO
+cat in8 "${in_lz}" | "${FFEXAMPLE}" -r | cmp in - || test_failed $LINENO
+cat "${in_lz}" "${fox_lf}" "${in_lz}" | "${FFEXAMPLE}" -r - | cmp in2 - ||
+ test_failed $LINENO
+cat "${in_lz}" in8 "${in_lz}" | "${FFEXAMPLE}" -r - - | cmp in2 - ||
+ test_failed $LINENO
+
+"${LZCHECK}" in || test_failed $LINENO
+"${LZCHECK}" "${in_lz}" || test_failed $LINENO
+"${LZCHECK}" "${fox_lf}" || test_failed $LINENO
+rm -f in8 || framework_failure
+
+printf "\ntesting bad input..."
+
+headers='LZIp LZiP LZip LzIP LzIp LziP lZIP lZIp lZiP lzIP'
+body='\001\014\000\203\377\373\377\377\300\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000$\000\000\000\000\000\000\000'
+cat "${in_lz}" > int.lz || framework_failure
+printf "LZIP${body}" >> int.lz || framework_failure
+if "${LZIP}" -tq int.lz ; then
+ for header in ${headers} ; do
+ printf "${header}${body}" > int.lz || framework_failure
+ "${LZIP}" -tq int.lz # first member
+ [ $? = 2 ] || test_failed $LINENO ${header}
+ "${LZIP}" -tq < int.lz
+ [ $? = 2 ] || test_failed $LINENO ${header}
+ "${LZIP}" -cdq int.lz > /dev/null
+ [ $? = 2 ] || test_failed $LINENO ${header}
+ "${LZIP}" -tq --loose-trailing int.lz
+ [ $? = 2 ] || test_failed $LINENO ${header}
+ "${LZIP}" -tq --loose-trailing < int.lz
+ [ $? = 2 ] || test_failed $LINENO ${header}
+ "${LZIP}" -cdq --loose-trailing int.lz > /dev/null
+ [ $? = 2 ] || test_failed $LINENO ${header}
+ cat "${in_lz}" > int.lz || framework_failure
+ printf "${header}${body}" >> int.lz || framework_failure
+ "${LZIP}" -tq int.lz # trailing data
+ [ $? = 2 ] || test_failed $LINENO ${header}
+ "${LZIP}" -tq < int.lz
+ [ $? = 2 ] || test_failed $LINENO ${header}
+ "${LZIP}" -cdq int.lz > /dev/null
+ [ $? = 2 ] || test_failed $LINENO ${header}
+ "${LZIP}" -t --loose-trailing int.lz ||
+ test_failed $LINENO ${header}
+ "${LZIP}" -t --loose-trailing < int.lz ||
+ test_failed $LINENO ${header}
+ "${LZIP}" -cd --loose-trailing int.lz > /dev/null ||
+ test_failed $LINENO ${header}
+ "${LZIP}" -tq --loose-trailing --trailing-error int.lz
+ [ $? = 2 ] || test_failed $LINENO ${header}
+ "${LZIP}" -tq --loose-trailing --trailing-error < int.lz
+ [ $? = 2 ] || test_failed $LINENO ${header}
+ "${LZIP}" -cdq --loose-trailing --trailing-error int.lz > /dev/null
+ [ $? = 2 ] || test_failed $LINENO ${header}
+ done
+else
+ printf "\nwarning: skipping header test: 'printf' does not work on your system."
+fi
+rm -f int.lz || framework_failure
+
+for i in fox_v2.lz fox_s11.lz fox_de20.lz \
+ fox_bcrc.lz fox_crc0.lz fox_das46.lz fox_mes81.lz ; do
+ "${LZIP}" -tq "${testdir}"/$i
+ [ $? = 2 ] || test_failed $LINENO $i
+done
+
+for i in fox_bcrc.lz fox_crc0.lz fox_das46.lz fox_mes81.lz ; do
+ "${LZIP}" -cdq "${testdir}"/$i > out
+ [ $? = 2 ] || test_failed $LINENO $i
+ cmp fox out || test_failed $LINENO $i
+done
+rm -f fox out || framework_failure
+
+cat "${in_lz}" "${in_lz}" > in2.lz || framework_failure
+cat "${in_lz}" "${in_lz}" "${in_lz}" > in3.lz || framework_failure
+if dd if=in3.lz of=trunc.lz bs=14752 count=1 2> /dev/null &&
+ [ -e trunc.lz ] && cmp in2.lz trunc.lz > /dev/null 2>&1 ; then
+ for i in 6 20 14734 14753 14754 14755 14756 14757 14758 ; do
+ dd if=in3.lz of=trunc.lz bs=$i count=1 2> /dev/null
+ "${LZIP}" -tq trunc.lz
+ [ $? = 2 ] || test_failed $LINENO $i
+ "${LZIP}" -tq < trunc.lz
+ [ $? = 2 ] || test_failed $LINENO $i
+ "${LZIP}" -cdq trunc.lz > /dev/null
+ [ $? = 2 ] || test_failed $LINENO $i
+ "${LZIP}" -dq < trunc.lz > /dev/null
+ [ $? = 2 ] || test_failed $LINENO $i
+ done
+else
+ printf "\nwarning: skipping truncation test: 'dd' does not work on your system."
+fi
+rm -f in2.lz in3.lz trunc.lz || framework_failure
+
+cat "${in_lz}" > ingin.lz || framework_failure
+printf "g" >> ingin.lz || framework_failure
+cat "${in_lz}" >> ingin.lz || framework_failure
+"${LZIP}" -atq ingin.lz
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -atq < ingin.lz
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -acdq ingin.lz > /dev/null
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -adq < ingin.lz > /dev/null
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -t ingin.lz || test_failed $LINENO
+"${LZIP}" -t < ingin.lz || test_failed $LINENO
+"${LZIP}" -cd ingin.lz > out || test_failed $LINENO
+cmp in out || test_failed $LINENO
+"${LZIP}" -d < ingin.lz > out || test_failed $LINENO
+cmp in out || test_failed $LINENO
+"${FFEXAMPLE}" -d ingin.lz | cmp in - || test_failed $LINENO
+"${FFEXAMPLE}" -r ingin.lz | cmp in2 - || test_failed $LINENO
+rm -f in2 out ingin.lz || framework_failure
+
+echo
+if [ ${fail} = 0 ] ; then
+ echo "tests completed successfully."
+ cd "${objdir}" && rm -r tmp
+else
+ echo "tests failed."
+fi
+exit ${fail}
diff --git a/testsuite/fox.lz b/testsuite/fox.lz
new file mode 100644
index 0000000..509da82
--- /dev/null
+++ b/testsuite/fox.lz
Binary files differ
diff --git a/testsuite/fox_bcrc.lz b/testsuite/fox_bcrc.lz
new file mode 100644
index 0000000..8f6a7c4
--- /dev/null
+++ b/testsuite/fox_bcrc.lz
Binary files differ
diff --git a/testsuite/fox_crc0.lz b/testsuite/fox_crc0.lz
new file mode 100644
index 0000000..1abe926
--- /dev/null
+++ b/testsuite/fox_crc0.lz
Binary files differ
diff --git a/testsuite/fox_das46.lz b/testsuite/fox_das46.lz
new file mode 100644
index 0000000..43ed9f9
--- /dev/null
+++ b/testsuite/fox_das46.lz
Binary files differ
diff --git a/testsuite/fox_de20.lz b/testsuite/fox_de20.lz
new file mode 100644
index 0000000..10949d8
--- /dev/null
+++ b/testsuite/fox_de20.lz
Binary files differ
diff --git a/testsuite/fox_lf b/testsuite/fox_lf
new file mode 100644
index 0000000..a0b11b5
--- /dev/null
+++ b/testsuite/fox_lf
@@ -0,0 +1,9 @@
+The
+quick
+brown
+fox
+jumps
+over
+the
+lazy
+dog.
diff --git a/testsuite/fox_mes81.lz b/testsuite/fox_mes81.lz
new file mode 100644
index 0000000..d50ef2e
--- /dev/null
+++ b/testsuite/fox_mes81.lz
Binary files differ
diff --git a/testsuite/fox_s11.lz b/testsuite/fox_s11.lz
new file mode 100644
index 0000000..dca909c
--- /dev/null
+++ b/testsuite/fox_s11.lz
Binary files differ
diff --git a/testsuite/fox_v2.lz b/testsuite/fox_v2.lz
new file mode 100644
index 0000000..8620981
--- /dev/null
+++ b/testsuite/fox_v2.lz
Binary files differ
diff --git a/testsuite/test.txt b/testsuite/test.txt
new file mode 100644
index 0000000..9196a3a
--- /dev/null
+++ b/testsuite/test.txt
@@ -0,0 +1,676 @@
+ GNU GENERAL PUBLIC LICENSE
+ Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+ Preamble
+
+ The licenses for most software are designed to take away your
+freedom to share and change it. By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users. This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it. (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.) You can apply it to
+your programs, too.
+
+ When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+ To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+ For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have. You must make sure that they, too, receive or can get the
+source code. And you must show them these terms so they know their
+rights.
+
+ We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+ Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software. If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+ Finally, any free program is threatened constantly by software
+patents. We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary. To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+
+ GNU GENERAL PUBLIC LICENSE
+ TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+ 0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License. The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language. (Hereinafter, translation is included without limitation in
+the term "modification".) Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope. The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+ 1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+ 2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+ a) You must cause the modified files to carry prominent notices
+ stating that you changed the files and the date of any change.
+
+ b) You must cause any work that you distribute or publish, that in
+ whole or in part contains or is derived from the Program or any
+ part thereof, to be licensed as a whole at no charge to all third
+ parties under the terms of this License.
+
+ c) If the modified program normally reads commands interactively
+ when run, you must cause it, when started running for such
+ interactive use in the most ordinary way, to print or display an
+ announcement including an appropriate copyright notice and a
+ notice that there is no warranty (or else, saying that you provide
+ a warranty) and that users may redistribute the program under
+ these conditions, and telling the user how to view a copy of this
+ License. (Exception: if the Program itself is interactive but
+ does not normally print such an announcement, your work based on
+ the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole. If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works. But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+ 3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+ a) Accompany it with the complete corresponding machine-readable
+ source code, which must be distributed under the terms of Sections
+ 1 and 2 above on a medium customarily used for software interchange; or,
+
+ b) Accompany it with a written offer, valid for at least three
+ years, to give any third party, for a charge no more than your
+ cost of physically performing source distribution, a complete
+ machine-readable copy of the corresponding source code, to be
+ distributed under the terms of Sections 1 and 2 above on a medium
+ customarily used for software interchange; or,
+
+ c) Accompany it with the information you received as to the offer
+ to distribute corresponding source code. (This alternative is
+ allowed only for noncommercial distribution and only if you
+ received the program in object code or executable form with such
+ an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it. For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable. However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+ 4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License. Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+ 5. You are not required to accept this License, since you have not
+signed it. However, nothing else grants you permission to modify or
+distribute the Program or its derivative works. These actions are
+prohibited by law if you do not accept this License. Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+ 6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions. You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+ 7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License. If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all. For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices. Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+ 8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded. In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+ 9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time. Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number. If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation. If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+ 10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission. For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this. Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+ NO WARRANTY
+
+ 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+ 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+ END OF TERMS AND CONDITIONS
+
+ How to Apply These Terms to Your New Programs
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+ <one line to give the program's name and a brief idea of what it does.>
+ Copyright (C) <year> <name of author>
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+ Gnomovision version 69, Copyright (C) <year> <name of author>
+ Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+ This is free software, and you are welcome to redistribute it
+ under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License. Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary. Here is a sample; alter the names:
+
+ Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+ `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+ <signature of Ty Coon>, 1 April 1989
+ Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs. If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library. If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
+ GNU GENERAL PUBLIC LICENSE
+ Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+ Preamble
+
+ The licenses for most software are designed to take away your
+freedom to share and change it. By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users. This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it. (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.) You can apply it to
+your programs, too.
+
+ When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+ To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+ For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have. You must make sure that they, too, receive or can get the
+source code. And you must show them these terms so they know their
+rights.
+
+ We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+ Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software. If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+ Finally, any free program is threatened constantly by software
+patents. We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary. To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+
+ GNU GENERAL PUBLIC LICENSE
+ TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+ 0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License. The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language. (Hereinafter, translation is included without limitation in
+the term "modification".) Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope. The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+ 1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+ 2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+ a) You must cause the modified files to carry prominent notices
+ stating that you changed the files and the date of any change.
+
+ b) You must cause any work that you distribute or publish, that in
+ whole or in part contains or is derived from the Program or any
+ part thereof, to be licensed as a whole at no charge to all third
+ parties under the terms of this License.
+
+ c) If the modified program normally reads commands interactively
+ when run, you must cause it, when started running for such
+ interactive use in the most ordinary way, to print or display an
+ announcement including an appropriate copyright notice and a
+ notice that there is no warranty (or else, saying that you provide
+ a warranty) and that users may redistribute the program under
+ these conditions, and telling the user how to view a copy of this
+ License. (Exception: if the Program itself is interactive but
+ does not normally print such an announcement, your work based on
+ the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole. If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works. But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+ 3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+ a) Accompany it with the complete corresponding machine-readable
+ source code, which must be distributed under the terms of Sections
+ 1 and 2 above on a medium customarily used for software interchange; or,
+
+ b) Accompany it with a written offer, valid for at least three
+ years, to give any third party, for a charge no more than your
+ cost of physically performing source distribution, a complete
+ machine-readable copy of the corresponding source code, to be
+ distributed under the terms of Sections 1 and 2 above on a medium
+ customarily used for software interchange; or,
+
+ c) Accompany it with the information you received as to the offer
+ to distribute corresponding source code. (This alternative is
+ allowed only for noncommercial distribution and only if you
+ received the program in object code or executable form with such
+ an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it. For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable. However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+ 4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License. Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+ 5. You are not required to accept this License, since you have not
+signed it. However, nothing else grants you permission to modify or
+distribute the Program or its derivative works. These actions are
+prohibited by law if you do not accept this License. Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+ 6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions. You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+ 7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License. If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all. For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices. Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+ 8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded. In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+ 9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time. Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number. If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation. If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+ 10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission. For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this. Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+ NO WARRANTY
+
+ 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+ 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+ END OF TERMS AND CONDITIONS
+
+ How to Apply These Terms to Your New Programs
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+ <one line to give the program's name and a brief idea of what it does.>
+ Copyright (C) <year> <name of author>
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+ Gnomovision version 69, Copyright (C) <year> <name of author>
+ Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+ This is free software, and you are welcome to redistribute it
+ under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License. Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary. Here is a sample; alter the names:
+
+ Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+ `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+ <signature of Ty Coon>, 1 April 1989
+ Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs. If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library. If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
diff --git a/testsuite/test.txt.lz b/testsuite/test.txt.lz
new file mode 100644
index 0000000..22cea6e
--- /dev/null
+++ b/testsuite/test.txt.lz
Binary files differ
diff --git a/testsuite/test_em.txt.lz b/testsuite/test_em.txt.lz
new file mode 100644
index 0000000..7e96250
--- /dev/null
+++ b/testsuite/test_em.txt.lz
Binary files differ
diff --git a/testsuite/test_sync.lz b/testsuite/test_sync.lz
new file mode 100644
index 0000000..db680c3
--- /dev/null
+++ b/testsuite/test_sync.lz
Binary files differ