40 files changed, 10942 insertions, 0 deletions
diff --git a/AUTHORS b/AUTHORS
new file mode 100644
index 0000000..04be19e
--- /dev/null
+++ b/AUTHORS
@@ -0,0 +1,7 @@
+Clzip was written by Antonio Diaz Diaz.
+
+The ideas embodied in clzip are due to (at least) the following people:
+Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the
+definition of Markov chains), G.N.N. Martin (for the definition of range
+encoding), Igor Pavlov (for putting all the above together in LZMA), and
+Julian Seward (for bzip2's CLI).
diff --git a/COPYING b/COPYING
new file mode 100644
index 0000000..4ad17ae
--- /dev/null
+++ b/COPYING
@@ -0,0 +1,338 @@
+                    GNU GENERAL PUBLIC LICENSE
+                       Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+                            Preamble
+
+  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.)  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+                    GNU GENERAL PUBLIC LICENSE
+   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+  0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term "modification".)  Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+  1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+  2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+    a) You must cause the modified files to carry prominent notices
+    stating that you changed the files and the date of any change.
+
+    b) You must cause any work that you distribute or publish, that in
+    whole or in part contains or is derived from the Program or any
+    part thereof, to be licensed as a whole at no charge to all third
+    parties under the terms of this License.
+
+    c) If the modified program normally reads commands interactively
+    when run, you must cause it, when started running for such
+    interactive use in the most ordinary way, to print or display an
+    announcement including an appropriate copyright notice and a
+    notice that there is no warranty (or else, saying that you provide
+    a warranty) and that users may redistribute the program under
+    these conditions, and telling the user how to view a copy of this
+    License.  (Exception: if the Program itself is interactive but
+    does not normally print such an announcement, your work based on
+    the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole.  If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works.  But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+  3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+    a) Accompany it with the complete corresponding machine-readable
+    source code, which must be distributed under the terms of Sections
+    1 and 2 above on a medium customarily used for software interchange; or,
+
+    b) Accompany it with a written offer, valid for at least three
+    years, to give any third party, for a charge no more than your
+    cost of physically performing source distribution, a complete
+    machine-readable copy of the corresponding source code, to be
+    distributed under the terms of Sections 1 and 2 above on a medium
+    customarily used for software interchange; or,
+
+    c) Accompany it with the information you received as to the offer
+    to distribute corresponding source code.  (This alternative is
+    allowed only for noncommercial distribution and only if you
+    received the program in object code or executable form with such
+    an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it.  For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable.  However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+  4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+  5. You are not required to accept this License, since you have not
+signed it.  However, nothing else grants you permission to modify or
+distribute the Program or its derivative works.  These actions are
+prohibited by law if you do not accept this License.  Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+  6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions.  You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+  7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all.  For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices.  Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+  8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded.  In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+  9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number.  If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation.  If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+  10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission.  For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this.  Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+                            NO WARRANTY
+
+  11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+  12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+                     END OF TERMS AND CONDITIONS
+
+            How to Apply These Terms to Your New Programs
+
+  If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+  To do so, attach the following notices to the program.  It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+    <one line to give the program's name and a brief idea of what it does.>
+    Copyright (C) <year>  <name of author>
+
+    This program is free software: you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation, either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+    Gnomovision version 69, Copyright (C) <year>  <name of author>
+    Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+    This is free software, and you are welcome to redistribute it
+    under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License.  Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary.  Here is a sample; alter the names:
+
+  Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+  `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+  <signature of Ty Coon>, 1 April 1989
+  Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs.  If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library.  If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
diff --git a/ChangeLog b/ChangeLog
new file mode 100644
index 0000000..63e6a0d
--- /dev/null
+++ b/ChangeLog
@@ -0,0 +1,179 @@
+2024-01-22  Antonio Diaz Diaz  <antonio@gnu.org>
+
+	* Version 1.14 released.
+	* New options '--empty-error' and '--marking-error'.
+	* main.c: Reformat file diagnostics as 'PROGRAM: FILE: MESSAGE'.
+	  (show_option_error): New function showing argument and option name.
+	  (main): Make -o preserve date/mode/owner if 1 input file.
+	  (open_outstream): Create missing intermediate directories.
+	* lzip.h: Rename verify_* to check_*.
+	* configure, Makefile.in: New variable 'MAKEINFO'.
+	* INSTALL: Document use of CFLAGS+='--std=c99 -D_XOPEN_SOURCE=500'.
+	* testsuite: New test files fox6.lz, fox6_mark.lz.
+
+2022-01-24  Antonio Diaz Diaz  <antonio@gnu.org>
+
+	* Version 1.13 released.
+	* Decompression time has been reduced by 5-12% depending on the file.
+	* main.c (getnum): Show option name and valid range if error.
+	* Improve several descriptions in manual, '--help', and man page.
+	* clzip.texi: Change GNU Texinfo category to 'Compression'.
+	  (Reported by Alfred M. Szmidt).
+
+2021-01-04  Antonio Diaz Diaz  <antonio@gnu.org>
+
+	* Version 1.12 released.
+	* main.c (main): Report an error if a file name is empty.
+	  Make '-o' behave like '-c', but writing to file instead of stdout.
+	  Make '-c' and '-o' check whether the output is a terminal only once.
+	  Do not open output if input is a terminal.
+	* Replace 'decompressed', 'compressed' with 'out', 'in' in output.
+	* lzip_index.c: Improve messages for corruption in last header.
+	* main.c: Set a valid invocation_name even if argc == 0.
+	* Document extraction from tar.lz in manual, '--help', and man page.
+	* clzip.texi (Introduction): Mention plzip and tarlz as alternatives.
+	* clzip.texi: Several fixes and improvements.
+	* testsuite: Add 9 new test files.
+
+2019-01-03  Antonio Diaz Diaz  <antonio@gnu.org>
+
+	* Version 1.11 released.
+	* Rename File_* to Lzip_*.
+	* lzip.h (Lzip_trailer): New function 'Lt_verify_consistency'.
+	* lzip_index.c: Detect some kinds of corrupt trailers.
+	* main.c (main): Check return value of close( infd ).
+	* main.c: Compile on DOS with DJGPP.
+	* clzip.texi: Improve descriptions of '-0..-9', '-m', and '-s'.
+	* configure: Accept appending to CFLAGS; 'CFLAGS+=OPTIONS'.
+	* INSTALL: Document use of CFLAGS+='-D __USE_MINGW_ANSI_STDIO'.
+
+2018-02-06  Antonio Diaz Diaz  <antonio@gnu.org>
+
+	* Version 1.10 released.
+	* New option '--loose-trailing'.
+	* Improve corrupt header detection to HD=3.
+	* main.c: Show corrupt or truncated header in multimember file.
+	* main.c (main): Option '-S, --volume-size' now keeps input files.
+	* encoder_base.*: Adjust dictionary size for each member.
+	* Replace 'bits/byte' with inverse compression ratio in output.
+	* Show progress of decompression at verbosity level 2 (-vv).
+	* Show progress of (de)compression only if stderr is a terminal.
+	* main.c: Show final diagnostic when testing multiple files.
+	* main.c: Do not add a second .lz extension to the arg of -o.
+	* decoder.c (LZd_verify_trailer): Show stored sizes also in hex.
+	  Show dictionary size at verbosity level 4 (-vvvv).
+	* clzip.texi: New chapter 'Meaning of clzip's output'.
+
+2017-04-13  Antonio Diaz Diaz  <antonio@gnu.org>
+
+	* Version 1.9 released.
+	* The option '-l, --list' has been ported from lziprecover.
+	* Don't allow mixing different operations (-d, -l or -t).
+	* Compression time of option '-0' has been reduced by 6%.
+	* Compression time of options -1 to -9 has been reduced by 1%.
+	* Decompression time has been reduced by 7%.
+	* main.c: Continue testing if any input file is a terminal.
+	* main.c: Show trailing data in both hexadecimal and ASCII.
+	* lzip_index.c: Improve detection of bad dict and trailing data.
+	* lzip.h: Unify messages for bad magic, trailing data, etc.
+	* clzip.texi: Add missing chapters from lzip.texi.
+
+2016-05-13  Antonio Diaz Diaz  <antonio@gnu.org>
+
+	* Version 1.8 released.
+	* New option '-a, --trailing-error'.
+	* main.c (decompress): Print up to 6 bytes of trailing data when
+	  '-vvvv' is specified.
+	* decoder.c (LZd_verify_trailer): Remove test of final code.
+	* main.c (main): Delete '--output' file if infd is a terminal.
+	* main.c (main): Don't use stdin more than once.
+	* clzip.texi: New chapter 'Trailing data'.
+	* configure: Avoid warning on some shells when testing for gcc.
+	* Makefile.in: Detect the existence of install-info.
+	* check.sh: A POSIX shell is required to run the tests.
+	* check.sh: Don't check error messages.
+
+2015-07-07  Antonio Diaz Diaz  <antonio@gnu.org>
+
+	* Version 1.7 released.
+	* Port fast encoder and option '-0' from lzip.
+	* Makefile.in: New targets 'install*-compress'.
+
+2014-08-28  Antonio Diaz Diaz  <antonio@gnu.org>
+
+	* Version 1.6 released.
+	* Compression ratio of option '-9' has been slightly increased.
+	* main.c (close_and_set_permissions): Behave like 'cp -p'.
+	* clzip.texinfo: Rename to clzip.texi.
+	* Change license to GPL version 2 or later.
+
+2013-09-17  Antonio Diaz Diaz  <antonio@gnu.org>
+
+	* Version 1.5 released.
+	* Show progress of compression at verbosity level 2 (-vv).
+	* main.c (show_header): Don't show header version.
+	* Ignore option '-n, --threads' for compatibility with plzip.
+	* configure: Options now accept a separate argument.
+
+2013-02-18  Antonio Diaz Diaz  <ant_diaz@teleline.es>
+
+	* Version 1.4 released.
+	* Multi-step trials have been implemented.
+	* Compression ratio has been slightly increased.
+	* Compression time has been reduced by 10%.
+	* Decompression time has been reduced by 8%.
+	* Makefile.in: New targets 'install-as-lzip' and 'install-bin'.
+	* main.c: Use 'setmode' instead of '_setmode' on Windows and OS/2.
+	* main.c: Define 'strtoull' to 'strtoul' on Windows.
+
+2012-02-25  Antonio Diaz Diaz  <ant_diaz@teleline.es>
+
+	* Version 1.3 released.
+	* main.c (close_and_set_permissions): Inability to change output
+	  file attributes has been downgraded from error to warning.
+	* encoder.c (Mf_init): Return false if out of memory instead of
+	  calling cleanup_and_fail.
+	* Small change in '--help' output and man page.
+	* Change quote characters in messages as advised by GNU Standards.
+	* configure: Rename 'datadir' to 'datarootdir'.
+
+2011-05-18  Antonio Diaz Diaz  <ant_diaz@teleline.es>
+
+	* Version 1.2 released.
+	* New option '-F, --recompress'.
+	* main.c (decompress): Print only one status line for each
+	  multimember file when only one '-v' is specified.
+	* encoder.h (Lee_update_prices): Update high length symbol prices
+	  independently of the value of 'pos_state'. This gives better
+	  compression for large values of '--match-length' without being
+	  slower.
+	* encoder.h, encoder.c: Optimize pair price calculations, reducing
+	  compression time for large values of '--match-length' by up to 6%.
+
+2011-01-11  Antonio Diaz Diaz  <ant_diaz@teleline.es>
+
+	* Version 1.1 released.
+	* Code has been converted to 'C89 + long long' from C99.
+	* main.c: Fix warning about fchown return value being ignored.
+	* decoder.c: '-tvvvv' now shows compression ratio.
+	* main.c: Match length limit set by options -1 to -8 has been
+	  reduced to extend range of use towards gzip. Lower numbers now
+	  compress less but faster. (-1 now takes 43% less time for only 20%
+	  larger compressed size).
+	  Exit with status 1 if any output file exists and is skipped.
+	* Compression ratio of option '-9' has been slightly increased.
+	* main.c (open_instream): Don't show the message
+	  " and '--stdout' was not specified" for directories, etc.
+	* New examples have been added to the manual.
+
+2010-04-05  Antonio Diaz Diaz  <ant_diaz@teleline.es>
+
+	* Version 1.0 released.
+	* Initial release.
+	* Translated to C from the C++ source of lzip 1.10.
+
+
+Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+This file is a collection of facts, and thus it is not copyrightable, but just
+in case, you have unlimited permission to copy, distribute, and modify it.
diff --git a/INSTALL b/INSTALL
new file mode 100644
index 0000000..b405598
--- /dev/null
+++ b/INSTALL
@@ -0,0 +1,81 @@
+Requirements
+------------
+You will need a C99 compiler. (gcc 3.3.6 or newer is recommended).
+I use gcc 6.1.0 and 3.3.6, but the code should compile with any standards
+compliant compiler.
+Gcc is available at http://gcc.gnu.org.
+
+The operating system must allow signal handlers read access to objects with
+static storage duration so that the cleanup handler for Control-C can delete
+the partial output file.
+
+
+Procedure
+---------
+1. Unpack the archive if you have not done so already:
+
+	tar -xf clzip[version].tar.lz
+or
+	lzip -cd clzip[version].tar.lz | tar -xf -
+
+This creates the directory ./clzip[version] containing the source code
+extracted from the archive.
+
+2. Change to clzip directory and run configure.
+   (Try 'configure --help' for usage instructions).
+
+	cd clzip[version]
+	./configure
+
+   If you choose a C standard, enable the POSIX features explicitly:
+
+	./configure CFLAGS+='--std=c99 -D_XOPEN_SOURCE=500'
+
+   If you are compiling on MinGW, use:
+
+	./configure CFLAGS+='-D __USE_MINGW_ANSI_STDIO'
+
+3. Run make.
+
+	make
+
+4. Optionally, type 'make check' to run the tests that come with clzip.
+
+5. Type 'make install' to install the program and any data files and
+   documentation. You need root privileges to install into a prefix owned
+   by root.
+
+   Or type 'make install-compress', which additionally compresses the
+   info manual and the man page after installation.
+   (Installing compressed docs may become the default in the future).
+
+   You can install only the program, the info manual, or the man page by
+   typing 'make install-bin', 'make install-info', or 'make install-man'
+   respectively.
+
+   Instead of 'make install', you can type 'make install-as-lzip' to
+   install the program and any data files and documentation, and link
+   the program to the name 'lzip'.
+
+
+Another way
+-----------
+You can also compile clzip into a separate directory.
+To do this, you must use a version of 'make' that supports the variable
+'VPATH', such as GNU 'make'. 'cd' to the directory where you want the
+object files and executables to go and run the 'configure' script.
+'configure' automatically checks for the source code in '.', in '..', and
+in the directory that 'configure' is in.
+
+'configure' recognizes the option '--srcdir=DIR' to control where to look
+for the source code. Usually 'configure' can determine that directory
+automatically.
+
+After running 'configure', you can run 'make' and 'make install' as
+explained above.
+
+
+Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+This file is free documentation: you have unlimited permission to copy,
+distribute, and modify it.
diff --git a/Makefile.in b/Makefile.in
new file mode 100644
index 0000000..55e2bcb
--- /dev/null
+++ b/Makefile.in
@@ -0,0 +1,144 @@
+
+DISTNAME = $(pkgname)-$(pkgversion)
+INSTALL = install
+INSTALL_PROGRAM = $(INSTALL) -m 755
+INSTALL_DATA = $(INSTALL) -m 644
+INSTALL_DIR = $(INSTALL) -d -m 755
+SHELL = /bin/sh
+CAN_RUN_INSTALLINFO = $(SHELL) -c "install-info --version" > /dev/null 2>&1
+
+objs = carg_parser.o lzip_index.o list.o encoder_base.o encoder.o \
+       fast_encoder.o decoder.o main.o
+
+
+.PHONY : all install install-bin install-info install-man \
+         install-strip install-compress install-strip-compress \
+         install-bin-strip install-info-compress install-man-compress \
+         install-as-lzip \
+         uninstall uninstall-bin uninstall-info uninstall-man \
+         doc info man check dist clean distclean
+
+all : $(progname)
+
+$(progname) : $(objs)
+	$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(objs)
+
+main.o : main.c
+	$(CC) $(CPPFLAGS) $(CFLAGS) -DPROGVERSION=\"$(pkgversion)\" -c -o $@ $<
+
+%.o : %.c
+	$(CC) $(CPPFLAGS) $(CFLAGS) -c -o $@ $<
+
+# prevent 'make' from trying to remake source files
+$(VPATH)/configure $(VPATH)/Makefile.in $(VPATH)/doc/$(pkgname).texi : ;
+%.h %.c : ;
+
+$(objs)        : Makefile
+carg_parser.o  : carg_parser.h
+decoder.o      : lzip.h decoder.h
+encoder_base.o : lzip.h encoder_base.h
+encoder.o      : lzip.h encoder_base.h encoder.h
+fast_encoder.o : lzip.h encoder_base.h fast_encoder.h
+list.o         : lzip.h lzip_index.h
+lzip_index.o   : lzip.h lzip_index.h
+main.o         : carg_parser.h lzip.h decoder.h encoder_base.h encoder.h fast_encoder.h
+
+doc : info man
+
+info : $(VPATH)/doc/$(pkgname).info
+
+$(VPATH)/doc/$(pkgname).info : $(VPATH)/doc/$(pkgname).texi
+	cd $(VPATH)/doc && $(MAKEINFO) $(pkgname).texi
+
+man : $(VPATH)/doc/$(progname).1
+
+$(VPATH)/doc/$(progname).1 : $(progname)
+	help2man -n 'reduces the size of files' -o $@ ./$(progname)
+
+Makefile : $(VPATH)/configure $(VPATH)/Makefile.in
+	./config.status
+
+check : all
+	@$(VPATH)/testsuite/check.sh $(VPATH)/testsuite $(pkgversion)
+
+install : install-bin install-info install-man
+install-strip : install-bin-strip install-info install-man
+install-compress : install-bin install-info-compress install-man-compress
+install-strip-compress : install-bin-strip install-info-compress install-man-compress
+
+install-bin : all
+	if [ ! -d "$(DESTDIR)$(bindir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(bindir)" ; fi
+	$(INSTALL_PROGRAM) ./$(progname) "$(DESTDIR)$(bindir)/$(progname)"
+
+install-bin-strip : all
+	$(MAKE) INSTALL_PROGRAM='$(INSTALL_PROGRAM) -s' install-bin
+
+install-info :
+	if [ ! -d "$(DESTDIR)$(infodir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(infodir)" ; fi
+	-rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"*
+	$(INSTALL_DATA) $(VPATH)/doc/$(pkgname).info "$(DESTDIR)$(infodir)/$(pkgname).info"
+	-if $(CAN_RUN_INSTALLINFO) ; then \
+	  install-info --info-dir="$(DESTDIR)$(infodir)" "$(DESTDIR)$(infodir)/$(pkgname).info" ; \
+	fi
+
+install-info-compress : install-info
+	lzip -v -9 "$(DESTDIR)$(infodir)/$(pkgname).info"
+
+install-man :
+	if [ ! -d "$(DESTDIR)$(mandir)/man1" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1" ; fi
+	-rm -f "$(DESTDIR)$(mandir)/man1/$(progname).1"*
+	$(INSTALL_DATA) $(VPATH)/doc/$(progname).1 "$(DESTDIR)$(mandir)/man1/$(progname).1"
+
+install-man-compress : install-man
+	lzip -v -9 "$(DESTDIR)$(mandir)/man1/$(progname).1"
+
+install-as-lzip : install
+	-rm -f "$(DESTDIR)$(bindir)/lzip"
+	cd "$(DESTDIR)$(bindir)" && ln -s $(progname) lzip
+
+uninstall : uninstall-man uninstall-info uninstall-bin
+
+uninstall-bin :
+	-rm -f "$(DESTDIR)$(bindir)/$(progname)"
+
+uninstall-info :
+	-if $(CAN_RUN_INSTALLINFO) ; then \
+	  install-info --info-dir="$(DESTDIR)$(infodir)" --remove "$(DESTDIR)$(infodir)/$(pkgname).info" ; \
+	fi
+	-rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"*
+
+uninstall-man :
+	-rm -f "$(DESTDIR)$(mandir)/man1/$(progname).1"*
+
+dist : doc
+	ln -sf $(VPATH) $(DISTNAME)
+	tar -Hustar --owner=root --group=root -cvf $(DISTNAME).tar \
+	  $(DISTNAME)/AUTHORS \
+	  $(DISTNAME)/COPYING \
+	  $(DISTNAME)/ChangeLog \
+	  $(DISTNAME)/INSTALL \
+	  $(DISTNAME)/Makefile.in \
+	  $(DISTNAME)/NEWS \
+	  $(DISTNAME)/README \
+	  $(DISTNAME)/configure \
+	  $(DISTNAME)/doc/$(progname).1 \
+	  $(DISTNAME)/doc/$(pkgname).info \
+	  $(DISTNAME)/doc/$(pkgname).texi \
+	  $(DISTNAME)/*.h \
+	  $(DISTNAME)/*.c \
+	  $(DISTNAME)/testsuite/check.sh \
+	  $(DISTNAME)/testsuite/test.txt \
+	  $(DISTNAME)/testsuite/fox.lz \
+	  $(DISTNAME)/testsuite/fox_*.lz \
+	  $(DISTNAME)/testsuite/fox6.lz \
+	  $(DISTNAME)/testsuite/fox6_mark.lz \
+	  $(DISTNAME)/testsuite/test.txt.lz \
+	  $(DISTNAME)/testsuite/test_em.txt.lz
+	rm -f $(DISTNAME)
+	lzip -v -9 $(DISTNAME).tar
+
+clean :
+	-rm -f $(progname) $(objs)
+
+distclean : clean
+	-rm -f Makefile config.status *.tar *.tar.lz
diff --git a/NEWS b/NEWS
new file mode 100644
index 0000000..83fde4d
--- /dev/null
+++ b/NEWS
@@ -0,0 +1,24 @@
+Changes in version 1.14:
+
+The option '--empty-error', which forces exit status 2 if any empty member
+is found, has been added.
+
+The option '--marking-error', which forces exit status 2 if the first LZMA
+byte is non-zero in any member, has been added.
+
+File diagnostics have been reformatted as 'PROGRAM: FILE: MESSAGE'.
+
+Diagnostics caused by invalid arguments to command-line options now show the
+argument and the name of the option.
+
+The option '-o, --output' now preserves dates, permissions, and ownership of
+the file when (de)compressing exactly one file.
+
+The option '-o, --output' now creates missing intermediate directories when
+writing to a file.
+
+The variable MAKEINFO has been added to configure and Makefile.in.
+
+It has been documented in INSTALL that when choosing a C standard, the POSIX
+features need to be enabled explicitly:
+  ./configure CFLAGS+='--std=c99 -D_XOPEN_SOURCE=500'
diff --git a/README b/README
new file mode 100644
index 0000000..5905364
--- /dev/null
+++ b/README
@@ -0,0 +1,139 @@
+Description
+
+Clzip is a C language version of lzip, compatible with lzip 1.4 or newer. As
+clzip is written in C, it may be easier to integrate in applications like
+package managers, embedded devices, or systems lacking a C++ compiler.
+
+Lzip is a lossless data compressor with a user interface similar to the one
+of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
+chain-Algorithm' (LZMA) stream format to maximize interoperability. The
+maximum dictionary size is 512 MiB so that any lzip file can be decompressed
+on 32-bit machines. Lzip provides accurate and robust 3-factor integrity
+checking. Lzip can compress about as fast as gzip (lzip -0) or compress most
+files more than bzip2 (lzip -9). Decompression speed is intermediate between
+gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
+perspective. Lzip has been designed, written, and tested with great care to
+replace gzip and bzip2 as the standard general-purpose compressed format for
+Unix-like systems.
+
+For compressing/decompressing large files on multiprocessor machines plzip
+can be much faster than lzip at the cost of a slightly reduced compression
+ratio.
+
+For creation and manipulation of compressed tar archives tarlz can be more
+efficient than using tar and plzip because tarlz is able to keep the
+alignment between tar members and lzip members.
+
+The lzip file format is designed for data sharing and long-term archiving,
+taking into account both data integrity and decoder availability:
+
+   * The lzip format provides very safe integrity checking and some data
+     recovery means. The program lziprecover can repair bit flip errors
+     (one of the most common forms of data corruption) in lzip files, and
+     provides data recovery capabilities, including error-checked merging
+     of damaged copies of a file.
+
+   * The lzip format is as simple as possible (but not simpler). The lzip
+     manual provides the source code of a simple decompressor along with a
+     detailed explanation of how it works, so that with the only help of the
+     lzip manual it would be possible for a digital archaeologist to extract
+     the data from a lzip file long after quantum computers eventually
+     render LZMA obsolete.
+
+   * Additionally the lzip reference implementation is copylefted, which
+     guarantees that it will remain free forever.
+
+A nice feature of the lzip format is that a corrupt byte is easier to repair
+the nearer it is from the beginning of the file. Therefore, with the help of
+lziprecover, losing an entire archive just because of a corrupt byte near
+the beginning is a thing of the past.
+
+Clzip uses the same well-defined exit status values used by bzip2, which
+makes it safer than compressors returning ambiguous warning values (like
+gzip) when it is used as a back end for other programs like tar or zutils.
+
+Clzip automatically uses for each file the largest dictionary size that does
+not exceed neither the file size nor the limit given. Keep in mind that the
+decompression memory requirement is affected at compression time by the
+choice of dictionary size limit.
+
+The amount of memory required for compression is about 1 or 2 times the
+dictionary size limit (1 if input file size is less than dictionary size
+limit, else 2) plus 9 times the dictionary size really used. The option '-0'
+is special and only requires about 1.5 MiB at most. The amount of memory
+required for decompression is about 46 kB larger than the dictionary size
+really used.
+
+When compressing, clzip replaces every file given in the command line
+with a compressed version of itself, with the name "original_name.lz".
+When decompressing, clzip attempts to guess the name for the decompressed
+file from that of the compressed file as follows:
+
+filename.lz    becomes   filename
+filename.tlz   becomes   filename.tar
+anyothername   becomes   anyothername.out
+
+(De)compressing a file is much like copying or moving it. Therefore clzip
+preserves the access and modification dates, permissions, and, if you have
+appropriate privileges, ownership of the file just as 'cp -p' does. (If the
+user ID or the group ID can't be duplicated, the file permission bits
+S_ISUID and S_ISGID are cleared).
+
+Clzip is able to read from some types of non-regular files if either the
+option '-c' or the option '-o' is specified.
+
+If no file names are specified, clzip compresses (or decompresses) from
+standard input to standard output. Clzip refuses to read compressed data
+from a terminal or write compressed data to a terminal, as this would be
+entirely incomprehensible and might leave the terminal in an abnormal state.
+
+Clzip correctly decompresses a file which is the concatenation of two or
+more compressed files. The result is the concatenation of the corresponding
+decompressed files. Integrity testing of concatenated compressed files is
+also supported.
+
+Clzip can produce multimember files, and lziprecover can safely recover the
+undamaged members in case of file damage. Clzip can also split the compressed
+output in volumes of a given size, even when reading from standard input.
+This allows the direct creation of multivolume compressed tar archives.
+
+Clzip is able to compress and decompress streams of unlimited size by
+automatically creating multimember output. The members so created are large,
+about 2 PiB each.
+
+In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a
+concrete algorithm; it is more like "any algorithm using the LZMA coding
+scheme". For example, the option '-0' of lzip uses the scheme in almost the
+simplest way possible; issuing the longest match it can find, or a literal
+byte if it can't find a match. Inversely, a much more elaborated way of
+finding coding sequences of minimum size than the one currently used by lzip
+could be developed, and the resulting sequence could also be coded using the
+LZMA coding scheme.
+
+Clzip currently implements two variants of the LZMA algorithm: fast
+(used by option '-0') and normal (used by all other compression levels).
+
+The high compression of LZMA comes from combining two basic, well-proven
+compression ideas: sliding dictionaries (LZ77) and markov models (the thing
+used by every compression algorithm that uses a range encoder or similar
+order-0 entropy coder as its last stage) with segregation of contexts
+according to what the bits are used for.
+
+The ideas embodied in clzip are due to (at least) the following people:
+Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the
+definition of Markov chains), G.N.N. Martin (for the definition of range
+encoding), Igor Pavlov (for putting all the above together in LZMA), and
+Julian Seward (for bzip2's CLI).
+
+LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never have
+been compressed. Decompressed is used to refer to data which have undergone
+the process of decompression.
+
+
+Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+This file is free documentation: you have unlimited permission to copy,
+distribute, and modify it.
+
+The file Makefile.in is a data file used by configure to produce the Makefile.
+It has the same copyright owner and permissions that configure itself.
diff --git a/carg_parser.c b/carg_parser.c
new file mode 100644
index 0000000..edb4eb9
--- /dev/null
+++ b/carg_parser.c
@@ -0,0 +1,319 @@
+/* Arg_parser - POSIX/GNU command-line argument parser. (C version)
+   Copyright (C) 2006-2024 Antonio Diaz Diaz.
+
+   This library is free software. Redistribution and use in source and
+   binary forms, with or without modification, are permitted provided
+   that the following conditions are met:
+
+   1. Redistributions of source code must retain the above copyright
+   notice, this list of conditions, and the following disclaimer.
+
+   2. Redistributions in binary form must reproduce the above copyright
+   notice, this list of conditions, and the following disclaimer in the
+   documentation and/or other materials provided with the distribution.
+
+   This library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+#include <stdlib.h>
+#include <string.h>
+
+#include "carg_parser.h"
+
+
+/* assure at least a minimum size for buffer 'buf' */
+static void * ap_resize_buffer( void * buf, const int min_size )
+  {
+  if( buf ) buf = realloc( buf, min_size );
+  else buf = malloc( min_size );
+  return buf;
+  }
+
+
+static char push_back_record( struct Arg_parser * const ap, const int code,
+                              const char * const long_name,
+                              const char * const argument )
+  {
+  struct ap_Record * p;
+  void * tmp = ap_resize_buffer( ap->data,
+                 ( ap->data_size + 1 ) * sizeof (struct ap_Record) );
+  if( !tmp ) return 0;
+  ap->data = (struct ap_Record *)tmp;
+  p = &(ap->data[ap->data_size]);
+  p->code = code;
+  if( long_name )
+    {
+    const int len = strlen( long_name );
+    p->parsed_name = (char *)malloc( len + 2 + 1 );
+    if( !p->parsed_name ) return 0;
+    p->parsed_name[0] = p->parsed_name[1] = '-';
+    strncpy( p->parsed_name + 2, long_name, len + 1 );
+    }
+  else if( code > 0 && code < 256 )
+    {
+    p->parsed_name = (char *)malloc( 2 + 1 );
+    if( !p->parsed_name ) return 0;
+    p->parsed_name[0] = '-'; p->parsed_name[1] = code; p->parsed_name[2] = 0;
+    }
+  else p->parsed_name = 0;
+  if( argument )
+    {
+    const int len = strlen( argument );
+    p->argument = (char *)malloc( len + 1 );
+    if( !p->argument ) { free( p->parsed_name ); return 0; }
+    strncpy( p->argument, argument, len + 1 );
+    }
+  else p->argument = 0;
+  ++ap->data_size;
+  return 1;
+  }
+
+
+static char add_error( struct Arg_parser * const ap, const char * const msg )
+  {
+  const int len = strlen( msg );
+  void * tmp = ap_resize_buffer( ap->error, ap->error_size + len + 1 );
+  if( !tmp ) return 0;
+  ap->error = (char *)tmp;
+  strncpy( ap->error + ap->error_size, msg, len + 1 );
+  ap->error_size += len;
+  return 1;
+  }
+
+
+static void free_data( struct Arg_parser * const ap )
+  {
+  int i;
+  for( i = 0; i < ap->data_size; ++i )
+    { free( ap->data[i].argument ); free( ap->data[i].parsed_name ); }
+  if( ap->data ) { free( ap->data ); ap->data = 0; }
+  ap->data_size = 0;
+  }
+
+
+/* Return 0 only if out of memory. */
+static char parse_long_option( struct Arg_parser * const ap,
+                               const char * const opt, const char * const arg,
+                               const struct ap_Option options[],
+                               int * const argindp )
+  {
+  unsigned len;
+  int index = -1, i;
+  char exact = 0, ambig = 0;
+
+  for( len = 0; opt[len+2] && opt[len+2] != '='; ++len ) ;
+
+  /* Test all long options for either exact match or abbreviated matches. */
+  for( i = 0; options[i].code != 0; ++i )
+    if( options[i].long_name &&
+        strncmp( options[i].long_name, &opt[2], len ) == 0 )
+      {
+      if( strlen( options[i].long_name ) == len )	/* Exact match found */
+        { index = i; exact = 1; break; }
+      else if( index < 0 ) index = i;		/* First nonexact match found */
+      else if( options[index].code != options[i].code ||
+               options[index].has_arg != options[i].has_arg )
+        ambig = 1;		/* Second or later nonexact match found */
+      }
+
+  if( ambig && !exact )
+    {
+    add_error( ap, "option '" ); add_error( ap, opt );
+    add_error( ap, "' is ambiguous" );
+    return 1;
+    }
+
+  if( index < 0 )		/* nothing found */
+    {
+    add_error( ap, "unrecognized option '" ); add_error( ap, opt );
+    add_error( ap, "'" );
+    return 1;
+    }
+
+  ++*argindp;
+
+  if( opt[len+2] )		/* '--<long_option>=<argument>' syntax */
+    {
+    if( options[index].has_arg == ap_no )
+      {
+      add_error( ap, "option '--" ); add_error( ap, options[index].long_name );
+      add_error( ap, "' doesn't allow an argument" );
+      return 1;
+      }
+    if( options[index].has_arg == ap_yes && !opt[len+3] )
+      {
+      add_error( ap, "option '--" ); add_error( ap, options[index].long_name );
+      add_error( ap, "' requires an argument" );
+      return 1;
+      }
+    return push_back_record( ap, options[index].code,
+                             options[index].long_name, &opt[len+3] );
+    }
+
+  if( options[index].has_arg == ap_yes )
+    {
+    if( !arg || !arg[0] )
+      {
+      add_error( ap, "option '--" ); add_error( ap, options[index].long_name );
+      add_error( ap, "' requires an argument" );
+      return 1;
+      }
+    ++*argindp;
+    return push_back_record( ap, options[index].code,
+                             options[index].long_name, arg );
+    }
+
+  return push_back_record( ap, options[index].code,
+                           options[index].long_name, 0 );
+  }
+
+
+/* Return 0 only if out of memory. */
+static char parse_short_option( struct Arg_parser * const ap,
+                                const char * const opt, const char * const arg,
+                                const struct ap_Option options[],
+                                int * const argindp )
+  {
+  int cind = 1;			/* character index in opt */
+
+  while( cind > 0 )
+    {
+    int index = -1, i;
+    const unsigned char c = opt[cind];
+    char code_str[2];
+    code_str[0] = c; code_str[1] = 0;
+
+    if( c != 0 )
+      for( i = 0; options[i].code; ++i )
+        if( c == options[i].code )
+          { index = i; break; }
+
+    if( index < 0 )
+      {
+      add_error( ap, "invalid option -- '" ); add_error( ap, code_str );
+      add_error( ap, "'" );
+      return 1;
+      }
+
+    if( opt[++cind] == 0 ) { ++*argindp; cind = 0; }	/* opt finished */
+
+    if( options[index].has_arg != ap_no && cind > 0 && opt[cind] )
+      {
+      if( !push_back_record( ap, c, 0, &opt[cind] ) ) return 0;
+      ++*argindp; cind = 0;
+      }
+    else if( options[index].has_arg == ap_yes )
+      {
+      if( !arg || !arg[0] )
+        {
+        add_error( ap, "option requires an argument -- '" );
+        add_error( ap, code_str ); add_error( ap, "'" );
+        return 1;
+        }
+      ++*argindp; cind = 0;
+      if( !push_back_record( ap, c, 0, arg ) ) return 0;
+      }
+    else if( !push_back_record( ap, c, 0, 0 ) ) return 0;
+    }
+  return 1;
+  }
+
+
+char ap_init( struct Arg_parser * const ap,
+              const int argc, const char * const argv[],
+              const struct ap_Option options[], const char in_order )
+  {
+  const char ** non_options = 0;	/* skipped non-options */
+  int non_options_size = 0;		/* number of skipped non-options */
+  int argind = 1;			/* index in argv */
+  char done = 0;			/* false until success */
+
+  ap->data = 0;
+  ap->error = 0;
+  ap->data_size = 0;
+  ap->error_size = 0;
+  if( argc < 2 || !argv || !options ) return 1;
+
+  while( argind < argc )
+    {
+    const unsigned char ch1 = argv[argind][0];
+    const unsigned char ch2 = ch1 ? argv[argind][1] : 0;
+
+    if( ch1 == '-' && ch2 )		/* we found an option */
+      {
+      const char * const opt = argv[argind];
+      const char * const arg = ( argind + 1 < argc ) ? argv[argind+1] : 0;
+      if( ch2 == '-' )
+        {
+        if( !argv[argind][2] ) { ++argind; break; }	/* we found "--" */
+        else if( !parse_long_option( ap, opt, arg, options, &argind ) ) goto out;
+        }
+      else if( !parse_short_option( ap, opt, arg, options, &argind ) ) goto out;
+      if( ap->error ) break;
+      }
+    else
+      {
+      if( in_order )
+        { if( !push_back_record( ap, 0, 0, argv[argind++] ) ) goto out; }
+      else
+        {
+        void * tmp = ap_resize_buffer( non_options,
+                       ( non_options_size + 1 ) * sizeof *non_options );
+        if( !tmp ) goto out;
+        non_options = (const char **)tmp;
+        non_options[non_options_size++] = argv[argind++];
+        }
+      }
+    }
+  if( ap->error ) free_data( ap );
+  else
+    {
+    int i;
+    for( i = 0; i < non_options_size; ++i )
+      if( !push_back_record( ap, 0, 0, non_options[i] ) ) goto out;
+    while( argind < argc )
+      if( !push_back_record( ap, 0, 0, argv[argind++] ) ) goto out;
+    }
+  done = 1;
+out: if( non_options ) free( non_options );
+  return done;
+  }
+
+
+void ap_free( struct Arg_parser * const ap )
+  {
+  free_data( ap );
+  if( ap->error ) { free( ap->error ); ap->error = 0; }
+  ap->error_size = 0;
+  }
+
+
+const char * ap_error( const struct Arg_parser * const ap )
+  { return ap->error; }
+
+
+int ap_arguments( const struct Arg_parser * const ap )
+  { return ap->data_size; }
+
+
+int ap_code( const struct Arg_parser * const ap, const int i )
+  {
+  if( i < 0 || i >= ap_arguments( ap ) ) return 0;
+  return ap->data[i].code;
+  }
+
+
+const char * ap_parsed_name( const struct Arg_parser * const ap, const int i )
+  {
+  if( i < 0 || i >= ap_arguments( ap ) || !ap->data[i].parsed_name ) return "";
+  return ap->data[i].parsed_name;
+  }
+
+
+const char * ap_argument( const struct Arg_parser * const ap, const int i )
+  {
+  if( i < 0 || i >= ap_arguments( ap ) || !ap->data[i].argument ) return "";
+  return ap->data[i].argument;
+  }
diff --git a/carg_parser.h b/carg_parser.h
new file mode 100644
index 0000000..69ce271
--- /dev/null
+++ b/carg_parser.h
@@ -0,0 +1,97 @@
+/* Arg_parser - POSIX/GNU command-line argument parser. (C version)
+   Copyright (C) 2006-2024 Antonio Diaz Diaz.
+
+   This library is free software. Redistribution and use in source and
+   binary forms, with or without modification, are permitted provided
+   that the following conditions are met:
+
+   1. Redistributions of source code must retain the above copyright
+   notice, this list of conditions, and the following disclaimer.
+
+   2. Redistributions in binary form must reproduce the above copyright
+   notice, this list of conditions, and the following disclaimer in the
+   documentation and/or other materials provided with the distribution.
+
+   This library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+
+/* Arg_parser reads the arguments in 'argv' and creates a number of
+   option codes, option arguments, and non-option arguments.
+
+   In case of error, 'ap_error' returns a non-null pointer to an error
+   message.
+
+   'options' is an array of 'struct ap_Option' terminated by an element
+   containing a code which is zero. A null long_name means a short-only
+   option. A code value outside the unsigned char range means a long-only
+   option.
+
+   Arg_parser normally makes it appear as if all the option arguments
+   were specified before all the non-option arguments for the purposes
+   of parsing, even if the user of your program intermixed option and
+   non-option arguments. If you want the arguments in the exact order
+   the user typed them, call 'ap_init' with 'in_order' = true.
+
+   The argument '--' terminates all options; any following arguments are
+   treated as non-option arguments, even if they begin with a hyphen.
+
+   The syntax for optional option arguments is '-<short_option><argument>'
+   (without whitespace), or '--<long_option>=<argument>'.
+*/
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+enum ap_Has_arg { ap_no, ap_yes, ap_maybe };
+
+struct ap_Option
+  {
+  int code;			/* Short option letter or code ( code != 0 ) */
+  const char * long_name;	/* Long option name (maybe null) */
+  enum ap_Has_arg has_arg;
+  };
+
+
+struct ap_Record
+  {
+  int code;
+  char * parsed_name;
+  char * argument;
+  };
+
+
+struct Arg_parser
+  {
+  struct ap_Record * data;
+  char * error;
+  int data_size;
+  int error_size;
+  };
+
+
+char ap_init( struct Arg_parser * const ap,
+              const int argc, const char * const argv[],
+              const struct ap_Option options[], const char in_order );
+
+void ap_free( struct Arg_parser * const ap );
+
+const char * ap_error( const struct Arg_parser * const ap );
+
+/* The number of arguments parsed. May be different from argc. */
+int ap_arguments( const struct Arg_parser * const ap );
+
+/* If ap_code( i ) is 0, ap_argument( i ) is a non-option.
+   Else ap_argument( i ) is the option's argument (or empty). */
+int ap_code( const struct Arg_parser * const ap, const int i );
+
+/* Full name of the option parsed (short or long). */
+const char * ap_parsed_name( const struct Arg_parser * const ap, const int i );
+
+const char * ap_argument( const struct Arg_parser * const ap, const int i );
+
+#ifdef __cplusplus
+}
+#endif
diff --git a/configure b/configure
new file mode 100755
index 0000000..a95a89e
--- /dev/null
+++ b/configure
@@ -0,0 +1,198 @@
+#! /bin/sh
+# configure script for Clzip - LZMA lossless data compressor
+# Copyright (C) 2010-2024 Antonio Diaz Diaz.
+#
+# This configure script is free software: you have unlimited permission
+# to copy, distribute, and modify it.
+
+pkgname=clzip
+pkgversion=1.14
+progname=clzip
+srctrigger=doc/${pkgname}.texi
+
+# clear some things potentially inherited from environment.
+LC_ALL=C
+export LC_ALL
+srcdir=
+prefix=/usr/local
+exec_prefix='$(prefix)'
+bindir='$(exec_prefix)/bin'
+datarootdir='$(prefix)/share'
+infodir='$(datarootdir)/info'
+mandir='$(datarootdir)/man'
+CC=gcc
+CPPFLAGS=
+CFLAGS='-Wall -W -O2'
+LDFLAGS=
+MAKEINFO=makeinfo
+
+# checking whether we are using GNU C.
+/bin/sh -c "${CC} --version" > /dev/null 2>&1 || { CC=cc ; CFLAGS=-O2 ; }
+
+# Loop over all args
+args=
+no_create=
+while [ $# != 0 ] ; do
+
+	# Get the first arg, and shuffle
+	option=$1 ; arg2=no
+	shift
+
+	# Add the argument quoted to args
+	if [ -z "${args}" ] ; then args="\"${option}\""
+	else args="${args} \"${option}\"" ; fi
+
+	# Split out the argument for options that take them
+	case ${option} in
+	*=*) optarg=`echo "${option}" | sed -e 's,^[^=]*=,,;s,/$,,'` ;;
+	esac
+
+	# Process the options
+	case ${option} in
+	--help | -h)
+		echo "Usage: $0 [OPTION]... [VAR=VALUE]..."
+		echo
+		echo "To assign makefile variables (e.g., CC, CFLAGS...), specify them as"
+		echo "arguments to configure in the form VAR=VALUE."
+		echo
+		echo "Options and variables: [defaults in brackets]"
+		echo "  -h, --help            display this help and exit"
+		echo "  -V, --version         output version information and exit"
+		echo "  --srcdir=DIR          find the source code in DIR [. or ..]"
+		echo "  --prefix=DIR          install into DIR [${prefix}]"
+		echo "  --exec-prefix=DIR     base directory for arch-dependent files [${exec_prefix}]"
+		echo "  --bindir=DIR          user executables directory [${bindir}]"
+		echo "  --datarootdir=DIR     base directory for doc and data [${datarootdir}]"
+		echo "  --infodir=DIR         info files directory [${infodir}]"
+		echo "  --mandir=DIR          man pages directory [${mandir}]"
+		echo "  CC=COMPILER           C compiler to use [${CC}]"
+		echo "  CPPFLAGS=OPTIONS      command-line options for the preprocessor [${CPPFLAGS}]"
+		echo "  CFLAGS=OPTIONS        command-line options for the C compiler [${CFLAGS}]"
+		echo "  CFLAGS+=OPTIONS       append options to the current value of CFLAGS"
+		echo "  LDFLAGS=OPTIONS       command-line options for the linker [${LDFLAGS}]"
+		echo "  MAKEINFO=NAME         makeinfo program to use [${MAKEINFO}]"
+		echo
+		exit 0 ;;
+	--version | -V)
+		echo "Configure script for ${pkgname} version ${pkgversion}"
+		exit 0 ;;
+	--srcdir)            srcdir=$1 ; arg2=yes ;;
+	--prefix)            prefix=$1 ; arg2=yes ;;
+	--exec-prefix)  exec_prefix=$1 ; arg2=yes ;;
+	--bindir)            bindir=$1 ; arg2=yes ;;
+	--datarootdir)  datarootdir=$1 ; arg2=yes ;;
+	--infodir)          infodir=$1 ; arg2=yes ;;
+	--mandir)            mandir=$1 ; arg2=yes ;;
+
+	--srcdir=*)            srcdir=${optarg} ;;
+	--prefix=*)            prefix=${optarg} ;;
+	--exec-prefix=*)  exec_prefix=${optarg} ;;
+	--bindir=*)            bindir=${optarg} ;;
+	--datarootdir=*)  datarootdir=${optarg} ;;
+	--infodir=*)          infodir=${optarg} ;;
+	--mandir=*)            mandir=${optarg} ;;
+	--no-create)              no_create=yes ;;
+
+	CC=*)              CC=${optarg} ;;
+	CPPFLAGS=*)  CPPFLAGS=${optarg} ;;
+	CFLAGS=*)      CFLAGS=${optarg} ;;
+	CFLAGS+=*)     CFLAGS="${CFLAGS} ${optarg}" ;;
+	LDFLAGS=*)    LDFLAGS=${optarg} ;;
+	MAKEINFO=*)  MAKEINFO=${optarg} ;;
+
+	--*)
+		echo "configure: WARNING: unrecognized option: '${option}'" 1>&2 ;;
+	*=* | *-*-*) ;;
+	*)
+		echo "configure: unrecognized option: '${option}'" 1>&2
+		echo "Try 'configure --help' for more information." 1>&2
+		exit 1 ;;
+	esac
+
+	# Check if the option took a separate argument
+	if [ "${arg2}" = yes ] ; then
+		if [ $# != 0 ] ; then args="${args} \"$1\"" ; shift
+		else echo "configure: Missing argument to '${option}'" 1>&2
+			exit 1
+		fi
+	fi
+done
+
+# Find the source code, if location was not specified.
+srcdirtext=
+if [ -z "${srcdir}" ] ; then
+	srcdirtext="or . or .." ; srcdir=.
+	if [ ! -r "${srcdir}/${srctrigger}" ] ; then srcdir=.. ; fi
+	if [ ! -r "${srcdir}/${srctrigger}" ] ; then
+		## the sed command below emulates the dirname command
+		srcdir=`echo "$0" | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'`
+	fi
+fi
+
+if [ ! -r "${srcdir}/${srctrigger}" ] ; then
+	echo "configure: Can't find source code in ${srcdir} ${srcdirtext}" 1>&2
+	echo "configure: (At least ${srctrigger} is missing)." 1>&2
+	exit 1
+fi
+
+# Set srcdir to . if that's what it is.
+if [ "`pwd`" = "`cd "${srcdir}" ; pwd`" ] ; then srcdir=. ; fi
+
+echo
+if [ -z "${no_create}" ] ; then
+	echo "creating config.status"
+	rm -f config.status
+	cat > config.status << EOF
+#! /bin/sh
+# This file was generated automatically by configure. Don't edit.
+# Run this file to recreate the current configuration.
+#
+# This script is free software: you have unlimited permission
+# to copy, distribute, and modify it.
+
+exec /bin/sh "$0" ${args} --no-create
+EOF
+	chmod +x config.status
+fi
+
+echo "creating Makefile"
+echo "VPATH = ${srcdir}"
+echo "prefix = ${prefix}"
+echo "exec_prefix = ${exec_prefix}"
+echo "bindir = ${bindir}"
+echo "datarootdir = ${datarootdir}"
+echo "infodir = ${infodir}"
+echo "mandir = ${mandir}"
+echo "CC = ${CC}"
+echo "CPPFLAGS = ${CPPFLAGS}"
+echo "CFLAGS = ${CFLAGS}"
+echo "LDFLAGS = ${LDFLAGS}"
+echo "MAKEINFO = ${MAKEINFO}"
+rm -f Makefile
+cat > Makefile << EOF
+# Makefile for Clzip - LZMA lossless data compressor
+# Copyright (C) 2010-2024 Antonio Diaz Diaz.
+# This file was generated automatically by configure. Don't edit.
+#
+# This Makefile is free software: you have unlimited permission
+# to copy, distribute, and modify it.
+
+pkgname = ${pkgname}
+pkgversion = ${pkgversion}
+progname = ${progname}
+VPATH = ${srcdir}
+prefix = ${prefix}
+exec_prefix = ${exec_prefix}
+bindir = ${bindir}
+datarootdir = ${datarootdir}
+infodir = ${infodir}
+mandir = ${mandir}
+CC = ${CC}
+CPPFLAGS = ${CPPFLAGS}
+CFLAGS = ${CFLAGS}
+LDFLAGS = ${LDFLAGS}
+MAKEINFO = ${MAKEINFO}
+EOF
+cat "${srcdir}/Makefile.in" >> Makefile
+
+echo "OK. Now you can run make."
diff --git a/decoder.c b/decoder.c
new file mode 100644
index 0000000..baea067
--- /dev/null
+++ b/decoder.c
@@ -0,0 +1,293 @@
+/* Clzip - LZMA lossless data compressor
+   Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+*/
+
+#define _FILE_OFFSET_BITS 64
+
+#include <errno.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include "lzip.h"
+#include "decoder.h"
+
+
+/* Return the number of bytes really read.
+   If (value returned < size) and (errno == 0), means EOF was reached.
+*/
+int readblock( const int fd, uint8_t * const buf, const int size )
+  {
+  int sz = 0;
+  errno = 0;
+  while( sz < size )
+    {
+    const int n = read( fd, buf + sz, size - sz );
+    if( n > 0 ) sz += n;
+    else if( n == 0 ) break;				/* EOF */
+    else if( errno != EINTR ) break;
+    errno = 0;
+    }
+  return sz;
+  }
+
+
+/* Return the number of bytes really written.
+   If (value returned < size), it is always an error.
+*/
+int writeblock( const int fd, const uint8_t * const buf, const int size )
+  {
+  int sz = 0;
+  errno = 0;
+  while( sz < size )
+    {
+    const int n = write( fd, buf + sz, size - sz );
+    if( n > 0 ) sz += n;
+    else if( n < 0 && errno != EINTR ) break;
+    errno = 0;
+    }
+  return sz;
+  }
+
+
+bool Rd_read_block( struct Range_decoder * const rdec )
+  {
+  if( !rdec->at_stream_end )
+    {
+    rdec->stream_pos = readblock( rdec->infd, rdec->buffer, rd_buffer_size );
+    if( rdec->stream_pos != rd_buffer_size && errno )
+      { show_error( "Read error", errno, false ); cleanup_and_fail( 1 ); }
+    rdec->at_stream_end = ( rdec->stream_pos < rd_buffer_size );
+    rdec->partial_member_pos += rdec->pos;
+    rdec->pos = 0;
+    show_dprogress( 0, 0, 0, 0 );
+    }
+  return rdec->pos < rdec->stream_pos;
+  }
+
+
+void LZd_flush_data( struct LZ_decoder * const d )
+  {
+  if( d->pos > d->stream_pos )
+    {
+    const int size = d->pos - d->stream_pos;
+    CRC32_update_buf( &d->crc, d->buffer + d->stream_pos, size );
+    if( d->outfd >= 0 &&
+        writeblock( d->outfd, d->buffer + d->stream_pos, size ) != size )
+      { show_error( "Write error", errno, false ); cleanup_and_fail( 1 ); }
+    if( d->pos >= d->dictionary_size )
+      { d->partial_data_pos += d->pos; d->pos = 0; d->pos_wrapped = true; }
+    d->stream_pos = d->pos;
+    }
+  }
+
+
+static int LZd_check_trailer( struct LZ_decoder * const d,
+                              struct Pretty_print * const pp,
+                              const bool ignore_empty )
+  {
+  Lzip_trailer trailer;
+  int size = Rd_read_data( d->rdec, trailer, Lt_size );
+  bool error = false;
+
+  if( size < Lt_size )
+    {
+    error = true;
+    if( verbosity >= 0 )
+      { Pp_show_msg( pp, 0 );
+        fprintf( stderr, "Trailer truncated at trailer position %d;"
+                         " some checks may fail.\n", size ); }
+    while( size < Lt_size ) trailer[size++] = 0;
+    }
+
+  const unsigned td_crc = Lt_get_data_crc( trailer );
+  if( td_crc != LZd_crc( d ) )
+    {
+    error = true;
+    if( verbosity >= 0 )
+      { Pp_show_msg( pp, 0 );
+        fprintf( stderr, "CRC mismatch; stored %08X, computed %08X\n",
+                 td_crc, LZd_crc( d ) ); }
+    }
+  const unsigned long long data_size = LZd_data_position( d );
+  const unsigned long long td_size = Lt_get_data_size( trailer );
+  if( td_size != data_size )
+    {
+    error = true;
+    if( verbosity >= 0 )
+      { Pp_show_msg( pp, 0 );
+        fprintf( stderr, "Data size mismatch; stored %llu (0x%llX), computed %llu (0x%llX)\n",
+                 td_size, td_size, data_size, data_size ); }
+    }
+  const unsigned long long member_size = Rd_member_position( d->rdec );
+  const unsigned long long tm_size = Lt_get_member_size( trailer );
+  if( tm_size != member_size )
+    {
+    error = true;
+    if( verbosity >= 0 )
+      { Pp_show_msg( pp, 0 );
+        fprintf( stderr, "Member size mismatch; stored %llu (0x%llX), computed %llu (0x%llX)\n",
+                 tm_size, tm_size, member_size, member_size ); }
+    }
+  if( error ) return 3;
+  if( !ignore_empty && data_size == 0 ) return 5;
+  if( verbosity >= 2 )
+    {
+    if( verbosity >= 4 ) show_header( d->dictionary_size );
+    if( data_size == 0 || member_size == 0 )
+      fputs( "no data compressed. ", stderr );
+    else
+      fprintf( stderr, "%6.3f:1, %5.2f%% ratio, %5.2f%% saved. ",
+               (double)data_size / member_size,
+               ( 100.0 * member_size ) / data_size,
+               100.0 - ( ( 100.0 * member_size ) / data_size ) );
+    if( verbosity >= 4 ) fprintf( stderr, "CRC %08X, ", td_crc );
+    if( verbosity >= 3 )
+      fprintf( stderr, "%9llu out, %8llu in. ", data_size, member_size );
+    }
+  return 0;
+  }
+
+
+/* Return value: 0 = OK, 1 = decoder error, 2 = unexpected EOF,
+                 3 = trailer error, 4 = unknown marker found,
+                 5 = empty member found, 6 = marked member found. */
+int LZd_decode_member( struct LZ_decoder * const d,
+                       const struct Cl_options * const cl_opts,
+                       struct Pretty_print * const pp )
+  {
+  struct Range_decoder * const rdec = d->rdec;
+  Bit_model bm_literal[1<<literal_context_bits][0x300];
+  Bit_model bm_match[states][pos_states];
+  Bit_model bm_rep[states];
+  Bit_model bm_rep0[states];
+  Bit_model bm_rep1[states];
+  Bit_model bm_rep2[states];
+  Bit_model bm_len[states][pos_states];
+  Bit_model bm_dis_slot[len_states][1<<dis_slot_bits];
+  Bit_model bm_dis[modeled_distances-end_dis_model+1];
+  Bit_model bm_align[dis_align_size];
+  struct Len_model match_len_model;
+  struct Len_model rep_len_model;
+  unsigned rep0 = 0;		/* rep[0-3] latest four distances */
+  unsigned rep1 = 0;		/* used for efficient coding of */
+  unsigned rep2 = 0;		/* repeated distances */
+  unsigned rep3 = 0;
+  State state = 0;
+
+  Bm_array_init( bm_literal[0], (1 << literal_context_bits) * 0x300 );
+  Bm_array_init( bm_match[0], states * pos_states );
+  Bm_array_init( bm_rep, states );
+  Bm_array_init( bm_rep0, states );
+  Bm_array_init( bm_rep1, states );
+  Bm_array_init( bm_rep2, states );
+  Bm_array_init( bm_len[0], states * pos_states );
+  Bm_array_init( bm_dis_slot[0], len_states * (1 << dis_slot_bits) );
+  Bm_array_init( bm_dis, modeled_distances - end_dis_model + 1 );
+  Bm_array_init( bm_align, dis_align_size );
+  Lm_init( &match_len_model );
+  Lm_init( &rep_len_model );
+
+  if( !Rd_load( rdec, cl_opts->ignore_marking ) ) return 6;
+  while( !Rd_finished( rdec ) )
+    {
+    const int pos_state = LZd_data_position( d ) & pos_state_mask;
+    if( Rd_decode_bit( rdec, &bm_match[state][pos_state] ) == 0 ) /* 1st bit */
+      {
+      /* literal byte */
+      Bit_model * const bm = bm_literal[get_lit_state(LZd_peek_prev( d ))];
+      if( ( state = St_set_char( state ) ) < 4 )
+        LZd_put_byte( d, Rd_decode_tree8( rdec, bm ) );
+      else
+        LZd_put_byte( d, Rd_decode_matched( rdec, bm, LZd_peek( d, rep0 ) ) );
+      continue;
+      }
+    /* match or repeated match */
+    int len;
+    if( Rd_decode_bit( rdec, &bm_rep[state] ) != 0 )		/* 2nd bit */
+      {
+      if( Rd_decode_bit( rdec, &bm_rep0[state] ) == 0 )		/* 3rd bit */
+        {
+        if( Rd_decode_bit( rdec, &bm_len[state][pos_state] ) == 0 ) /* 4th bit */
+          { state = St_set_short_rep( state );
+            LZd_put_byte( d, LZd_peek( d, rep0 ) ); continue; }
+        }
+      else
+        {
+        unsigned distance;
+        if( Rd_decode_bit( rdec, &bm_rep1[state] ) == 0 )	/* 4th bit */
+          distance = rep1;
+        else
+          {
+          if( Rd_decode_bit( rdec, &bm_rep2[state] ) == 0 )	/* 5th bit */
+            distance = rep2;
+          else
+            { distance = rep3; rep3 = rep2; }
+          rep2 = rep1;
+          }
+        rep1 = rep0;
+        rep0 = distance;
+        }
+      state = St_set_rep( state );
+      len = Rd_decode_len( rdec, &rep_len_model, pos_state );
+      }
+    else					/* match */
+      {
+      len = Rd_decode_len( rdec, &match_len_model, pos_state );
+      unsigned distance = Rd_decode_tree6( rdec, bm_dis_slot[get_len_state(len)] );
+      if( distance >= start_dis_model )
+        {
+        const unsigned dis_slot = distance;
+        const int direct_bits = ( dis_slot >> 1 ) - 1;
+        distance = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
+        if( dis_slot < end_dis_model )
+          distance += Rd_decode_tree_reversed( rdec,
+                      bm_dis + ( distance - dis_slot ), direct_bits );
+        else
+          {
+          distance +=
+            Rd_decode( rdec, direct_bits - dis_align_bits ) << dis_align_bits;
+          distance += Rd_decode_tree_reversed4( rdec, bm_align );
+          if( distance == 0xFFFFFFFFU )		/* marker found */
+            {
+            Rd_normalize( rdec );
+            LZd_flush_data( d );
+            if( len == min_match_len )		/* End Of Stream marker */
+              return LZd_check_trailer( d, pp, cl_opts->ignore_empty );
+            if( len == min_match_len + 1 )	/* Sync Flush marker */
+              { Rd_load( rdec, true ); continue; }
+            if( verbosity >= 0 )
+              {
+              Pp_show_msg( pp, 0 );
+              fprintf( stderr, "Unsupported marker code '%d'\n", len );
+              }
+            return 4;
+            }
+          }
+        }
+      rep3 = rep2; rep2 = rep1; rep1 = rep0; rep0 = distance;
+      state = St_set_match( state );
+      if( rep0 >= d->dictionary_size || ( rep0 >= d->pos && !d->pos_wrapped ) )
+        { LZd_flush_data( d ); return 1; }
+      }
+    LZd_copy_block( d, rep0, len );
+    }
+  LZd_flush_data( d );
+  return 2;
+  }
diff --git a/decoder.h b/decoder.h
new file mode 100644
index 0000000..d160135
--- /dev/null
+++ b/decoder.h
@@ -0,0 +1,367 @@
+/* Clzip - LZMA lossless data compressor
+   Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+*/
+
+enum { rd_buffer_size = 16384 };
+
+struct Range_decoder
+  {
+  unsigned long long partial_member_pos;
+  uint8_t * buffer;		/* input buffer */
+  int pos;			/* current pos in buffer */
+  int stream_pos;		/* when reached, a new block must be read */
+  uint32_t code;
+  uint32_t range;
+  int infd;			/* input file descriptor */
+  bool at_stream_end;
+  };
+
+bool Rd_read_block( struct Range_decoder * const rdec );
+
+static inline bool Rd_init( struct Range_decoder * const rdec, const int ifd )
+  {
+  rdec->partial_member_pos = 0;
+  rdec->buffer = (uint8_t *)malloc( rd_buffer_size );
+  if( !rdec->buffer ) return false;
+  rdec->pos = 0;
+  rdec->stream_pos = 0;
+  rdec->code = 0;
+  rdec->range = 0xFFFFFFFFU;
+  rdec->infd = ifd;
+  rdec->at_stream_end = false;
+  return true;
+  }
+
+static inline void Rd_free( struct Range_decoder * const rdec )
+  { free( rdec->buffer ); }
+
+static inline bool Rd_finished( struct Range_decoder * const rdec )
+  { return rdec->pos >= rdec->stream_pos && !Rd_read_block( rdec ); }
+
+static inline unsigned long long
+Rd_member_position( const struct Range_decoder * const rdec )
+  { return rdec->partial_member_pos + rdec->pos; }
+
+static inline void Rd_reset_member_position( struct Range_decoder * const rdec )
+  { rdec->partial_member_pos = 0; rdec->partial_member_pos -= rdec->pos; }
+
+static inline uint8_t Rd_get_byte( struct Range_decoder * const rdec )
+  {
+  /* 0xFF avoids decoder error if member is truncated at EOS marker */
+  if( Rd_finished( rdec ) ) return 0xFF;
+  return rdec->buffer[rdec->pos++];
+  }
+
+static inline int Rd_read_data( struct Range_decoder * const rdec,
+                                uint8_t * const outbuf, const int size )
+  {
+  int sz = 0;
+  while( sz < size && !Rd_finished( rdec ) )
+    {
+    const int rd = min( size - sz, rdec->stream_pos - rdec->pos );
+    memcpy( outbuf + sz, rdec->buffer + rdec->pos, rd );
+    rdec->pos += rd;
+    sz += rd;
+    }
+  return sz;
+  }
+
+static inline bool Rd_load( struct Range_decoder * const rdec,
+                            const bool ignore_marking )
+  {
+  int i;
+  rdec->code = 0;
+  rdec->range = 0xFFFFFFFFU;
+  /* check and discard first byte of the LZMA stream */
+  if( Rd_get_byte( rdec ) != 0 && !ignore_marking ) return false;
+  for( i = 0; i < 4; ++i ) rdec->code = (rdec->code << 8) | Rd_get_byte( rdec );
+  return true;
+  }
+
+static inline void Rd_normalize( struct Range_decoder * const rdec )
+  {
+  if( rdec->range <= 0x00FFFFFFU )
+    { rdec->range <<= 8; rdec->code = (rdec->code << 8) | Rd_get_byte( rdec ); }
+  }
+
+static inline unsigned Rd_decode( struct Range_decoder * const rdec,
+                                  const int num_bits )
+  {
+  unsigned symbol = 0;
+  int i;
+  for( i = num_bits; i > 0; --i )
+    {
+    Rd_normalize( rdec );
+    rdec->range >>= 1;
+/*    symbol <<= 1; */
+/*    if( rdec->code >= rdec->range ) { rdec->code -= rdec->range; symbol |= 1; } */
+    const bool bit = ( rdec->code >= rdec->range );
+    symbol <<= 1; symbol += bit;
+    rdec->code -= rdec->range & ( 0U - bit );
+    }
+  return symbol;
+  }
+
+static inline unsigned Rd_decode_bit( struct Range_decoder * const rdec,
+                                      Bit_model * const probability )
+  {
+  Rd_normalize( rdec );
+  const uint32_t bound = ( rdec->range >> bit_model_total_bits ) * *probability;
+  if( rdec->code < bound )
+    {
+    rdec->range = bound;
+    *probability += ( bit_model_total - *probability ) >> bit_model_move_bits;
+    return 0;
+    }
+  else
+    {
+    rdec->code -= bound;
+    rdec->range -= bound;
+    *probability -= *probability >> bit_model_move_bits;
+    return 1;
+    }
+  }
+
+static inline void Rd_decode_symbol_bit( struct Range_decoder * const rdec,
+                         Bit_model * const probability, unsigned * symbol )
+  {
+  Rd_normalize( rdec );
+  *symbol <<= 1;
+  const uint32_t bound = ( rdec->range >> bit_model_total_bits ) * *probability;
+  if( rdec->code < bound )
+    {
+    rdec->range = bound;
+    *probability += ( bit_model_total - *probability ) >> bit_model_move_bits;
+    }
+  else
+    {
+    rdec->code -= bound;
+    rdec->range -= bound;
+    *probability -= *probability >> bit_model_move_bits;
+    *symbol |= 1;
+    }
+  }
+
+static inline void Rd_decode_symbol_bit_reversed( struct Range_decoder * const rdec,
+                         Bit_model * const probability, unsigned * model,
+                         unsigned * symbol, const int i )
+  {
+  Rd_normalize( rdec );
+  *model <<= 1;
+  const uint32_t bound = ( rdec->range >> bit_model_total_bits ) * *probability;
+  if( rdec->code < bound )
+    {
+    rdec->range = bound;
+    *probability += ( bit_model_total - *probability ) >> bit_model_move_bits;
+    }
+  else
+    {
+    rdec->code -= bound;
+    rdec->range -= bound;
+    *probability -= *probability >> bit_model_move_bits;
+    *model |= 1;
+    *symbol |= 1 << i;
+    }
+  }
+
+static inline unsigned Rd_decode_tree6( struct Range_decoder * const rdec,
+                                        Bit_model bm[] )
+  {
+  unsigned symbol = 1;
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  return symbol & 0x3F;
+  }
+
+static inline unsigned Rd_decode_tree8( struct Range_decoder * const rdec,
+                                        Bit_model bm[] )
+  {
+  unsigned symbol = 1;
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  return symbol & 0xFF;
+  }
+
+static inline unsigned
+Rd_decode_tree_reversed( struct Range_decoder * const rdec,
+                         Bit_model bm[], const int num_bits )
+  {
+  unsigned model = 1;
+  unsigned symbol = 0;
+  int i;
+  for( i = 0; i < num_bits; ++i )
+    Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, i );
+  return symbol;
+  }
+
+static inline unsigned
+Rd_decode_tree_reversed4( struct Range_decoder * const rdec, Bit_model bm[] )
+  {
+  unsigned model = 1;
+  unsigned symbol = 0;
+  Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 0 );
+  Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 1 );
+  Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 2 );
+  Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 3 );
+  return symbol;
+  }
+
+static inline unsigned Rd_decode_matched( struct Range_decoder * const rdec,
+                                          Bit_model bm[], unsigned match_byte )
+  {
+  unsigned symbol = 1;
+  unsigned mask = 0x100;
+  while( true )
+    {
+    const unsigned match_bit = ( match_byte <<= 1 ) & mask;
+    const unsigned bit = Rd_decode_bit( rdec, &bm[symbol+match_bit+mask] );
+    symbol <<= 1; symbol += bit;
+    if( symbol > 0xFF ) return symbol & 0xFF;
+    mask &= ~(match_bit ^ (bit << 8));	/* if( match_bit != bit ) mask = 0; */
+    }
+  }
+
+static inline unsigned Rd_decode_len( struct Range_decoder * const rdec,
+                                      struct Len_model * const lm,
+                                      const int pos_state )
+  {
+  Bit_model * bm;
+  unsigned mask, offset, symbol = 1;
+
+  if( Rd_decode_bit( rdec, &lm->choice1 ) == 0 )
+    { bm = lm->bm_low[pos_state]; mask = 7; offset = 0; goto len3; }
+  if( Rd_decode_bit( rdec, &lm->choice2 ) == 0 )
+    { bm = lm->bm_mid[pos_state]; mask = 7; offset = len_low_symbols; goto len3; }
+  bm = lm->bm_high; mask = 0xFF; offset = len_low_symbols + len_mid_symbols;
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+len3:
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol );
+  return ( symbol & mask ) + min_match_len + offset;
+  }
+
+
+struct LZ_decoder
+  {
+  unsigned long long partial_data_pos;
+  struct Range_decoder * rdec;
+  unsigned dictionary_size;
+  uint8_t * buffer;		/* output buffer */
+  unsigned pos;			/* current pos in buffer */
+  unsigned stream_pos;		/* first byte not yet written to file */
+  uint32_t crc;
+  int outfd;			/* output file descriptor */
+  bool pos_wrapped;
+  };
+
+void LZd_flush_data( struct LZ_decoder * const d );
+
+static inline uint8_t LZd_peek_prev( const struct LZ_decoder * const d )
+  { return d->buffer[((d->pos > 0) ? d->pos : d->dictionary_size)-1]; }
+
+static inline uint8_t LZd_peek( const struct LZ_decoder * const d,
+                                const unsigned distance )
+  {
+  const unsigned i = ( ( d->pos > distance ) ? 0 : d->dictionary_size ) +
+                     d->pos - distance - 1;
+  return d->buffer[i];
+  }
+
+static inline void LZd_put_byte( struct LZ_decoder * const d, const uint8_t b )
+  {
+  d->buffer[d->pos] = b;
+  if( ++d->pos >= d->dictionary_size ) LZd_flush_data( d );
+  }
+
+static inline void LZd_copy_block( struct LZ_decoder * const d,
+                                   const unsigned distance, unsigned len )
+  {
+  unsigned lpos = d->pos, i = lpos - distance - 1;
+  bool fast, fast2;
+  if( lpos > distance )
+    {
+    fast = ( len < d->dictionary_size - lpos );
+    fast2 = ( fast && len <= lpos - i );
+    }
+  else
+    {
+    i += d->dictionary_size;
+    fast = ( len < d->dictionary_size - i );	/* (i == pos) may happen */
+    fast2 = ( fast && len <= i - lpos );
+    }
+  if( fast )					/* no wrap */
+    {
+    d->pos += len;
+    if( fast2 )					/* no wrap, no overlap */
+      memcpy( d->buffer + lpos, d->buffer + i, len );
+    else
+      for( ; len > 0; --len ) d->buffer[lpos++] = d->buffer[i++];
+    }
+  else for( ; len > 0; --len )
+    {
+    d->buffer[d->pos] = d->buffer[i];
+    if( ++d->pos >= d->dictionary_size ) LZd_flush_data( d );
+    if( ++i >= d->dictionary_size ) i = 0;
+    }
+  }
+
+static inline bool LZd_init( struct LZ_decoder * const d,
+                             struct Range_decoder * const rde,
+                             const unsigned dict_size, const int ofd )
+  {
+  d->partial_data_pos = 0;
+  d->rdec = rde;
+  d->dictionary_size = dict_size;
+  d->buffer = (uint8_t *)malloc( d->dictionary_size );
+  if( !d->buffer ) return false;
+  d->pos = 0;
+  d->stream_pos = 0;
+  d->crc = 0xFFFFFFFFU;
+  d->outfd = ofd;
+  d->pos_wrapped = false;
+  /* prev_byte of first byte; also for LZd_peek( 0 ) on corrupt file */
+  d->buffer[d->dictionary_size-1] = 0;
+  return true;
+  }
+
+static inline void LZd_free( struct LZ_decoder * const d )
+  { free( d->buffer ); }
+
+static inline unsigned LZd_crc( const struct LZ_decoder * const d )
+  { return d->crc ^ 0xFFFFFFFFU; }
+
+static inline unsigned long long
+LZd_data_position( const struct LZ_decoder * const d )
+  { return d->partial_data_pos + d->pos; }
+
+int LZd_decode_member( struct LZ_decoder * const d,
+                       const struct Cl_options * const cl_opts,
+                       struct Pretty_print * const pp );
diff --git a/doc/clzip.1 b/doc/clzip.1
new file mode 100644
index 0000000..46f69e9
--- /dev/null
+++ b/doc/clzip.1
@@ -0,0 +1,141 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.49.2.
+.TH CLZIP "1" "January 2024" "clzip 1.14" "User Commands"
+.SH NAME
+clzip \- reduces the size of files
+.SH SYNOPSIS
+.B clzip
+[\fI\,options\/\fR] [\fI\,files\/\fR]
+.SH DESCRIPTION
+Clzip is a C language version of lzip, compatible with lzip 1.4 or newer. As
+clzip is written in C, it may be easier to integrate in applications like
+package managers, embedded devices, or systems lacking a C++ compiler.
+.PP
+Lzip is a lossless data compressor with a user interface similar to the one
+of gzip or bzip2. Lzip uses a simplified form of the 'Lempel\-Ziv\-Markov
+chain\-Algorithm' (LZMA) stream format to maximize interoperability. The
+maximum dictionary size is 512 MiB so that any lzip file can be decompressed
+on 32\-bit machines. Lzip provides accurate and robust 3\-factor integrity
+checking. Lzip can compress about as fast as gzip (lzip \fB\-0\fR) or compress most
+files more than bzip2 (lzip \fB\-9\fR). Decompression speed is intermediate between
+gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
+perspective. Lzip has been designed, written, and tested with great care to
+replace gzip and bzip2 as the standard general\-purpose compressed format for
+Unix\-like systems.
+.SH OPTIONS
+.TP
+\fB\-h\fR, \fB\-\-help\fR
+display this help and exit
+.TP
+\fB\-V\fR, \fB\-\-version\fR
+output version information and exit
+.TP
+\fB\-a\fR, \fB\-\-trailing\-error\fR
+exit with error status if trailing data
+.TP
+\fB\-b\fR, \fB\-\-member\-size=\fR<bytes>
+set member size limit in bytes
+.TP
+\fB\-c\fR, \fB\-\-stdout\fR
+write to standard output, keep input files
+.TP
+\fB\-d\fR, \fB\-\-decompress\fR
+decompress, test compressed file integrity
+.TP
+\fB\-f\fR, \fB\-\-force\fR
+overwrite existing output files
+.TP
+\fB\-F\fR, \fB\-\-recompress\fR
+force re\-compression of compressed files
+.TP
+\fB\-k\fR, \fB\-\-keep\fR
+keep (don't delete) input files
+.TP
+\fB\-l\fR, \fB\-\-list\fR
+print (un)compressed file sizes
+.TP
+\fB\-m\fR, \fB\-\-match\-length=\fR<bytes>
+set match length limit in bytes [36]
+.TP
+\fB\-o\fR, \fB\-\-output=\fR<file>
+write to <file>, keep input files
+.TP
+\fB\-q\fR, \fB\-\-quiet\fR
+suppress all messages
+.TP
+\fB\-s\fR, \fB\-\-dictionary\-size=\fR<bytes>
+set dictionary size limit in bytes [8 MiB]
+.TP
+\fB\-S\fR, \fB\-\-volume\-size=\fR<bytes>
+set volume size limit in bytes
+.TP
+\fB\-t\fR, \fB\-\-test\fR
+test compressed file integrity
+.TP
+\fB\-v\fR, \fB\-\-verbose\fR
+be verbose (a 2nd \fB\-v\fR gives more)
+.TP
+\fB\-0\fR .. \fB\-9\fR
+set compression level [default 6]
+.TP
+\fB\-\-fast\fR
+alias for \fB\-0\fR
+.TP
+\fB\-\-best\fR
+alias for \fB\-9\fR
+.TP
+\fB\-\-empty\-error\fR
+exit with error status if empty member in file
+.TP
+\fB\-\-marking\-error\fR
+exit with error status if 1st LZMA byte not 0
+.TP
+\fB\-\-loose\-trailing\fR
+allow trailing data seeming corrupt header
+.PP
+If no file names are given, or if a file is '\-', clzip compresses or
+decompresses from standard input to standard output.
+Numbers may be followed by a multiplier: k = kB = 10^3 = 1000,
+Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc...
+Dictionary sizes 12 to 29 are interpreted as powers of two, meaning 2^12 to
+2^29 bytes.
+.PP
+The bidimensional parameter space of LZMA can't be mapped to a linear scale
+optimal for all files. If your files are large, very repetitive, etc, you
+may need to use the options \fB\-\-dictionary\-size\fR and \fB\-\-match\-length\fR directly
+to achieve optimal performance.
+.PP
+To extract all the files from archive 'foo.tar.lz', use the commands
+\&'tar \fB\-xf\fR foo.tar.lz' or 'clzip \fB\-cd\fR foo.tar.lz | tar \fB\-xf\fR \-'.
+.PP
+Exit status: 0 for a normal exit, 1 for environmental problems
+(file not found, invalid command\-line options, I/O errors, etc), 2 to
+indicate a corrupt or invalid input file, 3 for an internal consistency
+error (e.g., bug) which caused clzip to panic.
+.PP
+The ideas embodied in clzip are due to (at least) the following people:
+Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the
+definition of Markov chains), G.N.N. Martin (for the definition of range
+encoding), Igor Pavlov (for putting all the above together in LZMA), and
+Julian Seward (for bzip2's CLI).
+.SH "REPORTING BUGS"
+Report bugs to lzip\-bug@nongnu.org
+.br
+Clzip home page: http://www.nongnu.org/lzip/clzip.html
+.SH COPYRIGHT
+Copyright \(co 2024 Antonio Diaz Diaz.
+License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>
+.br
+This is free software: you are free to change and redistribute it.
+There is NO WARRANTY, to the extent permitted by law.
+.SH "SEE ALSO"
+The full documentation for
+.B clzip
+is maintained as a Texinfo manual.  If the
+.B info
+and
+.B clzip
+programs are properly installed at your site, the command
+.IP
+.B info clzip
+.PP
+should give you access to the complete manual.
diff --git a/doc/clzip.info b/doc/clzip.info
new file mode 100644
index 0000000..2d83e3c
--- /dev/null
+++ b/doc/clzip.info
@@ -0,0 +1,1732 @@
+This is clzip.info, produced by makeinfo version 4.13+ from clzip.texi.
+
+INFO-DIR-SECTION Compression
+START-INFO-DIR-ENTRY
+* Clzip: (clzip).               LZMA lossless data compressor
+END-INFO-DIR-ENTRY
+
+
+File: clzip.info,  Node: Top,  Next: Introduction,  Up: (dir)
+
+Clzip Manual
+************
+
+This manual is for Clzip (version 1.14, 22 January 2024).
+
+* Menu:
+
+* Introduction::           Purpose and features of clzip
+* Output::                 Meaning of clzip's output
+* Invoking clzip::         Command-line interface
+* Quality assurance::      Design, development, and testing of lzip
+* Algorithm::              How clzip compresses the data
+* File format::            Detailed format of the compressed file
+* Stream format::          Format of the LZMA stream in lzip files
+* Trailing data::          Extra data appended to the file
+* Examples::               A small tutorial with examples
+* Problems::               Reporting bugs
+* Reference source code::  Source code illustrating stream format
+* Concept index::          Index of concepts
+
+
+   Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+   This manual is free documentation: you have unlimited permission to copy,
+distribute, and modify it.
+
+
+File: clzip.info,  Node: Introduction,  Next: Output,  Prev: Top,  Up: Top
+
+1 Introduction
+**************
+
+Clzip is a C language version of lzip, compatible with lzip 1.4 or newer.
+As clzip is written in C, it may be easier to integrate in applications like
+package managers, embedded devices, or systems lacking a C++ compiler.
+
+   Lzip is a lossless data compressor with a user interface similar to the
+one of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
+chain-Algorithm' (LZMA) stream format to maximize interoperability. The
+maximum dictionary size is 512 MiB so that any lzip file can be decompressed
+on 32-bit machines. Lzip provides accurate and robust 3-factor integrity
+checking. Lzip can compress about as fast as gzip (lzip -0) or compress most
+files more than bzip2 (lzip -9). Decompression speed is intermediate between
+gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
+perspective. Lzip has been designed, written, and tested with great care to
+replace gzip and bzip2 as the standard general-purpose compressed format for
+Unix-like systems.
+
+   For compressing/decompressing large files on multiprocessor machines
+plzip can be much faster than lzip at the cost of a slightly reduced
+compression ratio. *Note plzip manual: (plzip)Top.
+
+   For creation and manipulation of compressed tar archives tarlz can be
+more efficient than using tar and plzip because tarlz is able to keep the
+alignment between tar members and lzip members. *Note tarlz manual:
+(tarlz)Top.
+
+   The lzip file format is designed for data sharing and long-term
+archiving, taking into account both data integrity and decoder availability:
+
+   * The lzip format provides very safe integrity checking and some data
+     recovery means. The program lziprecover can repair bit flip errors
+     (one of the most common forms of data corruption) in lzip files, and
+     provides data recovery capabilities, including error-checked merging
+     of damaged copies of a file. *Note Data safety: (lziprecover)Data
+     safety.
+
+   * The lzip format is as simple as possible (but not simpler). The lzip
+     manual provides the source code of a simple decompressor along with a
+     detailed explanation of how it works, so that with the only help of the
+     lzip manual it would be possible for a digital archaeologist to extract
+     the data from a lzip file long after quantum computers eventually
+     render LZMA obsolete.
+
+   * Additionally the lzip reference implementation is copylefted, which
+     guarantees that it will remain free forever.
+
+   A nice feature of the lzip format is that a corrupt byte is easier to
+repair the nearer it is from the beginning of the file. Therefore, with the
+help of lziprecover, losing an entire archive just because of a corrupt
+byte near the beginning is a thing of the past.
+
+   The member trailer stores the 32-bit CRC of the original data, the size
+of the original data, and the size of the member. These values, together
+with the "End Of Stream" marker, provide a 3-factor integrity checking which
+guarantees that the decompressed version of the data is identical to the
+original. This guards against corruption of the compressed data, and against
+undetected bugs in clzip (hopefully very unlikely). The chances of data
+corruption going undetected are microscopic. Be aware, though, that the
+check occurs upon decompression, so it can only tell you that something is
+wrong. It can't help you recover the original uncompressed data.
+
+   Clzip uses the same well-defined exit status values used by bzip2, which
+makes it safer than compressors returning ambiguous warning values (like
+gzip) when it is used as a back end for other programs like tar or zutils.
+
+   Clzip automatically uses for each file the largest dictionary size that
+does not exceed neither the file size nor the limit given. Keep in mind
+that the decompression memory requirement is affected at compression time
+by the choice of dictionary size limit.
+
+   The amount of memory required for compression is about 1 or 2 times the
+dictionary size limit (1 if input file size is less than dictionary size
+limit, else 2) plus 9 times the dictionary size really used. The option
+'-0' is special and only requires about 1.5 MiB at most. The amount of
+memory required for decompression is about 46 kB larger than the dictionary
+size really used.
+
+   When compressing, clzip replaces every file given in the command line
+with a compressed version of itself, with the name "original_name.lz". When
+decompressing, clzip attempts to guess the name for the decompressed file
+from that of the compressed file as follows:
+
+filename.lz    becomes   filename
+filename.tlz   becomes   filename.tar
+anyothername   becomes   anyothername.out
+
+   (De)compressing a file is much like copying or moving it. Therefore clzip
+preserves the access and modification dates, permissions, and, if you have
+appropriate privileges, ownership of the file just as 'cp -p' does. (If the
+user ID or the group ID can't be duplicated, the file permission bits
+S_ISUID and S_ISGID are cleared).
+
+   Clzip is able to read from some types of non-regular files if either the
+option '-c' or the option '-o' is specified.
+
+   Clzip refuses to read compressed data from a terminal or write compressed
+data to a terminal, as this would be entirely incomprehensible and might
+leave the terminal in an abnormal state.
+
+   Clzip correctly decompresses a file which is the concatenation of two or
+more compressed files. The result is the concatenation of the corresponding
+decompressed files. Integrity testing of concatenated compressed files is
+also supported.
+
+   Clzip can produce multimember files, and lziprecover can safely recover
+the undamaged members in case of file damage. Clzip can also split the
+compressed output in volumes of a given size, even when reading from
+standard input. This allows the direct creation of multivolume compressed
+tar archives.
+
+   Clzip is able to compress and decompress streams of unlimited size by
+automatically creating multimember output. The members so created are large,
+about 2 PiB each.
+
+
+File: clzip.info,  Node: Output,  Next: Invoking clzip,  Prev: Introduction,  Up: Top
+
+2 Meaning of clzip's output
+***************************
+
+The output of clzip looks like this:
+
+     clzip -v foo
+       foo:  6.676:1, 14.98% ratio, 85.02% saved, 450560 in, 67493 out.
+
+     clzip -tvvv foo.lz
+       foo.lz:  6.676:1, 14.98% ratio, 85.02% saved.  450560 out,  67493 in. ok
+
+   The meaning of each field is as follows:
+
+'N:1'
+     The compression ratio (uncompressed_size / compressed_size), shown as
+     N to 1.
+
+'ratio'
+     The inverse compression ratio (compressed_size / uncompressed_size),
+     shown as a percentage. A decimal ratio is easily obtained by moving the
+     decimal point two places to the left; 14.98% = 0.1498.
+
+'saved'
+     The space saved by compression (1 - ratio), shown as a percentage.
+
+'in'
+     Size of the input data. This is the uncompressed size when
+     compressing, or the compressed size when decompressing or testing.
+     Note that clzip always prints the uncompressed size before the
+     compressed size when compressing, decompressing, testing, or listing.
+
+'out'
+     Size of the output data. This is the compressed size when compressing,
+     or the decompressed size when decompressing or testing.
+
+
+   When decompressing or testing at verbosity level 4 (-vvvv), the
+dictionary size used to compress the file and the CRC32 of the uncompressed
+data are also shown.
+
+   LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never
+have been compressed. Decompressed is used to refer to data which have
+undergone the process of decompression.
+
+
+File: clzip.info,  Node: Invoking clzip,  Next: Quality assurance,  Prev: Output,  Up: Top
+
+3 Invoking clzip
+****************
+
+The format for running clzip is:
+
+     clzip [OPTIONS] [FILES]
+
+If no file names are specified, clzip compresses (or decompresses) from
+standard input to standard output. A hyphen '-' used as a FILE argument
+means standard input. It can be mixed with other FILES and is read just
+once, the first time it appears in the command line. Remember to prepend
+'./' to any file name beginning with a hyphen, or use '--'.
+
+   clzip supports the following options: *Note Argument syntax:
+(arg_parser)Argument syntax.
+
+'-h'
+'--help'
+     Print an informative help message describing the options and exit.
+
+'-V'
+'--version'
+     Print the version number of clzip on the standard output and exit.
+     This version number should be included in all bug reports.
+
+'-a'
+'--trailing-error'
+     Exit with error status 2 if any remaining input is detected after
+     decompressing the last member. Such remaining input is usually trailing
+     garbage that can be safely ignored. *Note concat-example::.
+
+'-b BYTES'
+'--member-size=BYTES'
+     When compressing, set the member size limit to BYTES. It is advisable
+     to keep members smaller than RAM size so that they can be repaired with
+     lziprecover in case of corruption. A small member size may degrade
+     compression ratio, so use it only when needed. Valid values range from
+     100 kB to 2 PiB. Defaults to 2 PiB.
+
+'-c'
+'--stdout'
+     Compress or decompress to standard output; keep input files unchanged.
+     If compressing several files, each file is compressed independently.
+     (The output consists of a sequence of independently compressed
+     members). This option (or '-o') is needed when reading from a named
+     pipe (fifo) or from a device. Use it also to recover as much of the
+     decompressed data as possible when decompressing a corrupt file. '-c'
+     overrides '-o' and '-S'. '-c' has no effect when testing or listing.
+
+'-d'
+'--decompress'
+     Decompress the files specified. The integrity of the files specified is
+     checked. If a file does not exist, can't be opened, or the destination
+     file already exists and '--force' has not been specified, clzip
+     continues decompressing the rest of the files and exits with error
+     status 1. If a file fails to decompress, or is a terminal, clzip exits
+     immediately with error status 2 without decompressing the rest of the
+     files. A terminal is considered an uncompressed file, and therefore
+     invalid.
+
+'-f'
+'--force'
+     Force overwrite of output files.
+
+'-F'
+'--recompress'
+     When compressing, force re-compression of files whose name already has
+     the '.lz' or '.tlz' suffix.
+
+'-k'
+'--keep'
+     Keep (don't delete) input files during compression or decompression.
+
+'-l'
+'--list'
+     Print the uncompressed size, compressed size, and percentage saved of
+     the files specified. Trailing data are ignored. The values produced
+     are correct even for multimember files. If more than one file is
+     given, a final line containing the cumulative sizes is printed. With
+     '-v', the dictionary size, the number of members in the file, and the
+     amount of trailing data (if any) are also printed. With '-vv', the
+     positions and sizes of each member in multimember files are also
+     printed.
+
+     If any file is damaged, does not exist, can't be opened, or is not
+     regular, the final exit status is > 0. '-lq' can be used to check
+     quickly (without decompressing) the structural integrity of the files
+     specified. (Use '--test' to check the data integrity). '-alq'
+     additionally checks that none of the files specified contain trailing
+     data.
+
+'-m BYTES'
+'--match-length=BYTES'
+     When compressing, set the match length limit in bytes. After a match
+     this long is found, the search is finished. Valid values range from 5
+     to 273. Larger values usually give better compression ratios but
+     longer compression times.
+
+'-o FILE'
+'--output=FILE'
+     If '-c' has not been also specified, write the (de)compressed output
+     to FILE, automatically creating any missing parent directories; keep
+     input files unchanged. If compressing several files, each file is
+     compressed independently. (The output consists of a sequence of
+     independently compressed members). This option (or '-c') is needed
+     when reading from a named pipe (fifo) or from a device. '-o -' is
+     equivalent to '-c'. '-o' has no effect when testing or listing.
+
+     In order to keep backward compatibility with clzip versions prior to
+     1.12, when compressing from standard input and no other file names are
+     given, the extension '.lz' is appended to FILE unless it already ends
+     in '.lz' or '.tlz'. This feature will be removed in a future version
+     of clzip. Meanwhile, redirection may be used instead of '-o' to write
+     the compressed output to a file without the extension '.lz' in its
+     name: 'clzip < file > foo'.
+
+     When compressing and splitting the output in volumes, FILE is used as
+     a prefix, and several files named 'FILE00001.lz', 'FILE00002.lz', etc,
+     are created. In this case, only one input file is allowed.
+
+'-q'
+'--quiet'
+     Quiet operation. Suppress all messages.
+
+'-s BYTES'
+'--dictionary-size=BYTES'
+     When compressing, set the dictionary size limit in bytes. Clzip uses
+     for each file the largest dictionary size that does not exceed neither
+     the file size nor this limit. Valid values range from 4 KiB to 512 MiB.
+     Values 12 to 29 are interpreted as powers of two, meaning 2^12 to 2^29
+     bytes. Dictionary sizes are quantized so that they can be coded in
+     just one byte (*note coded-dict-size::). If the size specified does
+     not match one of the valid sizes, it is rounded upwards by adding up
+     to (BYTES / 8) to it.
+
+     For maximum compression you should use a dictionary size limit as large
+     as possible, but keep in mind that the decompression memory requirement
+     is affected at compression time by the choice of dictionary size limit.
+
+'-S BYTES'
+'--volume-size=BYTES'
+     When compressing, and '-c' has not been also specified, split the
+     compressed output into several volume files with names
+     'original_name00001.lz', 'original_name00002.lz', etc, and set the
+     volume size limit to BYTES. Input files are kept unchanged. Each
+     volume is a complete, maybe multimember, lzip file. A small volume
+     size may degrade compression ratio, so use it only when needed. Valid
+     values range from 100 kB to 4 EiB.
+
+'-t'
+'--test'
+     Check integrity of the files specified, but don't decompress them. This
+     really performs a trial decompression and throws away the result. Use
+     it together with '-v' to see information about the files. If a file
+     fails the test, does not exist, can't be opened, or is a terminal,
+     clzip continues testing the rest of the files. A final diagnostic is
+     shown at verbosity level 1 or higher if any file fails the test when
+     testing multiple files.
+
+'-v'
+'--verbose'
+     Verbose mode.
+     When compressing, show the compression ratio and size for each file
+     processed.
+     When decompressing or testing, further -v's (up to 4) increase the
+     verbosity level, showing status, compression ratio, dictionary size,
+     trailer contents (CRC, data size, member size), and up to 6 bytes of
+     trailing data (if any) both in hexadecimal and as a string of printable
+     ASCII characters.
+     Two or more '-v' options show the progress of (de)compression.
+
+'-0 .. -9'
+     Compression level. Set the compression parameters (dictionary size and
+     match length limit) as shown in the table below. The default
+     compression level is '-6', equivalent to '-s8MiB -m36'. Note that '-9'
+     can be much slower than '-0'. These options have no effect when
+     decompressing, testing, or listing.
+
+     The bidimensional parameter space of LZMA can't be mapped to a linear
+     scale optimal for all files. If your files are large, very repetitive,
+     etc, you may need to use the options '--dictionary-size' and
+     '--match-length' directly to achieve optimal performance.
+
+     If several compression levels or '-s' or '-m' options are given, the
+     last setting is used. For example '-9 -s64MiB' is equivalent to
+     '-s64MiB -m273'
+
+     Level   Dictionary size (-s)   Match length limit (-m)
+     -0      64 KiB                 16 bytes
+     -1      1 MiB                  5 bytes
+     -2      1.5 MiB                6 bytes
+     -3      2 MiB                  8 bytes
+     -4      3 MiB                  12 bytes
+     -5      4 MiB                  20 bytes
+     -6      8 MiB                  36 bytes
+     -7      16 MiB                 68 bytes
+     -8      24 MiB                 132 bytes
+     -9      32 MiB                 273 bytes
+
+'--fast'
+'--best'
+     Aliases for GNU gzip compatibility.
+
+'--empty-error'
+     Exit with error status 2 if any empty member is found in the input
+     files.
+
+'--marking-error'
+     Exit with error status 2 if the first LZMA byte is non-zero in any
+     member of the input files. This may be caused by data corruption or by
+     deliberate insertion of tracking information in the file. Use
+     'lziprecover --clear-marking' to clear any such non-zero bytes.
+
+'--loose-trailing'
+     When decompressing, testing, or listing, allow trailing data whose
+     first bytes are so similar to the magic bytes of a lzip header that
+     they can be confused with a corrupt header. Use this option if a file
+     triggers a "corrupt header" error and the cause is not indeed a
+     corrupt header.
+
+
+   Numbers given as arguments to options may be expressed in decimal,
+hexadecimal, or octal (using the same syntax as integer constants in C++),
+and may be followed by a multiplier and an optional 'B' for "byte".
+
+   Table of SI and binary prefixes (unit multipliers):
+
+Prefix   Value                      |   Prefix   Value
+k        kilobyte   (10^3 = 1000)   |   Ki       kibibyte  (2^10 = 1024)
+M        megabyte   (10^6)          |   Mi       mebibyte  (2^20)
+G        gigabyte   (10^9)          |   Gi       gibibyte  (2^30)
+T        terabyte   (10^12)         |   Ti       tebibyte  (2^40)
+P        petabyte   (10^15)         |   Pi       pebibyte  (2^50)
+E        exabyte    (10^18)         |   Ei       exbibyte  (2^60)
+Z        zettabyte  (10^21)         |   Zi       zebibyte  (2^70)
+Y        yottabyte  (10^24)         |   Yi       yobibyte  (2^80)
+R        ronnabyte  (10^27)         |   Ri       robibyte  (2^90)
+Q        quettabyte (10^30)         |   Qi       quebibyte (2^100)
+
+
+   Exit status: 0 for a normal exit, 1 for environmental problems (file not
+found, invalid command-line options, I/O errors, etc), 2 to indicate a
+corrupt or invalid input file, 3 for an internal consistency error (e.g.,
+bug) which caused clzip to panic.
+
+
+File: clzip.info,  Node: Quality assurance,  Next: Algorithm,  Prev: Invoking clzip,  Up: Top
+
+4 Design, development, and testing of lzip
+******************************************
+
+There are two ways of constructing a software design: One way is to make it
+so simple that there are obviously no deficiencies and the other way is to
+make it so complicated that there are no obvious deficiencies. The first
+method is far more difficult.
+-- C.A.R. Hoare
+
+   Lzip has been designed, written, and tested with great care to replace
+gzip and bzip2 as the standard general-purpose compressed format for
+Unix-like systems. This chapter describes the lessons learned from these
+previous formats, and their application to the design of lzip. The lzip
+format specification has been reviewed carefully and is believed to be free
+from design errors.
+
+
+4.1 Format design
+=================
+
+When gzip was designed in 1992, computers and operating systems were much
+less capable than they are today. The designers of gzip tried to work around
+some of those limitations, like 8.3 file names, with additional fields in
+the file format.
+
+   Today those limitations have mostly disappeared, and the format of gzip
+has proved to be unnecessarily complicated. It includes fields that were
+never used, others that have lost their usefulness, and finally others that
+have become too limited.
+
+   Bzip2 was designed 5 years later, and its format is simpler than the one
+of gzip.
+
+   Probably the worst defect of the gzip format from the point of view of
+data safety is the variable size of its header. If the byte at offset 3
+(flags) of a gzip member gets corrupted, it may become difficult to recover
+the data, even if the compressed blocks are intact, because it can't be
+known with certainty where the compressed blocks begin.
+
+   By contrast, the header of a lzip member has a fixed length of 6. The
+LZMA stream in a lzip member always starts at offset 6, making it trivial to
+recover the data even if the whole header becomes corrupt.
+
+   Bzip2 also provides a header of fixed length and marks the begin and end
+of each compressed block with six magic bytes, making it possible to find
+the compressed blocks even in case of file damage. But bzip2 does not store
+the size of each compressed block, as lzip does.
+
+   Lziprecover is able to provide unique data recovery capabilities because
+the lzip format is extraordinarily safe. The simple and safe design of the
+file format complements the embedded error detection provided by the LZMA
+data stream. Any distance larger than the dictionary size acts as a
+forbidden symbol, allowing the decompressor to detect the approximate
+position of errors, and leaving very little work for the check sequence
+(CRC and data sizes) in the detection of errors. Lzip is usually able to
+detect all possible bit flips in the compressed data without resorting to
+the check sequence. It would be difficult to write an automatic recovery
+tool like lziprecover for the gzip format. And, as far as I know, it has
+never been written.
+
+   Lzip, like gzip and bzip2, uses a CRC32 to check the integrity of the
+decompressed data because it provides optimal accuracy in the detection of
+errors up to a compressed size of about 16 GiB, a size larger than that of
+most files. In the case of lzip, the additional detection capability of the
+decompressor reduces the probability of undetected errors several million
+times more, resulting in a combined integrity checking optimally accurate
+for any member size produced by lzip. Preliminary results suggest that the
+lzip format is safe enough to be used in critical safety avionics systems.
+
+   The lzip format is designed for long-term archiving. Therefore it
+excludes any unneeded features that may interfere with the future
+extraction of the decompressed data.
+
+
+4.1.1 Gzip format (mis)features not present in lzip
+---------------------------------------------------
+
+'Multiple algorithms'
+     Gzip provides a CM (Compression Method) field that has never been used
+     because it is a bad idea to begin with. New compression methods may
+     require additional fields, making it impossible to implement new
+     methods and, at the same time, keep the same format. This field does
+     not solve the problem of format proliferation; it just makes the
+     problem less obvious.
+
+'Optional fields in header'
+     Unless special precautions are taken, optional fields are generally a
+     bad idea because they produce a header of variable size. The gzip
+     header has 2 fields that, in addition to being optional, are
+     zero-terminated. This means that if any byte inside the field gets
+     zeroed, or if the terminating zero gets altered, gzip won't be able to
+     find neither the header CRC nor the compressed blocks.
+
+'Optional CRC for the header'
+     Using an optional CRC for the header is not only a bad idea, it is an
+     error; it circumvents the Hamming distance (HD) of the CRC and may
+     prevent the extraction of perfectly good data. For example, if the CRC
+     is used and the bit enabling it is reset by a bit flip, then the
+     header seems to be intact (in spite of being corrupt) while the
+     compressed blocks seem to be totally unrecoverable (in spite of being
+     intact). Very misleading indeed.
+
+'Metadata'
+     The gzip format stores some metadata, like the modification time of the
+     original file or the operating system on which compression took place.
+     This complicates reproducible compression (obtaining identical
+     compressed output from identical input).
+
+
+4.1.2 Lzip format improvements over gzip and bzip2
+--------------------------------------------------
+
+'64-bit size field'
+     Probably the most frequently reported shortcoming of the gzip format
+     is that it only stores the least significant 32 bits of the
+     uncompressed size. The size of any file larger or equal than 4 GiB
+     gets truncated.
+
+     Bzip2 does not store the uncompressed size of the file.
+
+     The lzip format provides a 64-bit field for the uncompressed size.
+     Additionally, lzip produces multimember output automatically when the
+     size is too large for a single member, allowing for an unlimited
+     uncompressed size.
+
+'Distributed index'
+     The lzip format provides a distributed index that, among other things,
+     helps plzip to decompress several times faster than pigz and helps
+     lziprecover do its job. Neither the gzip format nor the bzip2 format
+     do provide an index.
+
+     A distributed index is safer and more scalable than a monolithic
+     index. The monolithic index introduces a single point of failure in
+     the compressed file and may limit the number of members or the total
+     uncompressed size.
+
+
+4.2 Quality of implementation
+=============================
+
+Our civilization depends critically on software; it had better be quality
+software.
+-- Bjarne Stroustrup
+
+'Accurate and robust error detection'
+     The lzip format provides 3-factor integrity checking, and the
+     decompressors report mismatches in each factor separately. This method
+     detects most false positives for corruption. If just one byte in one
+     factor fails but the other two factors match the data, it probably
+     means that the data are intact and the corruption just affects the
+     mismatching factor (CRC, data size, or member size) in the member
+     trailer.
+
+'Multiple implementations'
+     Just like the lzip format provides 3-factor protection against
+     undetected data corruption, the development methodology of the lzip
+     family of compressors provides 3-factor protection against undetected
+     programming errors.
+
+     Three related but independent compressor implementations, lzip, clzip,
+     and minilzip/lzlib, are developed concurrently. Every stable release
+     of any of them is tested to check that it produces identical output to
+     the other two. This guarantees that all three implement the same
+     algorithm, and makes it unlikely that any of them may contain serious
+     undiscovered errors. In fact, no errors have been discovered in lzip
+     since 2009.
+
+     Additionally, the three implementations have been extensively tested
+     with unzcrash, valgrind, and 'american fuzzy lop' without finding a
+     single vulnerability or false negative. *Note Unzcrash:
+     (lziprecover)Unzcrash.
+
+'Dictionary size'
+     Lzip automatically adapts the dictionary size to the size of each file.
+     In addition to reducing the amount of memory required for
+     decompression, this feature also minimizes the probability of being
+     affected by RAM errors during compression.
+
+'Exit status'
+     Returning a warning status of 2 is a design flaw of compress that
+     leaked into the design of gzip. Both bzip2 and lzip are free from this
+     flaw.
+
+
+
+File: clzip.info,  Node: Algorithm,  Next: File format,  Prev: Quality assurance,  Up: Top
+
+5 Algorithm
+***********
+
+In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a
+concrete algorithm; it is more like "any algorithm using the LZMA coding
+scheme". LZMA compression consists in describing the uncompressed data as a
+succession of coding sequences from the set shown in Section 'What is
+coded' (*note what-is-coded::), and then encoding them using a range
+encoder. For example, the option '-0' of clzip uses the scheme in almost
+the simplest way possible; issuing the longest match it can find, or a
+literal byte if it can't find a match. Inversely, a much more elaborated way
+of finding coding sequences of minimum size than the one currently used by
+clzip could be developed, and the resulting sequence could also be coded
+using the LZMA coding scheme.
+
+   Clzip currently implements two variants of the LZMA algorithm: fast
+(used by option '-0') and normal (used by all other compression levels).
+
+   The high compression of LZMA comes from combining two basic, well-proven
+compression ideas: sliding dictionaries (LZ77) and markov models (the thing
+used by every compression algorithm that uses a range encoder or similar
+order-0 entropy coder as its last stage) with segregation of contexts
+according to what the bits are used for.
+
+   Clzip is a two stage compressor. The first stage is a Lempel-Ziv coder,
+which reduces redundancy by translating chunks of data to their
+corresponding distance-length pairs. The second stage is a range encoder
+that uses a different probability model for each type of data: distances,
+lengths, literal bytes, etc.
+
+   Here is how it works, step by step:
+
+   1) The member header is written to the output stream.
+
+   2) The first byte is coded literally, because there are no previous
+bytes to which the match finder can refer to.
+
+   3) The main encoder advances to the next byte in the input data and
+calls the match finder.
+
+   4) The match finder fills an array with the minimum distances before the
+current byte where a match of a given length can be found.
+
+   5) Go back to step 3 until a sequence (formed of pairs, repeated
+distances, and literal bytes) of minimum price has been formed. Where the
+price represents the number of output bits produced.
+
+   6) The range encoder encodes the sequence produced by the main encoder
+and sends the bytes produced to the output stream.
+
+   7) Go back to step 3 until the input data are finished or until the
+member or volume size limits are reached.
+
+   8) The range encoder is flushed.
+
+   9) The member trailer is written to the output stream.
+
+   10) If there are more data to compress, go back to step 1.
+
+
+   During compression, clzip reads data in large blocks (one dictionary
+size at a time). Therefore it may block for up to tens of seconds any
+process feeding data to it through a pipe. This is normal. The blocking
+intervals get longer with higher compression levels because dictionary size
+increases (and compression speed decreases) with compression level.
+
+The ideas embodied in clzip are due to (at least) the following people:
+Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the
+definition of Markov chains), G.N.N. Martin (for the definition of range
+encoding), Igor Pavlov (for putting all the above together in LZMA), and
+Julian Seward (for bzip2's CLI).
+
+
+File: clzip.info,  Node: File format,  Next: Stream format,  Prev: Algorithm,  Up: Top
+
+6 File format
+*************
+
+Perfection is reached, not when there is no longer anything to add, but
+when there is no longer anything to take away.
+-- Antoine de Saint-Exupery
+
+
+   In the diagram below, a box like this:
+
++---+
+|   | <-- the vertical bars might be missing
++---+
+
+   represents one byte; a box like this:
+
++==============+
+|              |
++==============+
+
+   represents a variable number of bytes.
+
+
+   A lzip file consists of one or more independent "members" (compressed
+data sets). The members simply appear one after another in the file, with no
+additional information before, between, or after them. Each member can
+encode in compressed form up to 16 EiB - 1 byte of uncompressed data. The
+size of a multimember file is unlimited.
+
+   Each member has the following structure:
+
++--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+| ID string | VN | DS | LZMA stream | CRC32 |   Data size   |  Member size  |
++--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+   All multibyte values are stored in little endian order.
+
+'ID string (the "magic" bytes)'
+     A four byte string, identifying the lzip format, with the value "LZIP"
+     (0x4C, 0x5A, 0x49, 0x50).
+
+'VN (version number, 1 byte)'
+     Just in case something needs to be modified in the future. 1 for now.
+
+'DS (coded dictionary size, 1 byte)'
+     The dictionary size is calculated by taking a power of 2 (the base
+     size) and subtracting from it a fraction between 0/16 and 7/16 of the
+     base size.
+     Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).
+     Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
+     from the base size to obtain the dictionary size.
+     Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB
+     Valid values for dictionary size range from 4 KiB to 512 MiB.
+
+'LZMA stream'
+     The LZMA stream, finished by an "End Of Stream" marker. Uses default
+     values for encoder properties. *Note Stream format::, for a complete
+     description.
+
+'CRC32 (4 bytes)'
+     Cyclic Redundancy Check (CRC) of the original uncompressed data.
+
+'Data size (8 bytes)'
+     Size of the original uncompressed data.
+
+'Member size (8 bytes)'
+     Total size of the member, including header and trailer. This field acts
+     as a distributed index, improves the checking of stream integrity, and
+     facilitates the safe recovery of undamaged members from multimember
+     files. Lzip limits the member size to 2 PiB to prevent the data size
+     field from overflowing.
+
+
+
+File: clzip.info,  Node: Stream format,  Next: Trailing data,  Prev: File format,  Up: Top
+
+7 Format of the LZMA stream in lzip files
+*****************************************
+
+The LZMA algorithm has three parameters, called "special LZMA properties",
+to adjust it for some kinds of binary data. These parameters are:
+'literal_context_bits' (with a default value of 3),
+'literal_pos_state_bits' (with a default value of 0), and 'pos_state_bits'
+(with a default value of 2). As a general purpose compressor, lzip only
+uses the default values for these parameters. In particular
+'literal_pos_state_bits' has been optimized away and does not even appear
+in the code.
+
+   Lzip finishes the LZMA stream with an "End Of Stream" (EOS) marker (the
+distance-length pair 0xFFFFFFFFU, 2), which in conjunction with the 'member
+size' field in the member trailer allows the checking of stream integrity.
+The EOS marker is the only LZMA marker allowed in lzip files. The LZMA
+stream in lzip files always has these two features (default properties and
+EOS marker) and is referred to in this document as LZMA-302eos. This
+simplified and marker-terminated form of the LZMA stream format has been
+chosen to maximize interoperability and safety.
+
+   The second stage of LZMA is a range encoder that uses a different
+probability model for each type of symbol: distances, lengths, literal
+bytes, etc. Range encoding conceptually encodes all the symbols of the
+message into one number. Unlike Huffman coding, which assigns to each
+symbol a bit-pattern and concatenates all the bit-patterns together, range
+encoding can compress one symbol to less than one bit. Therefore the
+compressed data produced by a range encoder can't be split in pieces that
+could be described individually.
+
+   It seems that the only way of describing the LZMA-302eos stream is to
+describe the algorithm that decodes it. And given the many details about
+the range decoder that need to be described accurately, the source code of
+a real decompressor seems the only appropriate reference to use.
+
+   What follows is a description of the decoding algorithm for LZMA-302eos
+streams using as reference the source code of "lzd", an educational
+decompressor for lzip files, included in appendix A. *Note Reference source
+code::. Lzd is written in C++11 and can be downloaded from the lzip download
+directory.
+
+
+7.1 What is coded
+=================
+
+The LZMA stream includes literals, matches, and repeated matches (matches
+reusing a recently used distance). There are 7 different coding sequences:
+
+Bit sequence                Name        Description
+-----------------------------------------------------------------------------
+0 + byte                    literal     literal byte
+1 + 0 + len + dis           match       distance-length pair
+1 + 1 + 0 + 0               shortrep    1 byte match at latest used distance
+1 + 1 + 0 + 1 + len         rep0        len bytes match at latest used distance
+1 + 1 + 1 + 0 + len         rep1        len bytes match at second latest used
+                                        distance
+1 + 1 + 1 + 1 + 0 + len     rep2        len bytes match at third latest used
+                                        distance
+1 + 1 + 1 + 1 + 1 + len     rep3        len bytes match at fourth latest used
+                                        distance
+
+
+   In the following tables, multibit sequences are coded in normal order,
+from most significant bit (MSB) to least significant bit (LSB), except
+where noted otherwise.
+
+   Lengths (the 'len' in the table above) are coded as follows:
+
+Bit sequence                           Description
+----------------------------------------------------------------------------
+0 + 3 bits                             lengths from 2 to 9
+1 + 0 + 3 bits                         lengths from 10 to 17
+1 + 1 + 8 bits                         lengths from 18 to 273
+
+
+   The coding of distances is a little more complicated, so I'll begin by
+explaining a simpler version of the encoding.
+
+   Imagine you need to encode a number from 0 to 2^32 - 1, and you want to
+do it in a way that produces shorter codes for the smaller numbers. You may
+first encode the position of the most significant bit that is set to 1,
+which you may find by making a bit scan from the left (from the MSB). A
+position of 0 means that the number is 0 (no bit is set), 1 means the LSB is
+the first bit set (the number is 1), and 32 means the MSB is set (i.e., the
+number is >= 0x80000000). Then, if the position is >= 2, you encode the
+remaining position - 1 bits. Let's call these bits "direct bits" because
+they are coded directly by value instead of indirectly by position.
+
+   The inconvenient of this simple method is that it needs 6 bits to encode
+the position, but it just uses 33 of the 64 possible values, wasting almost
+half of the codes.
+
+   The intelligent trick of LZMA is that it encodes in what it calls a
+"slot" the position of the most significant bit set, along with the value
+of the next bit, using the same 6 bits that would take to encode the
+position alone. This seems to need 66 slots (twice the number of
+positions), but for positions 0 and 1 there is no next bit, so the number
+of slots needed is 64 (0 to 63).
+
+   The 6 bits representing this "slot number" are then context-coded. If
+the distance is >= 4, the remaining bits are encoded as follows.
+'direct_bits' is the amount of remaining bits (from 1 to 30) needed to form
+a complete distance, and is calculated as (slot >> 1) - 1. If a distance
+needs 6 or more direct_bits, the last 4 bits are encoded separately. The
+last piece (all the direct_bits for distances 4 to 127 (slots 4 to 13), or
+the last 4 bits for distances >= 128 (slot >= 14)) is context-coded in
+reverse order (from LSB to MSB). For distances >= 128, the
+'direct_bits - 4' part is encoded with fixed 0.5 probability.
+
+Bit sequence                           Description
+----------------------------------------------------------------------------
+slot                                   distances from 0 to 3
+slot + direct_bits                     distances from 4 to 127
+slot + (direct_bits - 4) + 4 bits      distances from 128 to 2^32 - 1
+
+
+7.2 The coding contexts
+=======================
+
+These contexts ('Bit_model' in the source), are integers or arrays of
+integers representing the probability of the corresponding bit being 0.
+
+   The indices used in these arrays are:
+
+'state'
+     A state machine ('State' in the source) with 12 states (0 to 11),
+     coding the latest 2 to 4 types of sequences processed. The initial
+     state is 0.
+
+'pos_state'
+     Value of the 2 least significant bits of the current position in the
+     decoded data.
+
+'literal_state'
+     Value of the 3 most significant bits of the latest byte decoded.
+
+'len_state'
+     Coded value of the current match length (length - 2), with a maximum
+     of 3. The resulting value is in the range 0 to 3.
+
+
+   The types of previous sequences corresponding to each state are shown in
+the following table. '!literal' is any sequence except a literal byte.
+'rep' is any one of 'rep0', 'rep1', 'rep2', or 'rep3'. The last type in
+each line is the most recent.
+
+State   Types of previous sequences
+------------------------------------------------------
+0       literal, literal, literal
+1       match, literal, literal
+2       rep or (!literal, shortrep), literal, literal
+3       literal, shortrep, literal, literal
+4       match, literal
+5       rep or (!literal, shortrep), literal
+6       literal, shortrep, literal
+7       literal, match
+8       literal, rep
+9       literal, shortrep
+10      !literal, match
+11      !literal, (rep or shortrep)
+
+
+   The contexts for decoding the type of coding sequence are:
+
+Name            Indices                     Used when
+----------------------------------------------------------------------------
+bm_match        state, pos_state            sequence start
+bm_rep          state                       after sequence 1
+bm_rep0         state                       after sequence 11
+bm_rep1         state                       after sequence 111
+bm_rep2         state                       after sequence 1111
+bm_len          state, pos_state            after sequence 110
+
+
+   The contexts for decoding distances are:
+
+Name            Indices                 Used when
+----------------------------------------------------------------------------
+bm_dis_slot     len_state, bit tree     distance start
+bm_dis          reverse bit tree        after slots 4 to 13
+bm_align        reverse bit tree        for distances >= 128, after fixed
+                                        probability bits
+
+
+   There are two separate sets of contexts for lengths ('Len_model' in the
+source). One for normal matches, the other for repeated matches. The
+contexts in each Len_model are (see 'decode_len' in the source):
+
+Name            Indices                        Used when
+---------------------------------------------------------------------------
+choice1         none                           length start
+choice2         none                           after sequence 1
+bm_low          pos_state, bit tree            after sequence 0
+bm_mid          pos_state, bit tree            after sequence 10
+bm_high         bit tree                       after sequence 11
+
+
+   The context array 'bm_literal' is special. In principle it acts as a
+normal bit tree context, the one selected by 'literal_state'. But if the
+previous decoded byte was not a literal, two other bit tree contexts are
+used depending on the value of each bit in 'match_byte' (the byte at the
+latest used distance), until a bit is decoded that is different from its
+corresponding bit in 'match_byte'. After the first difference is found, the
+rest of the byte is decoded using the normal bit tree context. (See
+'decode_matched' in the source).
+
+
+7.3 The range decoder
+=====================
+
+The LZMA stream is consumed one byte at a time by the range decoder. (See
+'normalize' in the source). Every byte consumed produces a variable number
+of decoded bits, depending on how well these bits agree with their context.
+(See 'decode_bit' in the source).
+
+   The range decoder state consists of two unsigned 32-bit variables:
+'range' (representing the most significant part of the range size not yet
+decoded) and 'code' (representing the current point within 'range').
+'range' is initialized to 2^32 - 1, and 'code' is initialized to 0.
+
+   The range encoder produces a first 0 byte that must be ignored by the
+range decoder. (See the 'Range_decoder' constructor in the source).
+
+
+7.4 Decoding and checking the LZMA stream
+=========================================
+
+After decoding the member header and obtaining the dictionary size, the
+range decoder is initialized and then the LZMA decoder enters a loop (see
+'decode_member' in the source) where it invokes the range decoder with the
+appropriate contexts to decode the different coding sequences (matches,
+repeated matches, and literal bytes), until the "End Of Stream" marker is
+decoded.
+
+   Once the "End Of Stream" marker has been decoded, the decompressor reads
+and decodes the member trailer, and checks that the three integrity factors
+stored there (CRC, data size, and member size) match those computed from the
+data.
+
+
+File: clzip.info,  Node: Trailing data,  Next: Examples,  Prev: Stream format,  Up: Top
+
+8 Extra data appended to the file
+*********************************
+
+Sometimes extra data are found appended to a lzip file after the last
+member. Such trailing data may be:
+
+   * Padding added to make the file size a multiple of some block size, for
+     example when writing to a tape. It is safe to append any amount of
+     padding zero bytes to a lzip file.
+
+   * Useful data added by the user; an "End Of File" string (to check that
+     the file has not been truncated), a cryptographically secure hash, a
+     description of file contents, etc. It is safe to append any amount of
+     text to a lzip file as long as none of the first four bytes of the
+     text matches the corresponding byte in the string "LZIP", and the text
+     does not contain any zero bytes (null characters). Nonzero bytes and
+     zero bytes can't be safely mixed in trailing data.
+
+   * Garbage added by some not totally successful copy operation.
+
+   * Malicious data added to the file in order to make its total size and
+     hash value (for a chosen hash) coincide with those of another file.
+
+   * In rare cases, trailing data could be the corrupt header of another
+     member. In multimember or concatenated files the probability of
+     corruption happening in the magic bytes is 5 times smaller than the
+     probability of getting a false positive caused by the corruption of the
+     integrity information itself. Therefore it can be considered to be
+     below the noise level. Additionally, the test used by clzip to
+     discriminate trailing data from a corrupt header has a Hamming
+     distance (HD) of 3, and the 3 bit flips must happen in different magic
+     bytes for the test to fail. In any case, the option '--trailing-error'
+     guarantees that any corrupt header is detected.
+
+   Trailing data are in no way part of the lzip file format, but tools
+reading lzip files are expected to behave as correctly and usefully as
+possible in the presence of trailing data.
+
+   Trailing data can be safely ignored in most cases. In some cases, like
+that of user-added data, they are expected to be ignored. In those cases
+where a file containing trailing data must be rejected, the option
+'--trailing-error' can be used. *Note --trailing-error::.
+
+
+File: clzip.info,  Node: Examples,  Next: Problems,  Prev: Trailing data,  Up: Top
+
+9 A small tutorial with examples
+********************************
+
+WARNING! Even if clzip is bug-free, other causes may result in a corrupt
+compressed file (bugs in the system libraries, memory errors, etc).
+Therefore, if the data you are going to compress are important, give the
+option '--keep' to clzip and don't remove the original file until you check
+the compressed file with a command like 'clzip -cd file.lz | cmp file -'.
+Most RAM errors happening during compression can only be detected by
+comparing the compressed file with the original because the corruption
+happens before clzip compresses the RAM contents, resulting in a valid
+compressed file containing wrong data.
+
+
+Example 1: Extract all the files from archive 'foo.tar.lz'.
+
+       tar -xf foo.tar.lz
+     or
+       clzip -cd foo.tar.lz | tar -xf -
+
+
+Example 2: Replace a regular file with its compressed version 'file.lz' and
+show the compression ratio.
+
+     clzip -v file
+
+
+Example 3: Like example 2 but the created 'file.lz' is multimember with a
+member size of 1 MiB. The compression ratio is not shown.
+
+     clzip -b 1MiB file
+
+
+Example 4: Restore a regular file from its compressed version 'file.lz'. If
+the operation is successful, 'file.lz' is removed.
+
+     clzip -d file.lz
+
+
+Example 5: Check the integrity of the compressed file 'file.lz' and show
+status.
+
+     clzip -tv file.lz
+
+
+Example 6: The right way of concatenating the decompressed output of two or
+more compressed files. *Note Trailing data::.
+
+     Don't do this
+       cat file1.lz file2.lz file3.lz | clzip -d -
+     Do this instead
+       clzip -cd file1.lz file2.lz file3.lz
+
+
+Example 7: Decompress 'file.lz' partially until 10 KiB of decompressed data
+are produced.
+
+     clzip -cd file.lz | dd bs=1024 count=10
+
+
+Example 8: Decompress 'file.lz' partially from decompressed byte at offset
+10000 to decompressed byte at offset 14999 (5000 bytes are produced).
+
+     clzip -cd file.lz | dd bs=1000 skip=10 count=5
+
+
+Example 9: Compress a whole device in /dev/sdc and send the output to
+'file.lz'.
+
+       clzip -c /dev/sdc > file.lz
+     or
+       clzip /dev/sdc -o file.lz
+
+
+Example 10: Create a multivolume compressed tar archive with a volume size
+of 1440 KiB.
+
+     tar -c some_directory | clzip -S 1440KiB -o volume_name -
+
+
+Example 11: Extract a multivolume compressed tar archive.
+
+     clzip -cd volume_name*.lz | tar -xf -
+
+
+Example 12: Create a multivolume compressed backup of a large database file
+with a volume size of 650 MB, where each volume is a multimember file with
+a member size of 32 MiB.
+
+     clzip -b 32MiB -S 650MB big_db
+
+
+File: clzip.info,  Node: Problems,  Next: Reference source code,  Prev: Examples,  Up: Top
+
+10 Reporting bugs
+*****************
+
+There are probably bugs in clzip. There are certainly errors and omissions
+in this manual. If you report them, they will get fixed. If you don't, no
+one will ever know about them and they will remain unfixed for all
+eternity, if not longer.
+
+   If you find a bug in clzip, please send electronic mail to
+<lzip-bug@nongnu.org>. Include the version number, which you can find by
+running 'clzip --version'.
+
+
+File: clzip.info,  Node: Reference source code,  Next: Concept index,  Prev: Problems,  Up: Top
+
+Appendix A Reference source code
+********************************
+
+/* Lzd - Educational decompressor for the lzip format
+   Copyright (C) 2013-2024 Antonio Diaz Diaz.
+
+   This program is free software. Redistribution and use in source and
+   binary forms, with or without modification, are permitted provided
+   that the following conditions are met:
+
+   1. Redistributions of source code must retain the above copyright
+   notice, this list of conditions, and the following disclaimer.
+
+   2. Redistributions in binary form must reproduce the above copyright
+   notice, this list of conditions, and the following disclaimer in the
+   documentation and/or other materials provided with the distribution.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+/*
+   Exit status: 0 for a normal exit, 1 for environmental problems
+   (file not found, invalid command-line options, I/O errors, etc), 2 to
+   indicate a corrupt or invalid input file.
+*/
+
+#include <algorithm>
+#include <cerrno>
+#include <cstdio>
+#include <cstdlib>
+#include <cstring>
+#include <stdint.h>
+#include <unistd.h>
+#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__
+#include <fcntl.h>
+#include <io.h>
+#endif
+
+
+class State
+  {
+  int st;
+
+public:
+  enum { states = 12 };
+  State() : st( 0 ) {}
+  int operator()() const { return st; }
+  bool is_char() const { return st < 7; }
+
+  void set_char()
+    {
+    const int next[states] = { 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 4, 5 };
+    st = next[st];
+    }
+  void set_match()     { st = ( st < 7 ) ? 7 : 10; }
+  void set_rep()       { st = ( st < 7 ) ? 8 : 11; }
+  void set_short_rep() { st = ( st < 7 ) ? 9 : 11; }
+  };
+
+
+enum {
+  min_dictionary_size = 1 << 12,
+  max_dictionary_size = 1 << 29,
+  literal_context_bits = 3,
+  literal_pos_state_bits = 0,				// not used
+  pos_state_bits = 2,
+  pos_states = 1 << pos_state_bits,
+  pos_state_mask = pos_states - 1,
+
+  len_states = 4,
+  dis_slot_bits = 6,
+  start_dis_model = 4,
+  end_dis_model = 14,
+  modeled_distances = 1 << ( end_dis_model / 2 ),	// 128
+  dis_align_bits = 4,
+  dis_align_size = 1 << dis_align_bits,
+
+  len_low_bits = 3,
+  len_mid_bits = 3,
+  len_high_bits = 8,
+  len_low_symbols = 1 << len_low_bits,
+  len_mid_symbols = 1 << len_mid_bits,
+  len_high_symbols = 1 << len_high_bits,
+  max_len_symbols = len_low_symbols + len_mid_symbols + len_high_symbols,
+
+  min_match_len = 2,					// must be 2
+
+  bit_model_move_bits = 5,
+  bit_model_total_bits = 11,
+  bit_model_total = 1 << bit_model_total_bits };
+
+struct Bit_model
+  {
+  int probability;
+  Bit_model() : probability( bit_model_total / 2 ) {}
+  };
+
+struct Len_model
+  {
+  Bit_model choice1;
+  Bit_model choice2;
+  Bit_model bm_low[pos_states][len_low_symbols];
+  Bit_model bm_mid[pos_states][len_mid_symbols];
+  Bit_model bm_high[len_high_symbols];
+  };
+
+
+class CRC32
+  {
+  uint32_t data[256];		// Table of CRCs of all 8-bit messages.
+
+public:
+  CRC32()
+    {
+    for( unsigned n = 0; n < 256; ++n )
+      {
+      unsigned c = n;
+      for( int k = 0; k < 8; ++k )
+        { if( c & 1 ) c = 0xEDB88320U ^ ( c >> 1 ); else c >>= 1; }
+      data[n] = c;
+      }
+    }
+
+  void update_buf( uint32_t & crc, const uint8_t * const buffer,
+                   const int size ) const
+    {
+    for( int i = 0; i < size; ++i )
+      crc = data[(crc^buffer[i])&0xFF] ^ ( crc >> 8 );
+    }
+  };
+
+const CRC32 crc32;
+
+
+enum { header_size = 6, trailer_size = 20 };
+typedef uint8_t Lzip_header[header_size]; // 0-3 magic bytes
+					  //   4 version
+					  //   5 coded dictionary size
+typedef uint8_t Lzip_trailer[trailer_size];
+			//  0-3  CRC32 of the uncompressed data
+			//  4-11 size of the uncompressed data
+			// 12-19 member size including header and trailer
+
+class Range_decoder
+  {
+  unsigned long long member_pos;
+  uint32_t code;
+  uint32_t range;
+
+public:
+  Range_decoder()
+    : member_pos( header_size ), code( 0 ), range( 0xFFFFFFFFU )
+    {
+    get_byte();			// discard first byte of the LZMA stream
+    for( int i = 0; i < 4; ++i ) code = ( code << 8 ) | get_byte();
+    }
+
+  uint8_t get_byte() { ++member_pos; return std::getc( stdin ); }
+  unsigned long long member_position() const { return member_pos; }
+
+  unsigned decode( const int num_bits )
+    {
+    unsigned symbol = 0;
+    for( int i = num_bits; i > 0; --i )
+      {
+      range >>= 1;
+      symbol <<= 1;
+      if( code >= range ) { code -= range; symbol |= 1; }
+      if( range <= 0x00FFFFFFU )			// normalize
+        { range <<= 8; code = ( code << 8 ) | get_byte(); }
+      }
+    return symbol;
+    }
+
+  bool decode_bit( Bit_model & bm )
+    {
+    bool symbol;
+    const uint32_t bound = ( range >> bit_model_total_bits ) * bm.probability;
+    if( code < bound )
+      {
+      range = bound;
+      bm.probability +=
+        ( bit_model_total - bm.probability ) >> bit_model_move_bits;
+      symbol = 0;
+      }
+    else
+      {
+      code -= bound;
+      range -= bound;
+      bm.probability -= bm.probability >> bit_model_move_bits;
+      symbol = 1;
+      }
+    if( range <= 0x00FFFFFFU )				// normalize
+      { range <<= 8; code = ( code << 8 ) | get_byte(); }
+    return symbol;
+    }
+
+  unsigned decode_tree( Bit_model bm[], const int num_bits )
+    {
+    unsigned symbol = 1;
+    for( int i = 0; i < num_bits; ++i )
+      symbol = ( symbol << 1 ) | decode_bit( bm[symbol] );
+    return symbol - ( 1 << num_bits );
+    }
+
+  unsigned decode_tree_reversed( Bit_model bm[], const int num_bits )
+    {
+    unsigned symbol = decode_tree( bm, num_bits );
+    unsigned reversed_symbol = 0;
+    for( int i = 0; i < num_bits; ++i )
+      {
+      reversed_symbol = ( reversed_symbol << 1 ) | ( symbol & 1 );
+      symbol >>= 1;
+      }
+    return reversed_symbol;
+    }
+
+  unsigned decode_matched( Bit_model bm[], const unsigned match_byte )
+    {
+    unsigned symbol = 1;
+    for( int i = 7; i >= 0; --i )
+      {
+      const bool match_bit = ( match_byte >> i ) & 1;
+      const bool bit = decode_bit( bm[symbol+(match_bit<<8)+0x100] );
+      symbol = ( symbol << 1 ) | bit;
+      if( match_bit != bit )
+        {
+        while( symbol < 0x100 )
+          symbol = ( symbol << 1 ) | decode_bit( bm[symbol] );
+        break;
+        }
+      }
+    return symbol & 0xFF;
+    }
+
+  unsigned decode_len( Len_model & lm, const int pos_state )
+    {
+    if( decode_bit( lm.choice1 ) == 0 )
+      return min_match_len +
+             decode_tree( lm.bm_low[pos_state], len_low_bits );
+    if( decode_bit( lm.choice2 ) == 0 )
+      return min_match_len + len_low_symbols +
+             decode_tree( lm.bm_mid[pos_state], len_mid_bits );
+    return min_match_len + len_low_symbols + len_mid_symbols +
+           decode_tree( lm.bm_high, len_high_bits );
+    }
+  };
+
+
+class LZ_decoder
+  {
+  unsigned long long partial_data_pos;
+  Range_decoder rdec;
+  const unsigned dictionary_size;
+  uint8_t * const buffer;	// output buffer
+  unsigned pos;			// current pos in buffer
+  unsigned stream_pos;		// first byte not yet written to stdout
+  uint32_t crc_;
+  bool pos_wrapped;
+
+  void flush_data();
+
+  uint8_t peek( const unsigned distance ) const
+    {
+    if( pos > distance ) return buffer[pos - distance - 1];
+    if( pos_wrapped ) return buffer[dictionary_size + pos - distance - 1];
+    return 0;			// prev_byte of first byte
+    }
+
+  void put_byte( const uint8_t b )
+    {
+    buffer[pos] = b;
+    if( ++pos >= dictionary_size ) flush_data();
+    }
+
+public:
+  explicit LZ_decoder( const unsigned dict_size )
+    :
+    partial_data_pos( 0 ),
+    dictionary_size( dict_size ),
+    buffer( new uint8_t[dictionary_size] ),
+    pos( 0 ),
+    stream_pos( 0 ),
+    crc_( 0xFFFFFFFFU ),
+    pos_wrapped( false )
+    {}
+
+  ~LZ_decoder() { delete[] buffer; }
+
+  unsigned crc() const { return crc_ ^ 0xFFFFFFFFU; }
+  unsigned long long data_position() const
+    { return partial_data_pos + pos; }
+  uint8_t get_byte() { return rdec.get_byte(); }
+  unsigned long long member_position() const
+    { return rdec.member_position(); }
+
+  bool decode_member();
+  };
+
+
+void LZ_decoder::flush_data()
+  {
+  if( pos > stream_pos )
+    {
+    const unsigned size = pos - stream_pos;
+    crc32.update_buf( crc_, buffer + stream_pos, size );
+    if( std::fwrite( buffer + stream_pos, 1, size, stdout ) != size )
+      { std::fprintf( stderr, "Write error: %s\n", std::strerror( errno ) );
+        std::exit( 1 ); }
+    if( pos >= dictionary_size )
+      { partial_data_pos += pos; pos = 0; pos_wrapped = true; }
+    stream_pos = pos;
+    }
+  }
+
+
+bool LZ_decoder::decode_member()	// Return false if error
+  {
+  Bit_model bm_literal[1<<literal_context_bits][0x300];
+  Bit_model bm_match[State::states][pos_states];
+  Bit_model bm_rep[State::states];
+  Bit_model bm_rep0[State::states];
+  Bit_model bm_rep1[State::states];
+  Bit_model bm_rep2[State::states];
+  Bit_model bm_len[State::states][pos_states];
+  Bit_model bm_dis_slot[len_states][1<<dis_slot_bits];
+  Bit_model bm_dis[modeled_distances-end_dis_model+1];
+  Bit_model bm_align[dis_align_size];
+  Len_model match_len_model;
+  Len_model rep_len_model;
+  unsigned rep0 = 0;		// rep[0-3] latest four distances
+  unsigned rep1 = 0;		// used for efficient coding of
+  unsigned rep2 = 0;		// repeated distances
+  unsigned rep3 = 0;
+  State state;
+
+  while( !std::feof( stdin ) && !std::ferror( stdin ) )
+    {
+    const int pos_state = data_position() & pos_state_mask;
+    if( rdec.decode_bit( bm_match[state()][pos_state] ) == 0 )	// 1st bit
+      {
+      // literal byte
+      const uint8_t prev_byte = peek( 0 );
+      const int literal_state = prev_byte >> ( 8 - literal_context_bits );
+      Bit_model * const bm = bm_literal[literal_state];
+      if( state.is_char() )
+        put_byte( rdec.decode_tree( bm, 8 ) );
+      else
+        put_byte( rdec.decode_matched( bm, peek( rep0 ) ) );
+      state.set_char();
+      continue;
+      }
+    // match or repeated match
+    int len;
+    if( rdec.decode_bit( bm_rep[state()] ) != 0 )		// 2nd bit
+      {
+      if( rdec.decode_bit( bm_rep0[state()] ) == 0 )		// 3rd bit
+        {
+        if( rdec.decode_bit( bm_len[state()][pos_state] ) == 0 ) // 4th bit
+          { state.set_short_rep(); put_byte( peek( rep0 ) ); continue; }
+        }
+      else
+        {
+        unsigned distance;
+        if( rdec.decode_bit( bm_rep1[state()] ) == 0 )		// 4th bit
+          distance = rep1;
+        else
+          {
+          if( rdec.decode_bit( bm_rep2[state()] ) == 0 )	// 5th bit
+            distance = rep2;
+          else
+            { distance = rep3; rep3 = rep2; }
+          rep2 = rep1;
+          }
+        rep1 = rep0;
+        rep0 = distance;
+        }
+      state.set_rep();
+      len = rdec.decode_len( rep_len_model, pos_state );
+      }
+    else					// match
+      {
+      rep3 = rep2; rep2 = rep1; rep1 = rep0;
+      len = rdec.decode_len( match_len_model, pos_state );
+      const int len_state = std::min( len - min_match_len, len_states - 1 );
+      rep0 = rdec.decode_tree( bm_dis_slot[len_state], dis_slot_bits );
+      if( rep0 >= start_dis_model )
+        {
+        const unsigned dis_slot = rep0;
+        const int direct_bits = ( dis_slot >> 1 ) - 1;
+        rep0 = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
+        if( dis_slot < end_dis_model )
+          rep0 += rdec.decode_tree_reversed( bm_dis + ( rep0 - dis_slot ),
+                                             direct_bits );
+        else
+          {
+          rep0 +=
+            rdec.decode( direct_bits - dis_align_bits ) << dis_align_bits;
+          rep0 += rdec.decode_tree_reversed( bm_align, dis_align_bits );
+          if( rep0 == 0xFFFFFFFFU )		// marker found
+            {
+            flush_data();
+            return len == min_match_len;	// End Of Stream marker
+            }
+          }
+        }
+      state.set_match();
+      if( rep0 >= dictionary_size || ( rep0 >= pos && !pos_wrapped ) )
+        { flush_data(); return false; }
+      }
+    for( int i = 0; i < len; ++i ) put_byte( peek( rep0 ) );
+    }
+  flush_data();
+  return false;
+  }
+
+
+int main( const int argc, const char * const argv[] )
+  {
+  if( argc > 2 || ( argc == 2 && std::strcmp( argv[1], "-d" ) != 0 ) )
+    {
+    std::printf(
+      "Lzd %s - Educational decompressor for the lzip format.\n"
+      "Study the source code to learn how a lzip decompressor works.\n"
+      "See the lzip manual for an explanation of the code.\n"
+      "\nUsage: %s [-d] < file.lz > file\n"
+      "Lzd decompresses from standard input to standard output.\n"
+      "\nCopyright (C) 2024 Antonio Diaz Diaz.\n"
+      "License 2-clause BSD.\n"
+      "This is free software: you are free to change and redistribute it.\n"
+      "There is NO WARRANTY, to the extent permitted by law.\n"
+      "Report bugs to lzip-bug@nongnu.org\n"
+      "Lzd home page: http://www.nongnu.org/lzip/lzd.html\n",
+      PROGVERSION, argv[0] );
+    return 0;
+    }
+
+#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__
+  setmode( STDIN_FILENO, O_BINARY );
+  setmode( STDOUT_FILENO, O_BINARY );
+#endif
+
+  for( bool first_member = true; ; first_member = false )
+    {
+    Lzip_header header;				// check header
+    for( int i = 0; i < header_size; ++i ) header[i] = std::getc( stdin );
+    if( std::feof( stdin ) || std::memcmp( header, "LZIP\x01", 5 ) != 0 )
+      {
+      if( first_member )
+        { std::fputs( "Bad magic number (file not in lzip format).\n",
+                      stderr ); return 2; }
+      break;					// ignore trailing data
+      }
+    unsigned dict_size = 1 << ( header[5] & 0x1F );
+    dict_size -= ( dict_size / 16 ) * ( ( header[5] >> 5 ) & 7 );
+    if( dict_size < min_dictionary_size || dict_size > max_dictionary_size )
+      { std::fputs( "Invalid dictionary size in member header.\n", stderr );
+        return 2; }
+
+    LZ_decoder decoder( dict_size );		// decode LZMA stream
+    if( !decoder.decode_member() )
+      { std::fputs( "Data error\n", stderr ); return 2; }
+
+    Lzip_trailer trailer;			// check trailer
+    for( int i = 0; i < trailer_size; ++i ) trailer[i] = decoder.get_byte();
+    int retval = 0;
+    unsigned crc = 0;
+    for( int i = 3; i >= 0; --i ) crc = ( crc << 8 ) + trailer[i];
+    if( crc != decoder.crc() )
+      { std::fputs( "CRC mismatch\n", stderr ); retval = 2; }
+
+    unsigned long long data_size = 0;
+    for( int i = 11; i >= 4; --i )
+      data_size = ( data_size << 8 ) + trailer[i];
+    if( data_size != decoder.data_position() )
+      { std::fputs( "Data size mismatch\n", stderr ); retval = 2; }
+
+    unsigned long long member_size = 0;
+    for( int i = 19; i >= 12; --i )
+      member_size = ( member_size << 8 ) + trailer[i];
+    if( member_size != decoder.member_position() )
+      { std::fputs( "Member size mismatch\n", stderr ); retval = 2; }
+    if( retval ) return retval;
+    }
+
+  if( std::fclose( stdout ) != 0 )
+    { std::fprintf( stderr, "Error closing stdout: %s\n",
+                    std::strerror( errno ) ); return 1; }
+  return 0;
+  }
+
+
+File: clzip.info,  Node: Concept index,  Prev: Reference source code,  Up: Top
+
+Concept index
+*************
+
+
+* Menu:
+
+* algorithm:                             Algorithm.                 (line 6)
+* bugs:                                  Problems.                  (line 6)
+* examples:                              Examples.                  (line 6)
+* file format:                           File format.               (line 6)
+* format of the LZMA stream:             Stream format.             (line 6)
+* getting help:                          Problems.                  (line 6)
+* introduction:                          Introduction.              (line 6)
+* invoking:                              Invoking clzip.            (line 6)
+* options:                               Invoking clzip.            (line 6)
+* output:                                Output.                    (line 6)
+* quality assurance:                     Quality assurance.         (line 6)
+* reference source code:                 Reference source code.     (line 6)
+* trailing data:                         Trailing data.             (line 6)
+* usage:                                 Invoking clzip.            (line 6)
+* version:                               Invoking clzip.            (line 6)
+
+
+
+Tag Table:
+Node: Top205
+Node: Introduction1207
+Node: Output7331
+Node: Invoking clzip8934
+Ref: --trailing-error9812
+Node: Quality assurance19918
+Node: Algorithm28733
+Node: File format32141
+Ref: coded-dict-size33571
+Node: Stream format34802
+Ref: what-is-coded37198
+Node: Trailing data46072
+Node: Examples48410
+Ref: concat-example49860
+Node: Problems51090
+Node: Reference source code51626
+Node: Concept index66672
+
+End Tag Table
+
+
+Local Variables:
+coding: iso-8859-15
+End:
diff --git a/doc/clzip.texi b/doc/clzip.texi
new file mode 100644
index 0000000..c98e026
--- /dev/null
+++ b/doc/clzip.texi
@@ -0,0 +1,1805 @@
+\input texinfo @c -*-texinfo-*-
+@c %**start of header
+@setfilename clzip.info
+@documentencoding ISO-8859-15
+@settitle Clzip Manual
+@finalout
+@c %**end of header
+
+@set UPDATED 22 January 2024
+@set VERSION 1.14
+
+@dircategory Compression
+@direntry
+* Clzip: (clzip).               LZMA lossless data compressor
+@end direntry
+
+
+@ifnothtml
+@titlepage
+@title Clzip
+@subtitle LZMA lossless data compressor
+@subtitle for Clzip version @value{VERSION}, @value{UPDATED}
+@author by Antonio Diaz Diaz
+
+@page
+@vskip 0pt plus 1filll
+@end titlepage
+
+@contents
+@end ifnothtml
+
+@ifnottex
+@node Top
+@top
+
+This manual is for Clzip (version @value{VERSION}, @value{UPDATED}).
+
+@menu
+* Introduction::           Purpose and features of clzip
+* Output::                 Meaning of clzip's output
+* Invoking clzip::         Command-line interface
+* Quality assurance::      Design, development, and testing of lzip
+* Algorithm::              How clzip compresses the data
+* File format::            Detailed format of the compressed file
+* Stream format::          Format of the LZMA stream in lzip files
+* Trailing data::          Extra data appended to the file
+* Examples::               A small tutorial with examples
+* Problems::               Reporting bugs
+* Reference source code::  Source code illustrating stream format
+* Concept index::          Index of concepts
+@end menu
+
+@sp 1
+Copyright @copyright{} 2010-2024 Antonio Diaz Diaz.
+
+This manual is free documentation: you have unlimited permission to copy,
+distribute, and modify it.
+@end ifnottex
+
+
+@node Introduction
+@chapter Introduction
+@cindex introduction
+
+@uref{http://www.nongnu.org/lzip/clzip.html,,Clzip}
+is a C language version of lzip, compatible with @w{lzip 1.4} or newer.
+As clzip is written in C, it may be easier to integrate in applications like
+package managers, embedded devices, or systems lacking a C++ compiler.
+
+@uref{http://www.nongnu.org/lzip/lzip.html,,Lzip}
+is a lossless data compressor with a user interface similar to the one
+of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov
+chain-Algorithm' (LZMA) stream format to maximize interoperability. The
+maximum dictionary size is 512 MiB so that any lzip file can be decompressed
+on 32-bit machines. Lzip provides accurate and robust 3-factor integrity
+checking. Lzip can compress about as fast as gzip @w{(lzip -0)} or compress most
+files more than bzip2 @w{(lzip -9)}. Decompression speed is intermediate between
+gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery
+perspective. Lzip has been designed, written, and tested with great care to
+replace gzip and bzip2 as the standard general-purpose compressed format for
+Unix-like systems.
+
+For compressing/decompressing large files on multiprocessor machines
+@uref{http://www.nongnu.org/lzip/manual/plzip_manual.html,,plzip} can be
+much faster than lzip at the cost of a slightly reduced compression ratio.
+@ifnothtml
+@xref{Top,plzip manual,,plzip}.
+@end ifnothtml
+
+For creation and manipulation of compressed tar archives
+@uref{http://www.nongnu.org/lzip/manual/tarlz_manual.html,,tarlz} can be more
+efficient than using tar and plzip because tarlz is able to keep the
+alignment between tar members and lzip members.
+@ifnothtml
+@xref{Top,tarlz manual,,tarlz}.
+@end ifnothtml
+
+The lzip file format is designed for data sharing and long-term archiving,
+taking into account both data integrity and decoder availability:
+
+@itemize @bullet
+@item
+The lzip format provides very safe integrity checking and some data
+recovery means. The program
+@uref{http://www.nongnu.org/lzip/manual/lziprecover_manual.html#Data-safety,,lziprecover}
+can repair bit flip errors (one of the most common forms of data corruption)
+in lzip files, and provides data recovery capabilities, including
+error-checked merging of damaged copies of a file.
+@ifnothtml
+@xref{Data safety,,,lziprecover}.
+@end ifnothtml
+
+@item
+The lzip format is as simple as possible (but not simpler). The lzip
+manual provides the source code of a simple decompressor along with a
+detailed explanation of how it works, so that with the only help of the
+lzip manual it would be possible for a digital archaeologist to extract
+the data from a lzip file long after quantum computers eventually
+render LZMA obsolete.
+
+@item
+Additionally the lzip reference implementation is copylefted, which
+guarantees that it will remain free forever.
+@end itemize
+
+A nice feature of the lzip format is that a corrupt byte is easier to repair
+the nearer it is from the beginning of the file. Therefore, with the help of
+lziprecover, losing an entire archive just because of a corrupt byte near
+the beginning is a thing of the past.
+
+The member trailer stores the 32-bit CRC of the original data, the size of
+the original data, and the size of the member. These values, together with
+the "End Of Stream" marker, provide a 3-factor integrity checking which
+guarantees that the decompressed version of the data is identical to the
+original. This guards against corruption of the compressed data, and against
+undetected bugs in clzip (hopefully very unlikely). The chances of data
+corruption going undetected are microscopic. Be aware, though, that the
+check occurs upon decompression, so it can only tell you that something is
+wrong. It can't help you recover the original uncompressed data.
+
+Clzip uses the same well-defined exit status values used by bzip2, which
+makes it safer than compressors returning ambiguous warning values (like
+gzip) when it is used as a back end for other programs like tar or zutils.
+
+Clzip automatically uses for each file the largest dictionary size that does
+not exceed neither the file size nor the limit given. Keep in mind that the
+decompression memory requirement is affected at compression time by the
+choice of dictionary size limit.
+
+The amount of memory required for compression is about 1 or 2 times the
+dictionary size limit (1 if input file size is less than dictionary size
+limit, else 2) plus 9 times the dictionary size really used. The option
+@option{-0} is special and only requires about @w{1.5 MiB} at most. The
+amount of memory required for decompression is about @w{46 kB} larger
+than the dictionary size really used.
+
+When compressing, clzip replaces every file given in the command line
+with a compressed version of itself, with the name "original_name.lz".
+When decompressing, clzip attempts to guess the name for the decompressed
+file from that of the compressed file as follows:
+
+@multitable {anyothername} {becomes} {anyothername.out}
+@item filename.lz  @tab becomes @tab filename
+@item filename.tlz @tab becomes @tab filename.tar
+@item anyothername @tab becomes @tab anyothername.out
+@end multitable
+
+(De)compressing a file is much like copying or moving it. Therefore clzip
+preserves the access and modification dates, permissions, and, if you have
+appropriate privileges, ownership of the file just as @w{@samp{cp -p}} does.
+(If the user ID or the group ID can't be duplicated, the file permission
+bits S_ISUID and S_ISGID are cleared).
+
+Clzip is able to read from some types of non-regular files if either the
+option @option{-c} or the option @option{-o} is specified.
+
+Clzip refuses to read compressed data from a terminal or write compressed
+data to a terminal, as this would be entirely incomprehensible and might
+leave the terminal in an abnormal state.
+
+Clzip correctly decompresses a file which is the concatenation of two or
+more compressed files. The result is the concatenation of the corresponding
+decompressed files. Integrity testing of concatenated compressed files is
+also supported.
+
+Clzip can produce multimember files, and lziprecover can safely recover the
+undamaged members in case of file damage. Clzip can also split the compressed
+output in volumes of a given size, even when reading from standard input.
+This allows the direct creation of multivolume compressed tar archives.
+
+Clzip is able to compress and decompress streams of unlimited size by
+automatically creating multimember output. The members so created are large,
+about @w{2 PiB} each.
+
+
+@node Output
+@chapter Meaning of clzip's output
+@cindex output
+
+The output of clzip looks like this:
+
+@example
+clzip -v foo
+  foo:  6.676:1, 14.98% ratio, 85.02% saved, 450560 in, 67493 out.
+
+clzip -tvvv foo.lz
+  foo.lz:  6.676:1, 14.98% ratio, 85.02% saved.  450560 out,  67493 in. ok
+@end example
+
+The meaning of each field is as follows:
+
+@table @code
+@item N:1
+The compression ratio @w{(uncompressed_size / compressed_size)}, shown as
+@w{N to 1}.
+
+@item ratio
+The inverse compression ratio @w{(compressed_size / uncompressed_size)},
+shown as a percentage. A decimal ratio is easily obtained by moving the
+decimal point two places to the left; @w{14.98% = 0.1498}.
+
+@item saved
+The space saved by compression @w{(1 - ratio)}, shown as a percentage.
+
+@item in
+Size of the input data. This is the uncompressed size when compressing, or
+the compressed size when decompressing or testing. Note that clzip always
+prints the uncompressed size before the compressed size when compressing,
+decompressing, testing, or listing.
+
+@item out
+Size of the output data. This is the compressed size when compressing, or
+the decompressed size when decompressing or testing.
+
+@end table
+
+When decompressing or testing at verbosity level 4 (-vvvv), the dictionary
+size used to compress the file and the CRC32 of the uncompressed data are
+also shown.
+
+LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never have
+been compressed. Decompressed is used to refer to data which have undergone
+the process of decompression.
+
+
+@node Invoking clzip
+@chapter Invoking clzip
+@cindex invoking
+@cindex options
+@cindex usage
+@cindex version
+
+The format for running clzip is:
+
+@example
+clzip [@var{options}] [@var{files}]
+@end example
+
+@noindent
+If no file names are specified, clzip compresses (or decompresses) from
+standard input to standard output. A hyphen @samp{-} used as a @var{file}
+argument means standard input. It can be mixed with other @var{files} and is
+read just once, the first time it appears in the command line. Remember to
+prepend @file{./} to any file name beginning with a hyphen, or use @samp{--}.
+
+clzip supports the following
+@uref{http://www.nongnu.org/arg-parser/manual/arg_parser_manual.html#Argument-syntax,,options}:
+@ifnothtml
+@xref{Argument syntax,,,arg_parser}.
+@end ifnothtml
+
+@table @code
+@item -h
+@itemx --help
+Print an informative help message describing the options and exit.
+
+@item -V
+@itemx --version
+Print the version number of clzip on the standard output and exit.
+This version number should be included in all bug reports.
+
+@anchor{--trailing-error}
+@item -a
+@itemx --trailing-error
+Exit with error status 2 if any remaining input is detected after
+decompressing the last member. Such remaining input is usually trailing
+garbage that can be safely ignored. @xref{concat-example}.
+
+@item -b @var{bytes}
+@itemx --member-size=@var{bytes}
+When compressing, set the member size limit to @var{bytes}. It is advisable
+to keep members smaller than RAM size so that they can be repaired with
+lziprecover in case of corruption. A small member size may degrade
+compression ratio, so use it only when needed. Valid values range from
+@w{100 kB} to @w{2 PiB}. Defaults to @w{2 PiB}.
+
+@item -c
+@itemx --stdout
+Compress or decompress to standard output; keep input files unchanged. If
+compressing several files, each file is compressed independently. (The
+output consists of a sequence of independently compressed members). This
+option (or @option{-o}) is needed when reading from a named pipe (fifo) or
+from a device. Use it also to recover as much of the decompressed data as
+possible when decompressing a corrupt file. @option{-c} overrides @option{-o}
+and @option{-S}. @option{-c} has no effect when testing or listing.
+
+@item -d
+@itemx --decompress
+Decompress the files specified. The integrity of the files specified is
+checked. If a file does not exist, can't be opened, or the destination file
+already exists and @option{--force} has not been specified, clzip continues
+decompressing the rest of the files and exits with error status 1. If a file
+fails to decompress, or is a terminal, clzip exits immediately with error
+status 2 without decompressing the rest of the files. A terminal is
+considered an uncompressed file, and therefore invalid.
+
+@item -f
+@itemx --force
+Force overwrite of output files.
+
+@item -F
+@itemx --recompress
+When compressing, force re-compression of files whose name already has
+the @samp{.lz} or @samp{.tlz} suffix.
+
+@item -k
+@itemx --keep
+Keep (don't delete) input files during compression or decompression.
+
+@item -l
+@itemx --list
+Print the uncompressed size, compressed size, and percentage saved of the
+files specified. Trailing data are ignored. The values produced are correct
+even for multimember files. If more than one file is given, a final line
+containing the cumulative sizes is printed. With @option{-v}, the dictionary
+size, the number of members in the file, and the amount of trailing data (if
+any) are also printed. With @option{-vv}, the positions and sizes of each
+member in multimember files are also printed.
+
+If any file is damaged, does not exist, can't be opened, or is not regular,
+the final exit status is @w{> 0}. @option{-lq} can be used to check quickly
+(without decompressing) the structural integrity of the files specified.
+(Use @option{--test} to check the data integrity). @option{-alq}
+additionally checks that none of the files specified contain trailing data.
+
+@item -m @var{bytes}
+@itemx --match-length=@var{bytes}
+When compressing, set the match length limit in bytes. After a match this
+long is found, the search is finished. Valid values range from 5 to 273.
+Larger values usually give better compression ratios but longer compression
+times.
+
+@item -o @var{file}
+@itemx --output=@var{file}
+If @option{-c} has not been also specified, write the (de)compressed output
+to @var{file}, automatically creating any missing parent directories; keep
+input files unchanged. If compressing several files, each file is compressed
+independently. (The output consists of a sequence of independently
+compressed members). This option (or @option{-c}) is needed when reading
+from a named pipe (fifo) or from a device. @w{@option{-o -}} is equivalent
+to @option{-c}. @option{-o} has no effect when testing or listing.
+
+In order to keep backward compatibility with clzip versions prior to 1.12,
+when compressing from standard input and no other file names are given, the
+extension @samp{.lz} is appended to @var{file} unless it already ends in
+@samp{.lz} or @samp{.tlz}. This feature will be removed in a future version
+of clzip. Meanwhile, redirection may be used instead of @option{-o} to write
+the compressed output to a file without the extension @samp{.lz} in its
+name: @w{@samp{clzip < file > foo}}.
+
+When compressing and splitting the output in volumes, @var{file} is used as
+a prefix, and several files named @samp{@var{file}00001.lz},
+@samp{@var{file}00002.lz}, etc, are created. In this case, only one input
+file is allowed.
+
+@item -q
+@itemx --quiet
+Quiet operation. Suppress all messages.
+
+@item -s @var{bytes}
+@itemx --dictionary-size=@var{bytes}
+When compressing, set the dictionary size limit in bytes. Clzip uses for
+each file the largest dictionary size that does not exceed neither the file
+size nor this limit. Valid values range from @w{4 KiB} to @w{512 MiB}.
+Values 12 to 29 are interpreted as powers of two, meaning 2^12 to 2^29
+bytes. Dictionary sizes are quantized so that they can be coded in just one
+byte (@pxref{coded-dict-size}). If the size specified does not match one of
+the valid sizes, it is rounded upwards by adding up to @w{(@var{bytes} / 8)}
+to it.
+
+For maximum compression you should use a dictionary size limit as large
+as possible, but keep in mind that the decompression memory requirement
+is affected at compression time by the choice of dictionary size limit.
+
+@item -S @var{bytes}
+@itemx --volume-size=@var{bytes}
+When compressing, and @option{-c} has not been also specified, split the
+compressed output into several volume files with names
+@samp{original_name00001.lz}, @samp{original_name00002.lz}, etc, and set the
+volume size limit to @var{bytes}. Input files are kept unchanged. Each
+volume is a complete, maybe multimember, lzip file. A small volume size may
+degrade compression ratio, so use it only when needed. Valid values range
+from @w{100 kB} to @w{4 EiB}.
+
+@item -t
+@itemx --test
+Check integrity of the files specified, but don't decompress them. This
+really performs a trial decompression and throws away the result. Use it
+together with @option{-v} to see information about the files. If a file
+fails the test, does not exist, can't be opened, or is a terminal, clzip
+continues testing the rest of the files. A final diagnostic is shown at
+verbosity level 1 or higher if any file fails the test when testing multiple
+files.
+
+@item -v
+@itemx --verbose
+Verbose mode.@*
+When compressing, show the compression ratio and size for each file
+processed.@*
+When decompressing or testing, further -v's (up to 4) increase the
+verbosity level, showing status, compression ratio, dictionary size,
+trailer contents (CRC, data size, member size), and up to 6 bytes of
+trailing data (if any) both in hexadecimal and as a string of printable
+ASCII characters.@*
+Two or more @option{-v} options show the progress of (de)compression.
+
+@item -0 .. -9
+Compression level. Set the compression parameters (dictionary size and
+match length limit) as shown in the table below. The default compression
+level is @option{-6}, equivalent to @w{@option{-s8MiB -m36}}. Note that
+@option{-9} can be much slower than @option{-0}. These options have no
+effect when decompressing, testing, or listing.
+
+The bidimensional parameter space of LZMA can't be mapped to a linear scale
+optimal for all files. If your files are large, very repetitive, etc, you
+may need to use the options @option{--dictionary-size} and
+@option{--match-length} directly to achieve optimal performance.
+
+If several compression levels or @option{-s} or @option{-m} options are
+given, the last setting is used. For example @w{@option{-9 -s64MiB}} is
+equivalent to @w{@option{-s64MiB -m273}}
+
+@multitable {Level} {Dictionary size (-s)} {Match length limit (-m)}
+@item Level @tab Dictionary size (-s) @tab Match length limit (-m)
+@item -0 @tab 64 KiB @tab  16 bytes
+@item -1 @tab  1 MiB @tab   5 bytes
+@item -2 @tab  1.5 MiB @tab   6 bytes
+@item -3 @tab  2 MiB @tab   8 bytes
+@item -4 @tab  3 MiB @tab  12 bytes
+@item -5 @tab  4 MiB @tab  20 bytes
+@item -6 @tab  8 MiB @tab  36 bytes
+@item -7 @tab 16 MiB @tab  68 bytes
+@item -8 @tab 24 MiB @tab 132 bytes
+@item -9 @tab 32 MiB @tab 273 bytes
+@end multitable
+
+@item --fast
+@itemx --best
+Aliases for GNU gzip compatibility.
+
+@item --empty-error
+Exit with error status 2 if any empty member is found in the input files.
+
+@item --marking-error
+Exit with error status 2 if the first LZMA byte is non-zero in any member of
+the input files. This may be caused by data corruption or by deliberate
+insertion of tracking information in the file. Use
+@w{@samp{lziprecover --clear-marking}} to clear any such non-zero bytes.
+
+@item --loose-trailing
+When decompressing, testing, or listing, allow trailing data whose first
+bytes are so similar to the magic bytes of a lzip header that they can
+be confused with a corrupt header. Use this option if a file triggers a
+"corrupt header" error and the cause is not indeed a corrupt header.
+
+@end table
+
+Numbers given as arguments to options may be expressed in decimal,
+hexadecimal, or octal (using the same syntax as integer constants in C++),
+and may be followed by a multiplier and an optional @samp{B} for "byte".
+
+Table of SI and binary prefixes (unit multipliers):
+
+@multitable {Prefix} {kilobyte   (10^3 = 1000)} {|} {Prefix} {kibibyte  (2^10 = 1024)}
+@item Prefix @tab Value               @tab | @tab Prefix @tab Value
+@item k @tab kilobyte   (10^3 = 1000) @tab | @tab Ki @tab kibibyte  (2^10 = 1024)
+@item M @tab megabyte   (10^6)        @tab | @tab Mi @tab mebibyte  (2^20)
+@item G @tab gigabyte   (10^9)        @tab | @tab Gi @tab gibibyte  (2^30)
+@item T @tab terabyte   (10^12)       @tab | @tab Ti @tab tebibyte  (2^40)
+@item P @tab petabyte   (10^15)       @tab | @tab Pi @tab pebibyte  (2^50)
+@item E @tab exabyte    (10^18)       @tab | @tab Ei @tab exbibyte  (2^60)
+@item Z @tab zettabyte  (10^21)       @tab | @tab Zi @tab zebibyte  (2^70)
+@item Y @tab yottabyte  (10^24)       @tab | @tab Yi @tab yobibyte  (2^80)
+@item R @tab ronnabyte  (10^27)       @tab | @tab Ri @tab robibyte  (2^90)
+@item Q @tab quettabyte (10^30)       @tab | @tab Qi @tab quebibyte (2^100)
+@end multitable
+
+@sp 1
+Exit status: 0 for a normal exit, 1 for environmental problems
+(file not found, invalid command-line options, I/O errors, etc), 2 to
+indicate a corrupt or invalid input file, 3 for an internal consistency
+error (e.g., bug) which caused clzip to panic.
+
+
+@node Quality assurance
+@chapter Design, development, and testing of lzip
+@cindex quality assurance
+
+There are two ways of constructing a software design: One way is to make it
+so simple that there are obviously no deficiencies and the other way is to
+make it so complicated that there are no obvious deficiencies. The first
+method is far more difficult.@*
+--- C.A.R. Hoare
+
+Lzip has been designed, written, and tested with great care to replace gzip
+and bzip2 as the standard general-purpose compressed format for Unix-like
+systems. This chapter describes the lessons learned from these previous
+formats, and their application to the design of lzip. The lzip format
+specification has been reviewed carefully and is believed to be free from
+design errors.
+
+@sp 1
+@section Format design
+
+When gzip was designed in 1992, computers and operating systems were much
+less capable than they are today. The designers of gzip tried to work around
+some of those limitations, like 8.3 file names, with additional fields in
+the file format.
+
+Today those limitations have mostly disappeared, and the format of gzip has
+proved to be unnecessarily complicated. It includes fields that were never
+used, others that have lost their usefulness, and finally others that have
+become too limited.
+
+Bzip2 was designed 5 years later, and its format is simpler than the one of
+gzip.
+
+Probably the worst defect of the gzip format from the point of view of data
+safety is the variable size of its header. If the byte at offset 3 (flags)
+of a gzip member gets corrupted, it may become difficult to recover the
+data, even if the compressed blocks are intact, because it can't be known
+with certainty where the compressed blocks begin.
+
+By contrast, the header of a lzip member has a fixed length of 6. The LZMA
+stream in a lzip member always starts at offset 6, making it trivial to
+recover the data even if the whole header becomes corrupt.
+
+Bzip2 also provides a header of fixed length and marks the begin and end of
+each compressed block with six magic bytes, making it possible to find the
+compressed blocks even in case of file damage. But bzip2 does not store the
+size of each compressed block, as lzip does.
+
+Lziprecover is able to provide unique data recovery capabilities because the
+lzip format is extraordinarily safe. The simple and safe design of the file
+format complements the embedded error detection provided by the LZMA data
+stream. Any distance larger than the dictionary size acts as a forbidden
+symbol, allowing the decompressor to detect the approximate position of
+errors, and leaving very little work for the check sequence (CRC and data
+sizes) in the detection of errors. Lzip is usually able to detect all
+possible bit flips in the compressed data without resorting to the check
+sequence. It would be difficult to write an automatic recovery tool like
+lziprecover for the gzip format. And, as far as I know, it has never been
+written.
+
+Lzip, like gzip and bzip2, uses a CRC32 to check the integrity of the
+decompressed data because it provides optimal accuracy in the detection of
+errors up to a compressed size of about @w{16 GiB}, a size larger than that
+of most files. In the case of lzip, the additional detection capability of
+the decompressor reduces the probability of undetected errors several
+million times more, resulting in a combined integrity checking optimally
+accurate for any member size produced by lzip. Preliminary results suggest
+that the lzip format is safe enough to be used in critical safety avionics
+systems.
+
+The lzip format is designed for long-term archiving. Therefore it excludes
+any unneeded features that may interfere with the future extraction of the
+decompressed data.
+
+@sp 1
+@subsection Gzip format (mis)features not present in lzip
+
+@table @samp
+@item Multiple algorithms
+
+Gzip provides a CM (Compression Method) field that has never been used
+because it is a bad idea to begin with. New compression methods may require
+additional fields, making it impossible to implement new methods and, at the
+same time, keep the same format. This field does not solve the problem of
+format proliferation; it just makes the problem less obvious.
+
+@item Optional fields in header
+
+Unless special precautions are taken, optional fields are generally a bad
+idea because they produce a header of variable size. The gzip header has 2
+fields that, in addition to being optional, are zero-terminated. This means
+that if any byte inside the field gets zeroed, or if the terminating zero
+gets altered, gzip won't be able to find neither the header CRC nor the
+compressed blocks.
+
+@item Optional CRC for the header
+
+Using an optional CRC for the header is not only a bad idea, it is an error;
+it circumvents the Hamming distance (HD) of the CRC and may prevent the
+extraction of perfectly good data. For example, if the CRC is used and the
+bit enabling it is reset by a bit flip, then the header seems to be intact
+(in spite of being corrupt) while the compressed blocks seem to be totally
+unrecoverable (in spite of being intact). Very misleading indeed.
+
+@item Metadata
+
+The gzip format stores some metadata, like the modification time of the
+original file or the operating system on which compression took place. This
+complicates reproducible compression (obtaining identical compressed output
+from identical input).
+
+@end table
+
+@subsection Lzip format improvements over gzip and bzip2
+
+@table @samp
+@item 64-bit size field
+
+Probably the most frequently reported shortcoming of the gzip format is that
+it only stores the least significant 32 bits of the uncompressed size. The
+size of any file larger or equal than @w{4 GiB} gets truncated.
+
+Bzip2 does not store the uncompressed size of the file.
+
+The lzip format provides a 64-bit field for the uncompressed size.
+Additionally, lzip produces multimember output automatically when the size
+is too large for a single member, allowing for an unlimited uncompressed
+size.
+
+@item Distributed index
+
+The lzip format provides a distributed index that, among other things, helps
+plzip to decompress several times faster than pigz and helps lziprecover do
+its job. Neither the gzip format nor the bzip2 format do provide an index.
+
+A distributed index is safer and more scalable than a monolithic index. The
+monolithic index introduces a single point of failure in the compressed file
+and may limit the number of members or the total uncompressed size.
+
+@end table
+
+@section Quality of implementation
+
+Our civilization depends critically on software; it had better be quality
+software.@*
+--- Bjarne Stroustrup
+
+@table @samp
+@item Accurate and robust error detection
+
+The lzip format provides 3-factor integrity checking, and the decompressors
+report mismatches in each factor separately. This method detects most false
+positives for corruption. If just one byte in one factor fails but the other
+two factors match the data, it probably means that the data are intact and
+the corruption just affects the mismatching factor (CRC, data size, or
+member size) in the member trailer.
+
+@item Multiple implementations
+
+Just like the lzip format provides 3-factor protection against undetected
+data corruption, the development methodology of the lzip family of
+compressors provides 3-factor protection against undetected programming
+errors.
+
+Three related but independent compressor implementations, lzip, clzip, and
+minilzip/lzlib, are developed concurrently. Every stable release of any of
+them is tested to check that it produces identical output to the other two.
+This guarantees that all three implement the same algorithm, and makes it
+unlikely that any of them may contain serious undiscovered errors. In fact,
+no errors have been discovered in lzip since 2009.
+
+Additionally, the three implementations have been extensively tested with
+@uref{http://www.nongnu.org/lzip/manual/lziprecover_manual.html#Unzcrash,,unzcrash},
+valgrind, and @samp{american fuzzy lop} without finding a single
+vulnerability or false negative.
+@ifnothtml
+@xref{Unzcrash,,,lziprecover}.
+@end ifnothtml
+
+@item Dictionary size
+
+Lzip automatically adapts the dictionary size to the size of each file.
+In addition to reducing the amount of memory required for decompression,
+this feature also minimizes the probability of being affected by RAM errors
+during compression. @c key4_mask
+
+@item Exit status
+
+Returning a warning status of 2 is a design flaw of compress that leaked
+into the design of gzip. Both bzip2 and lzip are free from this flaw.
+
+@end table
+
+
+@node Algorithm
+@chapter Algorithm
+@cindex algorithm
+
+In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a
+concrete algorithm; it is more like "any algorithm using the LZMA coding
+scheme". LZMA compression consists in describing the uncompressed data as a
+succession of coding sequences from the set shown in Section @samp{What is
+coded} (@pxref{what-is-coded}), and then encoding them using a range
+encoder. For example, the option @option{-0} of clzip uses the scheme in almost
+the simplest way possible; issuing the longest match it can find, or a
+literal byte if it can't find a match. Inversely, a much more elaborated way
+of finding coding sequences of minimum size than the one currently used by
+clzip could be developed, and the resulting sequence could also be coded
+using the LZMA coding scheme.
+
+Clzip currently implements two variants of the LZMA algorithm: fast
+(used by option @option{-0}) and normal (used by all other compression levels).
+
+The high compression of LZMA comes from combining two basic, well-proven
+compression ideas: sliding dictionaries (LZ77) and markov models (the thing
+used by every compression algorithm that uses a range encoder or similar
+order-0 entropy coder as its last stage) with segregation of contexts
+according to what the bits are used for.
+
+Clzip is a two stage compressor. The first stage is a Lempel-Ziv coder,
+which reduces redundancy by translating chunks of data to their
+corresponding distance-length pairs. The second stage is a range encoder
+that uses a different probability model for each type of data:
+distances, lengths, literal bytes, etc.
+
+Here is how it works, step by step:
+
+1) The member header is written to the output stream.
+
+2) The first byte is coded literally, because there are no previous
+bytes to which the match finder can refer to.
+
+3) The main encoder advances to the next byte in the input data and
+calls the match finder.
+
+4) The match finder fills an array with the minimum distances before the
+current byte where a match of a given length can be found.
+
+5) Go back to step 3 until a sequence (formed of pairs, repeated
+distances, and literal bytes) of minimum price has been formed. Where the
+price represents the number of output bits produced.
+
+6) The range encoder encodes the sequence produced by the main encoder
+and sends the bytes produced to the output stream.
+
+7) Go back to step 3 until the input data are finished or until the
+member or volume size limits are reached.
+
+8) The range encoder is flushed.
+
+9) The member trailer is written to the output stream.
+
+10) If there are more data to compress, go back to step 1.
+
+@sp 1
+During compression, clzip reads data in large blocks (one dictionary size at
+a time). Therefore it may block for up to tens of seconds any process
+feeding data to it through a pipe. This is normal. The blocking intervals
+get longer with higher compression levels because dictionary size increases
+(and compression speed decreases) with compression level.
+
+@noindent
+The ideas embodied in clzip are due to (at least) the following people:
+Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the
+definition of Markov chains), G.N.N. Martin (for the definition of range
+encoding), Igor Pavlov (for putting all the above together in LZMA), and
+Julian Seward (for bzip2's CLI).
+
+
+@node File format
+@chapter File format
+@cindex file format
+
+Perfection is reached, not when there is no longer anything to add, but
+when there is no longer anything to take away.@*
+--- Antoine de Saint-Exupery
+
+@sp 1
+In the diagram below, a box like this:
+
+@verbatim
++---+
+|   | <-- the vertical bars might be missing
++---+
+@end verbatim
+
+represents one byte; a box like this:
+
+@verbatim
++==============+
+|              |
++==============+
+@end verbatim
+
+represents a variable number of bytes.
+
+@sp 1
+A lzip file consists of one or more independent "members" (compressed data
+sets). The members simply appear one after another in the file, with no
+additional information before, between, or after them. Each member can
+encode in compressed form up to @w{16 EiB - 1 byte} of uncompressed data.
+The size of a multimember file is unlimited.
+
+Each member has the following structure:
+
+@verbatim
++--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+| ID string | VN | DS | LZMA stream | CRC32 |   Data size   |  Member size  |
++--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+@end verbatim
+
+All multibyte values are stored in little endian order.
+
+@table @samp
+@item ID string (the "magic" bytes)
+A four byte string, identifying the lzip format, with the value "LZIP"
+(0x4C, 0x5A, 0x49, 0x50).
+
+@item VN (version number, 1 byte)
+Just in case something needs to be modified in the future. 1 for now.
+
+@anchor{coded-dict-size}
+@item DS (coded dictionary size, 1 byte)
+The dictionary size is calculated by taking a power of 2 (the base size)
+and subtracting from it a fraction between 0/16 and 7/16 of the base size.@*
+Bits 4-0 contain the base 2 logarithm of the base size (12 to 29).@*
+Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract
+from the base size to obtain the dictionary size.@*
+Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB@*
+Valid values for dictionary size range from 4 KiB to 512 MiB.
+
+@item LZMA stream
+The LZMA stream, finished by an "End Of Stream" marker. Uses default values
+for encoder properties. @xref{Stream format}, for a complete description.
+
+@item CRC32 (4 bytes)
+Cyclic Redundancy Check (CRC) of the original uncompressed data.
+
+@item Data size (8 bytes)
+Size of the original uncompressed data.
+
+@item Member size (8 bytes)
+Total size of the member, including header and trailer. This field acts
+as a distributed index, improves the checking of stream integrity, and
+facilitates the safe recovery of undamaged members from multimember files.
+Lzip limits the member size to @w{2 PiB} to prevent the data size field from
+overflowing.
+
+@end table
+
+
+@node Stream format
+@chapter Format of the LZMA stream in lzip files
+@cindex format of the LZMA stream
+
+The LZMA algorithm has three parameters, called "special LZMA
+properties", to adjust it for some kinds of binary data. These
+parameters are: @samp{literal_context_bits} (with a default value of 3),
+@samp{literal_pos_state_bits} (with a default value of 0), and
+@samp{pos_state_bits} (with a default value of 2). As a general purpose
+compressor, lzip only uses the default values for these parameters. In
+particular @samp{literal_pos_state_bits} has been optimized away and
+does not even appear in the code.
+
+Lzip finishes the LZMA stream with an "End Of Stream" (EOS) marker (the
+distance-length pair @w{0xFFFFFFFFU, 2}), which in conjunction with the
+@samp{member size} field in the member trailer allows the checking of stream
+integrity. The EOS marker is the only LZMA marker allowed in lzip files. The
+LZMA stream in lzip files always has these two features (default properties
+and EOS marker) and is referred to in this document as LZMA-302eos. This
+simplified and marker-terminated form of the LZMA stream format has been
+chosen to maximize interoperability and safety.
+
+The second stage of LZMA is a range encoder that uses a different
+probability model for each type of symbol: distances, lengths, literal
+bytes, etc. Range encoding conceptually encodes all the symbols of the
+message into one number. Unlike Huffman coding, which assigns to each
+symbol a bit-pattern and concatenates all the bit-patterns together,
+range encoding can compress one symbol to less than one bit. Therefore
+the compressed data produced by a range encoder can't be split in pieces
+that could be described individually.
+
+It seems that the only way of describing the LZMA-302eos stream is to
+describe the algorithm that decodes it. And given the many details
+about the range decoder that need to be described accurately, the source
+code of a real decompressor seems the only appropriate reference to use.
+
+What follows is a description of the decoding algorithm for LZMA-302eos
+streams using as reference the source code of "lzd", an educational
+decompressor for lzip files, included in appendix A. @xref{Reference source
+code}. Lzd is written in C++11 and can be downloaded from the lzip download
+directory.
+
+@sp 1
+@section What is coded
+
+@anchor{what-is-coded}
+The LZMA stream includes literals, matches, and repeated matches (matches
+reusing a recently used distance). There are 7 different coding sequences:
+
+@multitable @columnfractions .35 .14 .51
+@headitem Bit sequence @tab Name @tab Description
+@item 0 + byte @tab literal @tab literal byte
+@item 1 + 0 + len + dis @tab match @tab distance-length pair
+@item 1 + 1 + 0 + 0 @tab shortrep @tab 1 byte match at latest used distance
+@item 1 + 1 + 0 + 1 + len @tab rep0 @tab len bytes match at latest used distance
+@item 1 + 1 + 1 + 0 + len @tab rep1 @tab len bytes match at second
+latest used distance
+@item 1 + 1 + 1 + 1 + 0 + len @tab rep2 @tab len bytes match at third
+latest used distance
+@item 1 + 1 + 1 + 1 + 1 + len @tab rep3 @tab len bytes match at fourth
+latest used distance
+@end multitable
+
+@sp 1
+In the following tables, multibit sequences are coded in normal order,
+from most significant bit (MSB) to least significant bit (LSB), except
+where noted otherwise.
+
+Lengths (the @samp{len} in the table above) are coded as follows:
+
+@multitable @columnfractions .5 .5
+@headitem Bit sequence @tab Description
+@item 0 + 3 bits @tab lengths from 2 to 9
+@item 1 + 0 + 3 bits @tab lengths from 10 to 17
+@item 1 + 1 + 8 bits @tab lengths from 18 to 273
+@end multitable
+
+@sp 1
+The coding of distances is a little more complicated, so I'll begin by
+explaining a simpler version of the encoding.
+
+Imagine you need to encode a number from 0 to @w{2^32 - 1}, and you want to
+do it in a way that produces shorter codes for the smaller numbers. You may
+first encode the position of the most significant bit that is set to 1,
+which you may find by making a bit scan from the left (from the MSB). A
+position of 0 means that the number is 0 (no bit is set), 1 means the LSB is
+the first bit set (the number is 1), and 32 means the MSB is set (i.e., the
+number is @w{>= 0x80000000}). Then, if the position is @w{>= 2}, you encode
+the remaining @w{position - 1} bits. Let's call these bits "direct bits"
+because they are coded directly by value instead of indirectly by position.
+
+The inconvenient of this simple method is that it needs 6 bits to encode the
+position, but it just uses 33 of the 64 possible values, wasting almost half
+of the codes.
+
+The intelligent trick of LZMA is that it encodes in what it calls a "slot"
+the position of the most significant bit set, along with the value of the
+next bit, using the same 6 bits that would take to encode the position
+alone. This seems to need 66 slots (twice the number of positions), but for
+positions 0 and 1 there is no next bit, so the number of slots needed is 64
+(0 to 63).
+
+The 6 bits representing this "slot number" are then context-coded. If
+the distance is @w{>= 4}, the remaining bits are encoded as follows.
+@samp{direct_bits} is the amount of remaining bits (from 1 to 30) needed
+to form a complete distance, and is calculated as @w{(slot >> 1) - 1}.
+If a distance needs 6 or more direct_bits, the last 4 bits are encoded
+separately. The last piece (all the direct_bits for distances 4 to 127
+(slots 4 to 13), or the last 4 bits for distances @w{>= 128}
+@w{(slot >= 14)}) is context-coded in reverse order (from LSB to MSB). For
+distances @w{>= 128}, the @w{@samp{direct_bits - 4}} part is encoded with
+fixed 0.5 probability.
+
+@multitable @columnfractions .5 .5
+@headitem Bit sequence @tab Description
+@item slot @tab distances from 0 to 3
+@item slot + direct_bits @tab distances from 4 to 127
+@item slot + (direct_bits - 4) + 4 bits @tab distances from 128 to 2^32 - 1
+@end multitable
+
+@sp 1
+@section The coding contexts
+
+These contexts (@samp{Bit_model} in the source), are integers or arrays
+of integers representing the probability of the corresponding bit being 0.
+
+The indices used in these arrays are:
+
+@table @samp
+@item state
+A state machine (@samp{State} in the source) with 12 states (0 to 11),
+coding the latest 2 to 4 types of sequences processed. The initial state
+is 0.
+
+@item pos_state
+Value of the 2 least significant bits of the current position in the
+decoded data.
+
+@item literal_state
+Value of the 3 most significant bits of the latest byte decoded.
+
+@item len_state
+Coded value of the current match length @w{(length - 2)}, with a maximum
+of 3. The resulting value is in the range 0 to 3.
+
+@end table
+
+
+The types of previous sequences corresponding to each state are shown in the
+following table. @samp{!literal} is any sequence except a literal byte.
+@samp{rep} is any one of @samp{rep0}, @samp{rep1}, @samp{rep2}, or
+@samp{rep3}. The last type in each line is the most recent.
+
+@multitable {State} {rep or (!literal, shortrep), literal, literal}
+@headitem State @tab Types of previous sequences
+@item  0 @tab literal, literal, literal
+@item  1 @tab match, literal, literal
+@item  2 @tab rep or (!literal, shortrep), literal, literal
+@item  3 @tab literal, shortrep, literal, literal
+@item  4 @tab match, literal
+@item  5 @tab rep or (!literal, shortrep), literal
+@item  6 @tab literal, shortrep, literal
+@item  7 @tab literal, match
+@item  8 @tab literal, rep
+@item  9 @tab literal, shortrep
+@item 10 @tab !literal, match
+@item 11 @tab !literal, (rep or shortrep)
+@end multitable
+
+@sp 1
+The contexts for decoding the type of coding sequence are:
+
+@multitable @columnfractions .2 .35 .45
+@headitem Name @tab Indices @tab Used when
+@item bm_match @tab state, pos_state @tab sequence start
+@item bm_rep @tab state @tab after sequence 1
+@item bm_rep0 @tab state @tab after sequence 11
+@item bm_rep1 @tab state @tab after sequence 111
+@item bm_rep2 @tab state @tab after sequence 1111
+@item bm_len @tab state, pos_state @tab after sequence 110
+@end multitable
+
+@sp 1
+The contexts for decoding distances are:
+
+@multitable @columnfractions .2 .3 .5
+@headitem Name @tab Indices @tab Used when
+@item bm_dis_slot @tab len_state, bit tree @tab distance start
+@item bm_dis @tab reverse bit tree @tab after slots 4 to 13
+@item bm_align @tab reverse bit tree @tab for distances >= 128, after
+fixed probability bits
+@end multitable
+
+@sp 1
+There are two separate sets of contexts for lengths (@samp{Len_model} in
+the source). One for normal matches, the other for repeated matches. The
+contexts in each Len_model are (see @samp{decode_len} in the source):
+
+@multitable @columnfractions .2 .4 .4
+@headitem Name @tab Indices @tab Used when
+@item choice1 @tab none @tab length start
+@item choice2 @tab none @tab after sequence 1
+@item bm_low @tab pos_state, bit tree @tab after sequence 0
+@item bm_mid @tab pos_state, bit tree @tab after sequence 10
+@item bm_high @tab bit tree @tab after sequence 11
+@end multitable
+
+@sp 1
+The context array @samp{bm_literal} is special. In principle it acts as
+a normal bit tree context, the one selected by @samp{literal_state}. But
+if the previous decoded byte was not a literal, two other bit tree
+contexts are used depending on the value of each bit in
+@samp{match_byte} (the byte at the latest used distance), until a bit is
+decoded that is different from its corresponding bit in
+@samp{match_byte}. After the first difference is found, the rest of the
+byte is decoded using the normal bit tree context. (See
+@samp{decode_matched} in the source).
+
+@sp 1
+@section The range decoder
+
+The LZMA stream is consumed one byte at a time by the range decoder.
+(See @samp{normalize} in the source). Every byte consumed produces a
+variable number of decoded bits, depending on how well these bits agree
+with their context. (See @samp{decode_bit} in the source).
+
+The range decoder state consists of two unsigned 32-bit variables:
+@samp{range} (representing the most significant part of the range size
+not yet decoded) and @samp{code} (representing the current point within
+@samp{range}). @samp{range} is initialized to @w{2^32 - 1}, and
+@samp{code} is initialized to 0.
+
+The range encoder produces a first 0 byte that must be ignored by the
+range decoder. (See the @samp{Range_decoder} constructor in the source).
+
+@sp 1
+@section Decoding and checking the LZMA stream
+
+After decoding the member header and obtaining the dictionary size, the
+range decoder is initialized and then the LZMA decoder enters a loop
+(see @samp{decode_member} in the source) where it invokes the range
+decoder with the appropriate contexts to decode the different coding
+sequences (matches, repeated matches, and literal bytes), until the "End
+Of Stream" marker is decoded.
+
+Once the "End Of Stream" marker has been decoded, the decompressor reads and
+decodes the member trailer, and checks that the three integrity factors
+stored there (CRC, data size, and member size) match those computed from the
+data.
+
+
+@node Trailing data
+@chapter Extra data appended to the file
+@cindex trailing data
+
+Sometimes extra data are found appended to a lzip file after the last
+member. Such trailing data may be:
+
+@itemize @bullet
+@item
+Padding added to make the file size a multiple of some block size, for
+example when writing to a tape. It is safe to append any amount of
+padding zero bytes to a lzip file.
+
+@item
+Useful data added by the user; an "End Of File" string (to check that the
+file has not been truncated), a cryptographically secure hash, a description
+of file contents, etc. It is safe to append any amount of text to a lzip
+file as long as none of the first four bytes of the text matches the
+corresponding byte in the string "LZIP", and the text does not contain any
+zero bytes (null characters). Nonzero bytes and zero bytes can't be safely
+mixed in trailing data.
+
+@item
+Garbage added by some not totally successful copy operation.
+
+@item
+Malicious data added to the file in order to make its total size and
+hash value (for a chosen hash) coincide with those of another file.
+
+@item
+In rare cases, trailing data could be the corrupt header of another
+member. In multimember or concatenated files the probability of
+corruption happening in the magic bytes is 5 times smaller than the
+probability of getting a false positive caused by the corruption of the
+integrity information itself. Therefore it can be considered to be below
+the noise level. Additionally, the test used by clzip to discriminate
+trailing data from a corrupt header has a Hamming distance (HD) of 3,
+and the 3 bit flips must happen in different magic bytes for the test to
+fail. In any case, the option @option{--trailing-error} guarantees that
+any corrupt header is detected.
+@end itemize
+
+Trailing data are in no way part of the lzip file format, but tools
+reading lzip files are expected to behave as correctly and usefully as
+possible in the presence of trailing data.
+
+Trailing data can be safely ignored in most cases. In some cases, like
+that of user-added data, they are expected to be ignored. In those cases
+where a file containing trailing data must be rejected, the option
+@option{--trailing-error} can be used. @xref{--trailing-error}.
+
+
+@node Examples
+@chapter A small tutorial with examples
+@cindex examples
+
+WARNING! Even if clzip is bug-free, other causes may result in a corrupt
+compressed file (bugs in the system libraries, memory errors, etc).
+Therefore, if the data you are going to compress are important, give the
+option @option{--keep} to clzip and don't remove the original file until you
+check the compressed file with a command like
+@w{@samp{clzip -cd file.lz | cmp file -}}. Most RAM errors happening during
+compression can only be detected by comparing the compressed file with the
+original because the corruption happens before clzip compresses the RAM
+contents, resulting in a valid compressed file containing wrong data.
+
+@sp 1
+@noindent
+Example 1: Extract all the files from archive @samp{foo.tar.lz}.
+
+@example
+  tar -xf foo.tar.lz
+or
+  clzip -cd foo.tar.lz | tar -xf -
+@end example
+
+@sp 1
+@noindent
+Example 2: Replace a regular file with its compressed version @samp{file.lz}
+and show the compression ratio.
+
+@example
+clzip -v file
+@end example
+
+@sp 1
+@noindent
+Example 3: Like example 2 but the created @samp{file.lz} is multimember with
+a member size of @w{1 MiB}. The compression ratio is not shown.
+
+@example
+clzip -b 1MiB file
+@end example
+
+@sp 1
+@noindent
+Example 4: Restore a regular file from its compressed version
+@samp{file.lz}. If the operation is successful, @samp{file.lz} is removed.
+
+@example
+clzip -d file.lz
+@end example
+
+@sp 1
+@noindent
+Example 5: Check the integrity of the compressed file @samp{file.lz} and
+show status.
+
+@example
+clzip -tv file.lz
+@end example
+
+@sp 1
+@anchor{concat-example}
+@noindent
+Example 6: The right way of concatenating the decompressed output of two or
+more compressed files. @xref{Trailing data}.
+
+@example
+Don't do this
+  cat file1.lz file2.lz file3.lz | clzip -d -
+Do this instead
+  clzip -cd file1.lz file2.lz file3.lz
+@end example
+
+@sp 1
+@noindent
+Example 7: Decompress @samp{file.lz} partially until @w{10 KiB} of
+decompressed data are produced.
+
+@example
+clzip -cd file.lz | dd bs=1024 count=10
+@end example
+
+@sp 1
+@noindent
+Example 8: Decompress @samp{file.lz} partially from decompressed byte at
+offset 10000 to decompressed byte at offset 14999 (5000 bytes are produced).
+
+@example
+clzip -cd file.lz | dd bs=1000 skip=10 count=5
+@end example
+
+@sp 1
+@noindent
+Example 9: Compress a whole device in /dev/sdc and send the output to
+@samp{file.lz}.
+
+@example
+  clzip -c /dev/sdc > file.lz
+or
+  clzip /dev/sdc -o file.lz
+@end example
+
+@sp 1
+@noindent
+Example 10: Create a multivolume compressed tar archive with a volume size
+of @w{1440 KiB}.
+
+@example
+tar -c some_directory | clzip -S 1440KiB -o volume_name -
+@end example
+
+@sp 1
+@noindent
+Example 11: Extract a multivolume compressed tar archive.
+
+@example
+clzip -cd volume_name*.lz | tar -xf -
+@end example
+
+@sp 1
+@noindent
+Example 12: Create a multivolume compressed backup of a large database file
+with a volume size of @w{650 MB}, where each volume is a multimember file
+with a member size of @w{32 MiB}.
+
+@example
+clzip -b 32MiB -S 650MB big_db
+@end example
+
+
+@node Problems
+@chapter Reporting bugs
+@cindex bugs
+@cindex getting help
+
+There are probably bugs in clzip. There are certainly errors and
+omissions in this manual. If you report them, they will get fixed. If
+you don't, no one will ever know about them and they will remain unfixed
+for all eternity, if not longer.
+
+If you find a bug in clzip, please send electronic mail to
+@email{lzip-bug@@nongnu.org}. Include the version number, which you can
+find by running @w{@samp{clzip --version}}.
+
+
+@node Reference source code
+@appendix Reference source code
+@cindex reference source code
+
+@verbatim
+/* Lzd - Educational decompressor for the lzip format
+   Copyright (C) 2013-2024 Antonio Diaz Diaz.
+
+   This program is free software. Redistribution and use in source and
+   binary forms, with or without modification, are permitted provided
+   that the following conditions are met:
+
+   1. Redistributions of source code must retain the above copyright
+   notice, this list of conditions, and the following disclaimer.
+
+   2. Redistributions in binary form must reproduce the above copyright
+   notice, this list of conditions, and the following disclaimer in the
+   documentation and/or other materials provided with the distribution.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+*/
+/*
+   Exit status: 0 for a normal exit, 1 for environmental problems
+   (file not found, invalid command-line options, I/O errors, etc), 2 to
+   indicate a corrupt or invalid input file.
+*/
+
+#include <algorithm>
+#include <cerrno>
+#include <cstdio>
+#include <cstdlib>
+#include <cstring>
+#include <stdint.h>
+#include <unistd.h>
+#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__
+#include <fcntl.h>
+#include <io.h>
+#endif
+
+
+class State
+  {
+  int st;
+
+public:
+  enum { states = 12 };
+  State() : st( 0 ) {}
+  int operator()() const { return st; }
+  bool is_char() const { return st < 7; }
+
+  void set_char()
+    {
+    const int next[states] = { 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 4, 5 };
+    st = next[st];
+    }
+  void set_match()     { st = ( st < 7 ) ? 7 : 10; }
+  void set_rep()       { st = ( st < 7 ) ? 8 : 11; }
+  void set_short_rep() { st = ( st < 7 ) ? 9 : 11; }
+  };
+
+
+enum {
+  min_dictionary_size = 1 << 12,
+  max_dictionary_size = 1 << 29,
+  literal_context_bits = 3,
+  literal_pos_state_bits = 0,				// not used
+  pos_state_bits = 2,
+  pos_states = 1 << pos_state_bits,
+  pos_state_mask = pos_states - 1,
+
+  len_states = 4,
+  dis_slot_bits = 6,
+  start_dis_model = 4,
+  end_dis_model = 14,
+  modeled_distances = 1 << ( end_dis_model / 2 ),	// 128
+  dis_align_bits = 4,
+  dis_align_size = 1 << dis_align_bits,
+
+  len_low_bits = 3,
+  len_mid_bits = 3,
+  len_high_bits = 8,
+  len_low_symbols = 1 << len_low_bits,
+  len_mid_symbols = 1 << len_mid_bits,
+  len_high_symbols = 1 << len_high_bits,
+  max_len_symbols = len_low_symbols + len_mid_symbols + len_high_symbols,
+
+  min_match_len = 2,					// must be 2
+
+  bit_model_move_bits = 5,
+  bit_model_total_bits = 11,
+  bit_model_total = 1 << bit_model_total_bits };
+
+struct Bit_model
+  {
+  int probability;
+  Bit_model() : probability( bit_model_total / 2 ) {}
+  };
+
+struct Len_model
+  {
+  Bit_model choice1;
+  Bit_model choice2;
+  Bit_model bm_low[pos_states][len_low_symbols];
+  Bit_model bm_mid[pos_states][len_mid_symbols];
+  Bit_model bm_high[len_high_symbols];
+  };
+
+
+class CRC32
+  {
+  uint32_t data[256];		// Table of CRCs of all 8-bit messages.
+
+public:
+  CRC32()
+    {
+    for( unsigned n = 0; n < 256; ++n )
+      {
+      unsigned c = n;
+      for( int k = 0; k < 8; ++k )
+        { if( c & 1 ) c = 0xEDB88320U ^ ( c >> 1 ); else c >>= 1; }
+      data[n] = c;
+      }
+    }
+
+  void update_buf( uint32_t & crc, const uint8_t * const buffer,
+                   const int size ) const
+    {
+    for( int i = 0; i < size; ++i )
+      crc = data[(crc^buffer[i])&0xFF] ^ ( crc >> 8 );
+    }
+  };
+
+const CRC32 crc32;
+
+
+enum { header_size = 6, trailer_size = 20 };
+typedef uint8_t Lzip_header[header_size]; // 0-3 magic bytes
+					  //   4 version
+					  //   5 coded dictionary size
+typedef uint8_t Lzip_trailer[trailer_size];
+			//  0-3  CRC32 of the uncompressed data
+			//  4-11 size of the uncompressed data
+			// 12-19 member size including header and trailer
+
+class Range_decoder
+  {
+  unsigned long long member_pos;
+  uint32_t code;
+  uint32_t range;
+
+public:
+  Range_decoder()
+    : member_pos( header_size ), code( 0 ), range( 0xFFFFFFFFU )
+    {
+    get_byte();			// discard first byte of the LZMA stream
+    for( int i = 0; i < 4; ++i ) code = ( code << 8 ) | get_byte();
+    }
+
+  uint8_t get_byte() { ++member_pos; return std::getc( stdin ); }
+  unsigned long long member_position() const { return member_pos; }
+
+  unsigned decode( const int num_bits )
+    {
+    unsigned symbol = 0;
+    for( int i = num_bits; i > 0; --i )
+      {
+      range >>= 1;
+      symbol <<= 1;
+      if( code >= range ) { code -= range; symbol |= 1; }
+      if( range <= 0x00FFFFFFU )			// normalize
+        { range <<= 8; code = ( code << 8 ) | get_byte(); }
+      }
+    return symbol;
+    }
+
+  bool decode_bit( Bit_model & bm )
+    {
+    bool symbol;
+    const uint32_t bound = ( range >> bit_model_total_bits ) * bm.probability;
+    if( code < bound )
+      {
+      range = bound;
+      bm.probability +=
+        ( bit_model_total - bm.probability ) >> bit_model_move_bits;
+      symbol = 0;
+      }
+    else
+      {
+      code -= bound;
+      range -= bound;
+      bm.probability -= bm.probability >> bit_model_move_bits;
+      symbol = 1;
+      }
+    if( range <= 0x00FFFFFFU )				// normalize
+      { range <<= 8; code = ( code << 8 ) | get_byte(); }
+    return symbol;
+    }
+
+  unsigned decode_tree( Bit_model bm[], const int num_bits )
+    {
+    unsigned symbol = 1;
+    for( int i = 0; i < num_bits; ++i )
+      symbol = ( symbol << 1 ) | decode_bit( bm[symbol] );
+    return symbol - ( 1 << num_bits );
+    }
+
+  unsigned decode_tree_reversed( Bit_model bm[], const int num_bits )
+    {
+    unsigned symbol = decode_tree( bm, num_bits );
+    unsigned reversed_symbol = 0;
+    for( int i = 0; i < num_bits; ++i )
+      {
+      reversed_symbol = ( reversed_symbol << 1 ) | ( symbol & 1 );
+      symbol >>= 1;
+      }
+    return reversed_symbol;
+    }
+
+  unsigned decode_matched( Bit_model bm[], const unsigned match_byte )
+    {
+    unsigned symbol = 1;
+    for( int i = 7; i >= 0; --i )
+      {
+      const bool match_bit = ( match_byte >> i ) & 1;
+      const bool bit = decode_bit( bm[symbol+(match_bit<<8)+0x100] );
+      symbol = ( symbol << 1 ) | bit;
+      if( match_bit != bit )
+        {
+        while( symbol < 0x100 )
+          symbol = ( symbol << 1 ) | decode_bit( bm[symbol] );
+        break;
+        }
+      }
+    return symbol & 0xFF;
+    }
+
+  unsigned decode_len( Len_model & lm, const int pos_state )
+    {
+    if( decode_bit( lm.choice1 ) == 0 )
+      return min_match_len +
+             decode_tree( lm.bm_low[pos_state], len_low_bits );
+    if( decode_bit( lm.choice2 ) == 0 )
+      return min_match_len + len_low_symbols +
+             decode_tree( lm.bm_mid[pos_state], len_mid_bits );
+    return min_match_len + len_low_symbols + len_mid_symbols +
+           decode_tree( lm.bm_high, len_high_bits );
+    }
+  };
+
+
+class LZ_decoder
+  {
+  unsigned long long partial_data_pos;
+  Range_decoder rdec;
+  const unsigned dictionary_size;
+  uint8_t * const buffer;	// output buffer
+  unsigned pos;			// current pos in buffer
+  unsigned stream_pos;		// first byte not yet written to stdout
+  uint32_t crc_;
+  bool pos_wrapped;
+
+  void flush_data();
+
+  uint8_t peek( const unsigned distance ) const
+    {
+    if( pos > distance ) return buffer[pos - distance - 1];
+    if( pos_wrapped ) return buffer[dictionary_size + pos - distance - 1];
+    return 0;			// prev_byte of first byte
+    }
+
+  void put_byte( const uint8_t b )
+    {
+    buffer[pos] = b;
+    if( ++pos >= dictionary_size ) flush_data();
+    }
+
+public:
+  explicit LZ_decoder( const unsigned dict_size )
+    :
+    partial_data_pos( 0 ),
+    dictionary_size( dict_size ),
+    buffer( new uint8_t[dictionary_size] ),
+    pos( 0 ),
+    stream_pos( 0 ),
+    crc_( 0xFFFFFFFFU ),
+    pos_wrapped( false )
+    {}
+
+  ~LZ_decoder() { delete[] buffer; }
+
+  unsigned crc() const { return crc_ ^ 0xFFFFFFFFU; }
+  unsigned long long data_position() const
+    { return partial_data_pos + pos; }
+  uint8_t get_byte() { return rdec.get_byte(); }
+  unsigned long long member_position() const
+    { return rdec.member_position(); }
+
+  bool decode_member();
+  };
+
+
+void LZ_decoder::flush_data()
+  {
+  if( pos > stream_pos )
+    {
+    const unsigned size = pos - stream_pos;
+    crc32.update_buf( crc_, buffer + stream_pos, size );
+    if( std::fwrite( buffer + stream_pos, 1, size, stdout ) != size )
+      { std::fprintf( stderr, "Write error: %s\n", std::strerror( errno ) );
+        std::exit( 1 ); }
+    if( pos >= dictionary_size )
+      { partial_data_pos += pos; pos = 0; pos_wrapped = true; }
+    stream_pos = pos;
+    }
+  }
+
+
+bool LZ_decoder::decode_member()	// Return false if error
+  {
+  Bit_model bm_literal[1<<literal_context_bits][0x300];
+  Bit_model bm_match[State::states][pos_states];
+  Bit_model bm_rep[State::states];
+  Bit_model bm_rep0[State::states];
+  Bit_model bm_rep1[State::states];
+  Bit_model bm_rep2[State::states];
+  Bit_model bm_len[State::states][pos_states];
+  Bit_model bm_dis_slot[len_states][1<<dis_slot_bits];
+  Bit_model bm_dis[modeled_distances-end_dis_model+1];
+  Bit_model bm_align[dis_align_size];
+  Len_model match_len_model;
+  Len_model rep_len_model;
+  unsigned rep0 = 0;		// rep[0-3] latest four distances
+  unsigned rep1 = 0;		// used for efficient coding of
+  unsigned rep2 = 0;		// repeated distances
+  unsigned rep3 = 0;
+  State state;
+
+  while( !std::feof( stdin ) && !std::ferror( stdin ) )
+    {
+    const int pos_state = data_position() & pos_state_mask;
+    if( rdec.decode_bit( bm_match[state()][pos_state] ) == 0 )	// 1st bit
+      {
+      // literal byte
+      const uint8_t prev_byte = peek( 0 );
+      const int literal_state = prev_byte >> ( 8 - literal_context_bits );
+      Bit_model * const bm = bm_literal[literal_state];
+      if( state.is_char() )
+        put_byte( rdec.decode_tree( bm, 8 ) );
+      else
+        put_byte( rdec.decode_matched( bm, peek( rep0 ) ) );
+      state.set_char();
+      continue;
+      }
+    // match or repeated match
+    int len;
+    if( rdec.decode_bit( bm_rep[state()] ) != 0 )		// 2nd bit
+      {
+      if( rdec.decode_bit( bm_rep0[state()] ) == 0 )		// 3rd bit
+        {
+        if( rdec.decode_bit( bm_len[state()][pos_state] ) == 0 ) // 4th bit
+          { state.set_short_rep(); put_byte( peek( rep0 ) ); continue; }
+        }
+      else
+        {
+        unsigned distance;
+        if( rdec.decode_bit( bm_rep1[state()] ) == 0 )		// 4th bit
+          distance = rep1;
+        else
+          {
+          if( rdec.decode_bit( bm_rep2[state()] ) == 0 )	// 5th bit
+            distance = rep2;
+          else
+            { distance = rep3; rep3 = rep2; }
+          rep2 = rep1;
+          }
+        rep1 = rep0;
+        rep0 = distance;
+        }
+      state.set_rep();
+      len = rdec.decode_len( rep_len_model, pos_state );
+      }
+    else					// match
+      {
+      rep3 = rep2; rep2 = rep1; rep1 = rep0;
+      len = rdec.decode_len( match_len_model, pos_state );
+      const int len_state = std::min( len - min_match_len, len_states - 1 );
+      rep0 = rdec.decode_tree( bm_dis_slot[len_state], dis_slot_bits );
+      if( rep0 >= start_dis_model )
+        {
+        const unsigned dis_slot = rep0;
+        const int direct_bits = ( dis_slot >> 1 ) - 1;
+        rep0 = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
+        if( dis_slot < end_dis_model )
+          rep0 += rdec.decode_tree_reversed( bm_dis + ( rep0 - dis_slot ),
+                                             direct_bits );
+        else
+          {
+          rep0 +=
+            rdec.decode( direct_bits - dis_align_bits ) << dis_align_bits;
+          rep0 += rdec.decode_tree_reversed( bm_align, dis_align_bits );
+          if( rep0 == 0xFFFFFFFFU )		// marker found
+            {
+            flush_data();
+            return len == min_match_len;	// End Of Stream marker
+            }
+          }
+        }
+      state.set_match();
+      if( rep0 >= dictionary_size || ( rep0 >= pos && !pos_wrapped ) )
+        { flush_data(); return false; }
+      }
+    for( int i = 0; i < len; ++i ) put_byte( peek( rep0 ) );
+    }
+  flush_data();
+  return false;
+  }
+
+
+int main( const int argc, const char * const argv[] )
+  {
+  if( argc > 2 || ( argc == 2 && std::strcmp( argv[1], "-d" ) != 0 ) )
+    {
+    std::printf(
+      "Lzd %s - Educational decompressor for the lzip format.\n"
+      "Study the source code to learn how a lzip decompressor works.\n"
+      "See the lzip manual for an explanation of the code.\n"
+      "\nUsage: %s [-d] < file.lz > file\n"
+      "Lzd decompresses from standard input to standard output.\n"
+      "\nCopyright (C) 2024 Antonio Diaz Diaz.\n"
+      "License 2-clause BSD.\n"
+      "This is free software: you are free to change and redistribute it.\n"
+      "There is NO WARRANTY, to the extent permitted by law.\n"
+      "Report bugs to lzip-bug@nongnu.org\n"
+      "Lzd home page: http://www.nongnu.org/lzip/lzd.html\n",
+      PROGVERSION, argv[0] );
+    return 0;
+    }
+
+#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__
+  setmode( STDIN_FILENO, O_BINARY );
+  setmode( STDOUT_FILENO, O_BINARY );
+#endif
+
+  for( bool first_member = true; ; first_member = false )
+    {
+    Lzip_header header;				// check header
+    for( int i = 0; i < header_size; ++i ) header[i] = std::getc( stdin );
+    if( std::feof( stdin ) || std::memcmp( header, "LZIP\x01", 5 ) != 0 )
+      {
+      if( first_member )
+        { std::fputs( "Bad magic number (file not in lzip format).\n",
+                      stderr ); return 2; }
+      break;					// ignore trailing data
+      }
+    unsigned dict_size = 1 << ( header[5] & 0x1F );
+    dict_size -= ( dict_size / 16 ) * ( ( header[5] >> 5 ) & 7 );
+    if( dict_size < min_dictionary_size || dict_size > max_dictionary_size )
+      { std::fputs( "Invalid dictionary size in member header.\n", stderr );
+        return 2; }
+
+    LZ_decoder decoder( dict_size );		// decode LZMA stream
+    if( !decoder.decode_member() )
+      { std::fputs( "Data error\n", stderr ); return 2; }
+
+    Lzip_trailer trailer;			// check trailer
+    for( int i = 0; i < trailer_size; ++i ) trailer[i] = decoder.get_byte();
+    int retval = 0;
+    unsigned crc = 0;
+    for( int i = 3; i >= 0; --i ) crc = ( crc << 8 ) + trailer[i];
+    if( crc != decoder.crc() )
+      { std::fputs( "CRC mismatch\n", stderr ); retval = 2; }
+
+    unsigned long long data_size = 0;
+    for( int i = 11; i >= 4; --i )
+      data_size = ( data_size << 8 ) + trailer[i];
+    if( data_size != decoder.data_position() )
+      { std::fputs( "Data size mismatch\n", stderr ); retval = 2; }
+
+    unsigned long long member_size = 0;
+    for( int i = 19; i >= 12; --i )
+      member_size = ( member_size << 8 ) + trailer[i];
+    if( member_size != decoder.member_position() )
+      { std::fputs( "Member size mismatch\n", stderr ); retval = 2; }
+    if( retval ) return retval;
+    }
+
+  if( std::fclose( stdout ) != 0 )
+    { std::fprintf( stderr, "Error closing stdout: %s\n",
+                    std::strerror( errno ) ); return 1; }
+  return 0;
+  }
+@end verbatim
+
+
+@node Concept index
+@unnumbered Concept index
+
+@printindex cp
+
+@bye
diff --git a/encoder.c b/encoder.c
new file mode 100644
index 0000000..1e2bd64
--- /dev/null
+++ b/encoder.c
@@ -0,0 +1,602 @@
+/* Clzip - LZMA lossless data compressor
+   Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+*/
+
+#define _FILE_OFFSET_BITS 64
+
+#include <errno.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "lzip.h"
+#include "encoder_base.h"
+#include "encoder.h"
+
+
+CRC32 crc32;
+
+
+int LZe_get_match_pairs( struct LZ_encoder * const e, struct Pair * pairs )
+  {
+  int len_limit = e->match_len_limit;
+  if( len_limit > Mb_available_bytes( &e->eb.mb ) )
+    {
+    len_limit = Mb_available_bytes( &e->eb.mb );
+    if( len_limit < 4 ) return 0;
+    }
+
+  int maxlen = 3;			/* only used if pairs != 0 */
+  int num_pairs = 0;
+  const int min_pos = ( e->eb.mb.pos > e->eb.mb.dictionary_size ) ?
+                        e->eb.mb.pos - e->eb.mb.dictionary_size : 0;
+  const uint8_t * const data = Mb_ptr_to_current_pos( &e->eb.mb );
+
+  unsigned tmp = crc32[data[0]] ^ data[1];
+  const int key2 = tmp & ( num_prev_positions2 - 1 );
+  tmp ^= (unsigned)data[2] << 8;
+  const int key3 = num_prev_positions2 + ( tmp & ( num_prev_positions3 - 1 ) );
+  const int key4 = num_prev_positions2 + num_prev_positions3 +
+                   ( ( tmp ^ ( crc32[data[3]] << 5 ) ) & e->eb.mb.key4_mask );
+
+  if( pairs )
+    {
+    const int np2 = e->eb.mb.prev_positions[key2];
+    const int np3 = e->eb.mb.prev_positions[key3];
+    if( np2 > min_pos && e->eb.mb.buffer[np2-1] == data[0] )
+      {
+      pairs[0].dis = e->eb.mb.pos - np2;
+      pairs[0].len = maxlen = 2 + ( np2 == np3 );
+      num_pairs = 1;
+      }
+    if( np2 != np3 && np3 > min_pos && e->eb.mb.buffer[np3-1] == data[0] )
+      {
+      maxlen = 3;
+      pairs[num_pairs++].dis = e->eb.mb.pos - np3;
+      }
+    if( num_pairs > 0 )
+      {
+      const int delta = pairs[num_pairs-1].dis + 1;
+      while( maxlen < len_limit && data[maxlen-delta] == data[maxlen] )
+        ++maxlen;
+      pairs[num_pairs-1].len = maxlen;
+      if( maxlen < 3 ) maxlen = 3;
+      if( maxlen >= len_limit ) pairs = 0;	/* done. now just skip */
+      }
+    }
+
+  const int pos1 = e->eb.mb.pos + 1;
+  e->eb.mb.prev_positions[key2] = pos1;
+  e->eb.mb.prev_positions[key3] = pos1;
+  int newpos1 = e->eb.mb.prev_positions[key4];
+  e->eb.mb.prev_positions[key4] = pos1;
+
+  int32_t * ptr0 = e->eb.mb.pos_array + ( e->eb.mb.cyclic_pos << 1 );
+  int32_t * ptr1 = ptr0 + 1;
+  int len = 0, len0 = 0, len1 = 0;
+
+  int count;
+  for( count = e->cycles; ; )
+    {
+    if( newpos1 <= min_pos || --count < 0 ) { *ptr0 = *ptr1 = 0; break; }
+
+    const int delta = pos1 - newpos1;
+    int32_t * const newptr = e->eb.mb.pos_array +
+      ( ( e->eb.mb.cyclic_pos - delta +
+          ( (e->eb.mb.cyclic_pos >= delta) ? 0 : e->eb.mb.dictionary_size + 1 ) ) << 1 );
+    if( data[len-delta] == data[len] )
+      {
+      while( ++len < len_limit && data[len-delta] == data[len] ) {}
+      if( pairs && maxlen < len )
+        {
+        pairs[num_pairs].dis = delta - 1;
+        pairs[num_pairs].len = maxlen = len;
+        ++num_pairs;
+        }
+      if( len >= len_limit )
+        {
+        *ptr0 = newptr[0];
+        *ptr1 = newptr[1];
+        break;
+        }
+      }
+    if( data[len-delta] < data[len] )
+      {
+      *ptr0 = newpos1;
+      ptr0 = newptr + 1;
+      newpos1 = *ptr0;
+      len0 = len; if( len1 < len ) len = len1;
+      }
+    else
+      {
+      *ptr1 = newpos1;
+      ptr1 = newptr;
+      newpos1 = *ptr1;
+      len1 = len; if( len0 < len ) len = len0;
+      }
+    }
+  return num_pairs;
+  }
+
+
+static void LZe_update_distance_prices( struct LZ_encoder * const e )
+  {
+  int dis, len_state;
+  for( dis = start_dis_model; dis < modeled_distances; ++dis )
+    {
+    const int dis_slot = dis_slots[dis];
+    const int direct_bits = ( dis_slot >> 1 ) - 1;
+    const int base = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
+    const int price = price_symbol_reversed( e->eb.bm_dis + ( base - dis_slot ),
+                                             dis - base, direct_bits );
+    for( len_state = 0; len_state < len_states; ++len_state )
+      e->dis_prices[len_state][dis] = price;
+    }
+
+  for( len_state = 0; len_state < len_states; ++len_state )
+    {
+    int * const dsp = e->dis_slot_prices[len_state];
+    const Bit_model * const bmds = e->eb.bm_dis_slot[len_state];
+    int slot = 0;
+    for( ; slot < end_dis_model; ++slot )
+      dsp[slot] = price_symbol6( bmds, slot );
+    for( ; slot < e->num_dis_slots; ++slot )
+      dsp[slot] = price_symbol6( bmds, slot ) +
+                  (((( slot >> 1 ) - 1 ) - dis_align_bits ) << price_shift_bits );
+
+    int * const dp = e->dis_prices[len_state];
+    for( dis = 0; dis < start_dis_model; ++dis )
+      dp[dis] = dsp[dis];
+    for( ; dis < modeled_distances; ++dis )
+      dp[dis] += dsp[dis_slots[dis]];
+    }
+  }
+
+
+/* Return the number of bytes advanced (ahead).
+   trials[0]..trials[ahead-1] contain the steps to encode.
+   ( trials[0].dis4 == -1 ) means literal.
+   A match/rep longer or equal than match_len_limit finishes the sequence.
+*/
+static int LZe_sequence_optimizer( struct LZ_encoder * const e,
+                                   const int reps[num_rep_distances],
+                                   const State state )
+  {
+  int num_pairs, num_trials;
+  int i, rep, len;
+
+  if( e->pending_num_pairs > 0 )		/* from previous call */
+    {
+    num_pairs = e->pending_num_pairs;
+    e->pending_num_pairs = 0;
+    }
+  else
+    num_pairs = LZe_read_match_distances( e );
+  const int main_len = ( num_pairs > 0 ) ? e->pairs[num_pairs-1].len : 0;
+
+  int replens[num_rep_distances];
+  int rep_index = 0;
+  for( i = 0; i < num_rep_distances; ++i )
+    {
+    replens[i] = Mb_true_match_len( &e->eb.mb, 0, reps[i] + 1 );
+    if( replens[i] > replens[rep_index] ) rep_index = i;
+    }
+  if( replens[rep_index] >= e->match_len_limit )
+    {
+    e->trials[0].price = replens[rep_index];
+    e->trials[0].dis4 = rep_index;
+    LZe_move_and_update( e, replens[rep_index] );
+    return replens[rep_index];
+    }
+
+  if( main_len >= e->match_len_limit )
+    {
+    e->trials[0].price = main_len;
+    e->trials[0].dis4 = e->pairs[num_pairs-1].dis + num_rep_distances;
+    LZe_move_and_update( e, main_len );
+    return main_len;
+    }
+
+  const int pos_state = Mb_data_position( &e->eb.mb ) & pos_state_mask;
+  const uint8_t prev_byte = Mb_peek( &e->eb.mb, 1 );
+  const uint8_t cur_byte = Mb_peek( &e->eb.mb, 0 );
+  const uint8_t match_byte = Mb_peek( &e->eb.mb, reps[0] + 1 );
+
+  e->trials[1].price = price0( e->eb.bm_match[state][pos_state] );
+  if( St_is_char( state ) )
+    e->trials[1].price += LZeb_price_literal( &e->eb, prev_byte, cur_byte );
+  else
+    e->trials[1].price += LZeb_price_matched( &e->eb, prev_byte, cur_byte, match_byte );
+  e->trials[1].dis4 = -1;				/* literal */
+
+  const int match_price = price1( e->eb.bm_match[state][pos_state] );
+  const int rep_match_price = match_price + price1( e->eb.bm_rep[state] );
+
+  if( match_byte == cur_byte )
+    Tr_update( &e->trials[1], rep_match_price +
+               LZeb_price_shortrep( &e->eb, state, pos_state ), 0, 0 );
+
+  num_trials = max( main_len, replens[rep_index] );
+
+  if( num_trials < min_match_len )
+    {
+    e->trials[0].price = 1;
+    e->trials[0].dis4 = e->trials[1].dis4;
+    Mb_move_pos( &e->eb.mb );
+    return 1;
+    }
+
+  e->trials[0].state = state;
+  for( i = 0; i < num_rep_distances; ++i )
+    e->trials[0].reps[i] = reps[i];
+
+  for( len = min_match_len; len <= num_trials; ++len )
+    e->trials[len].price = infinite_price;
+
+  for( rep = 0; rep < num_rep_distances; ++rep )
+    {
+    if( replens[rep] < min_match_len ) continue;
+    const int price = rep_match_price + LZeb_price_rep( &e->eb, rep, state, pos_state );
+    for( len = min_match_len; len <= replens[rep]; ++len )
+      Tr_update( &e->trials[len], price +
+                 Lp_price( &e->rep_len_prices, len, pos_state ), rep, 0 );
+    }
+
+  if( main_len > replens[0] )
+    {
+    const int normal_match_price = match_price + price0( e->eb.bm_rep[state] );
+    int i = 0, len = max( replens[0] + 1, min_match_len );
+    while( len > e->pairs[i].len ) ++i;
+    while( true )
+      {
+      const int dis = e->pairs[i].dis;
+      Tr_update( &e->trials[len], normal_match_price +
+                 LZe_price_pair( e, dis, len, pos_state ),
+                 dis + num_rep_distances, 0 );
+      if( ++len > e->pairs[i].len && ++i >= num_pairs ) break;
+      }
+    }
+
+  int cur = 0;
+  while( true )				/* price optimization loop */
+    {
+    Mb_move_pos( &e->eb.mb );
+    if( ++cur >= num_trials )		/* no more initialized trials */
+      {
+      LZe_backward( e, cur );
+      return cur;
+      }
+
+    const int num_pairs = LZe_read_match_distances( e );
+    const int newlen = ( num_pairs > 0 ) ? e->pairs[num_pairs-1].len : 0;
+    if( newlen >= e->match_len_limit )
+      {
+      e->pending_num_pairs = num_pairs;
+      LZe_backward( e, cur );
+      return cur;
+      }
+
+    /* give final values to current trial */
+    struct Trial * cur_trial = &e->trials[cur];
+    State cur_state;
+    {
+    const int dis4 = cur_trial->dis4;
+    int prev_index = cur_trial->prev_index;
+    const int prev_index2 = cur_trial->prev_index2;
+
+    if( prev_index2 == single_step_trial )
+      {
+      cur_state = e->trials[prev_index].state;
+      if( prev_index + 1 == cur )			/* len == 1 */
+        {
+        if( dis4 == 0 ) cur_state = St_set_short_rep( cur_state );
+        else cur_state = St_set_char( cur_state );	/* literal */
+        }
+      else if( dis4 < num_rep_distances ) cur_state = St_set_rep( cur_state );
+      else cur_state = St_set_match( cur_state );
+      }
+    else
+      {
+      if( prev_index2 == dual_step_trial )	/* dis4 == 0 (rep0) */
+        --prev_index;
+      else					/* prev_index2 >= 0 */
+        prev_index = prev_index2;
+      cur_state = St_set_char_rep();
+      }
+    cur_trial->state = cur_state;
+    for( i = 0; i < num_rep_distances; ++i )
+      cur_trial->reps[i] = e->trials[prev_index].reps[i];
+    mtf_reps( dis4, cur_trial->reps );		/* literal is ignored */
+    }
+
+    const int pos_state = Mb_data_position( &e->eb.mb ) & pos_state_mask;
+    const uint8_t prev_byte = Mb_peek( &e->eb.mb, 1 );
+    const uint8_t cur_byte = Mb_peek( &e->eb.mb, 0 );
+    const uint8_t match_byte = Mb_peek( &e->eb.mb, cur_trial->reps[0] + 1 );
+
+    int next_price = cur_trial->price +
+                     price0( e->eb.bm_match[cur_state][pos_state] );
+    if( St_is_char( cur_state ) )
+      next_price += LZeb_price_literal( &e->eb, prev_byte, cur_byte );
+    else
+      next_price += LZeb_price_matched( &e->eb, prev_byte, cur_byte, match_byte );
+
+    /* try last updates to next trial */
+    struct Trial * next_trial = &e->trials[cur+1];
+
+    Tr_update( next_trial, next_price, -1, cur );	/* literal */
+
+    const int match_price = cur_trial->price + price1( e->eb.bm_match[cur_state][pos_state] );
+    const int rep_match_price = match_price + price1( e->eb.bm_rep[cur_state] );
+
+    if( match_byte == cur_byte && next_trial->dis4 != 0 &&
+        next_trial->prev_index2 == single_step_trial )
+      {
+      const int price = rep_match_price +
+                        LZeb_price_shortrep( &e->eb, cur_state, pos_state );
+      if( price <= next_trial->price )
+        {
+        next_trial->price = price;
+        next_trial->dis4 = 0;				/* rep0 */
+        next_trial->prev_index = cur;
+        }
+      }
+
+    const int triable_bytes =
+      min( Mb_available_bytes( &e->eb.mb ), max_num_trials - 1 - cur );
+    if( triable_bytes < min_match_len ) continue;
+
+    const int len_limit = min( e->match_len_limit, triable_bytes );
+
+    /* try literal + rep0 */
+    if( match_byte != cur_byte && next_trial->prev_index != cur )
+      {
+      const uint8_t * const data = Mb_ptr_to_current_pos( &e->eb.mb );
+      const int dis = cur_trial->reps[0] + 1;
+      const int limit = min( e->match_len_limit + 1, triable_bytes );
+      int len = 1;
+      while( len < limit && data[len-dis] == data[len] ) ++len;
+      if( --len >= min_match_len )
+        {
+        const int pos_state2 = ( pos_state + 1 ) & pos_state_mask;
+        const State state2 = St_set_char( cur_state );
+        const int price = next_price +
+                          price1( e->eb.bm_match[state2][pos_state2] ) +
+                          price1( e->eb.bm_rep[state2] ) +
+                          LZe_price_rep0_len( e, len, state2, pos_state2 );
+        while( num_trials < cur + 1 + len )
+          e->trials[++num_trials].price = infinite_price;
+        Tr_update2( &e->trials[cur+1+len], price, cur + 1 );
+        }
+      }
+
+    int start_len = min_match_len;
+
+    /* try rep distances */
+    for( rep = 0; rep < num_rep_distances; ++rep )
+      {
+      const uint8_t * const data = Mb_ptr_to_current_pos( &e->eb.mb );
+      const int dis = cur_trial->reps[rep] + 1;
+
+      if( data[0-dis] != data[0] || data[1-dis] != data[1] ) continue;
+      for( len = min_match_len; len < len_limit; ++len )
+        if( data[len-dis] != data[len] ) break;
+      while( num_trials < cur + len )
+        e->trials[++num_trials].price = infinite_price;
+      int price = rep_match_price + LZeb_price_rep( &e->eb, rep, cur_state, pos_state );
+      for( i = min_match_len; i <= len; ++i )
+        Tr_update( &e->trials[cur+i], price +
+                   Lp_price( &e->rep_len_prices, i, pos_state ), rep, cur );
+
+      if( rep == 0 ) start_len = len + 1;	/* discard shorter matches */
+
+      /* try rep + literal + rep0 */
+      int len2 = len + 1;
+      const int limit = min( e->match_len_limit + len2, triable_bytes );
+      while( len2 < limit && data[len2-dis] == data[len2] ) ++len2;
+      len2 -= len + 1;
+      if( len2 < min_match_len ) continue;
+
+      int pos_state2 = ( pos_state + len ) & pos_state_mask;
+      State state2 = St_set_rep( cur_state );
+      price += Lp_price( &e->rep_len_prices, len, pos_state ) +
+               price0( e->eb.bm_match[state2][pos_state2] ) +
+               LZeb_price_matched( &e->eb, data[len-1], data[len], data[len-dis] );
+      pos_state2 = ( pos_state2 + 1 ) & pos_state_mask;
+      state2 = St_set_char( state2 );
+      price += price1( e->eb.bm_match[state2][pos_state2] ) +
+               price1( e->eb.bm_rep[state2] ) +
+               LZe_price_rep0_len( e, len2, state2, pos_state2 );
+      while( num_trials < cur + len + 1 + len2 )
+        e->trials[++num_trials].price = infinite_price;
+      Tr_update3( &e->trials[cur+len+1+len2], price, rep, cur + len + 1, cur );
+      }
+
+    /* try matches */
+    if( newlen >= start_len && newlen <= len_limit )
+      {
+      const int normal_match_price = match_price +
+                                     price0( e->eb.bm_rep[cur_state] );
+
+      while( num_trials < cur + newlen )
+        e->trials[++num_trials].price = infinite_price;
+
+      int i = 0;
+      while( e->pairs[i].len < start_len ) ++i;
+      int dis = e->pairs[i].dis;
+      for( len = start_len; ; ++len )
+        {
+        int price = normal_match_price + LZe_price_pair( e, dis, len, pos_state );
+        Tr_update( &e->trials[cur+len], price, dis + num_rep_distances, cur );
+
+        /* try match + literal + rep0 */
+        if( len == e->pairs[i].len )
+          {
+          const uint8_t * const data = Mb_ptr_to_current_pos( &e->eb.mb );
+          const int dis2 = dis + 1;
+          int len2 = len + 1;
+          const int limit = min( e->match_len_limit + len2, triable_bytes );
+          while( len2 < limit && data[len2-dis2] == data[len2] ) ++len2;
+          len2 -= len + 1;
+          if( len2 >= min_match_len )
+            {
+            int pos_state2 = ( pos_state + len ) & pos_state_mask;
+            State state2 = St_set_match( cur_state );
+            price += price0( e->eb.bm_match[state2][pos_state2] ) +
+                     LZeb_price_matched( &e->eb, data[len-1], data[len], data[len-dis2] );
+            pos_state2 = ( pos_state2 + 1 ) & pos_state_mask;
+            state2 = St_set_char( state2 );
+            price += price1( e->eb.bm_match[state2][pos_state2] ) +
+                     price1( e->eb.bm_rep[state2] ) +
+                     LZe_price_rep0_len( e, len2, state2, pos_state2 );
+
+            while( num_trials < cur + len + 1 + len2 )
+              e->trials[++num_trials].price = infinite_price;
+            Tr_update3( &e->trials[cur+len+1+len2], price,
+                        dis + num_rep_distances, cur + len + 1, cur );
+            }
+          if( ++i >= num_pairs ) break;
+          dis = e->pairs[i].dis;
+          }
+        }
+      }
+    }
+  }
+
+
+bool LZe_encode_member( struct LZ_encoder * const e,
+                        const unsigned long long member_size )
+  {
+  const unsigned long long member_size_limit =
+    member_size - Lt_size - max_marker_size;
+  const bool best = ( e->match_len_limit > 12 );
+  const int dis_price_count = best ? 1 : 512;
+  const int align_price_count = best ? 1 : dis_align_size;
+  const int price_count = ( e->match_len_limit > 36 ) ? 1013 : 4093;
+  int price_counter = 0;		/* counters may decrement below 0 */
+  int dis_price_counter = 0;
+  int align_price_counter = 0;
+  int i;
+  int reps[num_rep_distances];
+  State state = 0;
+  for( i = 0; i < num_rep_distances; ++i ) reps[i] = 0;
+
+  if( Mb_data_position( &e->eb.mb ) != 0 ||
+      Re_member_position( &e->eb.renc ) != Lh_size )
+    return false;				/* can be called only once */
+
+  if( !Mb_data_finished( &e->eb.mb ) )		/* encode first byte */
+    {
+    const uint8_t prev_byte = 0;
+    const uint8_t cur_byte = Mb_peek( &e->eb.mb, 0 );
+    Re_encode_bit( &e->eb.renc, &e->eb.bm_match[state][0], 0 );
+    LZeb_encode_literal( &e->eb, prev_byte, cur_byte );
+    CRC32_update_byte( &e->eb.crc, cur_byte );
+    LZe_get_match_pairs( e, 0 );
+    Mb_move_pos( &e->eb.mb );
+    }
+
+  while( !Mb_data_finished( &e->eb.mb ) )
+    {
+    if( price_counter <= 0 && e->pending_num_pairs == 0 )
+      {
+      price_counter = price_count;	/* recalculate prices every these bytes */
+      if( dis_price_counter <= 0 )
+        { dis_price_counter = dis_price_count; LZe_update_distance_prices( e ); }
+      if( align_price_counter <= 0 )
+        {
+        align_price_counter = align_price_count;
+        for( i = 0; i < dis_align_size; ++i )
+          e->align_prices[i] = price_symbol_reversed( e->eb.bm_align, i, dis_align_bits );
+        }
+      Lp_update_prices( &e->match_len_prices );
+      Lp_update_prices( &e->rep_len_prices );
+      }
+
+    int ahead = LZe_sequence_optimizer( e, reps, state );
+    price_counter -= ahead;
+
+    for( i = 0; ahead > 0; )
+      {
+      const int pos_state =
+        ( Mb_data_position( &e->eb.mb ) - ahead ) & pos_state_mask;
+      const int len = e->trials[i].price;
+      int dis = e->trials[i].dis4;
+
+      bool bit = ( dis < 0 );
+      Re_encode_bit( &e->eb.renc, &e->eb.bm_match[state][pos_state], !bit );
+      if( bit )					/* literal byte */
+        {
+        const uint8_t prev_byte = Mb_peek( &e->eb.mb, ahead + 1 );
+        const uint8_t cur_byte = Mb_peek( &e->eb.mb, ahead );
+        CRC32_update_byte( &e->eb.crc, cur_byte );
+        if( ( state = St_set_char( state ) ) < 4 )
+          LZeb_encode_literal( &e->eb, prev_byte, cur_byte );
+        else
+          {
+          const uint8_t match_byte = Mb_peek( &e->eb.mb, ahead + reps[0] + 1 );
+          LZeb_encode_matched( &e->eb, prev_byte, cur_byte, match_byte );
+          }
+        }
+      else					/* match or repeated match */
+        {
+        CRC32_update_buf( &e->eb.crc, Mb_ptr_to_current_pos( &e->eb.mb ) - ahead, len );
+        mtf_reps( dis, reps );
+        bit = ( dis < num_rep_distances );
+        Re_encode_bit( &e->eb.renc, &e->eb.bm_rep[state], bit );
+        if( bit )				/* repeated match */
+          {
+          bit = ( dis == 0 );
+          Re_encode_bit( &e->eb.renc, &e->eb.bm_rep0[state], !bit );
+          if( bit )
+            Re_encode_bit( &e->eb.renc, &e->eb.bm_len[state][pos_state], len > 1 );
+          else
+            {
+            Re_encode_bit( &e->eb.renc, &e->eb.bm_rep1[state], dis > 1 );
+            if( dis > 1 )
+              Re_encode_bit( &e->eb.renc, &e->eb.bm_rep2[state], dis > 2 );
+            }
+          if( len == 1 ) state = St_set_short_rep( state );
+          else
+            {
+            Re_encode_len( &e->eb.renc, &e->eb.rep_len_model, len, pos_state );
+            Lp_decrement_counter( &e->rep_len_prices, pos_state );
+            state = St_set_rep( state );
+            }
+          }
+        else					/* match */
+          {
+          dis -= num_rep_distances;
+          LZeb_encode_pair( &e->eb, dis, len, pos_state );
+          if( dis >= modeled_distances ) --align_price_counter;
+          --dis_price_counter;
+          Lp_decrement_counter( &e->match_len_prices, pos_state );
+          state = St_set_match( state );
+          }
+        }
+      ahead -= len; i += len;
+      if( Re_member_position( &e->eb.renc ) >= member_size_limit )
+        {
+        if( !Mb_dec_pos( &e->eb.mb, ahead ) ) return false;
+        LZeb_full_flush( &e->eb, state );
+        return true;
+        }
+      }
+    }
+  LZeb_full_flush( &e->eb, state );
+  return true;
+  }
diff --git a/encoder.h b/encoder.h
new file mode 100644
index 0000000..be36341
--- /dev/null
+++ b/encoder.h
@@ -0,0 +1,312 @@
+/* Clzip - LZMA lossless data compressor
+   Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+*/
+
+struct Len_prices
+  {
+  const struct Len_model * lm;
+  int len_symbols;
+  int count;
+  int prices[pos_states][max_len_symbols];
+  int counters[pos_states];			/* may decrement below 0 */
+  };
+
+static inline void Lp_update_low_mid_prices( struct Len_prices * const lp,
+                                             const int pos_state )
+  {
+  int * const pps = lp->prices[pos_state];
+  int tmp = price0( lp->lm->choice1 );
+  int len = 0;
+  for( ; len < len_low_symbols && len < lp->len_symbols; ++len )
+    pps[len] = tmp + price_symbol3( lp->lm->bm_low[pos_state], len );
+  if( len >= lp->len_symbols ) return;
+  tmp = price1( lp->lm->choice1 ) + price0( lp->lm->choice2 );
+  for( ; len < len_low_symbols + len_mid_symbols && len < lp->len_symbols; ++len )
+    pps[len] = tmp +
+               price_symbol3( lp->lm->bm_mid[pos_state], len - len_low_symbols );
+  }
+
+static inline void Lp_update_high_prices( struct Len_prices * const lp )
+  {
+  const int tmp = price1( lp->lm->choice1 ) + price1( lp->lm->choice2 );
+  int len;
+  for( len = len_low_symbols + len_mid_symbols; len < lp->len_symbols; ++len )
+    /* using 4 slots per value makes "Lp_price" faster */
+    lp->prices[3][len] = lp->prices[2][len] =
+    lp->prices[1][len] = lp->prices[0][len] = tmp +
+      price_symbol8( lp->lm->bm_high, len - len_low_symbols - len_mid_symbols );
+  }
+
+static inline void Lp_reset( struct Len_prices * const lp )
+  { int i; for( i = 0; i < pos_states; ++i ) lp->counters[i] = 0; }
+
+static inline void Lp_init( struct Len_prices * const lp,
+                            const struct Len_model * const lm,
+                            const int match_len_limit )
+  {
+  lp->lm = lm;
+  lp->len_symbols = match_len_limit + 1 - min_match_len;
+  lp->count = ( match_len_limit > 12 ) ? 1 : lp->len_symbols;
+  Lp_reset( lp );
+  }
+
+static inline void Lp_decrement_counter( struct Len_prices * const lp,
+                                         const int pos_state )
+  { --lp->counters[pos_state]; }
+
+static inline void Lp_update_prices( struct Len_prices * const lp )
+  {
+  int pos_state;
+  bool high_pending = false;
+  for( pos_state = 0; pos_state < pos_states; ++pos_state )
+    if( lp->counters[pos_state] <= 0 )
+      { lp->counters[pos_state] = lp->count;
+        Lp_update_low_mid_prices( lp, pos_state ); high_pending = true; }
+  if( high_pending && lp->len_symbols > len_low_symbols + len_mid_symbols )
+    Lp_update_high_prices( lp );
+  }
+
+static inline int Lp_price( const struct Len_prices * const lp,
+                            const int len, const int pos_state )
+  { return lp->prices[pos_state][len - min_match_len]; }
+
+
+struct Pair			/* distance-length pair */
+  {
+  int dis;
+  int len;
+  };
+
+enum { infinite_price = 0x0FFFFFFF,
+       max_num_trials = 1 << 13,
+       single_step_trial = -2,
+       dual_step_trial = -1 };
+
+struct Trial
+  {
+  State state;
+  int price;		/* dual use var; cumulative price, match length */
+  int dis4;		/* -1 for literal, or rep, or match distance + 4 */
+  int prev_index;	/* index of prev trial in trials[] */
+  int prev_index2;	/*   -2  trial is single step */
+			/*   -1  literal + rep0 */
+			/* >= 0  ( rep or match ) + literal + rep0 */
+  int reps[num_rep_distances];
+  };
+
+static inline void Tr_update( struct Trial * const trial, const int pr,
+                              const int distance4, const int p_i )
+  {
+  if( pr < trial->price )
+    { trial->price = pr; trial->dis4 = distance4; trial->prev_index = p_i;
+      trial->prev_index2 = single_step_trial; }
+  }
+
+static inline void Tr_update2( struct Trial * const trial, const int pr,
+                               const int p_i )
+  {
+  if( pr < trial->price )
+    { trial->price = pr; trial->dis4 = 0; trial->prev_index = p_i;
+      trial->prev_index2 = dual_step_trial; }
+  }
+
+static inline void Tr_update3( struct Trial * const trial, const int pr,
+                               const int distance4, const int p_i,
+                               const int p_i2 )
+  {
+  if( pr < trial->price )
+    { trial->price = pr; trial->dis4 = distance4; trial->prev_index = p_i;
+      trial->prev_index2 = p_i2; }
+  }
+
+
+struct LZ_encoder
+  {
+  struct LZ_encoder_base eb;
+  int cycles;
+  int match_len_limit;
+  struct Len_prices match_len_prices;
+  struct Len_prices rep_len_prices;
+  int pending_num_pairs;
+  struct Pair pairs[max_match_len+1];
+  struct Trial trials[max_num_trials];
+
+  int dis_slot_prices[len_states][2*max_dictionary_bits];
+  int dis_prices[len_states][modeled_distances];
+  int align_prices[dis_align_size];
+  int num_dis_slots;
+  };
+
+static inline bool Mb_dec_pos( struct Matchfinder_base * const mb,
+                               const int ahead )
+  {
+  if( ahead < 0 || mb->pos < ahead ) return false;
+  mb->pos -= ahead;
+  if( mb->cyclic_pos < ahead ) mb->cyclic_pos += mb->dictionary_size + 1;
+  mb->cyclic_pos -= ahead;
+  return true;
+  }
+
+int LZe_get_match_pairs( struct LZ_encoder * const e, struct Pair * pairs );
+
+       /* move-to-front dis in/into reps; do nothing if( dis4 <= 0 ) */
+static inline void mtf_reps( const int dis4, int reps[num_rep_distances] )
+  {
+  if( dis4 >= num_rep_distances )			/* match */
+    {
+    reps[3] = reps[2]; reps[2] = reps[1]; reps[1] = reps[0];
+    reps[0] = dis4 - num_rep_distances;
+    }
+  else if( dis4 > 0 )				/* repeated match */
+    {
+    const int distance = reps[dis4];
+    int i; for( i = dis4; i > 0; --i ) reps[i] = reps[i-1];
+    reps[0] = distance;
+    }
+  }
+
+static inline int LZeb_price_shortrep( const struct LZ_encoder_base * const eb,
+                                       const State state, const int pos_state )
+  {
+  return price0( eb->bm_rep0[state] ) + price0( eb->bm_len[state][pos_state] );
+  }
+
+static inline int LZeb_price_rep( const struct LZ_encoder_base * const eb,
+                                  const int rep, const State state,
+                                  const int pos_state )
+  {
+  if( rep == 0 ) return price0( eb->bm_rep0[state] ) +
+                        price1( eb->bm_len[state][pos_state] );
+  int price = price1( eb->bm_rep0[state] );
+  if( rep == 1 )
+    price += price0( eb->bm_rep1[state] );
+  else
+    {
+    price += price1( eb->bm_rep1[state] );
+    price += price_bit( eb->bm_rep2[state], rep - 2 );
+    }
+  return price;
+  }
+
+static inline int LZe_price_rep0_len( const struct LZ_encoder * const e,
+                                      const int len, const State state,
+                                      const int pos_state )
+  {
+  return LZeb_price_rep( &e->eb, 0, state, pos_state ) +
+         Lp_price( &e->rep_len_prices, len, pos_state );
+  }
+
+static inline int LZe_price_pair( const struct LZ_encoder * const e,
+                                  const int dis, const int len,
+                                  const int pos_state )
+  {
+  const int price = Lp_price( &e->match_len_prices, len, pos_state );
+  const int len_state = get_len_state( len );
+  if( dis < modeled_distances )
+    return price + e->dis_prices[len_state][dis];
+  else
+    return price + e->dis_slot_prices[len_state][get_slot( dis )] +
+           e->align_prices[dis & (dis_align_size - 1)];
+  }
+
+static inline int LZe_read_match_distances( struct LZ_encoder * const e )
+  {
+  const int num_pairs = LZe_get_match_pairs( e, e->pairs );
+  if( num_pairs > 0 )
+    {
+    const int len = e->pairs[num_pairs-1].len;
+    if( len == e->match_len_limit && len < max_match_len )
+      e->pairs[num_pairs-1].len =
+        Mb_true_match_len( &e->eb.mb, len, e->pairs[num_pairs-1].dis + 1 );
+    }
+  return num_pairs;
+  }
+
+static inline void LZe_move_and_update( struct LZ_encoder * const e, int n )
+  {
+  while( true )
+    {
+    Mb_move_pos( &e->eb.mb );
+    if( --n <= 0 ) break;
+    LZe_get_match_pairs( e, 0 );
+    }
+  }
+
+static inline void LZe_backward( struct LZ_encoder * const e, int cur )
+  {
+  int dis4 = e->trials[cur].dis4;
+  while( cur > 0 )
+    {
+    const int prev_index = e->trials[cur].prev_index;
+    struct Trial * const prev_trial = &e->trials[prev_index];
+
+    if( e->trials[cur].prev_index2 != single_step_trial )
+      {
+      prev_trial->dis4 = -1;					/* literal */
+      prev_trial->prev_index = prev_index - 1;
+      prev_trial->prev_index2 = single_step_trial;
+      if( e->trials[cur].prev_index2 >= 0 )
+        {
+        struct Trial * const prev_trial2 = &e->trials[prev_index-1];
+        prev_trial2->dis4 = dis4; dis4 = 0;			/* rep0 */
+        prev_trial2->prev_index = e->trials[cur].prev_index2;
+        prev_trial2->prev_index2 = single_step_trial;
+        }
+      }
+    prev_trial->price = cur - prev_index;			/* len */
+    cur = dis4; dis4 = prev_trial->dis4; prev_trial->dis4 = cur;
+    cur = prev_index;
+    }
+  }
+
+enum { num_prev_positions3 = 1 << 16,
+       num_prev_positions2 = 1 << 10 };
+
+static inline bool LZe_init( struct LZ_encoder * const e,
+                             const int dict_size, const int len_limit,
+                             const int ifd, const int outfd )
+  {
+  enum { before_size = max_num_trials,
+         /* bytes to keep in buffer after pos */
+         after_size = ( 2 * max_match_len ) + 1,
+         dict_factor = 2,
+         num_prev_positions23 = num_prev_positions2 + num_prev_positions3,
+         pos_array_factor = 2 };
+
+  if( !LZeb_init( &e->eb, before_size, dict_size, after_size, dict_factor,
+                  num_prev_positions23, pos_array_factor, ifd, outfd ) )
+    return false;
+  e->cycles = ( len_limit < max_match_len ) ? 16 + ( len_limit / 2 ) : 256;
+  e->match_len_limit = len_limit;
+  Lp_init( &e->match_len_prices, &e->eb.match_len_model, e->match_len_limit );
+  Lp_init( &e->rep_len_prices, &e->eb.rep_len_model, e->match_len_limit );
+  e->pending_num_pairs = 0;
+  e->num_dis_slots = 2 * real_bits( e->eb.mb.dictionary_size - 1 );
+  e->trials[1].prev_index = 0;
+  e->trials[1].prev_index2 = single_step_trial;
+  return true;
+  }
+
+static inline void LZe_reset( struct LZ_encoder * const e )
+  {
+  LZeb_reset( &e->eb );
+  Lp_reset( &e->match_len_prices );
+  Lp_reset( &e->rep_len_prices );
+  e->pending_num_pairs = 0;
+  }
+
+bool LZe_encode_member( struct LZ_encoder * const e,
+                        const unsigned long long member_size );
diff --git a/encoder_base.c b/encoder_base.c
new file mode 100644
index 0000000..5f40f9b
--- /dev/null
+++ b/encoder_base.c
@@ -0,0 +1,198 @@
+/* Clzip - LZMA lossless data compressor
+   Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+*/
+
+#define _FILE_OFFSET_BITS 64
+
+#include <errno.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "lzip.h"
+#include "encoder_base.h"
+
+
+Dis_slots dis_slots;
+Prob_prices prob_prices;
+
+
+bool Mb_read_block( struct Matchfinder_base * const mb )
+  {
+  if( !mb->at_stream_end && mb->stream_pos < mb->buffer_size )
+    {
+    const int size = mb->buffer_size - mb->stream_pos;
+    const int rd = readblock( mb->infd, mb->buffer + mb->stream_pos, size );
+    mb->stream_pos += rd;
+    if( rd != size && errno )
+      { show_error( "Read error", errno, false ); cleanup_and_fail( 1 ); }
+    if( rd < size ) { mb->at_stream_end = true; mb->pos_limit = mb->buffer_size; }
+    }
+  return mb->pos < mb->stream_pos;
+  }
+
+
+void Mb_normalize_pos( struct Matchfinder_base * const mb )
+  {
+  if( mb->pos > mb->stream_pos )
+    internal_error( "pos > stream_pos in Mb_normalize_pos." );
+  if( !mb->at_stream_end )
+    {
+    int i;
+    /* offset is int32_t for the min below */
+    const int32_t offset = mb->pos - mb->before_size - mb->dictionary_size;
+    const int size = mb->stream_pos - offset;
+    memmove( mb->buffer, mb->buffer + offset, size );
+    mb->partial_data_pos += offset;
+    mb->pos -= offset;		/* pos = before_size + dictionary_size */
+    mb->stream_pos -= offset;
+    for( i = 0; i < mb->num_prev_positions; ++i )
+      mb->prev_positions[i] -= min( mb->prev_positions[i], offset );
+    for( i = 0; i < mb->pos_array_size; ++i )
+      mb->pos_array[i] -= min( mb->pos_array[i], offset );
+    Mb_read_block( mb );
+    }
+  }
+
+
+bool Mb_init( struct Matchfinder_base * const mb, const int before_size,
+              const int dict_size, const int after_size,
+              const int dict_factor, const int num_prev_positions23,
+              const int pos_array_factor, const int ifd )
+  {
+  const int buffer_size_limit =
+    ( dict_factor * dict_size ) + before_size + after_size;
+  int i;
+
+  mb->partial_data_pos = 0;
+  mb->before_size = before_size;
+  mb->pos = 0;
+  mb->cyclic_pos = 0;
+  mb->stream_pos = 0;
+  mb->num_prev_positions23 = num_prev_positions23;
+  mb->infd = ifd;
+  mb->at_stream_end = false;
+
+  mb->buffer_size = max( 65536, dict_size );
+  mb->buffer = (uint8_t *)malloc( mb->buffer_size );
+  if( !mb->buffer ) return false;
+  if( Mb_read_block( mb ) && !mb->at_stream_end &&
+      mb->buffer_size < buffer_size_limit )
+    {
+    uint8_t * const tmp = (uint8_t *)realloc( mb->buffer, buffer_size_limit );
+    if( !tmp ) { free( mb->buffer ); return false; }
+    mb->buffer = tmp;
+    mb->buffer_size = buffer_size_limit;
+    Mb_read_block( mb );
+    }
+  if( mb->at_stream_end && mb->stream_pos < dict_size )
+    mb->dictionary_size = max( min_dictionary_size, mb->stream_pos );
+  else
+    mb->dictionary_size = dict_size;
+  mb->pos_limit = mb->buffer_size;
+  if( !mb->at_stream_end ) mb->pos_limit -= after_size;
+  unsigned size = 1 << max( 16, real_bits( mb->dictionary_size - 1 ) - 2 );
+  if( mb->dictionary_size > 1 << 26 ) size >>= 1;	/* 64 MiB */
+  mb->key4_mask = size - 1;		/* increases with dictionary size */
+  size += num_prev_positions23;
+  mb->num_prev_positions = size;
+
+  mb->pos_array_size = pos_array_factor * ( mb->dictionary_size + 1 );
+  size += mb->pos_array_size;
+  if( size * sizeof mb->prev_positions[0] <= size ) mb->prev_positions = 0;
+  else mb->prev_positions =
+    (int32_t *)malloc( size * sizeof mb->prev_positions[0] );
+  if( !mb->prev_positions ) { free( mb->buffer ); return false; }
+  mb->pos_array = mb->prev_positions + mb->num_prev_positions;
+  for( i = 0; i < mb->num_prev_positions; ++i ) mb->prev_positions[i] = 0;
+  return true;
+  }
+
+
+void Mb_reset( struct Matchfinder_base * const mb )
+  {
+  int i;
+  if( mb->stream_pos > mb->pos )
+    memmove( mb->buffer, mb->buffer + mb->pos, mb->stream_pos - mb->pos );
+  mb->partial_data_pos = 0;
+  mb->stream_pos -= mb->pos;
+  mb->pos = 0;
+  mb->cyclic_pos = 0;
+  Mb_read_block( mb );
+  if( mb->at_stream_end && mb->stream_pos < mb->dictionary_size )
+    {
+    mb->dictionary_size = max( min_dictionary_size, mb->stream_pos );
+    int size = 1 << max( 16, real_bits( mb->dictionary_size - 1 ) - 2 );
+    if( mb->dictionary_size > 1 << 26 ) size >>= 1;	/* 64 MiB */
+    mb->key4_mask = size - 1;
+    size += mb->num_prev_positions23;
+    mb->num_prev_positions = size;
+    mb->pos_array = mb->prev_positions + mb->num_prev_positions;
+    }
+  for( i = 0; i < mb->num_prev_positions; ++i ) mb->prev_positions[i] = 0;
+  }
+
+
+void Re_flush_data( struct Range_encoder * const renc )
+  {
+  if( renc->pos > 0 )
+    {
+    if( renc->outfd >= 0 &&
+        writeblock( renc->outfd, renc->buffer, renc->pos ) != renc->pos )
+      { show_error( "Write error", errno, false ); cleanup_and_fail( 1 ); }
+    renc->partial_member_pos += renc->pos;
+    renc->pos = 0;
+    show_cprogress( 0, 0, 0, 0 );
+    }
+  }
+
+
+/* End Of Stream marker => (dis == 0xFFFFFFFFU, len == min_match_len) */
+void LZeb_full_flush( struct LZ_encoder_base * const eb, const State state )
+  {
+  const int pos_state = Mb_data_position( &eb->mb ) & pos_state_mask;
+  Re_encode_bit( &eb->renc, &eb->bm_match[state][pos_state], 1 );
+  Re_encode_bit( &eb->renc, &eb->bm_rep[state], 0 );
+  LZeb_encode_pair( eb, 0xFFFFFFFFU, min_match_len, pos_state );
+  Re_flush( &eb->renc );
+  Lzip_trailer trailer;
+  Lt_set_data_crc( trailer, LZeb_crc( eb ) );
+  Lt_set_data_size( trailer, Mb_data_position( &eb->mb ) );
+  Lt_set_member_size( trailer, Re_member_position( &eb->renc ) + Lt_size );
+  int i; for( i = 0; i < Lt_size; ++i ) Re_put_byte( &eb->renc, trailer[i] );
+  Re_flush_data( &eb->renc );
+  }
+
+
+void LZeb_reset( struct LZ_encoder_base * const eb )
+  {
+  Mb_reset( &eb->mb );
+  eb->crc = 0xFFFFFFFFU;
+  Bm_array_init( eb->bm_literal[0], (1 << literal_context_bits) * 0x300 );
+  Bm_array_init( eb->bm_match[0], states * pos_states );
+  Bm_array_init( eb->bm_rep, states );
+  Bm_array_init( eb->bm_rep0, states );
+  Bm_array_init( eb->bm_rep1, states );
+  Bm_array_init( eb->bm_rep2, states );
+  Bm_array_init( eb->bm_len[0], states * pos_states );
+  Bm_array_init( eb->bm_dis_slot[0], len_states * (1 << dis_slot_bits) );
+  Bm_array_init( eb->bm_dis, modeled_distances - end_dis_model + 1 );
+  Bm_array_init( eb->bm_align, dis_align_size );
+  Lm_init( &eb->match_len_model );
+  Lm_init( &eb->rep_len_model );
+  Re_reset( &eb->renc, eb->mb.dictionary_size );
+  }
diff --git a/encoder_base.h b/encoder_base.h
new file mode 100644
index 0000000..c947904
--- /dev/null
+++ b/encoder_base.h
@@ -0,0 +1,507 @@
+/* Clzip - LZMA lossless data compressor
+   Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+*/
+
+enum { price_shift_bits = 6,
+       price_step_bits = 2,
+       price_step = 1 << price_step_bits };
+
+typedef uint8_t Dis_slots[1<<10];
+
+extern Dis_slots dis_slots;
+
+static inline void Dis_slots_init( void )
+  {
+  int i, size, slot;
+  for( slot = 0; slot < 4; ++slot ) dis_slots[slot] = slot;
+  for( i = 4, size = 2, slot = 4; slot < 20; slot += 2 )
+    {
+    memset( &dis_slots[i], slot, size );
+    memset( &dis_slots[i+size], slot + 1, size );
+    size <<= 1;
+    i += size;
+    }
+  }
+
+static inline uint8_t get_slot( const unsigned dis )
+  {
+  if( dis < (1 << 10) ) return dis_slots[dis];
+  if( dis < (1 << 19) ) return dis_slots[dis>> 9] + 18;
+  if( dis < (1 << 28) ) return dis_slots[dis>>18] + 36;
+  return dis_slots[dis>>27] + 54;
+  }
+
+
+typedef short Prob_prices[bit_model_total >> price_step_bits];
+
+extern Prob_prices prob_prices;
+
+static inline void Prob_prices_init( void )
+  {
+  int i, j;
+  for( i = 0; i < bit_model_total >> price_step_bits; ++i )
+    {
+    unsigned val = ( i * price_step ) + ( price_step / 2 );
+    int bits = 0;				/* base 2 logarithm of val */
+    for( j = 0; j < price_shift_bits; ++j )
+      {
+      val = val * val;
+      bits <<= 1;
+      while( val >= 1 << 16 ) { val >>= 1; ++bits; }
+      }
+    bits += 15;					/* remaining bits in val */
+    prob_prices[i] = ( bit_model_total_bits << price_shift_bits ) - bits;
+    }
+  }
+
+static inline int get_price( const int probability )
+  { return prob_prices[probability >> price_step_bits]; }
+
+
+static inline int price0( const Bit_model probability )
+  { return get_price( probability ); }
+
+static inline int price1( const Bit_model probability )
+  { return get_price( bit_model_total - probability ); }
+
+static inline int price_bit( const Bit_model bm, const bool bit )
+  { return bit ? price1( bm ) : price0( bm ); }
+
+
+static inline int price_symbol3( const Bit_model bm[], int symbol )
+  {
+  bool bit = symbol & 1;
+  symbol |= 8; symbol >>= 1;
+  int price = price_bit( bm[symbol], bit );
+  bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+  return price + price_bit( bm[1], symbol & 1 );
+  }
+
+
+static inline int price_symbol6( const Bit_model bm[], unsigned symbol )
+  {
+  bool bit = symbol & 1;
+  symbol |= 64; symbol >>= 1;
+  int price = price_bit( bm[symbol], bit );
+  bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+  bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+  bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+  bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+  return price + price_bit( bm[1], symbol & 1 );
+  }
+
+
+static inline int price_symbol8( const Bit_model bm[], int symbol )
+  {
+  bool bit = symbol & 1;
+  symbol |= 0x100; symbol >>= 1;
+  int price = price_bit( bm[symbol], bit );
+  bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+  bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+  bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+  bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+  bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+  bit = symbol & 1; symbol >>= 1; price += price_bit( bm[symbol], bit );
+  return price + price_bit( bm[1], symbol & 1 );
+  }
+
+
+static inline int price_symbol_reversed( const Bit_model bm[], int symbol,
+                                         const int num_bits )
+  {
+  int price = 0;
+  int model = 1;
+  int i;
+  for( i = num_bits; i > 0; --i )
+    {
+    const bool bit = symbol & 1;
+    symbol >>= 1;
+    price += price_bit( bm[model], bit );
+    model <<= 1; model |= bit;
+    }
+  return price;
+  }
+
+
+static inline int price_matched( const Bit_model bm[], unsigned symbol,
+                                 unsigned match_byte )
+  {
+  int price = 0;
+  unsigned mask = 0x100;
+  symbol |= mask;
+  while( true )
+    {
+    const unsigned match_bit = ( match_byte <<= 1 ) & mask;
+    const bool bit = ( symbol <<= 1 ) & 0x100;
+    price += price_bit( bm[(symbol>>9)+match_bit+mask], bit );
+    if( symbol >= 0x10000 ) return price;
+    mask &= ~(match_bit ^ symbol);	/* if( match_bit != bit ) mask = 0; */
+    }
+  }
+
+
+struct Matchfinder_base
+  {
+  unsigned long long partial_data_pos;
+  uint8_t * buffer;		/* input buffer */
+  int32_t * prev_positions;	/* 1 + last seen position of key. else 0 */
+  int32_t * pos_array;		/* may be tree or chain */
+  int before_size;		/* bytes to keep in buffer before dictionary */
+  int buffer_size;
+  int dictionary_size;		/* bytes to keep in buffer before pos */
+  int pos;			/* current pos in buffer */
+  int cyclic_pos;		/* cycles through [0, dictionary_size] */
+  int stream_pos;		/* first byte not yet read from file */
+  int pos_limit;		/* when reached, a new block must be read */
+  int key4_mask;
+  int num_prev_positions23;
+  int num_prev_positions;	/* size of prev_positions */
+  int pos_array_size;
+  int infd;			/* input file descriptor */
+  bool at_stream_end;		/* stream_pos shows real end of file */
+  };
+
+bool Mb_read_block( struct Matchfinder_base * const mb );
+void Mb_normalize_pos( struct Matchfinder_base * const mb );
+
+bool Mb_init( struct Matchfinder_base * const mb, const int before_size,
+              const int dict_size, const int after_size,
+              const int dict_factor, const int num_prev_positions23,
+              const int pos_array_factor, const int ifd );
+
+static inline void Mb_free( struct Matchfinder_base * const mb )
+  { free( mb->prev_positions ); free( mb->buffer ); }
+
+static inline uint8_t Mb_peek( const struct Matchfinder_base * const mb,
+                               const int distance )
+  { return mb->buffer[mb->pos-distance]; }
+
+static inline int Mb_available_bytes( const struct Matchfinder_base * const mb )
+  { return mb->stream_pos - mb->pos; }
+
+static inline unsigned long long
+Mb_data_position( const struct Matchfinder_base * const mb )
+  { return mb->partial_data_pos + mb->pos; }
+
+static inline bool Mb_data_finished( const struct Matchfinder_base * const mb )
+  { return mb->at_stream_end && mb->pos >= mb->stream_pos; }
+
+static inline const uint8_t *
+Mb_ptr_to_current_pos( const struct Matchfinder_base * const mb )
+  { return mb->buffer + mb->pos; }
+
+static inline int Mb_true_match_len( const struct Matchfinder_base * const mb,
+                                     const int index, const int distance )
+  {
+  const uint8_t * const data = mb->buffer + mb->pos;
+  int i = index;
+  const int len_limit = min( Mb_available_bytes( mb ), max_match_len );
+  while( i < len_limit && data[i-distance] == data[i] ) ++i;
+  return i;
+  }
+
+static inline void Mb_move_pos( struct Matchfinder_base * const mb )
+  {
+  if( ++mb->cyclic_pos > mb->dictionary_size ) mb->cyclic_pos = 0;
+  if( ++mb->pos >= mb->pos_limit ) Mb_normalize_pos( mb );
+  }
+
+void Mb_reset( struct Matchfinder_base * const mb );
+
+
+enum { re_buffer_size = 65536 };
+
+struct Range_encoder
+  {
+  uint64_t low;
+  unsigned long long partial_member_pos;
+  uint8_t * buffer;		/* output buffer */
+  int pos;			/* current pos in buffer */
+  uint32_t range;
+  unsigned ff_count;
+  int outfd;			/* output file descriptor */
+  uint8_t cache;
+  Lzip_header header;
+  };
+
+void Re_flush_data( struct Range_encoder * const renc );
+
+static inline void Re_put_byte( struct Range_encoder * const renc,
+                                const uint8_t b )
+  {
+  renc->buffer[renc->pos] = b;
+  if( ++renc->pos >= re_buffer_size ) Re_flush_data( renc );
+  }
+
+static inline void Re_shift_low( struct Range_encoder * const renc )
+  {
+  if( renc->low >> 24 != 0xFF )
+    {
+    const bool carry = ( renc->low > 0xFFFFFFFFU );
+    Re_put_byte( renc, renc->cache + carry );
+    for( ; renc->ff_count > 0; --renc->ff_count )
+      Re_put_byte( renc, 0xFF + carry );
+    renc->cache = renc->low >> 24;
+    }
+  else ++renc->ff_count;
+  renc->low = ( renc->low & 0x00FFFFFFU ) << 8;
+  }
+
+static inline void Re_reset( struct Range_encoder * const renc,
+                             const unsigned dictionary_size )
+  {
+  renc->low = 0;
+  renc->partial_member_pos = 0;
+  renc->pos = 0;
+  renc->range = 0xFFFFFFFFU;
+  renc->ff_count = 0;
+  renc->cache = 0;
+  Lh_set_dictionary_size( renc->header, dictionary_size );
+  int i; for( i = 0; i < Lh_size; ++i ) Re_put_byte( renc, renc->header[i] );
+  }
+
+static inline bool Re_init( struct Range_encoder * const renc,
+                            const unsigned dictionary_size, const int ofd )
+  {
+  renc->buffer = (uint8_t *)malloc( re_buffer_size );
+  if( !renc->buffer ) return false;
+  renc->outfd = ofd;
+  Lh_set_magic( renc->header );
+  Re_reset( renc, dictionary_size );
+  return true;
+  }
+
+static inline void Re_free( struct Range_encoder * const renc )
+  { free( renc->buffer ); }
+
+static inline unsigned long long
+Re_member_position( const struct Range_encoder * const renc )
+  { return renc->partial_member_pos + renc->pos + renc->ff_count; }
+
+static inline void Re_flush( struct Range_encoder * const renc )
+  { int i; for( i = 0; i < 5; ++i ) Re_shift_low( renc ); }
+
+static inline void Re_encode( struct Range_encoder * const renc,
+                              const int symbol, const int num_bits )
+  {
+  unsigned mask;
+  for( mask = 1 << ( num_bits - 1 ); mask > 0; mask >>= 1 )
+    {
+    renc->range >>= 1;
+    if( symbol & mask ) renc->low += renc->range;
+    if( renc->range <= 0x00FFFFFFU ) { renc->range <<= 8; Re_shift_low( renc ); }
+    }
+  }
+
+static inline void Re_encode_bit( struct Range_encoder * const renc,
+                                  Bit_model * const probability, const bool bit )
+  {
+  const uint32_t bound = ( renc->range >> bit_model_total_bits ) * *probability;
+  if( !bit )
+    {
+    renc->range = bound;
+    *probability += (bit_model_total - *probability) >> bit_model_move_bits;
+    }
+  else
+    {
+    renc->low += bound;
+    renc->range -= bound;
+    *probability -= *probability >> bit_model_move_bits;
+    }
+  if( renc->range <= 0x00FFFFFFU ) { renc->range <<= 8; Re_shift_low( renc ); }
+  }
+
+static inline void Re_encode_tree3( struct Range_encoder * const renc,
+                                    Bit_model bm[], const int symbol )
+  {
+  bool bit = ( symbol >> 2 ) & 1;
+  Re_encode_bit( renc, &bm[1], bit );
+  int model = 2 | bit;
+  bit = ( symbol >> 1 ) & 1;
+  Re_encode_bit( renc, &bm[model], bit ); model <<= 1; model |= bit;
+  Re_encode_bit( renc, &bm[model], symbol & 1 );
+  }
+
+static inline void Re_encode_tree6( struct Range_encoder * const renc,
+                                    Bit_model bm[], const unsigned symbol )
+  {
+  bool bit = ( symbol >> 5 ) & 1;
+  Re_encode_bit( renc, &bm[1], bit );
+  int model = 2 | bit;
+  bit = ( symbol >> 4 ) & 1;
+  Re_encode_bit( renc, &bm[model], bit ); model <<= 1; model |= bit;
+  bit = ( symbol >> 3 ) & 1;
+  Re_encode_bit( renc, &bm[model], bit ); model <<= 1; model |= bit;
+  bit = ( symbol >> 2 ) & 1;
+  Re_encode_bit( renc, &bm[model], bit ); model <<= 1; model |= bit;
+  bit = ( symbol >> 1 ) & 1;
+  Re_encode_bit( renc, &bm[model], bit ); model <<= 1; model |= bit;
+  Re_encode_bit( renc, &bm[model], symbol & 1 );
+  }
+
+static inline void Re_encode_tree8( struct Range_encoder * const renc,
+                                    Bit_model bm[], const int symbol )
+  {
+  int model = 1;
+  int i;
+  for( i = 7; i >= 0; --i )
+    {
+    const bool bit = ( symbol >> i ) & 1;
+    Re_encode_bit( renc, &bm[model], bit );
+    model <<= 1; model |= bit;
+    }
+  }
+
+static inline void Re_encode_tree_reversed( struct Range_encoder * const renc,
+                     Bit_model bm[], int symbol, const int num_bits )
+  {
+  int model = 1;
+  int i;
+  for( i = num_bits; i > 0; --i )
+    {
+    const bool bit = symbol & 1;
+    symbol >>= 1;
+    Re_encode_bit( renc, &bm[model], bit );
+    model <<= 1; model |= bit;
+    }
+  }
+
+static inline void Re_encode_matched( struct Range_encoder * const renc,
+                                      Bit_model bm[], unsigned symbol,
+                                      unsigned match_byte )
+  {
+  unsigned mask = 0x100;
+  symbol |= mask;
+  while( true )
+    {
+    const unsigned match_bit = ( match_byte <<= 1 ) & mask;
+    const bool bit = ( symbol <<= 1 ) & 0x100;
+    Re_encode_bit( renc, &bm[(symbol>>9)+match_bit+mask], bit );
+    if( symbol >= 0x10000 ) break;
+    mask &= ~(match_bit ^ symbol);	/* if( match_bit != bit ) mask = 0; */
+    }
+  }
+
+static inline void Re_encode_len( struct Range_encoder * const renc,
+                                  struct Len_model * const lm,
+                                  int symbol, const int pos_state )
+  {
+  bool bit = ( ( symbol -= min_match_len ) >= len_low_symbols );
+  Re_encode_bit( renc, &lm->choice1, bit );
+  if( !bit )
+    Re_encode_tree3( renc, lm->bm_low[pos_state], symbol );
+  else
+    {
+    bit = ( ( symbol -= len_low_symbols ) >= len_mid_symbols );
+    Re_encode_bit( renc, &lm->choice2, bit );
+    if( !bit )
+      Re_encode_tree3( renc, lm->bm_mid[pos_state], symbol );
+    else
+      Re_encode_tree8( renc, lm->bm_high, symbol - len_mid_symbols );
+    }
+  }
+
+
+enum { max_marker_size = 16,
+       num_rep_distances = 4 };		/* must be 4 */
+
+struct LZ_encoder_base
+  {
+  struct Matchfinder_base mb;
+  uint32_t crc;
+
+  Bit_model bm_literal[1<<literal_context_bits][0x300];
+  Bit_model bm_match[states][pos_states];
+  Bit_model bm_rep[states];
+  Bit_model bm_rep0[states];
+  Bit_model bm_rep1[states];
+  Bit_model bm_rep2[states];
+  Bit_model bm_len[states][pos_states];
+  Bit_model bm_dis_slot[len_states][1<<dis_slot_bits];
+  Bit_model bm_dis[modeled_distances-end_dis_model+1];
+  Bit_model bm_align[dis_align_size];
+  struct Len_model match_len_model;
+  struct Len_model rep_len_model;
+  struct Range_encoder renc;
+  };
+
+void LZeb_reset( struct LZ_encoder_base * const eb );
+
+static inline bool LZeb_init( struct LZ_encoder_base * const eb,
+                              const int before_size, const int dict_size,
+                              const int after_size, const int dict_factor,
+                              const int num_prev_positions23,
+                              const int pos_array_factor,
+                              const int ifd, const int outfd )
+  {
+  if( !Mb_init( &eb->mb, before_size, dict_size, after_size, dict_factor,
+                num_prev_positions23, pos_array_factor, ifd ) ) return false;
+  if( !Re_init( &eb->renc, eb->mb.dictionary_size, outfd ) ) return false;
+  LZeb_reset( eb );
+  return true;
+  }
+
+static inline void LZeb_free( struct LZ_encoder_base * const eb )
+  { Re_free( &eb->renc ); Mb_free( &eb->mb ); }
+
+static inline unsigned LZeb_crc( const struct LZ_encoder_base * const eb )
+  { return eb->crc ^ 0xFFFFFFFFU; }
+
+static inline int LZeb_price_literal( const struct LZ_encoder_base * const eb,
+                            const uint8_t prev_byte, const uint8_t symbol )
+  { return price_symbol8( eb->bm_literal[get_lit_state(prev_byte)], symbol ); }
+
+static inline int LZeb_price_matched( const struct LZ_encoder_base * const eb,
+  const uint8_t prev_byte, const uint8_t symbol, const uint8_t match_byte )
+  { return price_matched( eb->bm_literal[get_lit_state(prev_byte)], symbol,
+                          match_byte ); }
+
+static inline void LZeb_encode_literal( struct LZ_encoder_base * const eb,
+                            const uint8_t prev_byte, const uint8_t symbol )
+  { Re_encode_tree8( &eb->renc, eb->bm_literal[get_lit_state(prev_byte)], symbol ); }
+
+static inline void LZeb_encode_matched( struct LZ_encoder_base * const eb,
+  const uint8_t prev_byte, const uint8_t symbol, const uint8_t match_byte )
+  { Re_encode_matched( &eb->renc, eb->bm_literal[get_lit_state(prev_byte)],
+                       symbol, match_byte ); }
+
+static inline void LZeb_encode_pair( struct LZ_encoder_base * const eb,
+                                     const unsigned dis, const int len,
+                                     const int pos_state )
+  {
+  Re_encode_len( &eb->renc, &eb->match_len_model, len, pos_state );
+  const unsigned dis_slot = get_slot( dis );
+  Re_encode_tree6( &eb->renc, eb->bm_dis_slot[get_len_state(len)], dis_slot );
+
+  if( dis_slot >= start_dis_model )
+    {
+    const int direct_bits = ( dis_slot >> 1 ) - 1;
+    const unsigned base = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
+    const unsigned direct_dis = dis - base;
+
+    if( dis_slot < end_dis_model )
+      Re_encode_tree_reversed( &eb->renc, eb->bm_dis + ( base - dis_slot ),
+                               direct_dis, direct_bits );
+    else
+      {
+      Re_encode( &eb->renc, direct_dis >> dis_align_bits,
+                 direct_bits - dis_align_bits );
+      Re_encode_tree_reversed( &eb->renc, eb->bm_align, direct_dis, dis_align_bits );
+      }
+    }
+  }
+
+void LZeb_full_flush( struct LZ_encoder_base * const eb, const State state );
diff --git a/fast_encoder.c b/fast_encoder.c
new file mode 100644
index 0000000..bab87ca
--- /dev/null
+++ b/fast_encoder.c
@@ -0,0 +1,186 @@
+/* Clzip - LZMA lossless data compressor
+   Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+*/
+
+#define _FILE_OFFSET_BITS 64
+
+#include <errno.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "lzip.h"
+#include "encoder_base.h"
+#include "fast_encoder.h"
+
+
+int FLZe_longest_match_len( struct FLZ_encoder * const fe, int * const distance )
+  {
+  enum { len_limit = 16 };
+  const int available = min( Mb_available_bytes( &fe->eb.mb ), max_match_len );
+  if( available < len_limit ) return 0;
+
+  const uint8_t * const data = Mb_ptr_to_current_pos( &fe->eb.mb );
+  fe->key4 = ( ( fe->key4 << 4 ) ^ data[3] ) & fe->eb.mb.key4_mask;
+  const int pos1 = fe->eb.mb.pos + 1;
+  int newpos1 = fe->eb.mb.prev_positions[fe->key4];
+  fe->eb.mb.prev_positions[fe->key4] = pos1;
+  int32_t * ptr0 = fe->eb.mb.pos_array + fe->eb.mb.cyclic_pos;
+  int maxlen = 0, count;
+
+  for( count = 4; ; )
+    {
+    int delta;
+    if( newpos1 <= 0 || --count < 0 ||
+        ( delta = pos1 - newpos1 ) > fe->eb.mb.dictionary_size )
+      { *ptr0 = 0; break; }
+    int32_t * const newptr = fe->eb.mb.pos_array +
+      ( fe->eb.mb.cyclic_pos - delta +
+        ( ( fe->eb.mb.cyclic_pos >= delta ) ? 0 : fe->eb.mb.dictionary_size + 1 ) );
+
+    if( data[maxlen-delta] == data[maxlen] )
+      {
+      int len = 0;
+      while( len < available && data[len-delta] == data[len] ) ++len;
+      if( maxlen < len )
+        { maxlen = len; *distance = delta - 1;
+          if( maxlen >= len_limit ) { *ptr0 = *newptr; break; } }
+      }
+
+    *ptr0 = newpos1;
+    ptr0 = newptr;
+    newpos1 = *ptr0;
+    }
+  return maxlen;
+  }
+
+
+bool FLZe_encode_member( struct FLZ_encoder * const fe,
+                         const unsigned long long member_size )
+  {
+  const unsigned long long member_size_limit =
+    member_size - Lt_size - max_marker_size;
+  int rep = 0, i;
+  int reps[num_rep_distances];
+  State state = 0;
+  for( i = 0; i < num_rep_distances; ++i ) reps[i] = 0;
+
+  if( Mb_data_position( &fe->eb.mb ) != 0 ||
+      Re_member_position( &fe->eb.renc ) != Lh_size )
+    return false;				/* can be called only once */
+
+  if( !Mb_data_finished( &fe->eb.mb ) )		/* encode first byte */
+    {
+    const uint8_t prev_byte = 0;
+    const uint8_t cur_byte = Mb_peek( &fe->eb.mb, 0 );
+    Re_encode_bit( &fe->eb.renc, &fe->eb.bm_match[state][0], 0 );
+    LZeb_encode_literal( &fe->eb, prev_byte, cur_byte );
+    CRC32_update_byte( &fe->eb.crc, cur_byte );
+    FLZe_reset_key4( fe );
+    FLZe_update_and_move( fe, 1 );
+    }
+
+  while( !Mb_data_finished( &fe->eb.mb ) &&
+         Re_member_position( &fe->eb.renc ) < member_size_limit )
+    {
+    int match_distance;
+    const int main_len = FLZe_longest_match_len( fe, &match_distance );
+    const int pos_state = Mb_data_position( &fe->eb.mb ) & pos_state_mask;
+    int len = 0;
+
+    for( i = 0; i < num_rep_distances; ++i )
+      {
+      const int tlen = Mb_true_match_len( &fe->eb.mb, 0, reps[i] + 1 );
+      if( tlen > len ) { len = tlen; rep = i; }
+      }
+    if( len > min_match_len && len + 3 > main_len )
+      {
+      CRC32_update_buf( &fe->eb.crc, Mb_ptr_to_current_pos( &fe->eb.mb ), len );
+      Re_encode_bit( &fe->eb.renc, &fe->eb.bm_match[state][pos_state], 1 );
+      Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep[state], 1 );
+      Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep0[state], rep != 0 );
+      if( rep == 0 )
+        Re_encode_bit( &fe->eb.renc, &fe->eb.bm_len[state][pos_state], 1 );
+      else
+        {
+        Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep1[state], rep > 1 );
+        if( rep > 1 )
+          Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep2[state], rep > 2 );
+        const int distance = reps[rep];
+        for( i = rep; i > 0; --i ) reps[i] = reps[i-1];
+        reps[0] = distance;
+        }
+      state = St_set_rep( state );
+      Re_encode_len( &fe->eb.renc, &fe->eb.rep_len_model, len, pos_state );
+      Mb_move_pos( &fe->eb.mb );
+      FLZe_update_and_move( fe, len - 1 );
+      continue;
+      }
+
+    if( main_len > min_match_len )
+      {
+      CRC32_update_buf( &fe->eb.crc, Mb_ptr_to_current_pos( &fe->eb.mb ), main_len );
+      Re_encode_bit( &fe->eb.renc, &fe->eb.bm_match[state][pos_state], 1 );
+      Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep[state], 0 );
+      state = St_set_match( state );
+      for( i = num_rep_distances - 1; i > 0; --i ) reps[i] = reps[i-1];
+      reps[0] = match_distance;
+      LZeb_encode_pair( &fe->eb, match_distance, main_len, pos_state );
+      Mb_move_pos( &fe->eb.mb );
+      FLZe_update_and_move( fe, main_len - 1 );
+      continue;
+      }
+
+    const uint8_t prev_byte = Mb_peek( &fe->eb.mb, 1 );
+    const uint8_t cur_byte = Mb_peek( &fe->eb.mb, 0 );
+    const uint8_t match_byte = Mb_peek( &fe->eb.mb, reps[0] + 1 );
+    Mb_move_pos( &fe->eb.mb );
+    CRC32_update_byte( &fe->eb.crc, cur_byte );
+
+    if( match_byte == cur_byte )
+      {
+      const int short_rep_price = price1( fe->eb.bm_match[state][pos_state] ) +
+                                  price1( fe->eb.bm_rep[state] ) +
+                                  price0( fe->eb.bm_rep0[state] ) +
+                                  price0( fe->eb.bm_len[state][pos_state] );
+      int price = price0( fe->eb.bm_match[state][pos_state] );
+      if( St_is_char( state ) )
+        price += LZeb_price_literal( &fe->eb, prev_byte, cur_byte );
+      else
+        price += LZeb_price_matched( &fe->eb, prev_byte, cur_byte, match_byte );
+      if( short_rep_price < price )
+        {
+        Re_encode_bit( &fe->eb.renc, &fe->eb.bm_match[state][pos_state], 1 );
+        Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep[state], 1 );
+        Re_encode_bit( &fe->eb.renc, &fe->eb.bm_rep0[state], 0 );
+        Re_encode_bit( &fe->eb.renc, &fe->eb.bm_len[state][pos_state], 0 );
+        state = St_set_short_rep( state );
+        continue;
+        }
+      }
+
+    /* literal byte */
+    Re_encode_bit( &fe->eb.renc, &fe->eb.bm_match[state][pos_state], 0 );
+    if( ( state = St_set_char( state ) ) < 4 )
+      LZeb_encode_literal( &fe->eb, prev_byte, cur_byte );
+    else
+      LZeb_encode_matched( &fe->eb, prev_byte, cur_byte, match_byte );
+    }
+
+  LZeb_full_flush( &fe->eb, state );
+  return true;
+  }
diff --git a/fast_encoder.h b/fast_encoder.h
new file mode 100644
index 0000000..e4e4000
--- /dev/null
+++ b/fast_encoder.h
@@ -0,0 +1,68 @@
+/* Clzip - LZMA lossless data compressor
+   Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+*/
+
+struct FLZ_encoder
+  {
+  struct LZ_encoder_base eb;
+  unsigned key4;			/* key made from latest 4 bytes */
+  };
+
+static inline void FLZe_reset_key4( struct FLZ_encoder * const fe )
+  {
+  int i;
+  fe->key4 = 0;
+  for( i = 0; i < 3 && i < Mb_available_bytes( &fe->eb.mb ); ++i )
+    fe->key4 = ( fe->key4 << 4 ) ^ fe->eb.mb.buffer[i];
+  }
+
+int FLZe_longest_match_len( struct FLZ_encoder * const fe, int * const distance );
+
+static inline void FLZe_update_and_move( struct FLZ_encoder * const fe, int n )
+  {
+  struct Matchfinder_base * const mb = &fe->eb.mb;
+  while( --n >= 0 )
+    {
+    if( Mb_available_bytes( mb ) >= 4 )
+      {
+      fe->key4 = ( ( fe->key4 << 4 ) ^ mb->buffer[mb->pos+3] ) & mb->key4_mask;
+      mb->pos_array[mb->cyclic_pos] = mb->prev_positions[fe->key4];
+      mb->prev_positions[fe->key4] = mb->pos + 1;
+      }
+    Mb_move_pos( mb );
+    }
+  }
+
+static inline bool FLZe_init( struct FLZ_encoder * const fe,
+                              const int ifd, const int outfd )
+  {
+  enum { before_size = 0,
+         dict_size = 65536,
+         /* bytes to keep in buffer after pos */
+         after_size = max_match_len,
+         dict_factor = 16,
+         num_prev_positions23 = 0,
+         pos_array_factor = 1 };
+
+  return LZeb_init( &fe->eb, before_size, dict_size, after_size, dict_factor,
+                    num_prev_positions23, pos_array_factor, ifd, outfd );
+  }
+
+static inline void FLZe_reset( struct FLZ_encoder * const fe )
+  { LZeb_reset( &fe->eb ); }
+
+bool FLZe_encode_member( struct FLZ_encoder * const fe,
+                         const unsigned long long member_size );
diff --git a/list.c b/list.c
new file mode 100644
index 0000000..22fea92
--- /dev/null
+++ b/list.c
@@ -0,0 +1,112 @@
+/* Clzip - LZMA lossless data compressor
+   Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+*/
+
+#define _FILE_OFFSET_BITS 64
+
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+#include <sys/stat.h>
+
+#include "lzip.h"
+#include "lzip_index.h"
+
+
+static void list_line( const unsigned long long uncomp_size,
+                       const unsigned long long comp_size,
+                       const char * const input_filename )
+  {
+  if( uncomp_size > 0 )
+    printf( "%14llu %14llu %6.2f%%  %s\n", uncomp_size, comp_size,
+            100.0 - ( ( 100.0 * comp_size ) / uncomp_size ),
+            input_filename );
+  else
+    printf( "%14llu %14llu   -INF%%  %s\n", uncomp_size, comp_size,
+            input_filename );
+  }
+
+
+int list_files( const char * const filenames[], const int num_filenames,
+                const struct Cl_options * const cl_opts )
+  {
+  unsigned long long total_comp = 0, total_uncomp = 0;
+  int files = 0, retval = 0;
+  int i;
+  bool first_post = true;
+  bool stdin_used = false;
+
+  for( i = 0; i < num_filenames; ++i )
+    {
+    const bool from_stdin = ( strcmp( filenames[i], "-" ) == 0 );
+    if( from_stdin ) { if( stdin_used ) continue; else stdin_used = true; }
+    const char * const input_filename = from_stdin ? "(stdin)" : filenames[i];
+    struct stat in_stats;				/* not used */
+    const int infd = from_stdin ? STDIN_FILENO :
+      open_instream( input_filename, &in_stats, false, true );
+    if( infd < 0 ) { set_retval( &retval, 1 ); continue; }
+
+    struct Lzip_index lzip_index;
+    Li_init( &lzip_index, infd, cl_opts );
+    close( infd );
+    if( lzip_index.retval != 0 )
+      {
+      show_file_error( input_filename, lzip_index.error, 0 );
+      set_retval( &retval, lzip_index.retval );
+      Li_free( &lzip_index ); continue;
+      }
+    if( verbosity < 0 ) { Li_free( &lzip_index ); continue; }
+    const unsigned long long udata_size = Li_udata_size( &lzip_index );
+    const unsigned long long cdata_size = Li_cdata_size( &lzip_index );
+    total_comp += cdata_size; total_uncomp += udata_size; ++files;
+    const long members = lzip_index.members;
+    if( first_post )
+      {
+      first_post = false;
+      if( verbosity >= 1 ) fputs( "   dict   memb  trail ", stdout );
+      fputs( "  uncompressed     compressed   saved  name\n", stdout );
+      }
+    if( verbosity >= 1 )
+      printf( "%s %5ld %6lld ", format_ds( lzip_index.dictionary_size ),
+              members, Li_file_size( &lzip_index ) - cdata_size );
+    list_line( udata_size, cdata_size, input_filename );
+
+    if( verbosity >= 2 && members > 1 )
+      {
+      long i;
+      fputs( " member      data_pos      data_size     member_pos    member_size\n", stdout );
+      for( i = 0; i < members; ++i )
+        {
+        const struct Block * db = Li_dblock( &lzip_index, i );
+        const struct Block * mb = Li_mblock( &lzip_index, i );
+        printf( "%6ld %14llu %14llu %14llu %14llu\n",
+                i + 1, db->pos, db->size, mb->pos, mb->size );
+        }
+      first_post = true;	/* reprint heading after list of members */
+      }
+    fflush( stdout );
+    Li_free( &lzip_index );
+    }
+  if( verbosity >= 0 && files > 1 )
+    {
+    if( verbosity >= 1 ) fputs( "                      ", stdout );
+    list_line( total_uncomp, total_comp, "(totals)" );
+    fflush( stdout );
+    }
+  return retval;
+  }
diff --git a/lzip.h b/lzip.h
new file mode 100644
index 0000000..abf1a27
--- /dev/null
+++ b/lzip.h
@@ -0,0 +1,336 @@
+/* Clzip - LZMA lossless data compressor
+   Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+*/
+
+#ifndef max
+  #define max(x,y) ((x) >= (y) ? (x) : (y))
+#endif
+#ifndef min
+  #define min(x,y) ((x) <= (y) ? (x) : (y))
+#endif
+
+typedef int State;
+
+enum { states = 12 };
+static inline bool St_is_char( const State st ) { return st < 7; }
+
+static inline State St_set_char( const State st )
+  {
+  static const State next[states] = { 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 4, 5 };
+  return next[st];
+  }
+static inline State St_set_char_rep() { return 8; }
+static inline State St_set_match( const State st )
+  { return ( st < 7 ) ? 7 : 10; }
+static inline State St_set_rep( const State st )
+  { return ( st < 7 ) ? 8 : 11; }
+static inline State St_set_short_rep( const State st )
+  { return ( st < 7 ) ? 9 : 11; }
+
+
+enum {
+  min_dictionary_bits = 12,
+  min_dictionary_size = 1 << min_dictionary_bits,	/* >= modeled_distances */
+  max_dictionary_bits = 29,
+  max_dictionary_size = 1 << max_dictionary_bits,
+  min_member_size = 36,
+  literal_context_bits = 3,
+  literal_pos_state_bits = 0,				/* not used */
+  pos_state_bits = 2,
+  pos_states = 1 << pos_state_bits,
+  pos_state_mask = pos_states - 1,
+
+  len_states = 4,
+  dis_slot_bits = 6,
+  start_dis_model = 4,
+  end_dis_model = 14,
+  modeled_distances = 1 << ( end_dis_model / 2 ),	/* 128 */
+  dis_align_bits = 4,
+  dis_align_size = 1 << dis_align_bits,
+
+  len_low_bits = 3,
+  len_mid_bits = 3,
+  len_high_bits = 8,
+  len_low_symbols = 1 << len_low_bits,
+  len_mid_symbols = 1 << len_mid_bits,
+  len_high_symbols = 1 << len_high_bits,
+  max_len_symbols = len_low_symbols + len_mid_symbols + len_high_symbols,
+
+  min_match_len = 2,					/* must be 2 */
+  max_match_len = min_match_len + max_len_symbols - 1,	/* 273 */
+  min_match_len_limit = 5 };
+
+static inline int get_len_state( const int len )
+  { return min( len - min_match_len, len_states - 1 ); }
+
+static inline int get_lit_state( const uint8_t prev_byte )
+  { return prev_byte >> ( 8 - literal_context_bits ); }
+
+
+enum { bit_model_move_bits = 5,
+       bit_model_total_bits = 11,
+       bit_model_total = 1 << bit_model_total_bits };
+
+typedef int Bit_model;
+
+static inline void Bm_init( Bit_model * const probability )
+  { *probability = bit_model_total / 2; }
+
+static inline void Bm_array_init( Bit_model bm[], const int size )
+  { int i; for( i = 0; i < size; ++i ) Bm_init( &bm[i] ); }
+
+struct Len_model
+  {
+  Bit_model choice1;
+  Bit_model choice2;
+  Bit_model bm_low[pos_states][len_low_symbols];
+  Bit_model bm_mid[pos_states][len_mid_symbols];
+  Bit_model bm_high[len_high_symbols];
+  };
+
+static inline void Lm_init( struct Len_model * const lm )
+  {
+  Bm_init( &lm->choice1 );
+  Bm_init( &lm->choice2 );
+  Bm_array_init( lm->bm_low[0], pos_states * len_low_symbols );
+  Bm_array_init( lm->bm_mid[0], pos_states * len_mid_symbols );
+  Bm_array_init( lm->bm_high, len_high_symbols );
+  }
+
+
+typedef uint32_t CRC32[256];	/* Table of CRCs of all 8-bit messages. */
+
+extern CRC32 crc32;
+
+static inline void CRC32_init( void )
+  {
+  unsigned n;
+  for( n = 0; n < 256; ++n )
+    {
+    unsigned c = n;
+    int k;
+    for( k = 0; k < 8; ++k )
+      { if( c & 1 ) c = 0xEDB88320U ^ ( c >> 1 ); else c >>= 1; }
+    crc32[n] = c;
+    }
+  }
+
+static inline void CRC32_update_byte( uint32_t * const crc, const uint8_t byte )
+  { *crc = crc32[(*crc^byte)&0xFF] ^ ( *crc >> 8 ); }
+
+/* about as fast as it is possible without messing with endianness */
+static inline void CRC32_update_buf( uint32_t * const crc,
+                                     const uint8_t * const buffer,
+                                     const int size )
+  {
+  int i;
+  uint32_t c = *crc;
+  for( i = 0; i < size; ++i )
+    c = crc32[(c^buffer[i])&0xFF] ^ ( c >> 8 );
+  *crc = c;
+  }
+
+
+static inline bool isvalid_ds( const unsigned dictionary_size )
+  { return dictionary_size >= min_dictionary_size &&
+           dictionary_size <= max_dictionary_size; }
+
+
+static inline int real_bits( unsigned value )
+  {
+  int bits = 0;
+  while( value > 0 ) { value >>= 1; ++bits; }
+  return bits;
+  }
+
+
+static const uint8_t lzip_magic[4] = { 0x4C, 0x5A, 0x49, 0x50 }; /* "LZIP" */
+
+enum { Lh_size = 6 };
+typedef uint8_t Lzip_header[Lh_size];	/* 0-3 magic bytes */
+					/*   4 version */
+					/*   5 coded dictionary size */
+
+static inline void Lh_set_magic( Lzip_header data )
+  { memcpy( data, lzip_magic, 4 ); data[4] = 1; }
+
+static inline bool Lh_check_magic( const Lzip_header data )
+  { return memcmp( data, lzip_magic, 4 ) == 0; }
+
+/* detect (truncated) header */
+static inline bool Lh_check_prefix( const Lzip_header data, const int sz )
+  {
+  int i; for( i = 0; i < sz && i < 4; ++i )
+    if( data[i] != lzip_magic[i] ) return false;
+  return sz > 0;
+  }
+
+/* detect corrupt header */
+static inline bool Lh_check_corrupt( const Lzip_header data )
+  {
+  int matches = 0;
+  int i; for( i = 0; i < 4; ++i )
+    if( data[i] == lzip_magic[i] ) ++matches;
+  return matches > 1 && matches < 4;
+  }
+
+static inline uint8_t Lh_version( const Lzip_header data )
+  { return data[4]; }
+
+static inline bool Lh_check_version( const Lzip_header data )
+  { return data[4] == 1; }
+
+static inline unsigned Lh_get_dictionary_size( const Lzip_header data )
+  {
+  unsigned sz = 1 << ( data[5] & 0x1F );
+  if( sz > min_dictionary_size )
+    sz -= ( sz / 16 ) * ( ( data[5] >> 5 ) & 7 );
+  return sz;
+  }
+
+static inline bool Lh_set_dictionary_size( Lzip_header data, const unsigned sz )
+  {
+  if( !isvalid_ds( sz ) ) return false;
+  data[5] = real_bits( sz - 1 );
+  if( sz > min_dictionary_size )
+    {
+    const unsigned base_size = 1 << data[5];
+    const unsigned fraction = base_size / 16;
+    unsigned i;
+    for( i = 7; i >= 1; --i )
+      if( base_size - ( i * fraction ) >= sz )
+        { data[5] |= i << 5; break; }
+    }
+  return true;
+  }
+
+static inline bool Lh_check( const Lzip_header data )
+  {
+  return Lh_check_magic( data ) && Lh_check_version( data ) &&
+         isvalid_ds( Lh_get_dictionary_size( data ) );
+  }
+
+
+enum { Lt_size = 20 };
+typedef uint8_t Lzip_trailer[Lt_size];
+			/*  0-3  CRC32 of the uncompressed data */
+			/*  4-11 size of the uncompressed data */
+			/* 12-19 member size including header and trailer */
+
+static inline unsigned Lt_get_data_crc( const Lzip_trailer data )
+  {
+  unsigned tmp = 0;
+  int i; for( i = 3; i >= 0; --i ) { tmp <<= 8; tmp += data[i]; }
+  return tmp;
+  }
+
+static inline void Lt_set_data_crc( Lzip_trailer data, unsigned crc )
+  { int i; for( i = 0; i <= 3; ++i ) { data[i] = (uint8_t)crc; crc >>= 8; } }
+
+static inline unsigned long long Lt_get_data_size( const Lzip_trailer data )
+  {
+  unsigned long long tmp = 0;
+  int i; for( i = 11; i >= 4; --i ) { tmp <<= 8; tmp += data[i]; }
+  return tmp;
+  }
+
+static inline void Lt_set_data_size( Lzip_trailer data, unsigned long long sz )
+  { int i; for( i = 4; i <= 11; ++i ) { data[i] = (uint8_t)sz; sz >>= 8; } }
+
+static inline unsigned long long Lt_get_member_size( const Lzip_trailer data )
+  {
+  unsigned long long tmp = 0;
+  int i; for( i = 19; i >= 12; --i ) { tmp <<= 8; tmp += data[i]; }
+  return tmp;
+  }
+
+static inline void Lt_set_member_size( Lzip_trailer data, unsigned long long sz )
+  { int i; for( i = 12; i <= 19; ++i ) { data[i] = (uint8_t)sz; sz >>= 8; } }
+
+/* check internal consistency */
+static inline bool Lt_check_consistency( const Lzip_trailer data )
+  {
+  const unsigned crc = Lt_get_data_crc( data );
+  const unsigned long long dsize = Lt_get_data_size( data );
+  if( ( crc == 0 ) != ( dsize == 0 ) ) return false;
+  const unsigned long long msize = Lt_get_member_size( data );
+  if( msize < min_member_size ) return false;
+  const unsigned long long mlimit = ( 9 * dsize + 7 ) / 8 + min_member_size;
+  if( mlimit > dsize && msize > mlimit ) return false;
+  const unsigned long long dlimit = 7090 * ( msize - 26 ) - 1;
+  if( dlimit > msize && dsize > dlimit ) return false;
+  return true;
+  }
+
+
+struct Cl_options		/* command-line options */
+  {
+  bool ignore_empty;
+  bool ignore_marking;
+  bool ignore_trailing;
+  bool loose_trailing;
+  };
+
+static inline void Cl_options_init( struct Cl_options * cl_opts )
+  { cl_opts->ignore_empty = true; cl_opts->ignore_marking = true;
+    cl_opts->ignore_trailing = true; cl_opts->loose_trailing = false; }
+
+
+static inline void set_retval( int * retval, const int new_val )
+  { if( *retval < new_val ) *retval = new_val; }
+
+static const char * const bad_magic_msg = "Bad magic number (file not in lzip format).";
+static const char * const bad_dict_msg = "Invalid dictionary size in member header.";
+static const char * const corrupt_mm_msg = "Corrupt header in multimember file.";
+static const char * const empty_msg = "Empty member not allowed.";
+static const char * const marking_msg = "Marking data not allowed.";
+static const char * const trailing_msg = "Trailing data not allowed.";
+static const char * const mem_msg = "Not enough memory.";
+
+/* defined in decoder.c */
+int readblock( const int fd, uint8_t * const buf, const int size );
+int writeblock( const int fd, const uint8_t * const buf, const int size );
+
+/* defined in list.c */
+int list_files( const char * const filenames[], const int num_filenames,
+                const struct Cl_options * const cl_opts );
+
+/* defined in main.c */
+struct stat;
+struct Pretty_print;
+extern int verbosity;
+void * resize_buffer( void * buf, const unsigned min_size );
+void Pp_show_msg( struct Pretty_print * const pp, const char * const msg );
+const char * bad_version( const unsigned version );
+const char * format_ds( const unsigned dictionary_size );
+void show_header( const unsigned dictionary_size );
+int open_instream( const char * const name, struct stat * const in_statsp,
+                   const bool one_to_one, const bool reg_only );
+void cleanup_and_fail( const int retval );
+void show_error( const char * const msg, const int errcode, const bool help );
+void show_file_error( const char * const filename, const char * const msg,
+                      const int errcode );
+void internal_error( const char * const msg );
+struct Matchfinder_base;
+void show_cprogress( const unsigned long long cfile_size,
+                     const unsigned long long partial_size,
+                     const struct Matchfinder_base * const m,
+                     struct Pretty_print * const p );
+struct Range_decoder;
+void show_dprogress( const unsigned long long cfile_size,
+                     const unsigned long long partial_size,
+                     const struct Range_decoder * const d,
+                     struct Pretty_print * const p );
diff --git a/lzip_index.c b/lzip_index.c
new file mode 100644
index 0000000..b7d594c
--- /dev/null
+++ b/lzip_index.c
@@ -0,0 +1,289 @@
+/* Clzip - LZMA lossless data compressor
+   Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+*/
+
+#define _FILE_OFFSET_BITS 64
+
+#include <errno.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include "lzip.h"
+#include "lzip_index.h"
+
+
+static int seek_read( const int fd, uint8_t * const buf, const int size,
+                      const long long pos )
+  {
+  if( lseek( fd, pos, SEEK_SET ) == pos )
+    return readblock( fd, buf, size );
+  return 0;
+  }
+
+
+static bool add_error( struct Lzip_index * const li, const char * const msg )
+  {
+  const int len = strlen( msg );
+  void * tmp = resize_buffer( li->error, li->error_size + len + 1 );
+  if( !tmp ) return false;
+  li->error = (char *)tmp;
+  strncpy( li->error + li->error_size, msg, len + 1 );
+  li->error_size += len;
+  return true;
+  }
+
+
+static bool push_back_member( struct Lzip_index * const li,
+                              const long long dp, const long long ds,
+                              const long long mp, const long long ms,
+                              const unsigned dict_size )
+  {
+  struct Member * p;
+  void * tmp = resize_buffer( li->member_vector,
+               ( li->members + 1 ) * sizeof li->member_vector[0] );
+  if( !tmp ) { add_error( li, mem_msg ); li->retval = 1; return false; }
+  li->member_vector = (struct Member *)tmp;
+  p = &(li->member_vector[li->members]);
+  init_member( p, dp, ds, mp, ms, dict_size );
+  ++li->members;
+  return true;
+  }
+
+
+static void Li_free_member_vector( struct Lzip_index * const li )
+  {
+  if( li->member_vector )
+    { free( li->member_vector ); li->member_vector = 0; }
+  li->members = 0;
+  }
+
+
+static void Li_reverse_member_vector( struct Lzip_index * const li )
+  {
+  struct Member tmp;
+  long i;
+  for( i = 0; i < li->members / 2; ++i )
+    {
+    tmp = li->member_vector[i];
+    li->member_vector[i] = li->member_vector[li->members-i-1];
+    li->member_vector[li->members-i-1] = tmp;
+    }
+  }
+
+
+static bool Li_check_header( struct Lzip_index * const li,
+                             const Lzip_header header )
+  {
+  if( !Lh_check_magic( header ) )
+    { add_error( li, bad_magic_msg ); li->retval = 2; return false; }
+  if( !Lh_check_version( header ) )
+    { add_error( li, bad_version( Lh_version( header ) ) ); li->retval = 2;
+      return false; }
+  if( !isvalid_ds( Lh_get_dictionary_size( header ) ) )
+    { add_error( li, bad_dict_msg ); li->retval = 2; return false; }
+  return true;
+  }
+
+static void Li_set_errno_error( struct Lzip_index * const li,
+                                const char * const msg )
+  {
+  add_error( li, msg ); add_error( li, strerror( errno ) );
+  li->retval = 1;
+  }
+
+static void Li_set_num_error( struct Lzip_index * const li,
+                              const char * const msg, unsigned long long num )
+  {
+  char buf[80];
+  snprintf( buf, sizeof buf, "%s%llu", msg, num );
+  add_error( li, buf );
+  li->retval = 2;
+  }
+
+
+static bool Li_read_header( struct Lzip_index * const li, const int fd,
+            Lzip_header header, const long long pos, const bool ignore_marking )
+  {
+  if( seek_read( fd, header, Lh_size, pos ) != Lh_size )
+    { Li_set_errno_error( li, "Error reading member header: " ); return false; }
+  uint8_t byte;
+  if( !ignore_marking && readblock( fd, &byte, 1 ) == 1 && byte != 0 )
+    { add_error( li, marking_msg ); li->retval = 2; return false; }
+  return true;
+  }
+
+
+/* If successful, push last member and set pos to member header. */
+static bool Li_skip_trailing_data( struct Lzip_index * const li, const int fd,
+                                   unsigned long long * const pos,
+                                   const struct Cl_options * const cl_opts )
+  {
+  if( *pos < min_member_size ) return false;
+  enum { block_size = 16384,
+         buffer_size = block_size + Lt_size - 1 + Lh_size };
+  uint8_t buffer[buffer_size];
+  int bsize = *pos % block_size;		/* total bytes in buffer */
+  if( bsize <= buffer_size - block_size ) bsize += block_size;
+  int search_size = bsize;			/* bytes to search for trailer */
+  int rd_size = bsize;				/* bytes to read from file */
+  unsigned long long ipos = *pos - rd_size;	/* aligned to block_size */
+
+  while( true )
+    {
+    if( seek_read( fd, buffer, rd_size, ipos ) != rd_size )
+      { Li_set_errno_error( li, "Error seeking member trailer: " ); return false; }
+    const uint8_t max_msb = ( ipos + search_size ) >> 56;
+    int i;
+    for( i = search_size; i >= Lt_size; --i )
+      if( buffer[i-1] <= max_msb )	/* most significant byte of member_size */
+        {
+        const Lzip_trailer * const trailer =
+          (const Lzip_trailer *)( buffer + i - Lt_size );
+        const unsigned long long member_size = Lt_get_member_size( *trailer );
+        if( member_size == 0 )			/* skip trailing zeros */
+          { while( i > Lt_size && buffer[i-9] == 0 ) --i; continue; }
+        if( member_size > ipos + i || !Lt_check_consistency( *trailer ) )
+          continue;
+        Lzip_header header;
+        if( !Li_read_header( li, fd, header, ipos + i - member_size,
+                             cl_opts->ignore_marking ) ) return false;
+        if( !Lh_check( header ) ) continue;
+        const Lzip_header * header2 = (const Lzip_header *)( buffer + i );
+        const bool full_h2 = bsize - i >= Lh_size;
+        if( Lh_check_prefix( *header2, bsize - i ) )	/* last member */
+          {
+          if( !full_h2 ) add_error( li, "Last member in input file is truncated." );
+          else if( Li_check_header( li, *header2 ) )
+            add_error( li, "Last member in input file is truncated or corrupt." );
+          li->retval = 2; return false;
+          }
+        if( !cl_opts->loose_trailing && full_h2 && Lh_check_corrupt( *header2 ) )
+          { add_error( li, corrupt_mm_msg ); li->retval = 2; return false; }
+        if( !cl_opts->ignore_trailing )
+          { add_error( li, trailing_msg ); li->retval = 2; return false; }
+        const unsigned long long data_size = Lt_get_data_size( *trailer );
+        if( !cl_opts->ignore_empty && data_size == 0 )
+          { add_error( li, empty_msg ); li->retval = 2; return false; }
+        *pos = ipos + i - member_size;			/* good member */
+        const unsigned dictionary_size = Lh_get_dictionary_size( header );
+        if( li->dictionary_size < dictionary_size )
+          li->dictionary_size = dictionary_size;
+        return push_back_member( li, 0, data_size, *pos, member_size,
+                                 dictionary_size );
+        }
+    if( ipos == 0 )
+      { Li_set_num_error( li, "Bad trailer at pos ", *pos - Lt_size );
+        return false; }
+    bsize = buffer_size;
+    search_size = bsize - Lh_size;
+    rd_size = block_size;
+    ipos -= rd_size;
+    memcpy( buffer + rd_size, buffer, buffer_size - rd_size );
+    }
+  }
+
+
+bool Li_init( struct Lzip_index * const li, const int infd,
+              const struct Cl_options * const cl_opts )
+  {
+  li->member_vector = 0;
+  li->error = 0;
+  li->insize = lseek( infd, 0, SEEK_END );
+  li->members = 0;
+  li->error_size = 0;
+  li->retval = 0;
+  li->dictionary_size = 0;
+  if( li->insize < 0 )
+    { Li_set_errno_error( li, "Input file is not seekable: " ); return false; }
+  if( li->insize < min_member_size )
+    { add_error( li, "Input file is too short." ); li->retval = 2;
+      return false; }
+  if( li->insize > INT64_MAX )
+    { add_error( li, "Input file is too long (2^63 bytes or more)." );
+      li->retval = 2; return false; }
+
+  Lzip_header header;
+  if( !Li_read_header( li, infd, header, 0, cl_opts->ignore_marking ) ||
+      !Li_check_header( li, header ) ) return false;
+
+  unsigned long long pos = li->insize;	/* always points to a header or to EOF */
+  while( pos >= min_member_size )
+    {
+    Lzip_trailer trailer;
+    if( seek_read( infd, trailer, Lt_size, pos - Lt_size ) != Lt_size )
+      { Li_set_errno_error( li, "Error reading member trailer: " ); break; }
+    const unsigned long long member_size = Lt_get_member_size( trailer );
+    if( member_size > pos || !Lt_check_consistency( trailer ) )
+      {							/* bad trailer */
+      if( li->members <= 0 )
+        { if( Li_skip_trailing_data( li, infd, &pos, cl_opts ) ) continue;
+          return false; }
+      Li_set_num_error( li, "Bad trailer at pos ", pos - Lt_size ); break;
+      }
+    if( !Li_read_header( li, infd, header, pos - member_size,
+                         cl_opts->ignore_marking ) ) break;
+    if( !Lh_check( header ) )				/* bad header */
+      {
+      if( li->members <= 0 )
+        { if( Li_skip_trailing_data( li, infd, &pos, cl_opts ) ) continue;
+          return false; }
+      Li_set_num_error( li, "Bad header at pos ", pos - member_size ); break;
+      }
+    const unsigned long long data_size = Lt_get_data_size( trailer );
+    if( !cl_opts->ignore_empty && data_size == 0 )
+      { add_error( li, empty_msg ); li->retval = 2; break; }
+    pos -= member_size;					/* good member */
+    const unsigned dictionary_size = Lh_get_dictionary_size( header );
+    if( li->dictionary_size < dictionary_size )
+      li->dictionary_size = dictionary_size;
+    if( !push_back_member( li, 0, data_size, pos, member_size,
+                           dictionary_size ) ) return false;
+    }
+  if( pos != 0 || li->members <= 0 || li->retval != 0 )
+    {
+    Li_free_member_vector( li );
+    if( li->retval == 0 )
+      { add_error( li, "Can't create file index." ); li->retval = 2; }
+    return false;
+    }
+  Li_reverse_member_vector( li );
+  long i;
+  for( i = 0; ; ++i )
+    {
+    const long long end = block_end( li->member_vector[i].dblock );
+    if( end < 0 || end > INT64_MAX )
+      {
+      Li_free_member_vector( li );
+      add_error( li, "Data in input file is too long (2^63 bytes or more)." );
+      li->retval = 2; return false;
+      }
+    if( i + 1 >= li->members ) break;
+    li->member_vector[i+1].dblock.pos = end;
+    }
+  return true;
+  }
+
+
+void Li_free( struct Lzip_index * const li )
+  {
+  Li_free_member_vector( li );
+  if( li->error ) { free( li->error ); li->error = 0; }
+  li->error_size = 0;
+  }
diff --git a/lzip_index.h b/lzip_index.h
new file mode 100644
index 0000000..e273eaf
--- /dev/null
+++ b/lzip_index.h
@@ -0,0 +1,91 @@
+/* Clzip - LZMA lossless data compressor
+   Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+*/
+
+#ifndef INT64_MAX
+#define INT64_MAX 0x7FFFFFFFFFFFFFFFLL
+#endif
+
+
+struct Block
+  {
+  long long pos, size;		/* pos >= 0, size >= 0, pos + size <= INT64_MAX */
+  };
+
+static inline void init_block( struct Block * const b,
+                               const long long p, const long long s )
+  { b->pos = p; b->size = s; }
+
+static inline long long block_end( const struct Block b )
+  { return b.pos + b.size; }
+
+
+struct Member
+  {
+  struct Block dblock, mblock;		/* data block, member block */
+  unsigned dictionary_size;
+  };
+
+static inline void init_member( struct Member * const m,
+                                const long long dpos, const long long dsize,
+                                const long long mpos, const long long msize,
+                                const unsigned dict_size )
+  { init_block( &m->dblock, dpos, dsize ); init_block( &m->mblock, mpos, msize );
+    m->dictionary_size = dict_size; }
+
+struct Lzip_index
+  {
+  struct Member * member_vector;
+  char * error;
+  long long insize;
+  long members;
+  int error_size;
+  int retval;
+  unsigned dictionary_size;	/* largest dictionary size in the file */
+  };
+
+bool Li_init( struct Lzip_index * const li, const int infd,
+              const struct Cl_options * const cl_opts );
+
+void Li_free( struct Lzip_index * const li );
+
+static inline long long Li_udata_size( const struct Lzip_index * const li )
+  {
+  if( li->members <= 0 ) return 0;
+  return block_end( li->member_vector[li->members-1].dblock );
+  }
+
+static inline long long Li_cdata_size( const struct Lzip_index * const li )
+  {
+  if( li->members <= 0 ) return 0;
+  return block_end( li->member_vector[li->members-1].mblock );
+  }
+
+  /* total size including trailing data (if any) */
+static inline long long Li_file_size( const struct Lzip_index * const li )
+  { if( li->insize >= 0 ) return li->insize; else return 0; }
+
+static inline const struct Block * Li_dblock( const struct Lzip_index * const li,
+                                              const long i )
+  { return &li->member_vector[i].dblock; }
+
+static inline const struct Block * Li_mblock( const struct Lzip_index * const li,
+                                              const long i )
+  { return &li->member_vector[i].mblock; }
+
+static inline unsigned Li_dictionary_size( const struct Lzip_index * const li,
+                                           const long i )
+  { return li->member_vector[i].dictionary_size; }
diff --git a/main.c b/main.c
new file mode 100644
index 0000000..788699f
--- /dev/null
+++ b/main.c
@@ -0,0 +1,1223 @@
+/* Clzip - LZMA lossless data compressor
+   Copyright (C) 2010-2024 Antonio Diaz Diaz.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+*/
+/*
+   Exit status: 0 for a normal exit, 1 for environmental problems
+   (file not found, invalid command-line options, I/O errors, etc), 2 to
+   indicate a corrupt or invalid input file, 3 for an internal consistency
+   error (e.g., bug) which caused clzip to panic.
+*/
+
+#define _FILE_OFFSET_BITS 64
+
+#include <ctype.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <limits.h>		/* SSIZE_MAX */
+#include <signal.h>
+#include <stdbool.h>
+#include <stdint.h>		/* SIZE_MAX */
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <utime.h>
+#include <sys/stat.h>
+#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__
+#include <io.h>
+#if defined __MSVCRT__
+#define fchmod(x,y) 0
+#define fchown(x,y,z) 0
+#define strtoull strtoul
+#define SIGHUP SIGTERM
+#define S_ISSOCK(x) 0
+#ifndef S_IRGRP
+#define S_IRGRP 0
+#define S_IWGRP 0
+#define S_IROTH 0
+#define S_IWOTH 0
+#endif
+#endif
+#if defined __DJGPP__
+#define S_ISSOCK(x) 0
+#define S_ISVTX 0
+#endif
+#endif
+
+#include "carg_parser.h"
+#include "lzip.h"
+#include "decoder.h"
+#include "encoder_base.h"
+#include "encoder.h"
+#include "fast_encoder.h"
+
+#ifndef O_BINARY
+#define O_BINARY 0
+#endif
+
+#if CHAR_BIT != 8
+#error "Environments where CHAR_BIT != 8 are not supported."
+#endif
+
+#if ( defined  SIZE_MAX &&  SIZE_MAX < UINT_MAX ) || \
+    ( defined SSIZE_MAX && SSIZE_MAX <  INT_MAX )
+#error "Environments where 'size_t' is narrower than 'int' are not supported."
+#endif
+
+int verbosity = 0;
+
+static const char * const program_name = "clzip";
+static const char * const program_year = "2024";
+static const char * invocation_name = "clzip";		/* default value */
+
+static const struct { const char * from; const char * to; } known_extensions[] = {
+  { ".lz",  ""     },
+  { ".tlz", ".tar" },
+  { 0,      0      } };
+
+struct Lzma_options
+  {
+  int dictionary_size;		/* 4 KiB .. 512 MiB */
+  int match_len_limit;		/* 5 .. 273 */
+  };
+
+enum Mode { m_compress, m_decompress, m_list, m_test };
+
+/* Variables used in signal handler context.
+   They are not declared volatile because the handler never returns. */
+static char * output_filename = 0;
+static int outfd = -1;
+static bool delete_output_on_interrupt = false;
+
+
+static void show_help( void )
+  {
+  printf( "Clzip is a C language version of lzip, compatible with lzip 1.4 or newer. As\n"
+          "clzip is written in C, it may be easier to integrate in applications like\n"
+          "package managers, embedded devices, or systems lacking a C++ compiler.\n"
+          "\nLzip is a lossless data compressor with a user interface similar to the one\n"
+          "of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov\n"
+          "chain-Algorithm' (LZMA) stream format to maximize interoperability. The\n"
+          "maximum dictionary size is 512 MiB so that any lzip file can be decompressed\n"
+          "on 32-bit machines. Lzip provides accurate and robust 3-factor integrity\n"
+          "checking. Lzip can compress about as fast as gzip (lzip -0) or compress most\n"
+          "files more than bzip2 (lzip -9). Decompression speed is intermediate between\n"
+          "gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery\n"
+          "perspective. Lzip has been designed, written, and tested with great care to\n"
+          "replace gzip and bzip2 as the standard general-purpose compressed format for\n"
+          "Unix-like systems.\n"
+          "\nUsage: %s [options] [files]\n", invocation_name );
+  printf( "\nOptions:\n"
+          "  -h, --help                     display this help and exit\n"
+          "  -V, --version                  output version information and exit\n"
+          "  -a, --trailing-error           exit with error status if trailing data\n"
+          "  -b, --member-size=<bytes>      set member size limit in bytes\n"
+          "  -c, --stdout                   write to standard output, keep input files\n"
+          "  -d, --decompress               decompress, test compressed file integrity\n"
+          "  -f, --force                    overwrite existing output files\n"
+          "  -F, --recompress               force re-compression of compressed files\n"
+          "  -k, --keep                     keep (don't delete) input files\n"
+          "  -l, --list                     print (un)compressed file sizes\n"
+          "  -m, --match-length=<bytes>     set match length limit in bytes [36]\n"
+          "  -o, --output=<file>            write to <file>, keep input files\n"
+          "  -q, --quiet                    suppress all messages\n"
+          "  -s, --dictionary-size=<bytes>  set dictionary size limit in bytes [8 MiB]\n"
+          "  -S, --volume-size=<bytes>      set volume size limit in bytes\n"
+          "  -t, --test                     test compressed file integrity\n"
+          "  -v, --verbose                  be verbose (a 2nd -v gives more)\n"
+          "  -0 .. -9                       set compression level [default 6]\n"
+          "      --fast                     alias for -0\n"
+          "      --best                     alias for -9\n"
+          "      --empty-error              exit with error status if empty member in file\n"
+          "      --marking-error            exit with error status if 1st LZMA byte not 0\n"
+          "      --loose-trailing           allow trailing data seeming corrupt header\n"
+          "\nIf no file names are given, or if a file is '-', clzip compresses or\n"
+          "decompresses from standard input to standard output.\n"
+          "Numbers may be followed by a multiplier: k = kB = 10^3 = 1000,\n"
+          "Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc...\n"
+          "Dictionary sizes 12 to 29 are interpreted as powers of two, meaning 2^12 to\n"
+          "2^29 bytes.\n"
+          "\nThe bidimensional parameter space of LZMA can't be mapped to a linear scale\n"
+          "optimal for all files. If your files are large, very repetitive, etc, you\n"
+          "may need to use the options --dictionary-size and --match-length directly\n"
+          "to achieve optimal performance.\n"
+          "\nTo extract all the files from archive 'foo.tar.lz', use the commands\n"
+          "'tar -xf foo.tar.lz' or 'clzip -cd foo.tar.lz | tar -xf -'.\n"
+          "\nExit status: 0 for a normal exit, 1 for environmental problems\n"
+          "(file not found, invalid command-line options, I/O errors, etc), 2 to\n"
+          "indicate a corrupt or invalid input file, 3 for an internal consistency\n"
+          "error (e.g., bug) which caused clzip to panic.\n"
+          "\nThe ideas embodied in clzip are due to (at least) the following people:\n"
+          "Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrei Markov (for the\n"
+          "definition of Markov chains), G.N.N. Martin (for the definition of range\n"
+          "encoding), Igor Pavlov (for putting all the above together in LZMA), and\n"
+          "Julian Seward (for bzip2's CLI).\n"
+          "\nReport bugs to lzip-bug@nongnu.org\n"
+          "Clzip home page: http://www.nongnu.org/lzip/clzip.html\n" );
+  }
+
+
+static void show_version( void )
+  {
+  printf( "%s %s\n", program_name, PROGVERSION );
+  printf( "Copyright (C) %s Antonio Diaz Diaz.\n", program_year );
+  printf( "License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>\n"
+          "This is free software: you are free to change and redistribute it.\n"
+          "There is NO WARRANTY, to the extent permitted by law.\n" );
+  }
+
+
+/* assure at least a minimum size for buffer 'buf' */
+void * resize_buffer( void * buf, const unsigned min_size )
+  {
+  if( buf ) buf = realloc( buf, min_size );
+  else buf = malloc( min_size );
+  if( !buf ) { show_error( mem_msg, 0, false ); cleanup_and_fail( 1 ); }
+  return buf;
+  }
+
+
+struct Pretty_print
+  {
+  const char * name;
+  char * padded_name;
+  const char * stdin_name;
+  unsigned longest_name;
+  bool first_post;
+  };
+
+static void Pp_init( struct Pretty_print * const pp,
+                     const char * const filenames[], const int num_filenames )
+  {
+  pp->name = 0;
+  pp->padded_name = 0;
+  pp->stdin_name = "(stdin)";
+  pp->longest_name = 0;
+  pp->first_post = false;
+
+  if( verbosity <= 0 ) return;
+  const unsigned stdin_name_len = strlen( pp->stdin_name );
+  int i;
+  for( i = 0; i < num_filenames; ++i )
+    {
+    const char * const s = filenames[i];
+    const unsigned len = (strcmp( s, "-" ) == 0) ? stdin_name_len : strlen( s );
+    if( pp->longest_name < len ) pp->longest_name = len;
+    }
+  if( pp->longest_name == 0 ) pp->longest_name = stdin_name_len;
+  }
+
+static void Pp_set_name( struct Pretty_print * const pp,
+                         const char * const filename )
+  {
+  unsigned name_len, padded_name_len, i = 0;
+
+  if( filename && filename[0] && strcmp( filename, "-" ) != 0 )
+    pp->name = filename;
+  else pp->name = pp->stdin_name;
+  name_len = strlen( pp->name );
+  padded_name_len = max( name_len, pp->longest_name ) + 4;
+  pp->padded_name = resize_buffer( pp->padded_name, padded_name_len + 1 );
+  while( i < 2 ) pp->padded_name[i++] = ' ';
+  while( i < name_len + 2 ) { pp->padded_name[i] = pp->name[i-2]; ++i; }
+  pp->padded_name[i++] = ':';
+  while( i < padded_name_len ) pp->padded_name[i++] = ' ';
+  pp->padded_name[i] = 0;
+  pp->first_post = true;
+  }
+
+static void Pp_reset( struct Pretty_print * const pp )
+  { if( pp->name && pp->name[0] ) pp->first_post = true; }
+
+void Pp_show_msg( struct Pretty_print * const pp, const char * const msg )
+  {
+  if( verbosity < 0 ) return;
+  if( pp->first_post )
+    {
+    pp->first_post = false;
+    fputs( pp->padded_name, stderr );
+    if( !msg ) fflush( stderr );
+    }
+  if( msg ) fprintf( stderr, "%s\n", msg );
+  }
+
+
+const char * bad_version( const unsigned version )
+  {
+  static char buf[80];
+  snprintf( buf, sizeof buf, "Version %u member format not supported.",
+            version );
+  return buf;
+  }
+
+
+const char * format_ds( const unsigned dictionary_size )
+  {
+  enum { bufsize = 16, factor = 1024, n = 3 };
+  static char buf[bufsize];
+  const char * const prefix[n] = { "Ki", "Mi", "Gi" };
+  const char * p = "";
+  const char * np = "  ";
+  unsigned num = dictionary_size;
+  bool exact = ( num % factor == 0 );
+
+  int i; for( i = 0; i < n && ( num > 9999 || ( exact && num >= factor ) ); ++i )
+    { num /= factor; if( num % factor != 0 ) exact = false;
+      p = prefix[i]; np = ""; }
+  snprintf( buf, bufsize, "%s%4u %sB", np, num, p );
+  return buf;
+  }
+
+
+void show_header( const unsigned dictionary_size )
+  {
+  fprintf( stderr, "dict %s, ", format_ds( dictionary_size ) );
+  }
+
+
+/* separate numbers of 5 or more digits in groups of 3 digits using '_' */
+static const char * format_num3( unsigned long long num )
+  {
+  enum { buffers = 8, bufsize = 4 * sizeof num, n = 10 };
+  const char * const si_prefix = "kMGTPEZYRQ";
+  const char * const binary_prefix = "KMGTPEZYRQ";
+  static char buffer[buffers][bufsize];	/* circle of static buffers for printf */
+  static int current = 0;
+  int i;
+  char * const buf = buffer[current++]; current %= buffers;
+  char * p = buf + bufsize - 1;		/* fill the buffer backwards */
+  *p = 0;	/* terminator */
+  if( num > 1024 )
+    {
+    char prefix = 0;			/* try binary first, then si */
+    for( i = 0; i < n && num != 0 && num % 1024 == 0; ++i )
+      { num /= 1024; prefix = binary_prefix[i]; }
+    if( prefix ) *(--p) = 'i';
+    else
+      for( i = 0; i < n && num != 0 && num % 1000 == 0; ++i )
+        { num /= 1000; prefix = si_prefix[i]; }
+    if( prefix ) *(--p) = prefix;
+    }
+  const bool split = num >= 10000;
+
+  for( i = 0; ; )
+    {
+    *(--p) = num % 10 + '0'; num /= 10; if( num == 0 ) break;
+    if( split && ++i >= 3 ) { i = 0; *(--p) = '_'; }
+    }
+  return p;
+  }
+
+
+void show_option_error( const char * const arg, const char * const msg,
+                        const char * const option_name )
+  {
+  if( verbosity >= 0 )
+    fprintf( stderr, "%s: '%s': %s option '%s'.\n",
+             program_name, arg, msg, option_name );
+  }
+
+
+/* Recognized formats: <num>k, <num>Ki, <num>[MGTPEZYRQ][i] */
+static unsigned long long getnum( const char * const arg,
+                                  const char * const option_name,
+                                  const unsigned long long llimit,
+                                  const unsigned long long ulimit )
+  {
+  char * tail;
+  errno = 0;
+  unsigned long long result = strtoull( arg, &tail, 0 );
+  if( tail == arg )
+    { show_option_error( arg, "Bad or missing numerical argument in",
+                         option_name ); exit( 1 ); }
+
+  if( !errno && tail[0] )
+    {
+    const unsigned factor = ( tail[1] == 'i' ) ? 1024 : 1000;
+    int exponent = 0;				/* 0 = bad multiplier */
+    int i;
+    switch( tail[0] )
+      {
+      case 'Q': exponent = 10; break;
+      case 'R': exponent = 9; break;
+      case 'Y': exponent = 8; break;
+      case 'Z': exponent = 7; break;
+      case 'E': exponent = 6; break;
+      case 'P': exponent = 5; break;
+      case 'T': exponent = 4; break;
+      case 'G': exponent = 3; break;
+      case 'M': exponent = 2; break;
+      case 'K': if( factor == 1024 ) exponent = 1; break;
+      case 'k': if( factor == 1000 ) exponent = 1; break;
+      }
+    if( exponent <= 0 )
+      { show_option_error( arg, "Bad multiplier in numerical argument of",
+                           option_name ); exit( 1 ); }
+    for( i = 0; i < exponent; ++i )
+      {
+      if( ulimit / factor >= result ) result *= factor;
+      else { errno = ERANGE; break; }
+      }
+    }
+  if( !errno && ( result < llimit || result > ulimit ) ) errno = ERANGE;
+  if( errno )
+    {
+    if( verbosity >= 0 )
+      fprintf( stderr, "%s: '%s': Value out of limits [%s,%s] in "
+               "option '%s'.\n", program_name, arg, format_num3( llimit ),
+               format_num3( ulimit ), option_name );
+    exit( 1 );
+    }
+  return result;
+  }
+
+
+static int get_dict_size( const char * const arg, const char * const option_name )
+  {
+  char * tail;
+  const long bits = strtol( arg, &tail, 0 );
+  if( bits >= min_dictionary_bits &&
+      bits <= max_dictionary_bits && *tail == 0 )
+    return 1 << bits;
+  return getnum( arg, option_name, min_dictionary_size, max_dictionary_size );
+  }
+
+
+static void set_mode( enum Mode * const program_modep, const enum Mode new_mode )
+  {
+  if( *program_modep != m_compress && *program_modep != new_mode )
+    {
+    show_error( "Only one operation can be specified.", 0, true );
+    exit( 1 );
+    }
+  *program_modep = new_mode;
+  }
+
+
+static int extension_index( const char * const name )
+  {
+  int eindex;
+  for( eindex = 0; known_extensions[eindex].from; ++eindex )
+    {
+    const char * const ext = known_extensions[eindex].from;
+    const unsigned name_len = strlen( name );
+    const unsigned ext_len = strlen( ext );
+    if( name_len > ext_len &&
+        strncmp( name + name_len - ext_len, ext, ext_len ) == 0 )
+      return eindex;
+    }
+  return -1;
+  }
+
+
+static void set_c_outname( const char * const name, const bool filenames_given,
+                           const bool force_ext, const bool multifile )
+  {
+  /* zupdate < 1.9 depends on lzip adding the extension '.lz' to name when
+     reading from standard input. */
+  output_filename = resize_buffer( output_filename, strlen( name ) + 5 +
+                                   strlen( known_extensions[0].from ) + 1 );
+  strcpy( output_filename, name );
+  if( multifile ) strcat( output_filename, "00001" );
+  if( force_ext || multifile ||
+      ( !filenames_given && extension_index( output_filename ) < 0 ) )
+    strcat( output_filename, known_extensions[0].from );
+  }
+
+
+static void set_d_outname( const char * const name, const int eindex )
+  {
+  const unsigned name_len = strlen( name );
+  if( eindex >= 0 )
+    {
+    const char * const from = known_extensions[eindex].from;
+    const unsigned from_len = strlen( from );
+    if( name_len > from_len )
+      {
+      output_filename = resize_buffer( output_filename, name_len +
+                                       strlen( known_extensions[eindex].to ) + 1 );
+      strcpy( output_filename, name );
+      strcpy( output_filename + name_len - from_len, known_extensions[eindex].to );
+      return;
+      }
+    }
+  output_filename = resize_buffer( output_filename, name_len + 4 + 1 );
+  strcpy( output_filename, name );
+  strcat( output_filename, ".out" );
+  if( verbosity >= 1 )
+    fprintf( stderr, "%s: %s: Can't guess original name -- using '%s'\n",
+             program_name, name, output_filename );
+  }
+
+
+int open_instream( const char * const name, struct stat * const in_statsp,
+                   const bool one_to_one, const bool reg_only )
+  {
+  int infd = open( name, O_RDONLY | O_BINARY );
+  if( infd < 0 )
+    show_file_error( name, "Can't open input file", errno );
+  else
+    {
+    const int i = fstat( infd, in_statsp );
+    const mode_t mode = in_statsp->st_mode;
+    const bool can_read = ( i == 0 && !reg_only &&
+                            ( S_ISBLK( mode ) || S_ISCHR( mode ) ||
+                              S_ISFIFO( mode ) || S_ISSOCK( mode ) ) );
+    if( i != 0 || ( !S_ISREG( mode ) && ( !can_read || one_to_one ) ) )
+      {
+      if( verbosity >= 0 )
+        fprintf( stderr, "%s: %s: Input file is not a regular file%s.\n",
+                 program_name, name, ( can_read && one_to_one ) ?
+                 ",\n  and neither '-c' nor '-o' were specified" : "" );
+      close( infd );
+      infd = -1;
+      }
+    }
+  return infd;
+  }
+
+
+static int open_instream2( const char * const name, struct stat * const in_statsp,
+                           const enum Mode program_mode, const int eindex,
+                           const bool one_to_one, const bool recompress )
+  {
+  if( program_mode == m_compress && !recompress && eindex >= 0 )
+    {
+    if( verbosity >= 0 )
+      fprintf( stderr, "%s: %s: Input file already has '%s' suffix.\n",
+               program_name, name, known_extensions[eindex].from );
+    return -1;
+    }
+  return open_instream( name, in_statsp, one_to_one, false );
+  }
+
+
+static bool make_dirs( const char * const name )
+  {
+  int i = strlen( name );
+  while( i > 0 && name[i-1] != '/' ) --i;	/* remove last component */
+  while( i > 0 && name[i-1] == '/' ) --i;	/* remove slash(es) */
+  const int dirsize = i;	/* size of dirname without trailing slash(es) */
+
+  for( i = 0; i < dirsize; )	/* if dirsize == 0, dirname is '/' or empty */
+    {
+    while( i < dirsize && name[i] == '/' ) ++i;
+    const int first = i;
+    while( i < dirsize && name[i] != '/' ) ++i;
+    if( first < i )
+      {
+      char partial[i+1]; memcpy( partial, name, i ); partial[i] = 0;
+      const mode_t mode = S_IRWXU | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH;
+      struct stat st;
+      if( stat( partial, &st ) == 0 )
+        { if( !S_ISDIR( st.st_mode ) ) { errno = ENOTDIR; return false; } }
+      else if( mkdir( partial, mode ) != 0 && errno != EEXIST )
+        return false;		/* if EEXIST, another process created the dir */
+      }
+    }
+  return true;
+  }
+
+
+static bool open_outstream( const bool force, const bool protect )
+  {
+  const mode_t usr_rw = S_IRUSR | S_IWUSR;
+  const mode_t all_rw = usr_rw | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH;
+  const mode_t outfd_mode = protect ? usr_rw : all_rw;
+  int flags = O_CREAT | O_WRONLY | O_BINARY;
+  if( force ) flags |= O_TRUNC; else flags |= O_EXCL;
+
+  outfd = -1;
+  const int len = strlen( output_filename );
+  if( len > 0 && output_filename[len-1] == '/' ) errno = EISDIR;
+  else {
+    if( !protect && !make_dirs( output_filename ) )
+      { show_file_error( output_filename,
+          "Error creating intermediate directory", errno ); return false; }
+    outfd = open( output_filename, flags, outfd_mode );
+    if( outfd >= 0 ) { delete_output_on_interrupt = true; return true; }
+    if( errno == EEXIST )
+      { show_file_error( output_filename,
+          "Output file already exists, skipping.", 0 ); return false; }
+    }
+  show_file_error( output_filename, "Can't create output file", errno );
+  return false;
+  }
+
+
+static void set_signals( void (*action)(int) )
+  {
+  signal( SIGHUP, action );
+  signal( SIGINT, action );
+  signal( SIGTERM, action );
+  }
+
+
+void cleanup_and_fail( const int retval )
+  {
+  set_signals( SIG_IGN );			/* ignore signals */
+  if( delete_output_on_interrupt )
+    {
+    delete_output_on_interrupt = false;
+    show_file_error( output_filename, "Deleting output file, if it exists.", 0 );
+    if( outfd >= 0 ) { close( outfd ); outfd = -1; }
+    if( remove( output_filename ) != 0 && errno != ENOENT )
+      show_error( "warning: deletion of output file failed", errno, false );
+    }
+  exit( retval );
+  }
+
+
+static void signal_handler( int sig )
+  {
+  if( sig ) {}				/* keep compiler happy */
+  show_error( "Control-C or similar caught, quitting.", 0, false );
+  cleanup_and_fail( 1 );
+  }
+
+
+static bool check_tty_in( const char * const input_filename, const int infd,
+                          const enum Mode program_mode, int * const retval )
+  {
+  if( ( program_mode == m_decompress || program_mode == m_test ) &&
+      isatty( infd ) )				/* for example /dev/tty */
+    { show_file_error( input_filename,
+                       "I won't read compressed data from a terminal.", 0 );
+      close( infd ); set_retval( retval, 2 );
+      if( program_mode != m_test ) cleanup_and_fail( *retval );
+      return false; }
+  return true;
+  }
+
+static bool check_tty_out( const enum Mode program_mode )
+  {
+  if( program_mode == m_compress && isatty( outfd ) )
+    { show_file_error( output_filename[0] ?
+                       output_filename : "(stdout)",
+                       "I won't write compressed data to a terminal.", 0 );
+      return false; }
+  return true;
+  }
+
+
+/* Set permissions, owner, and times. */
+static void close_and_set_permissions( const struct stat * const in_statsp )
+  {
+  bool warning = false;
+  if( in_statsp )
+    {
+    const mode_t mode = in_statsp->st_mode;
+    /* fchown in many cases returns with EPERM, which can be safely ignored. */
+    if( fchown( outfd, in_statsp->st_uid, in_statsp->st_gid ) == 0 )
+      { if( fchmod( outfd, mode ) != 0 ) warning = true; }
+    else
+      if( errno != EPERM ||
+          fchmod( outfd, mode & ~( S_ISUID | S_ISGID | S_ISVTX ) ) != 0 )
+        warning = true;
+    }
+  if( close( outfd ) != 0 )
+    { show_file_error( output_filename, "Error closing output file", errno );
+      cleanup_and_fail( 1 ); }
+  outfd = -1;
+  delete_output_on_interrupt = false;
+  if( in_statsp )
+    {
+    struct utimbuf t;
+    t.actime = in_statsp->st_atime;
+    t.modtime = in_statsp->st_mtime;
+    if( utime( output_filename, &t ) != 0 ) warning = true;
+    }
+  if( warning && verbosity >= 1 )
+    show_file_error( output_filename,
+                     "warning: can't change output file attributes", errno );
+  }
+
+
+static bool next_filename( void )
+  {
+  const unsigned name_len = strlen( output_filename );
+  const unsigned ext_len = strlen( known_extensions[0].from );
+  int i, j;
+  if( name_len >= ext_len + 5 )				/* "*00001.lz" */
+    for( i = name_len - ext_len - 1, j = 0; j < 5; --i, ++j )
+      {
+      if( output_filename[i] < '9' ) { ++output_filename[i]; return true; }
+      else output_filename[i] = '0';
+      }
+  return false;
+  }
+
+
+struct Poly_encoder
+  {
+  struct LZ_encoder_base * eb;
+  struct LZ_encoder * e;
+  struct FLZ_encoder * fe;
+  };
+
+
+static int compress( const unsigned long long cfile_size,
+                     const unsigned long long member_size,
+                     const unsigned long long volume_size, const int infd,
+                     const struct Lzma_options * const encoder_options,
+                     struct Pretty_print * const pp,
+                     const struct stat * const in_statsp, const bool zero )
+  {
+  int retval = 0;
+  struct Poly_encoder encoder = { 0, 0, 0 };	/* polymorphic encoder */
+  if( verbosity >= 1 ) Pp_show_msg( pp, 0 );
+
+  {
+  bool error = false;
+  if( zero )
+    {
+    encoder.fe = (struct FLZ_encoder *)malloc( sizeof *encoder.fe );
+    if( !encoder.fe || !FLZe_init( encoder.fe, infd, outfd ) ) error = true;
+    else encoder.eb = &encoder.fe->eb;
+    }
+  else
+    {
+    Lzip_header header;
+    if( Lh_set_dictionary_size( header, encoder_options->dictionary_size ) &&
+        encoder_options->match_len_limit >= min_match_len_limit &&
+        encoder_options->match_len_limit <= max_match_len )
+      encoder.e = (struct LZ_encoder *)malloc( sizeof *encoder.e );
+    else internal_error( "invalid argument to encoder." );
+    if( !encoder.e || !LZe_init( encoder.e, Lh_get_dictionary_size( header ),
+                                 encoder_options->match_len_limit, infd, outfd ) )
+      error = true;
+    else encoder.eb = &encoder.e->eb;
+    }
+  if( error )
+    {
+    Pp_show_msg( pp, "Not enough memory. Try a smaller dictionary size." );
+    return 1;
+    }
+  }
+
+  unsigned long long in_size = 0, out_size = 0, partial_volume_size = 0;
+  while( true )			/* encode one member per iteration */
+    {
+    const unsigned long long size = ( volume_size > 0 ) ?
+      min( member_size, volume_size - partial_volume_size ) : member_size;
+    show_cprogress( cfile_size, in_size, &encoder.eb->mb, pp );	/* init */
+    if( ( zero && !FLZe_encode_member( encoder.fe, size ) ) ||
+        ( !zero && !LZe_encode_member( encoder.e, size ) ) )
+      { Pp_show_msg( pp, "Encoder error." ); retval = 1; break; }
+    in_size += Mb_data_position( &encoder.eb->mb );
+    out_size += Re_member_position( &encoder.eb->renc );
+    if( Mb_data_finished( &encoder.eb->mb ) ) break;
+    if( volume_size > 0 )
+      {
+      partial_volume_size += Re_member_position( &encoder.eb->renc );
+      if( partial_volume_size >= volume_size - min_dictionary_size )
+        {
+        partial_volume_size = 0;
+        if( delete_output_on_interrupt )
+          {
+          close_and_set_permissions( in_statsp );
+          if( !next_filename() )
+            { Pp_show_msg( pp, "Too many volume files." ); retval = 1; break; }
+          if( !open_outstream( true, in_statsp ) ) { retval = 1; break; }
+          }
+        }
+      }
+    if( zero ) FLZe_reset( encoder.fe ); else LZe_reset( encoder.e );
+    }
+
+  if( retval == 0 && verbosity >= 1 )
+    {
+    if( in_size == 0 || out_size == 0 )
+      fputs( " no data compressed.\n", stderr );
+    else
+      fprintf( stderr, "%6.3f:1, %5.2f%% ratio, %5.2f%% saved, "
+                       "%llu in, %llu out.\n",
+               (double)in_size / out_size,
+               ( 100.0 * out_size ) / in_size,
+               100.0 - ( ( 100.0 * out_size ) / in_size ),
+               in_size, out_size );
+    }
+  LZeb_free( encoder.eb );
+  if( zero ) free( encoder.fe ); else free( encoder.e );
+  return retval;
+  }
+
+
+static unsigned char xdigit( const unsigned value )	/* hex digit for 'value' */
+  {
+  if( value <= 9 ) return '0' + value;
+  if( value <= 15 ) return 'A' + value - 10;
+  return 0;
+  }
+
+
+static bool show_trailing_data( const uint8_t * const data, const int size,
+                                struct Pretty_print * const pp, const bool all,
+                                const int ignore_trailing )	/* -1 = show */
+  {
+  if( verbosity >= 4 || ignore_trailing <= 0 )
+    {
+    int i;
+    char buf[80];
+    unsigned len = max( 0, snprintf( buf, sizeof buf, "%strailing data = ",
+                                     all ? "" : "first bytes of " ) );
+    for( i = 0; i < size && len + 2 < sizeof buf; ++i )
+      {
+      buf[len++] = xdigit( data[i] >> 4 );
+      buf[len++] = xdigit( data[i] & 0x0F );
+      buf[len++] = ' ';
+      }
+    if( len < sizeof buf ) buf[len++] = '\'';
+    for( i = 0; i < size && len < sizeof buf; ++i )
+      { if( isprint( data[i] ) ) buf[len++] = data[i]; else buf[len++] = '.'; }
+    if( len < sizeof buf ) buf[len++] = '\'';
+    if( len < sizeof buf ) buf[len] = 0; else buf[sizeof buf - 1] = 0;
+    Pp_show_msg( pp, buf );
+    if( ignore_trailing == 0 ) show_file_error( pp->name, trailing_msg, 0 );
+    }
+  return ignore_trailing > 0;
+  }
+
+
+static int decompress( const unsigned long long cfile_size, const int infd,
+                       const struct Cl_options * const cl_opts,
+                       struct Pretty_print * const pp, const bool testing )
+  {
+  unsigned long long partial_file_pos = 0;
+  struct Range_decoder rdec;
+  int retval = 0;
+  bool first_member;
+  if( !Rd_init( &rdec, infd ) )
+    { show_error( mem_msg, 0, false ); cleanup_and_fail( 1 ); }
+
+  for( first_member = true; ; first_member = false )
+    {
+    Lzip_header header;
+    Rd_reset_member_position( &rdec );
+    const int size = Rd_read_data( &rdec, header, Lh_size );
+    if( Rd_finished( &rdec ) )			/* End Of File */
+      {
+      if( first_member )
+        { show_file_error( pp->name, "File ends unexpectedly at member header.", 0 );
+          retval = 2; }
+      else if( Lh_check_prefix( header, size ) )
+        { Pp_show_msg( pp, "Truncated header in multimember file." );
+          show_trailing_data( header, size, pp, true, -1 ); retval = 2; }
+      else if( size > 0 && !show_trailing_data( header, size, pp, true,
+                                 cl_opts->ignore_trailing ) ) retval = 2;
+      break;
+      }
+    if( !Lh_check_magic( header ) )
+      {
+      if( first_member )
+        { show_file_error( pp->name, bad_magic_msg, 0 ); retval = 2; }
+      else if( !cl_opts->loose_trailing && Lh_check_corrupt( header ) )
+        { Pp_show_msg( pp, corrupt_mm_msg );
+          show_trailing_data( header, size, pp, false, -1 ); retval = 2; }
+      else if( !show_trailing_data( header, size, pp, false,
+                                    cl_opts->ignore_trailing ) ) retval = 2;
+      break;
+      }
+    if( !Lh_check_version( header ) )
+      { Pp_show_msg( pp, bad_version( Lh_version( header ) ) );
+        retval = 2; break; }
+    const unsigned dictionary_size = Lh_get_dictionary_size( header );
+    if( !isvalid_ds( dictionary_size ) )
+      { Pp_show_msg( pp, bad_dict_msg ); retval = 2; break; }
+
+    if( verbosity >= 2 || ( verbosity == 1 && first_member ) )
+      Pp_show_msg( pp, 0 );
+
+    struct LZ_decoder decoder;
+    if( !LZd_init( &decoder, &rdec, dictionary_size, outfd ) )
+      { Pp_show_msg( pp, mem_msg ); retval = 1; break; }
+    show_dprogress( cfile_size, partial_file_pos, &rdec, pp );	/* init */
+    const int result = LZd_decode_member( &decoder, cl_opts, pp );
+    partial_file_pos += Rd_member_position( &rdec );
+    LZd_free( &decoder );
+    if( result != 0 )
+      {
+      if( verbosity >= 0 && result <= 2 )
+        {
+        Pp_show_msg( pp, 0 );
+        fprintf( stderr, "%s at pos %llu\n", ( result == 2 ) ?
+                 "File ends unexpectedly" : "Decoder error",
+                 partial_file_pos );
+        }
+      else if( result == 5 ) Pp_show_msg( pp, empty_msg );
+      else if( result == 6 ) Pp_show_msg( pp, marking_msg );
+      retval = 2; break;
+      }
+    if( verbosity >= 2 )
+      { fputs( testing ? "ok\n" : "done\n", stderr ); Pp_reset( pp ); }
+    }
+  Rd_free( &rdec );
+  if( verbosity == 1 && retval == 0 )
+    fputs( testing ? "ok\n" : "done\n", stderr );
+  return retval;
+  }
+
+
+void show_error( const char * const msg, const int errcode, const bool help )
+  {
+  if( verbosity < 0 ) return;
+  if( msg && msg[0] )
+    fprintf( stderr, "%s: %s%s%s\n", program_name, msg,
+             ( errcode > 0 ) ? ": " : "",
+             ( errcode > 0 ) ? strerror( errcode ) : "" );
+  if( help )
+    fprintf( stderr, "Try '%s --help' for more information.\n",
+             invocation_name );
+  }
+
+
+void show_file_error( const char * const filename, const char * const msg,
+                      const int errcode )
+  {
+  if( verbosity >= 0 )
+    fprintf( stderr, "%s: %s: %s%s%s\n", program_name, filename, msg,
+             ( errcode > 0 ) ? ": " : "",
+             ( errcode > 0 ) ? strerror( errcode ) : "" );
+  }
+
+
+void internal_error( const char * const msg )
+  {
+  if( verbosity >= 0 )
+    fprintf( stderr, "%s: internal error: %s\n", program_name, msg );
+  exit( 3 );
+  }
+
+
+void show_cprogress( const unsigned long long cfile_size,
+                     const unsigned long long partial_size,
+                     const struct Matchfinder_base * const m,
+                     struct Pretty_print * const p )
+  {
+  static unsigned long long csize = 0;		/* file_size / 100 */
+  static unsigned long long psize = 0;
+  static const struct Matchfinder_base * mb = 0;
+  static struct Pretty_print * pp = 0;
+  static bool enabled = true;
+
+  if( !enabled ) return;
+  if( p )					/* initialize static vars */
+    {
+    if( verbosity < 2 || !isatty( STDERR_FILENO ) ) { enabled = false; return; }
+    csize = cfile_size; psize = partial_size; mb = m; pp = p;
+    }
+  if( mb && pp )
+    {
+    const unsigned long long pos = psize + Mb_data_position( mb );
+    if( csize > 0 )
+      fprintf( stderr, "%4llu%%  %.1f MB\r", pos / csize, pos / 1000000.0 );
+    else
+      fprintf( stderr, "  %.1f MB\r", pos / 1000000.0 );
+    Pp_reset( pp ); Pp_show_msg( pp, 0 );	/* restore cursor position */
+    }
+  }
+
+
+void show_dprogress( const unsigned long long cfile_size,
+                     const unsigned long long partial_size,
+                     const struct Range_decoder * const d,
+                     struct Pretty_print * const p )
+  {
+  static unsigned long long csize = 0;		/* file_size / 100 */
+  static unsigned long long psize = 0;
+  static const struct Range_decoder * rdec = 0;
+  static struct Pretty_print * pp = 0;
+  static int counter = 0;
+  static bool enabled = true;
+
+  if( !enabled ) return;
+  if( p )					/* initialize static vars */
+    {
+    if( verbosity < 2 || !isatty( STDERR_FILENO ) ) { enabled = false; return; }
+    csize = cfile_size; psize = partial_size; rdec = d; pp = p; counter = 0;
+    }
+  if( rdec && pp && --counter <= 0 )
+    {
+    const unsigned long long pos = psize + Rd_member_position( rdec );
+    counter = 7;		/* update display every 114688 bytes */
+    if( csize > 0 )
+      fprintf( stderr, "%4llu%%  %.1f MB\r", pos / csize, pos / 1000000.0 );
+    else
+      fprintf( stderr, "  %.1f MB\r", pos / 1000000.0 );
+    Pp_reset( pp ); Pp_show_msg( pp, 0 );	/* restore cursor position */
+    }
+  }
+
+
+int main( const int argc, const char * const argv[] )
+  {
+  /* Mapping from gzip/bzip2 style 0..9 compression levels to the
+     corresponding LZMA compression parameters. */
+  const struct Lzma_options option_mapping[] =
+    {
+    { 1 << 16,  16 },		/* -0 */
+    { 1 << 20,   5 },		/* -1 */
+    { 3 << 19,   6 },		/* -2 */
+    { 1 << 21,   8 },		/* -3 */
+    { 3 << 20,  12 },		/* -4 */
+    { 1 << 22,  20 },		/* -5 */
+    { 1 << 23,  36 },		/* -6 */
+    { 1 << 24,  68 },		/* -7 */
+    { 3 << 23, 132 },		/* -8 */
+    { 1 << 25, 273 } };		/* -9 */
+  struct Lzma_options encoder_options = option_mapping[6];  /* default = "-6" */
+  const unsigned long long max_member_size = 0x0008000000000000ULL; /* 2 PiB */
+  const unsigned long long max_volume_size = 0x4000000000000000ULL; /* 4 EiB */
+  unsigned long long member_size = max_member_size;
+  unsigned long long volume_size = 0;
+  const char * default_output_filename = "";
+  enum Mode program_mode = m_compress;
+  int i;
+  struct Cl_options cl_opts;		/* command-line options */
+  Cl_options_init( &cl_opts );
+  bool force = false;
+  bool keep_input_files = false;
+  bool recompress = false;
+  bool to_stdout = false;
+  bool zero = false;
+  if( argc > 0 ) invocation_name = argv[0];
+
+  enum { opt_eer = 256, opt_lt, opt_mer };
+  const struct ap_Option options[] =
+    {
+    { '0', "fast",               ap_no  },
+    { '1', 0,                    ap_no  },
+    { '2', 0,                    ap_no  },
+    { '3', 0,                    ap_no  },
+    { '4', 0,                    ap_no  },
+    { '5', 0,                    ap_no  },
+    { '6', 0,                    ap_no  },
+    { '7', 0,                    ap_no  },
+    { '8', 0,                    ap_no  },
+    { '9', "best",               ap_no  },
+    { 'a', "trailing-error",     ap_no  },
+    { 'b', "member-size",        ap_yes },
+    { 'c', "stdout",             ap_no  },
+    { 'd', "decompress",         ap_no  },
+    { 'f', "force",              ap_no  },
+    { 'F', "recompress",         ap_no  },
+    { 'h', "help",               ap_no  },
+    { 'k', "keep",               ap_no  },
+    { 'l', "list",               ap_no  },
+    { 'm', "match-length",       ap_yes },
+    { 'n', "threads",            ap_yes },
+    { 'o', "output",             ap_yes },
+    { 'q', "quiet",              ap_no  },
+    { 's', "dictionary-size",    ap_yes },
+    { 'S', "volume-size",        ap_yes },
+    { 't', "test",               ap_no  },
+    { 'v', "verbose",            ap_no  },
+    { 'V', "version",            ap_no  },
+    { opt_eer, "empty-error",    ap_no  },
+    { opt_lt,  "loose-trailing", ap_no  },
+    { opt_mer, "marking-error",  ap_no  },
+    {  0, 0,                     ap_no  } };
+
+  CRC32_init();
+
+  /* static because valgrind complains and memory management in C sucks */
+  static struct Arg_parser parser;
+  if( !ap_init( &parser, argc, argv, options, 0 ) )
+    { show_error( mem_msg, 0, false ); return 1; }
+  if( ap_error( &parser ) )				/* bad option */
+    { show_error( ap_error( &parser ), 0, true ); return 1; }
+
+  int argind = 0;
+  for( ; argind < ap_arguments( &parser ); ++argind )
+    {
+    const int code = ap_code( &parser, argind );
+    if( !code ) break;					/* no more options */
+    const char * const pn = ap_parsed_name( &parser, argind );
+    const char * const arg = ap_argument( &parser, argind );
+    switch( code )
+      {
+      case '0': case '1': case '2': case '3': case '4':
+      case '5': case '6': case '7': case '8': case '9':
+                zero = ( code == '0' );
+                encoder_options = option_mapping[code-'0']; break;
+      case 'a': cl_opts.ignore_trailing = false; break;
+      case 'b': member_size = getnum( arg, pn, 100000, max_member_size ); break;
+      case 'c': to_stdout = true; break;
+      case 'd': set_mode( &program_mode, m_decompress ); break;
+      case 'f': force = true; break;
+      case 'F': recompress = true; break;
+      case 'h': show_help(); return 0;
+      case 'k': keep_input_files = true; break;
+      case 'l': set_mode( &program_mode, m_list ); break;
+      case 'm': encoder_options.match_len_limit =
+                  getnum( arg, pn, min_match_len_limit, max_match_len );
+                zero = false; break;
+      case 'n': break;
+      case 'o': if( strcmp( arg, "-" ) == 0 ) to_stdout = true;
+                else { default_output_filename = arg; } break;
+      case 'q': verbosity = -1; break;
+      case 's': encoder_options.dictionary_size = get_dict_size( arg, pn );
+                zero = false; break;
+      case 'S': volume_size = getnum( arg, pn, 100000, max_volume_size ); break;
+      case 't': set_mode( &program_mode, m_test ); break;
+      case 'v': if( verbosity < 4 ) ++verbosity; break;
+      case 'V': show_version(); return 0;
+      case opt_eer: cl_opts.ignore_empty = false; break;
+      case opt_lt:  cl_opts.loose_trailing = true; break;
+      case opt_mer: cl_opts.ignore_marking = false; break;
+      default: internal_error( "uncaught option." );
+      }
+    } /* end process options */
+
+#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__
+  setmode( STDIN_FILENO, O_BINARY );
+  setmode( STDOUT_FILENO, O_BINARY );
+#endif
+
+  static const char ** filenames = 0;
+  int num_filenames = max( 1, ap_arguments( &parser ) - argind );
+  filenames = resize_buffer( filenames, num_filenames * sizeof filenames[0] );
+  filenames[0] = "-";
+
+  bool filenames_given = false;
+  for( i = 0; argind + i < ap_arguments( &parser ); ++i )
+    {
+    filenames[i] = ap_argument( &parser, argind + i );
+    if( strcmp( filenames[i], "-" ) != 0 ) filenames_given = true;
+    }
+
+  if( program_mode == m_list )
+    return list_files( filenames, num_filenames, &cl_opts );
+
+  if( program_mode == m_compress )
+    {
+    if( volume_size > 0 && !to_stdout && default_output_filename[0] &&
+        num_filenames > 1 )
+      { show_error( "Only can compress one file when using '-o' and '-S'.",
+                    0, true ); return 1; }
+    Dis_slots_init();
+    Prob_prices_init();
+    }
+  else volume_size = 0;
+  if( program_mode == m_test ) to_stdout = false;	/* apply overrides */
+  if( program_mode == m_test || to_stdout ) default_output_filename = "";
+
+  output_filename = resize_buffer( output_filename, 1 );
+  output_filename[0] = 0;
+  if( to_stdout && program_mode != m_test )	/* check tty only once */
+    { outfd = STDOUT_FILENO; if( !check_tty_out( program_mode ) ) return 1; }
+  else outfd = -1;
+
+  const bool to_file = !to_stdout && program_mode != m_test &&
+                       default_output_filename[0];
+  if( !to_stdout && program_mode != m_test && ( filenames_given || to_file ) )
+    set_signals( signal_handler );
+
+  static struct Pretty_print pp;
+  Pp_init( &pp, filenames, num_filenames );
+
+  int failed_tests = 0;
+  int retval = 0;
+  const bool one_to_one = !to_stdout && program_mode != m_test && !to_file;
+  bool stdin_used = false;
+  struct stat in_stats;
+  for( i = 0; i < num_filenames; ++i )
+    {
+    const char * input_filename = "";
+    int infd;
+
+    Pp_set_name( &pp, filenames[i] );
+    if( strcmp( filenames[i], "-" ) == 0 )
+      {
+      if( stdin_used ) continue; else stdin_used = true;
+      infd = STDIN_FILENO;
+      if( !check_tty_in( pp.name, infd, program_mode, &retval ) ) continue;
+      if( one_to_one ) { outfd = STDOUT_FILENO; output_filename[0] = 0; }
+      }
+    else
+      {
+      const int eindex = extension_index( input_filename = filenames[i] );
+      infd = open_instream2( input_filename, &in_stats, program_mode,
+                             eindex, one_to_one, recompress );
+      if( infd < 0 ) { set_retval( &retval, 1 ); continue; }
+      if( !check_tty_in( pp.name, infd, program_mode, &retval ) ) continue;
+      if( one_to_one )			/* open outfd after checking infd */
+        {
+        if( program_mode == m_compress )
+          set_c_outname( input_filename, true, true, volume_size > 0 );
+        else set_d_outname( input_filename, eindex );
+        if( !open_outstream( force, true ) )
+          { close( infd ); set_retval( &retval, 1 ); continue; }
+        }
+      }
+
+    if( one_to_one && !check_tty_out( program_mode ) )
+      { set_retval( &retval, 1 ); return retval; }	/* don't delete a tty */
+
+    if( to_file && outfd < 0 )		/* open outfd after checking infd */
+      {
+      if( program_mode == m_compress ) set_c_outname( default_output_filename,
+                                       filenames_given, false, volume_size > 0 );
+      else
+        { output_filename = resize_buffer( output_filename,
+                            strlen( default_output_filename ) + 1 );
+          strcpy( output_filename, default_output_filename ); }
+      if( !open_outstream( force, false ) || !check_tty_out( program_mode ) )
+        return 1;	/* check tty only once and don't try to delete a tty */
+      }
+
+    const struct stat * const in_statsp =
+      ( input_filename[0] && one_to_one ) ? &in_stats : 0;
+    const unsigned long long cfile_size =
+      ( input_filename[0] && S_ISREG( in_stats.st_mode ) ) ?
+        ( in_stats.st_size + 99 ) / 100 : 0;
+    int tmp;
+    if( program_mode == m_compress )
+      tmp = compress( cfile_size, member_size, volume_size, infd,
+                      &encoder_options, &pp, in_statsp, zero );
+    else
+      tmp = decompress( cfile_size, infd, &cl_opts, &pp, program_mode == m_test );
+    if( close( infd ) != 0 )
+      { show_file_error( pp.name, "Error closing input file", errno );
+        set_retval( &tmp, 1 ); }
+    set_retval( &retval, tmp );
+    if( tmp )
+      { if( program_mode != m_test ) cleanup_and_fail( retval );
+        else ++failed_tests; }
+
+    if( delete_output_on_interrupt && one_to_one )
+      close_and_set_permissions( in_statsp );
+    if( input_filename[0] && !keep_input_files && one_to_one &&
+        ( program_mode != m_compress || volume_size == 0 ) )
+      remove( input_filename );
+    }
+  if( delete_output_on_interrupt )					/* -o */
+    close_and_set_permissions( ( retval == 0 && !stdin_used &&
+      filenames_given && num_filenames == 1 ) ? &in_stats : 0 );
+  else if( outfd >= 0 && close( outfd ) != 0 )				/* -c */
+    {
+    show_error( "Error closing stdout", errno, false );
+    set_retval( &retval, 1 );
+    }
+  if( failed_tests > 0 && verbosity >= 1 && num_filenames > 1 )
+    fprintf( stderr, "%s: warning: %d %s failed the test.\n",
+             program_name, failed_tests,
+             ( failed_tests == 1 ) ? "file" : "files" );
+  free( output_filename );
+  free( filenames );
+  ap_free( &parser );
+  return retval;
+  }
diff --git a/testsuite/check.sh b/testsuite/check.sh
new file mode 100755
index 0000000..100deae
--- /dev/null
+++ b/testsuite/check.sh
@@ -0,0 +1,478 @@
+#! /bin/sh
+# check script for Clzip - LZMA lossless data compressor
+# Copyright (C) 2010-2024 Antonio Diaz Diaz.
+#
+# This script is free software: you have unlimited permission
+# to copy, distribute, and modify it.
+
+LC_ALL=C
+export LC_ALL
+objdir=`pwd`
+testdir=`cd "$1" ; pwd`
+LZIP="${objdir}"/clzip
+framework_failure() { echo "failure in testing framework" ; exit 1 ; }
+
+if [ ! -f "${LZIP}" ] || [ ! -x "${LZIP}" ] ; then
+	echo "${LZIP}: cannot execute"
+	exit 1
+fi
+
+[ -e "${LZIP}" ] 2> /dev/null ||
+	{
+	echo "$0: a POSIX shell is required to run the tests"
+	echo "Try bash -c \"$0 $1 $2\""
+	exit 1
+	}
+
+if [ -d tmp ] ; then rm -rf tmp ; fi
+mkdir tmp
+cd "${objdir}"/tmp || framework_failure
+
+cat "${testdir}"/test.txt > in || framework_failure
+in_lz="${testdir}"/test.txt.lz
+in_em="${testdir}"/test_em.txt.lz
+fox_lz="${testdir}"/fox.lz
+fox6_lz="${testdir}"/fox6.lz
+f6mk_lz="${testdir}"/fox6_mark.lz
+fail=0
+test_failed() { fail=1 ; printf " $1" ; [ -z "$2" ] || printf "($2)" ; }
+
+printf "testing clzip-%s..." "$2"
+
+"${LZIP}" -fkqm4 in
+[ $? = 1 ] || test_failed $LINENO
+[ ! -e in.lz ] || test_failed $LINENO
+"${LZIP}" -fkqm274 in
+[ $? = 1 ] || test_failed $LINENO
+[ ! -e in.lz ] || test_failed $LINENO
+for i in bad_size -1 0 4095 513MiB 1G 1T 1P 1E 1Z 1Y 10KB ; do
+	"${LZIP}" -fkqs $i in
+	[ $? = 1 ] || test_failed $LINENO $i
+	[ ! -e in.lz ] || test_failed $LINENO $i
+done
+"${LZIP}" -lq in
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -tq in
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -tq < in
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -cdq in
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -cdq < in
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -dq -o in < "${in_lz}"
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -dq -o in "${in_lz}"
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -dq -o out nx_file.lz
+[ $? = 1 ] || test_failed $LINENO
+[ ! -e out ] || test_failed $LINENO
+"${LZIP}" -q -o out.lz nx_file
+[ $? = 1 ] || test_failed $LINENO
+[ ! -e out.lz ] || test_failed $LINENO
+"${LZIP}" -qf -S100k -o out in in
+[ $? = 1 ] || test_failed $LINENO
+{ [ ! -e out ] && [ ! -e out.lz ] ; } || test_failed $LINENO
+# these are for code coverage
+"${LZIP}" -lt "${in_lz}" 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -cdl "${in_lz}" 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -cdt "${in_lz}" 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -t -- nx_file.lz 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -t "" < /dev/null 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --help > /dev/null || test_failed $LINENO
+"${LZIP}" -n1 -V > /dev/null || test_failed $LINENO
+"${LZIP}" -m 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -z 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --bad_option 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --t 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --test=2 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --output= 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --output 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+printf "LZIP\001-.............................." | "${LZIP}" -t 2> /dev/null
+printf "LZIP\002-.............................." | "${LZIP}" -t 2> /dev/null
+printf "LZIP\001+.............................." | "${LZIP}" -t 2> /dev/null
+
+printf "\ntesting decompression..."
+
+for i in "${in_lz}" "${in_em}" ; do
+	"${LZIP}" -lq "$i" || test_failed $LINENO "$i"
+	"${LZIP}" -t "$i" || test_failed $LINENO "$i"
+	"${LZIP}" -d "$i" -o out || test_failed $LINENO "$i"
+	cmp in out || test_failed $LINENO "$i"
+	"${LZIP}" -cd "$i" > out || test_failed $LINENO "$i"
+	cmp in out || test_failed $LINENO "$i"
+	"${LZIP}" -d "$i" -o - > out || test_failed $LINENO "$i"
+	cmp in out || test_failed $LINENO "$i"
+	"${LZIP}" -d < "$i" > out || test_failed $LINENO "$i"
+	cmp in out || test_failed $LINENO "$i"
+	rm -f out || framework_failure
+done
+
+lines=`"${LZIP}" -tvv "${in_em}" 2>&1 | wc -l` || test_failed $LINENO
+[ "${lines}" -eq 8 ] || test_failed $LINENO "${lines}"
+"${LZIP}" -tq "${in_em}" --empty-error
+[ $? = 2 ] || test_failed $LINENO
+
+lines=`"${LZIP}" -lvv "${in_em}" | wc -l` || test_failed $LINENO
+[ "${lines}" -eq 11 ] || test_failed $LINENO "${lines}"
+"${LZIP}" -lq "${in_em}" --empty-error
+[ $? = 2 ] || test_failed $LINENO
+
+cat "${in_lz}" > out.lz || framework_failure
+"${LZIP}" -dk out.lz || test_failed $LINENO
+cmp in out || test_failed $LINENO
+rm -f out || framework_failure
+"${LZIP}" -cd "${fox_lz}" > fox || test_failed $LINENO
+cat fox > copy || framework_failure
+cat "${in_lz}" > copy.lz || framework_failure
+"${LZIP}" -d copy.lz out.lz 2> /dev/null	# skip copy, decompress out
+[ $? = 1 ] || test_failed $LINENO
+[ ! -e out.lz ] || test_failed $LINENO
+cmp fox copy || test_failed $LINENO
+cmp in out || test_failed $LINENO
+"${LZIP}" -df copy.lz || test_failed $LINENO
+[ ! -e copy.lz ] || test_failed $LINENO
+cmp in copy || test_failed $LINENO
+rm -f copy out || framework_failure
+
+cat "${in_lz}" > out.lz || framework_failure
+"${LZIP}" -d -S100k out.lz || test_failed $LINENO	# ignore -S
+[ ! -e out.lz ] || test_failed $LINENO
+cmp in out || test_failed $LINENO
+
+printf "to be overwritten" > out || framework_failure
+"${LZIP}" -df -o out < "${in_lz}" || test_failed $LINENO
+cmp in out || test_failed $LINENO
+rm -f out || framework_failure
+"${LZIP}" -d -o ./- "${in_lz}" || test_failed $LINENO
+cmp in ./- || test_failed $LINENO
+rm -f ./- || framework_failure
+"${LZIP}" -d -o ./- < "${in_lz}" || test_failed $LINENO
+cmp in ./- || test_failed $LINENO
+rm -f ./- || framework_failure
+
+cat "${in_lz}" > anyothername || framework_failure
+"${LZIP}" -dv - anyothername - < "${in_lz}" > out 2> /dev/null ||
+	test_failed $LINENO
+cmp in out || test_failed $LINENO
+cmp in anyothername.out || test_failed $LINENO
+rm -f out anyothername.out || framework_failure
+
+"${LZIP}" -lq in "${in_lz}"
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -lq nx_file.lz "${in_lz}"
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -tq in "${in_lz}"
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -tq nx_file.lz "${in_lz}"
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -cdq in "${in_lz}" > out
+[ $? = 2 ] || test_failed $LINENO
+cat out in | cmp in - || test_failed $LINENO		# out must be empty
+"${LZIP}" -cdq nx_file.lz "${in_lz}" > out	# skip nx_file, decompress in
+[ $? = 1 ] || test_failed $LINENO
+cmp in out || test_failed $LINENO
+rm -f out || framework_failure
+cat "${in_lz}" > out.lz || framework_failure
+for i in 1 2 3 4 5 6 7 ; do
+	printf "g" >> out.lz || framework_failure
+	"${LZIP}" -alvv out.lz "${in_lz}" > /dev/null 2>&1
+	[ $? = 2 ] || test_failed $LINENO $i
+	"${LZIP}" -atvvvv out.lz "${in_lz}" 2> /dev/null
+	[ $? = 2 ] || test_failed $LINENO $i
+done
+"${LZIP}" -dq in out.lz
+[ $? = 2 ] || test_failed $LINENO
+[ -e out.lz ] || test_failed $LINENO
+[ ! -e out ] || test_failed $LINENO
+[ ! -e in.out ] || test_failed $LINENO
+"${LZIP}" -dq nx_file.lz out.lz
+[ $? = 1 ] || test_failed $LINENO
+[ ! -e out.lz ] || test_failed $LINENO
+[ ! -e nx_file ] || test_failed $LINENO
+cmp in out || test_failed $LINENO
+rm -f out || framework_failure
+
+cat in in > in2 || framework_failure
+"${LZIP}" -lq "${in_lz}" "${in_lz}" || test_failed $LINENO
+"${LZIP}" -t "${in_lz}" "${in_lz}" || test_failed $LINENO
+"${LZIP}" -cd "${in_lz}" "${in_lz}" -o out > out2 || test_failed $LINENO
+[ ! -e out ] || test_failed $LINENO			# override -o
+cmp in2 out2 || test_failed $LINENO
+rm -f out2 || framework_failure
+"${LZIP}" -d "${in_lz}" "${in_lz}" -o out2 || test_failed $LINENO
+cmp in2 out2 || test_failed $LINENO
+rm -f out2 || framework_failure
+
+cat "${in_lz}" "${in_lz}" > out2.lz || framework_failure
+printf "\ngarbage" >> out2.lz || framework_failure
+"${LZIP}" -tvvvv out2.lz 2> /dev/null || test_failed $LINENO
+"${LZIP}" -alq out2.lz
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -atq out2.lz
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -atq < out2.lz
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -adkq out2.lz
+[ $? = 2 ] || test_failed $LINENO
+[ ! -e out2 ] || test_failed $LINENO
+"${LZIP}" -adkq -o out2 < out2.lz
+[ $? = 2 ] || test_failed $LINENO
+[ ! -e out2 ] || test_failed $LINENO
+printf "to be overwritten" > out2 || framework_failure
+"${LZIP}" -df out2.lz || test_failed $LINENO
+cmp in2 out2 || test_failed $LINENO
+rm -f out2 || framework_failure
+
+"${LZIP}" -cd "${fox6_lz}" > out || test_failed $LINENO
+"${LZIP}" -cd "${f6mk_lz}" > copy || test_failed $LINENO
+cmp copy out || test_failed $LINENO
+rm -f copy out || framework_failure
+"${LZIP}" -lq "${f6mk_lz}" --marking-error
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -tq "${f6mk_lz}" --marking-error
+[ $? = 2 ] || test_failed $LINENO
+
+"${LZIP}" -d "${fox_lz}" -o a/b/c/fox || test_failed $LINENO
+cmp fox a/b/c/fox || test_failed $LINENO
+rm -rf a || framework_failure
+"${LZIP}" -d -o a/b/c/fox < "${fox_lz}" || test_failed $LINENO
+cmp fox a/b/c/fox || test_failed $LINENO
+rm -rf a || framework_failure
+"${LZIP}" -dq "${fox_lz}" -o a/b/c/
+[ $? = 1 ] || test_failed $LINENO
+[ ! -e a ] || test_failed $LINENO
+
+printf "\ntesting   compression..."
+
+"${LZIP}" -c -0 in in in -S100k -o out3.lz > copy2.lz || test_failed $LINENO
+[ ! -e out3.lz ] || test_failed $LINENO			# override -o and -S
+"${LZIP}" -0f in in --output=copy2.lz || test_failed $LINENO
+"${LZIP}" -d copy2.lz -o out2 || test_failed $LINENO
+[ -e copy2.lz ] || test_failed $LINENO
+cmp in2 out2 || test_failed $LINENO
+rm -f in2 out2 copy2.lz || framework_failure
+
+"${LZIP}" -cf "${in_lz}" > lzlz 2> /dev/null	# /dev/null is a tty on OS/2
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -Fvvm36 -o - "${in_lz}" > lzlz 2> /dev/null || test_failed $LINENO
+"${LZIP}" -cd lzlz | "${LZIP}" -d > out || test_failed $LINENO
+cmp in out || test_failed $LINENO
+rm -f lzlz out || framework_failure
+
+"${LZIP}" -0 -o ./- in || test_failed $LINENO
+"${LZIP}" -cd ./- | cmp in - || test_failed $LINENO
+rm -f ./- || framework_failure
+"${LZIP}" -0 -o ./- < in || test_failed $LINENO			# add .lz
+[ ! -e ./- ] || test_failed $LINENO
+"${LZIP}" -cd -- -.lz | cmp in - || test_failed $LINENO
+rm -f ./-.lz || framework_failure
+
+for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do
+	"${LZIP}" -k -$i in || test_failed $LINENO $i
+	mv in.lz out.lz || test_failed $LINENO $i
+	printf "garbage" >> out.lz || framework_failure
+	"${LZIP}" -df out.lz || test_failed $LINENO $i
+	cmp in out || test_failed $LINENO $i
+
+	"${LZIP}" -$i in -c > out || test_failed $LINENO $i
+	"${LZIP}" -$i in -o o_out || test_failed $LINENO $i	# don't add .lz
+	[ ! -e o_out.lz ] || test_failed $LINENO
+	cmp out o_out || test_failed $LINENO $i
+	rm -f o_out || framework_failure
+	printf "g" >> out || framework_failure
+	"${LZIP}" -cd out > copy || test_failed $LINENO $i
+	cmp in copy || test_failed $LINENO $i
+
+	"${LZIP}" -$i < in > out || test_failed $LINENO $i
+	"${LZIP}" -d < out > copy || test_failed $LINENO $i
+	cmp in copy || test_failed $LINENO $i
+
+	rm -f out || framework_failure
+	printf "to be overwritten" > out.lz || framework_failure
+	"${LZIP}" -f -$i -o out < in || test_failed $LINENO $i	# add .lz
+	[ ! -e out ] || test_failed $LINENO
+	"${LZIP}" -df -o copy < out.lz || test_failed $LINENO $i
+	cmp in copy || test_failed $LINENO $i
+done
+rm -f copy out.lz || framework_failure
+
+cat in in in in in in in in > in8 || framework_failure
+"${LZIP}" -1s12 -S100k in8 || test_failed $LINENO
+"${LZIP}" -t in800001.lz in800002.lz || test_failed $LINENO
+"${LZIP}" -cd in800001.lz in800002.lz | cmp in8 - || test_failed $LINENO
+[ ! -e in800003.lz ] || test_failed $LINENO
+rm -f in800001.lz in800002.lz || framework_failure
+"${LZIP}" -1s12 -S100k -o out.lz in8 || test_failed $LINENO
+# ignore -S
+"${LZIP}" -d out.lz00001.lz out.lz00002.lz -S100k -o out || test_failed $LINENO
+cmp in8 out || test_failed $LINENO
+"${LZIP}" -t out.lz00001.lz out.lz00002.lz || test_failed $LINENO
+[ ! -e out.lz00003.lz ] || test_failed $LINENO
+rm -f out out.lz00001.lz out.lz00002.lz || framework_failure
+"${LZIP}" -1ks4Ki -b100000 in8 || test_failed $LINENO
+"${LZIP}" -t in8.lz || test_failed $LINENO
+"${LZIP}" -cd in8.lz -o out | cmp in8 - || test_failed $LINENO	# override -o
+[ ! -e out ] || test_failed $LINENO
+rm -f in8 || framework_failure
+"${LZIP}" -0 -S100k -o out < in8.lz || test_failed $LINENO
+"${LZIP}" -t out00001.lz out00002.lz || test_failed $LINENO
+"${LZIP}" -cd out00001.lz out00002.lz | cmp in8.lz - || test_failed $LINENO
+[ ! -e out00003.lz ] || test_failed $LINENO
+rm -f out00001.lz || framework_failure
+"${LZIP}" -1 -S100k -o a/b/c/out < in8.lz || test_failed $LINENO
+"${LZIP}" -t a/b/c/out00001.lz a/b/c/out00002.lz || test_failed $LINENO
+"${LZIP}" -cd a/b/c/out00001.lz a/b/c/out00002.lz | cmp in8.lz - ||
+	test_failed $LINENO
+[ ! -e a/b/c/out00003.lz ] || test_failed $LINENO
+rm -rf a || framework_failure
+"${LZIP}" -0 -F -S100k in8.lz || test_failed $LINENO
+"${LZIP}" -t in8.lz00001.lz in8.lz00002.lz || test_failed $LINENO
+"${LZIP}" -cd in8.lz00001.lz in8.lz00002.lz | cmp in8.lz - || test_failed $LINENO
+[ ! -e in8.lz00003.lz ] || test_failed $LINENO
+rm -f in8.lz00001.lz in8.lz00002.lz || framework_failure
+"${LZIP}" -0kF -b100k in8.lz || test_failed $LINENO
+"${LZIP}" -t in8.lz.lz || test_failed $LINENO
+"${LZIP}" -cd in8.lz.lz | cmp in8.lz - || test_failed $LINENO
+rm -f in8.lz in8.lz.lz || framework_failure
+
+"${LZIP}" fox -o a/b/c/fox.lz || test_failed $LINENO
+cmp "${fox_lz}" a/b/c/fox.lz || test_failed $LINENO
+rm -rf a || framework_failure
+"${LZIP}" -o a/b/c/fox.lz < fox || test_failed $LINENO
+cmp "${fox_lz}" a/b/c/fox.lz || test_failed $LINENO
+rm -rf a || framework_failure
+
+printf "\ntesting bad input..."
+
+headers='LZIp LZiP LZip LzIP LzIp LziP lZIP lZIp lZiP lzIP'
+body='\001\014\000\203\377\373\377\377\300\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000$\000\000\000\000\000\000\000'
+cat "${in_lz}" > int.lz || framework_failure
+printf "LZIP${body}" >> int.lz || framework_failure
+if "${LZIP}" -tq int.lz ; then
+	for header in ${headers} ; do
+		printf "${header}${body}" > int.lz || framework_failure
+		"${LZIP}" -lq int.lz			# first member
+		[ $? = 2 ] || test_failed $LINENO ${header}
+		"${LZIP}" -tq int.lz
+		[ $? = 2 ] || test_failed $LINENO ${header}
+		"${LZIP}" -tq < int.lz
+		[ $? = 2 ] || test_failed $LINENO ${header}
+		"${LZIP}" -cdq int.lz > /dev/null
+		[ $? = 2 ] || test_failed $LINENO ${header}
+		"${LZIP}" -lq --loose-trailing int.lz
+		[ $? = 2 ] || test_failed $LINENO ${header}
+		"${LZIP}" -tq --loose-trailing int.lz
+		[ $? = 2 ] || test_failed $LINENO ${header}
+		"${LZIP}" -tq --loose-trailing < int.lz
+		[ $? = 2 ] || test_failed $LINENO ${header}
+		"${LZIP}" -cdq --loose-trailing int.lz > /dev/null
+		[ $? = 2 ] || test_failed $LINENO ${header}
+		cat "${in_lz}" > int.lz || framework_failure
+		printf "${header}${body}" >> int.lz || framework_failure
+		"${LZIP}" -lq int.lz			# trailing data
+		[ $? = 2 ] || test_failed $LINENO ${header}
+		"${LZIP}" -tq int.lz
+		[ $? = 2 ] || test_failed $LINENO ${header}
+		"${LZIP}" -tq < int.lz
+		[ $? = 2 ] || test_failed $LINENO ${header}
+		"${LZIP}" -cdq int.lz > /dev/null
+		[ $? = 2 ] || test_failed $LINENO ${header}
+		"${LZIP}" -lq --loose-trailing int.lz ||
+			test_failed $LINENO ${header}
+		"${LZIP}" -t --loose-trailing int.lz ||
+			test_failed $LINENO ${header}
+		"${LZIP}" -t --loose-trailing < int.lz ||
+			test_failed $LINENO ${header}
+		"${LZIP}" -cd --loose-trailing int.lz > /dev/null ||
+			test_failed $LINENO ${header}
+		"${LZIP}" -lq --loose-trailing --trailing-error int.lz
+		[ $? = 2 ] || test_failed $LINENO ${header}
+		"${LZIP}" -tq --loose-trailing --trailing-error int.lz
+		[ $? = 2 ] || test_failed $LINENO ${header}
+		"${LZIP}" -tq --loose-trailing --trailing-error < int.lz
+		[ $? = 2 ] || test_failed $LINENO ${header}
+		"${LZIP}" -cdq --loose-trailing --trailing-error int.lz > /dev/null
+		[ $? = 2 ] || test_failed $LINENO ${header}
+	done
+else
+	printf "\nwarning: skipping header test: 'printf' does not work on your system."
+fi
+rm -f int.lz || framework_failure
+
+for i in fox_v2.lz fox_s11.lz fox_de20.lz \
+         fox_bcrc.lz fox_crc0.lz fox_das46.lz fox_mes81.lz ; do
+	"${LZIP}" -tq "${testdir}"/$i
+	[ $? = 2 ] || test_failed $LINENO $i
+done
+
+for i in fox_bcrc.lz fox_crc0.lz fox_das46.lz fox_mes81.lz ; do
+	"${LZIP}" -cdq "${testdir}"/$i > out
+	[ $? = 2 ] || test_failed $LINENO $i
+	cmp fox out || test_failed $LINENO $i
+done
+rm -f fox out || framework_failure
+
+cat "${in_lz}" "${in_lz}" > in2.lz || framework_failure
+cat "${in_lz}" "${in_lz}" "${in_lz}" > in3.lz || framework_failure
+if dd if=in3.lz of=trunc.lz bs=14752 count=1 2> /dev/null &&
+   [ -e trunc.lz ] && cmp in2.lz trunc.lz > /dev/null 2>&1 ; then
+	for i in 6 20 14734 14753 14754 14755 14756 14757 14758 ; do
+		dd if=in3.lz of=trunc.lz bs=$i count=1 2> /dev/null
+		"${LZIP}" -lq trunc.lz
+		[ $? = 2 ] || test_failed $LINENO $i
+		"${LZIP}" -tq trunc.lz
+		[ $? = 2 ] || test_failed $LINENO $i
+		"${LZIP}" -tq < trunc.lz
+		[ $? = 2 ] || test_failed $LINENO $i
+		"${LZIP}" -cdq trunc.lz > /dev/null
+		[ $? = 2 ] || test_failed $LINENO $i
+		"${LZIP}" -dq < trunc.lz > /dev/null
+		[ $? = 2 ] || test_failed $LINENO $i
+	done
+else
+	printf "\nwarning: skipping truncation test: 'dd' does not work on your system."
+fi
+rm -f in2.lz in3.lz trunc.lz || framework_failure
+
+cat "${in_lz}" > ingin.lz || framework_failure
+printf "g" >> ingin.lz || framework_failure
+cat "${in_lz}" >> ingin.lz || framework_failure
+"${LZIP}" -lq ingin.lz
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -atq ingin.lz
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -atq < ingin.lz
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -acdq ingin.lz > /dev/null
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -adq < ingin.lz > /dev/null
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -t ingin.lz || test_failed $LINENO
+"${LZIP}" -t < ingin.lz || test_failed $LINENO
+"${LZIP}" -cd ingin.lz > out || test_failed $LINENO
+cmp in out || test_failed $LINENO
+"${LZIP}" -d < ingin.lz > out || test_failed $LINENO
+cmp in out || test_failed $LINENO
+rm -f out ingin.lz || framework_failure
+
+echo
+if [ ${fail} = 0 ] ; then
+	echo "tests completed successfully."
+	cd "${objdir}" && rm -r tmp
+else
+	echo "tests failed."
+fi
+exit ${fail}
diff --git a/testsuite/fox.lz b/testsuite/fox.lz
new file mode 100644
index 0000000..509da82
--- /dev/null
+++ b/testsuite/fox.lz
diff --git a/testsuite/fox6.lz b/testsuite/fox6.lz
new file mode 100644
index 0000000..8401b99
--- /dev/null
+++ b/testsuite/fox6.lz
diff --git a/testsuite/fox6_mark.lz b/testsuite/fox6_mark.lz
new file mode 100644
index 0000000..32b2ac0
--- /dev/null
+++ b/testsuite/fox6_mark.lz
diff --git a/testsuite/fox_bcrc.lz b/testsuite/fox_bcrc.lz
new file mode 100644
index 0000000..8f6a7c4
--- /dev/null
+++ b/testsuite/fox_bcrc.lz
diff --git a/testsuite/fox_crc0.lz b/testsuite/fox_crc0.lz
new file mode 100644
index 0000000..1abe926
--- /dev/null
+++ b/testsuite/fox_crc0.lz
diff --git a/testsuite/fox_das46.lz b/testsuite/fox_das46.lz
new file mode 100644
index 0000000..43ed9f9
--- /dev/null
+++ b/testsuite/fox_das46.lz
diff --git a/testsuite/fox_de20.lz b/testsuite/fox_de20.lz
new file mode 100644
index 0000000..10949d8
--- /dev/null
+++ b/testsuite/fox_de20.lz
diff --git a/testsuite/fox_mes81.lz b/testsuite/fox_mes81.lz
new file mode 100644
index 0000000..d50ef2e
--- /dev/null
+++ b/testsuite/fox_mes81.lz
diff --git a/testsuite/fox_s11.lz b/testsuite/fox_s11.lz
new file mode 100644
index 0000000..dca909c
--- /dev/null
+++ b/testsuite/fox_s11.lz
diff --git a/testsuite/fox_v2.lz b/testsuite/fox_v2.lz
new file mode 100644
index 0000000..8620981
--- /dev/null
+++ b/testsuite/fox_v2.lz
diff --git a/testsuite/test.txt b/testsuite/test.txt
new file mode 100644
index 0000000..9196a3a
--- /dev/null
+++ b/testsuite/test.txt
@@ -0,0 +1,676 @@
+                    GNU GENERAL PUBLIC LICENSE
+                       Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+                            Preamble
+
+  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.)  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+                    GNU GENERAL PUBLIC LICENSE
+   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+  0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term "modification".)  Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+  1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+  2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+    a) You must cause the modified files to carry prominent notices
+    stating that you changed the files and the date of any change.
+
+    b) You must cause any work that you distribute or publish, that in
+    whole or in part contains or is derived from the Program or any
+    part thereof, to be licensed as a whole at no charge to all third
+    parties under the terms of this License.
+
+    c) If the modified program normally reads commands interactively
+    when run, you must cause it, when started running for such
+    interactive use in the most ordinary way, to print or display an
+    announcement including an appropriate copyright notice and a
+    notice that there is no warranty (or else, saying that you provide
+    a warranty) and that users may redistribute the program under
+    these conditions, and telling the user how to view a copy of this
+    License.  (Exception: if the Program itself is interactive but
+    does not normally print such an announcement, your work based on
+    the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole.  If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works.  But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+  3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+    a) Accompany it with the complete corresponding machine-readable
+    source code, which must be distributed under the terms of Sections
+    1 and 2 above on a medium customarily used for software interchange; or,
+
+    b) Accompany it with a written offer, valid for at least three
+    years, to give any third party, for a charge no more than your
+    cost of physically performing source distribution, a complete
+    machine-readable copy of the corresponding source code, to be
+    distributed under the terms of Sections 1 and 2 above on a medium
+    customarily used for software interchange; or,
+
+    c) Accompany it with the information you received as to the offer
+    to distribute corresponding source code.  (This alternative is
+    allowed only for noncommercial distribution and only if you
+    received the program in object code or executable form with such
+    an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it.  For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable.  However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+  4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+  5. You are not required to accept this License, since you have not
+signed it.  However, nothing else grants you permission to modify or
+distribute the Program or its derivative works.  These actions are
+prohibited by law if you do not accept this License.  Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+  6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions.  You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+  7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all.  For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices.  Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+  8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded.  In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+  9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number.  If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation.  If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+  10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission.  For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this.  Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+                            NO WARRANTY
+
+  11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+  12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+                     END OF TERMS AND CONDITIONS
+
+            How to Apply These Terms to Your New Programs
+
+  If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+  To do so, attach the following notices to the program.  It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+    <one line to give the program's name and a brief idea of what it does.>
+    Copyright (C) <year>  <name of author>
+
+    This program is free software: you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation, either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+    Gnomovision version 69, Copyright (C) <year>  <name of author>
+    Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+    This is free software, and you are welcome to redistribute it
+    under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License.  Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary.  Here is a sample; alter the names:
+
+  Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+  `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+  <signature of Ty Coon>, 1 April 1989
+  Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs.  If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library.  If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
+                    GNU GENERAL PUBLIC LICENSE
+                       Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+                            Preamble
+
+  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.)  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+                    GNU GENERAL PUBLIC LICENSE
+   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+  0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term "modification".)  Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+  1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+  2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+    a) You must cause the modified files to carry prominent notices
+    stating that you changed the files and the date of any change.
+
+    b) You must cause any work that you distribute or publish, that in
+    whole or in part contains or is derived from the Program or any
+    part thereof, to be licensed as a whole at no charge to all third
+    parties under the terms of this License.
+
+    c) If the modified program normally reads commands interactively
+    when run, you must cause it, when started running for such
+    interactive use in the most ordinary way, to print or display an
+    announcement including an appropriate copyright notice and a
+    notice that there is no warranty (or else, saying that you provide
+    a warranty) and that users may redistribute the program under
+    these conditions, and telling the user how to view a copy of this
+    License.  (Exception: if the Program itself is interactive but
+    does not normally print such an announcement, your work based on
+    the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole.  If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works.  But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+  3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+    a) Accompany it with the complete corresponding machine-readable
+    source code, which must be distributed under the terms of Sections
+    1 and 2 above on a medium customarily used for software interchange; or,
+
+    b) Accompany it with a written offer, valid for at least three
+    years, to give any third party, for a charge no more than your
+    cost of physically performing source distribution, a complete
+    machine-readable copy of the corresponding source code, to be
+    distributed under the terms of Sections 1 and 2 above on a medium
+    customarily used for software interchange; or,
+
+    c) Accompany it with the information you received as to the offer
+    to distribute corresponding source code.  (This alternative is
+    allowed only for noncommercial distribution and only if you
+    received the program in object code or executable form with such
+    an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it.  For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable.  However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+  4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+  5. You are not required to accept this License, since you have not
+signed it.  However, nothing else grants you permission to modify or
+distribute the Program or its derivative works.  These actions are
+prohibited by law if you do not accept this License.  Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+  6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions.  You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+  7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all.  For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices.  Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+  8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded.  In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+  9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number.  If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation.  If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+  10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission.  For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this.  Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+                            NO WARRANTY
+
+  11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+  12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+                     END OF TERMS AND CONDITIONS
+
+            How to Apply These Terms to Your New Programs
+
+  If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+  To do so, attach the following notices to the program.  It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+    <one line to give the program's name and a brief idea of what it does.>
+    Copyright (C) <year>  <name of author>
+
+    This program is free software: you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation, either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+    Gnomovision version 69, Copyright (C) <year>  <name of author>
+    Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+    This is free software, and you are welcome to redistribute it
+    under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License.  Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary.  Here is a sample; alter the names:
+
+  Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+  `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+  <signature of Ty Coon>, 1 April 1989
+  Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs.  If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library.  If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
diff --git a/testsuite/test.txt.lz b/testsuite/test.txt.lz
new file mode 100644
index 0000000..22cea6e
--- /dev/null
+++ b/testsuite/test.txt.lz
diff --git a/testsuite/test_em.txt.lz b/testsuite/test_em.txt.lz
new file mode 100644
index 0000000..7e96250
--- /dev/null
+++ b/testsuite/test_em.txt.lz