From 9cec982f0c43414daa40c7c2abcdba1b15c656ce Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sat, 4 May 2024 16:24:33 +0200 Subject: Adding upstream version 1.13. Signed-off-by: Daniel Baumann --- AUTHORS | 7 + COPYING | 338 +++++++++++++++++ ChangeLog | 128 +++++++ INSTALL | 75 ++++ Makefile.in | 133 +++++++ NEWS | 6 + README | 99 +++++ carg_parser.c | 319 ++++++++++++++++ carg_parser.h | 97 +++++ configure | 193 ++++++++++ decoder.c | 317 ++++++++++++++++ decoder.h | 393 ++++++++++++++++++++ doc/lunzip.1 | 97 +++++ list.c | 111 ++++++ lzip.h | 278 ++++++++++++++ lzip_index.c | 283 ++++++++++++++ lzip_index.h | 91 +++++ main.c | 948 +++++++++++++++++++++++++++++++++++++++++++++++ testsuite/check.sh | 351 ++++++++++++++++++ testsuite/fox.lz | Bin 0 -> 80 bytes testsuite/fox_bcrc.lz | Bin 0 -> 80 bytes testsuite/fox_crc0.lz | Bin 0 -> 80 bytes testsuite/fox_das46.lz | Bin 0 -> 80 bytes testsuite/fox_de20.lz | Bin 0 -> 80 bytes testsuite/fox_mes81.lz | Bin 0 -> 80 bytes testsuite/fox_s11.lz | Bin 0 -> 80 bytes testsuite/fox_v2.lz | Bin 0 -> 80 bytes testsuite/test.txt | 676 +++++++++++++++++++++++++++++++++ testsuite/test.txt.lz | Bin 0 -> 7376 bytes testsuite/test_em.txt.lz | Bin 0 -> 14024 bytes 30 files changed, 4940 insertions(+) create mode 100644 AUTHORS create mode 100644 COPYING create mode 100644 ChangeLog create mode 100644 INSTALL create mode 100644 Makefile.in create mode 100644 NEWS create mode 100644 README create mode 100644 carg_parser.c create mode 100644 carg_parser.h create mode 100755 configure create mode 100644 decoder.c create mode 100644 decoder.h create mode 100644 doc/lunzip.1 create mode 100644 list.c create mode 100644 lzip.h create mode 100644 lzip_index.c create mode 100644 lzip_index.h create mode 100644 main.c create mode 100755 testsuite/check.sh create mode 100644 testsuite/fox.lz create mode 100644 testsuite/fox_bcrc.lz create mode 100644 testsuite/fox_crc0.lz create mode 100644 testsuite/fox_das46.lz create mode 100644 testsuite/fox_de20.lz create mode 100644 testsuite/fox_mes81.lz create mode 100644 testsuite/fox_s11.lz create mode 100644 testsuite/fox_v2.lz create mode 100644 testsuite/test.txt create mode 100644 testsuite/test.txt.lz create mode 100644 testsuite/test_em.txt.lz diff --git a/AUTHORS b/AUTHORS new file mode 100644 index 0000000..e39119d --- /dev/null +++ b/AUTHORS @@ -0,0 +1,7 @@ +Lunzip was written by Antonio Diaz Diaz. + +The ideas embodied in lunzip are due to (at least) the following people: +Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for the +definition of Markov chains), G.N.N. Martin (for the definition of range +encoding), Igor Pavlov (for putting all the above together in LZMA), and +Julian Seward (for bzip2's CLI). diff --git a/COPYING b/COPYING new file mode 100644 index 0000000..4ad17ae --- /dev/null +++ b/COPYING @@ -0,0 +1,338 @@ + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Lesser General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + , 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. diff --git a/ChangeLog b/ChangeLog new file mode 100644 index 0000000..07d8d6e --- /dev/null +++ b/ChangeLog @@ -0,0 +1,128 @@ +2022-01-22 Antonio Diaz Diaz + + * Version 1.13 released. + * Decompression time has been reduced by 5-12% depending on the file. + * main.c (getnum): Show option name and valid range if error. + +2021-01-01 Antonio Diaz Diaz + + * Version 1.12 released. + * main.c (main): Report an error if a file name is empty. + Make '-o' behave like '-c', but writing to file instead of stdout. + Do not open output if input is a terminal. + * Replace 'decompressed', 'compressed' with 'out', 'in' in output. + * lzip_index.c: Improve messages for corruption in last header. + * main.c: Set a valid invocation_name even if argc == 0. + * Document extraction from tar.lz in '--help' output and man page. + * testsuite: Add 9 new test files. + +2019-01-01 Antonio Diaz Diaz + + * Version 1.11 released. + * Rename File_* to Lzip_*. + * lzip.h (Lzip_trailer): New function 'Lt_verify_consistency'. + * lzip_index.c: Detect some kinds of corrupt trailers. + * main.c (main): Check return value of close( infd ). + * main.c: Compile on DOS with DJGPP. + * configure: Accept appending to CFLAGS; 'CFLAGS+=OPTIONS'. + * INSTALL: Document use of CFLAGS+='-D __USE_MINGW_ANSI_STDIO'. + +2018-02-05 Antonio Diaz Diaz + + * Version 1.10 released. + * main.c: New option '--loose-trailing'. + * Improve corrupt header detection to HD=3. + * main.c: Show corrupt or truncated header in multimember file. + * Replace 'bits/byte' with inverse compression ratio in output. + * Show progress of decompression at verbosity level 2 (-vv). + * Show progress of decompression only if stderr is a terminal. + * main.c: Show final diagnostic when testing multiple files. + * decoder.c (LZd_verify_trailer): Show stored sizes also in hex. + Show dictionary size at verbosity level 4 (-vvvv). + +2017-04-13 Antonio Diaz Diaz + + * Version 1.9 released. + * The option '-l, --list' has been ported from lziprecover. + * Don't allow mixing different operations (-d, -l or -t). + * Decompression time has been reduced by 7%. + * main.c: Continue testing if any input file is a terminal. + * main.c: Show trailing data in both hexadecimal and ASCII. + * lzip_index.c: Improve detection of bad dict and trailing data. + * lzip.h: Unify messages for bad magic, trailing data, etc. + +2016-05-12 Antonio Diaz Diaz + + * Version 1.8 released. + * main.c: New option '-a, --trailing-error'. + * main.c (main): With '-u', verify that output file is regular. + * main.c (decompress): Print up to 6 bytes of trailing data + when '-vvvv' is specified. + * decoder.c (LZd_verify_trailer): Remove test of final code. + * main.c (main): Delete '--output' file if infd is a terminal. + * main.c (main): Don't use stdin more than once. + * Error messages synced with lzip-1.18. + * configure: Avoid warning on some shells when testing for gcc. + * check.sh: A POSIX shell is required to run the tests. + * check.sh: Don't check error messages. + +2015-05-27 Antonio Diaz Diaz + + * Version 1.7 released. + * Minor changes. + * Makefile.in: New targets 'install*-compress'. + +2014-07-01 Antonio Diaz Diaz + + * Version 1.6 released. + * Change license to GPL version 2 or later. + +2014-04-11 Antonio Diaz Diaz + + * Version 1.5 released. + * main.c: New option '-u, --buffer-size' (low memory mode). + * main.c (close_and_set_permissions): Behave like 'cp -p'. + +2013-09-17 Antonio Diaz Diaz + + * Version 1.4 released. + * main.c (show_header): Don't show header version. + * Minor fixes. + +2013-06-18 Antonio Diaz Diaz + + * Version 1.3 released. + * Decompression time has been reduced by 1%. + * main.c (show_header): Show header version if verbosity >= 4. + * Ignore option '-n, --threads' for compatibility with plzip. + * configure: Options now accept a separate argument. + +2013-02-18 Antonio Diaz Diaz + + * Version 1.2 released. + * Decompression time has been reduced by 12%. + * Makefile.in: New targets 'install-as-lzip' and 'install-bin'. + * main.c: Use 'setmode' instead of '_setmode' on Windows and OS/2. + +2012-02-26 Antonio Diaz Diaz + + * Version 1.1 released. + * main.c (decompress): Print only one status line for each + multi-member file when only one '-v' is specified. + * main.c (close_and_set_permissions): Inability to change output + file attributes has been downgraded from error to warning. + * Change quote characters in messages as advised by GNU Standards. + * configure: Rename 'datadir' to 'datarootdir'. + +2011-01-17 Antonio Diaz Diaz + + * Version 1.0 released. + * Initial release. + * Created from the decompression code of clzip 1.1. + + +Copyright (C) 2010-2022 Antonio Diaz Diaz. + +This file is a collection of facts, and thus it is not copyrightable, +but just in case, you have unlimited permission to copy, distribute, and +modify it. diff --git a/INSTALL b/INSTALL new file mode 100644 index 0000000..4282b1e --- /dev/null +++ b/INSTALL @@ -0,0 +1,75 @@ +Requirements +------------ +You will need a C99 compiler. (gcc 3.3.6 or newer is recommended). +I use gcc 6.1.0 and 3.3.6, but the code should compile with any standards +compliant compiler. +Gcc is available at http://gcc.gnu.org. + +The operating system must allow signal handlers read access to objects with +static storage duration so that the cleanup handler for Control-C can delete +the partial output file. + + +Procedure +--------- +1. Unpack the archive if you have not done so already: + + tar -xf lunzip[version].tar.lz +or + lzip -cd lunzip[version].tar.lz | tar -xf - + +This creates the directory ./lunzip[version] containing the source from +the main archive. + +2. Change to lunzip directory and run configure. + (Try 'configure --help' for usage instructions). + + cd lunzip[version] + ./configure + + If you are compiling on MinGW, use: + + ./configure CFLAGS+='-D __USE_MINGW_ANSI_STDIO' + +3. Run make. + + make + +4. Optionally, type 'make check' to run the tests that come with lunzip. + +5. Type 'make install' to install the program and any data files and + documentation. + + Or type 'make install-compress', which additionally compresses the + man page after installation. + (Installing compressed docs may become the default in the future). + + You can install only the program or the man page by typing + 'make install-bin' or 'make install-man' respectively. + + Instead of 'make install', you can type 'make install-as-lzip' to + install the program and any data files and documentation, and link + the program to the name 'lzip'. + + +Another way +----------- +You can also compile lunzip into a separate directory. +To do this, you must use a version of 'make' that supports the variable +'VPATH', such as GNU 'make'. 'cd' to the directory where you want the +object files and executables to go and run the 'configure' script. +'configure' automatically checks for the source code in '.', in '..', and +in the directory that 'configure' is in. + +'configure' recognizes the option '--srcdir=DIR' to control where to +look for the sources. Usually 'configure' can determine that directory +automatically. + +After running 'configure', you can run 'make' and 'make install' as +explained above. + + +Copyright (C) 2010-2022 Antonio Diaz Diaz. + +This file is free documentation: you have unlimited permission to copy, +distribute, and modify it. diff --git a/Makefile.in b/Makefile.in new file mode 100644 index 0000000..ffc4ce8 --- /dev/null +++ b/Makefile.in @@ -0,0 +1,133 @@ + +DISTNAME = $(pkgname)-$(pkgversion) +INSTALL = install +INSTALL_PROGRAM = $(INSTALL) -m 755 +INSTALL_DATA = $(INSTALL) -m 644 +INSTALL_DIR = $(INSTALL) -d -m 755 +SHELL = /bin/sh +CAN_RUN_INSTALLINFO = $(SHELL) -c "install-info --version" > /dev/null 2>&1 + +objs = carg_parser.o lzip_index.o list.o decoder.o main.o + + +.PHONY : all install install-bin install-info install-man \ + install-strip install-compress install-strip-compress \ + install-bin-strip install-info-compress install-man-compress \ + install-as-lzip \ + uninstall uninstall-bin uninstall-info uninstall-man \ + doc info man check dist clean distclean + +all : $(progname) + +$(progname) : $(objs) + $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(objs) + +main.o : main.c + $(CC) $(CPPFLAGS) $(CFLAGS) -DPROGVERSION=\"$(pkgversion)\" -c -o $@ $< + +%.o : %.c + $(CC) $(CPPFLAGS) $(CFLAGS) -c -o $@ $< + +$(objs) : Makefile +carg_parser.o : carg_parser.h +decoder.o : lzip.h decoder.h +list.o : lzip.h lzip_index.h +lzip_index.o : lzip.h lzip_index.h +main.o : carg_parser.h lzip.h decoder.h + + +doc : man + +info : $(VPATH)/doc/$(pkgname).info + +$(VPATH)/doc/$(pkgname).info : $(VPATH)/doc/$(pkgname).texi + cd $(VPATH)/doc && makeinfo $(pkgname).texi + +man : $(VPATH)/doc/$(progname).1 + +$(VPATH)/doc/$(progname).1 : $(progname) + help2man -n 'decompressor for the lzip format' -o $@ --no-info ./$(progname) + +Makefile : $(VPATH)/configure $(VPATH)/Makefile.in + ./config.status + +check : all + @$(VPATH)/testsuite/check.sh $(VPATH)/testsuite $(pkgversion) + +install : install-bin install-man +install-strip : install-bin-strip install-man +install-compress : install-bin install-man-compress +install-strip-compress : install-bin-strip install-man-compress + +install-bin : all + if [ ! -d "$(DESTDIR)$(bindir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(bindir)" ; fi + $(INSTALL_PROGRAM) ./$(progname) "$(DESTDIR)$(bindir)/$(progname)" + +install-bin-strip : all + $(MAKE) INSTALL_PROGRAM='$(INSTALL_PROGRAM) -s' install-bin + +install-info : + if [ ! -d "$(DESTDIR)$(infodir)" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(infodir)" ; fi + -rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"* + $(INSTALL_DATA) $(VPATH)/doc/$(pkgname).info "$(DESTDIR)$(infodir)/$(pkgname).info" + -if $(CAN_RUN_INSTALLINFO) ; then \ + install-info --info-dir="$(DESTDIR)$(infodir)" "$(DESTDIR)$(infodir)/$(pkgname).info" ; \ + fi + +install-info-compress : install-info + lzip -v -9 "$(DESTDIR)$(infodir)/$(pkgname).info" + +install-man : + if [ ! -d "$(DESTDIR)$(mandir)/man1" ] ; then $(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1" ; fi + -rm -f "$(DESTDIR)$(mandir)/man1/$(progname).1"* + $(INSTALL_DATA) $(VPATH)/doc/$(progname).1 "$(DESTDIR)$(mandir)/man1/$(progname).1" + +install-man-compress : install-man + lzip -v -9 "$(DESTDIR)$(mandir)/man1/$(progname).1" + +install-as-lzip : install + -rm -f "$(DESTDIR)$(bindir)/lzip" + cd "$(DESTDIR)$(bindir)" && ln -s $(progname) lzip + +uninstall : uninstall-man uninstall-bin + +uninstall-bin : + -rm -f "$(DESTDIR)$(bindir)/$(progname)" + +uninstall-info : + -if $(CAN_RUN_INSTALLINFO) ; then \ + install-info --info-dir="$(DESTDIR)$(infodir)" --remove "$(DESTDIR)$(infodir)/$(pkgname).info" ; \ + fi + -rm -f "$(DESTDIR)$(infodir)/$(pkgname).info"* + +uninstall-man : + -rm -f "$(DESTDIR)$(mandir)/man1/$(progname).1"* + +dist : doc + ln -sf $(VPATH) $(DISTNAME) + tar -Hustar --owner=root --group=root -cvf $(DISTNAME).tar \ + $(DISTNAME)/AUTHORS \ + $(DISTNAME)/COPYING \ + $(DISTNAME)/ChangeLog \ + $(DISTNAME)/INSTALL \ + $(DISTNAME)/Makefile.in \ + $(DISTNAME)/NEWS \ + $(DISTNAME)/README \ + $(DISTNAME)/configure \ + $(DISTNAME)/doc/$(progname).1 \ + $(DISTNAME)/*.h \ + $(DISTNAME)/*.c \ + $(DISTNAME)/testsuite/check.sh \ + $(DISTNAME)/testsuite/test.txt \ + $(DISTNAME)/testsuite/fox.lz \ + $(DISTNAME)/testsuite/fox_*.lz \ + $(DISTNAME)/testsuite/test.txt.lz \ + $(DISTNAME)/testsuite/test_em.txt.lz + rm -f $(DISTNAME) + lzip -v -9 $(DISTNAME).tar + +clean : + -rm -f $(progname) $(objs) + +distclean : clean + -rm -f Makefile config.status *.tar *.tar.lz diff --git a/NEWS b/NEWS new file mode 100644 index 0000000..fee658c --- /dev/null +++ b/NEWS @@ -0,0 +1,6 @@ +Changes in version 1.13: + +Decompression time has been reduced by 5-12% depending on the file. + +In case of error in a numerical argument to a command line option, lunzip +now shows the name of the option and the range of valid values. diff --git a/README b/README new file mode 100644 index 0000000..39cd00d --- /dev/null +++ b/README @@ -0,0 +1,99 @@ +Description + +Lunzip is a decompressor for the lzip format written in C. Its small size +makes it well suited for embedded devices or software installers that need +to decompress files but don't need compression capabilities. Lunzip is fully +compatible with lzip 1.4 or newer. + +The lzip file format is designed for data sharing and long-term archiving, +taking into account both data integrity and decoder availability: + + * The lzip format provides very safe integrity checking and some data + recovery means. The program lziprecover can repair bit flip errors + (one of the most common forms of data corruption) in lzip files, and + provides data recovery capabilities, including error-checked merging + of damaged copies of a file. + + * The lzip format is as simple as possible (but not simpler). The lzip + manual provides the source code of a simple decompressor along with a + detailed explanation of how it works, so that with the only help of the + lzip manual it would be possible for a digital archaeologist to extract + the data from a lzip file long after quantum computers eventually + render LZMA obsolete. + + * Additionally the lzip reference implementation is copylefted, which + guarantees that it will remain free forever. + +A nice feature of the lzip format is that a corrupt byte is easier to repair +the nearer it is from the beginning of the file. Therefore, with the help of +lziprecover, losing an entire archive just because of a corrupt byte near +the beginning is a thing of the past. + +Lunzip uses the same well-defined exit status values used by bzip2, which +makes it safer than decompressors returning ambiguous warning values (like +gunzip) when it is used as a back end for other programs like tar or zutils. + +Lunzip provides a 'low memory' mode able to decompress any file using as +little memory as 50 kB, irrespective of the dictionary size used to +compress the file. To activate it, specify the size of the output buffer +with the option '--buffer-size' and lunzip will use the decompressed +file as dictionary for distances beyond the buffer size. Of course, the +larger the difference between the buffer size and the dictionary size, the +more accesses to disk are needed and the slower the decompression is. +This 'low memory' mode only works when decompressing to a regular file +and is intended for systems without enough memory (RAM + swap) to keep +the whole dictionary at once. It has been tested on a laptop with a 486 +processor and 4 MiB of RAM. + +The option '--buffer-size' may help to decompress a file erroneously created +with a dictionary size much larger than the uncompressed size. (Lzip adjusts +the dictionary size to the uncompressed size, but third-party tools may not). + +The amount of memory required by lunzip to decompress a file is about 46 kB +larger than the dictionary size used to compress that file, unless +'--buffer-size' is specified. + +Lunzip attempts to guess the name for the decompressed file from that of +the compressed file as follows: + +filename.lz becomes filename +filename.tlz becomes filename.tar +anyothername becomes anyothername.out + +Decompressing a file is much like copying or moving it. Therefore lunzip +preserves the access and modification dates, permissions, and, when +possible, ownership of the file just as 'cp -p' does. (If the user ID or +the group ID can't be duplicated, the file permission bits S_ISUID and +S_ISGID are cleared). + +Lunzip is able to read from some types of non-regular files if either the +option '-c' or the option '-o' is specified. + +If no file names are specified, lunzip decompresses from standard input to +standard output. In this case, lunzip will refuse to read compressed input +from a terminal, as this might leave the terminal in an abnormal state. + +Lunzip will correctly decompress a file which is the concatenation of two or +more compressed files. The result is the concatenation of the corresponding +decompressed files. Integrity testing of concatenated compressed files is +also supported. + +The ideas embodied in lunzip are due to (at least) the following people: +Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for the +definition of Markov chains), G.N.N. Martin (for the definition of range +encoding), Igor Pavlov (for putting all the above together in LZMA), and +Julian Seward (for bzip2's CLI). + +LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never have +been compressed. Decompressed is used to refer to data which have undergone +the process of decompression. + + +Copyright (C) 2010-2022 Antonio Diaz Diaz. + +This file is free documentation: you have unlimited permission to copy, +distribute, and modify it. + +The file Makefile.in is a data file used by configure to produce the +Makefile. It has the same copyright owner and permissions that configure +itself. diff --git a/carg_parser.c b/carg_parser.c new file mode 100644 index 0000000..181ba23 --- /dev/null +++ b/carg_parser.c @@ -0,0 +1,319 @@ +/* Arg_parser - POSIX/GNU command line argument parser. (C version) + Copyright (C) 2006-2022 Antonio Diaz Diaz. + + This library is free software. Redistribution and use in source and + binary forms, with or without modification, are permitted provided + that the following conditions are met: + + 1. Redistributions of source code must retain the above copyright + notice, this list of conditions, and the following disclaimer. + + 2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions, and the following disclaimer in the + documentation and/or other materials provided with the distribution. + + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. +*/ + +#include +#include + +#include "carg_parser.h" + + +/* assure at least a minimum size for buffer 'buf' */ +static void * ap_resize_buffer( void * buf, const int min_size ) + { + if( buf ) buf = realloc( buf, min_size ); + else buf = malloc( min_size ); + return buf; + } + + +static char push_back_record( struct Arg_parser * const ap, const int code, + const char * const long_name, + const char * const argument ) + { + struct ap_Record * p; + void * tmp = ap_resize_buffer( ap->data, + ( ap->data_size + 1 ) * sizeof (struct ap_Record) ); + if( !tmp ) return 0; + ap->data = (struct ap_Record *)tmp; + p = &(ap->data[ap->data_size]); + p->code = code; + if( long_name ) + { + const int len = strlen( long_name ); + p->parsed_name = (char *)malloc( len + 2 + 1 ); + if( !p->parsed_name ) return 0; + p->parsed_name[0] = p->parsed_name[1] = '-'; + strncpy( p->parsed_name + 2, long_name, len + 1 ); + } + else if( code > 0 && code < 256 ) + { + p->parsed_name = (char *)malloc( 2 + 1 ); + if( !p->parsed_name ) return 0; + p->parsed_name[0] = '-'; p->parsed_name[1] = code; p->parsed_name[2] = 0; + } + else p->parsed_name = 0; + if( argument ) + { + const int len = strlen( argument ); + p->argument = (char *)malloc( len + 1 ); + if( !p->argument ) { free( p->parsed_name ); return 0; } + strncpy( p->argument, argument, len + 1 ); + } + else p->argument = 0; + ++ap->data_size; + return 1; + } + + +static char add_error( struct Arg_parser * const ap, const char * const msg ) + { + const int len = strlen( msg ); + void * tmp = ap_resize_buffer( ap->error, ap->error_size + len + 1 ); + if( !tmp ) return 0; + ap->error = (char *)tmp; + strncpy( ap->error + ap->error_size, msg, len + 1 ); + ap->error_size += len; + return 1; + } + + +static void free_data( struct Arg_parser * const ap ) + { + int i; + for( i = 0; i < ap->data_size; ++i ) + { free( ap->data[i].argument ); free( ap->data[i].parsed_name ); } + if( ap->data ) { free( ap->data ); ap->data = 0; } + ap->data_size = 0; + } + + +/* Return 0 only if out of memory. */ +static char parse_long_option( struct Arg_parser * const ap, + const char * const opt, const char * const arg, + const struct ap_Option options[], + int * const argindp ) + { + unsigned len; + int index = -1, i; + char exact = 0, ambig = 0; + + for( len = 0; opt[len+2] && opt[len+2] != '='; ++len ) ; + + /* Test all long options for either exact match or abbreviated matches. */ + for( i = 0; options[i].code != 0; ++i ) + if( options[i].long_name && + strncmp( options[i].long_name, &opt[2], len ) == 0 ) + { + if( strlen( options[i].long_name ) == len ) /* Exact match found */ + { index = i; exact = 1; break; } + else if( index < 0 ) index = i; /* First nonexact match found */ + else if( options[index].code != options[i].code || + options[index].has_arg != options[i].has_arg ) + ambig = 1; /* Second or later nonexact match found */ + } + + if( ambig && !exact ) + { + add_error( ap, "option '" ); add_error( ap, opt ); + add_error( ap, "' is ambiguous" ); + return 1; + } + + if( index < 0 ) /* nothing found */ + { + add_error( ap, "unrecognized option '" ); add_error( ap, opt ); + add_error( ap, "'" ); + return 1; + } + + ++*argindp; + + if( opt[len+2] ) /* '--=' syntax */ + { + if( options[index].has_arg == ap_no ) + { + add_error( ap, "option '--" ); add_error( ap, options[index].long_name ); + add_error( ap, "' doesn't allow an argument" ); + return 1; + } + if( options[index].has_arg == ap_yes && !opt[len+3] ) + { + add_error( ap, "option '--" ); add_error( ap, options[index].long_name ); + add_error( ap, "' requires an argument" ); + return 1; + } + return push_back_record( ap, options[index].code, + options[index].long_name, &opt[len+3] ); + } + + if( options[index].has_arg == ap_yes ) + { + if( !arg || !arg[0] ) + { + add_error( ap, "option '--" ); add_error( ap, options[index].long_name ); + add_error( ap, "' requires an argument" ); + return 1; + } + ++*argindp; + return push_back_record( ap, options[index].code, + options[index].long_name, arg ); + } + + return push_back_record( ap, options[index].code, + options[index].long_name, 0 ); + } + + +/* Return 0 only if out of memory. */ +static char parse_short_option( struct Arg_parser * const ap, + const char * const opt, const char * const arg, + const struct ap_Option options[], + int * const argindp ) + { + int cind = 1; /* character index in opt */ + + while( cind > 0 ) + { + int index = -1, i; + const unsigned char c = opt[cind]; + char code_str[2]; + code_str[0] = c; code_str[1] = 0; + + if( c != 0 ) + for( i = 0; options[i].code; ++i ) + if( c == options[i].code ) + { index = i; break; } + + if( index < 0 ) + { + add_error( ap, "invalid option -- '" ); add_error( ap, code_str ); + add_error( ap, "'" ); + return 1; + } + + if( opt[++cind] == 0 ) { ++*argindp; cind = 0; } /* opt finished */ + + if( options[index].has_arg != ap_no && cind > 0 && opt[cind] ) + { + if( !push_back_record( ap, c, 0, &opt[cind] ) ) return 0; + ++*argindp; cind = 0; + } + else if( options[index].has_arg == ap_yes ) + { + if( !arg || !arg[0] ) + { + add_error( ap, "option requires an argument -- '" ); + add_error( ap, code_str ); add_error( ap, "'" ); + return 1; + } + ++*argindp; cind = 0; + if( !push_back_record( ap, c, 0, arg ) ) return 0; + } + else if( !push_back_record( ap, c, 0, 0 ) ) return 0; + } + return 1; + } + + +char ap_init( struct Arg_parser * const ap, + const int argc, const char * const argv[], + const struct ap_Option options[], const char in_order ) + { + const char ** non_options = 0; /* skipped non-options */ + int non_options_size = 0; /* number of skipped non-options */ + int argind = 1; /* index in argv */ + char done = 0; /* false until success */ + + ap->data = 0; + ap->error = 0; + ap->data_size = 0; + ap->error_size = 0; + if( argc < 2 || !argv || !options ) return 1; + + while( argind < argc ) + { + const unsigned char ch1 = argv[argind][0]; + const unsigned char ch2 = ch1 ? argv[argind][1] : 0; + + if( ch1 == '-' && ch2 ) /* we found an option */ + { + const char * const opt = argv[argind]; + const char * const arg = ( argind + 1 < argc ) ? argv[argind+1] : 0; + if( ch2 == '-' ) + { + if( !argv[argind][2] ) { ++argind; break; } /* we found "--" */ + else if( !parse_long_option( ap, opt, arg, options, &argind ) ) goto out; + } + else if( !parse_short_option( ap, opt, arg, options, &argind ) ) goto out; + if( ap->error ) break; + } + else + { + if( in_order ) + { if( !push_back_record( ap, 0, 0, argv[argind++] ) ) goto out; } + else + { + void * tmp = ap_resize_buffer( non_options, + ( non_options_size + 1 ) * sizeof *non_options ); + if( !tmp ) goto out; + non_options = (const char **)tmp; + non_options[non_options_size++] = argv[argind++]; + } + } + } + if( ap->error ) free_data( ap ); + else + { + int i; + for( i = 0; i < non_options_size; ++i ) + if( !push_back_record( ap, 0, 0, non_options[i] ) ) goto out; + while( argind < argc ) + if( !push_back_record( ap, 0, 0, argv[argind++] ) ) goto out; + } + done = 1; +out: if( non_options ) free( non_options ); + return done; + } + + +void ap_free( struct Arg_parser * const ap ) + { + free_data( ap ); + if( ap->error ) { free( ap->error ); ap->error = 0; } + ap->error_size = 0; + } + + +const char * ap_error( const struct Arg_parser * const ap ) + { return ap->error; } + + +int ap_arguments( const struct Arg_parser * const ap ) + { return ap->data_size; } + + +int ap_code( const struct Arg_parser * const ap, const int i ) + { + if( i < 0 || i >= ap_arguments( ap ) ) return 0; + return ap->data[i].code; + } + + +const char * ap_parsed_name( const struct Arg_parser * const ap, const int i ) + { + if( i < 0 || i >= ap_arguments( ap ) || !ap->data[i].parsed_name ) return ""; + return ap->data[i].parsed_name; + } + + +const char * ap_argument( const struct Arg_parser * const ap, const int i ) + { + if( i < 0 || i >= ap_arguments( ap ) || !ap->data[i].argument ) return ""; + return ap->data[i].argument; + } diff --git a/carg_parser.h b/carg_parser.h new file mode 100644 index 0000000..0c64861 --- /dev/null +++ b/carg_parser.h @@ -0,0 +1,97 @@ +/* Arg_parser - POSIX/GNU command line argument parser. (C version) + Copyright (C) 2006-2022 Antonio Diaz Diaz. + + This library is free software. Redistribution and use in source and + binary forms, with or without modification, are permitted provided + that the following conditions are met: + + 1. Redistributions of source code must retain the above copyright + notice, this list of conditions, and the following disclaimer. + + 2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions, and the following disclaimer in the + documentation and/or other materials provided with the distribution. + + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. +*/ + +/* Arg_parser reads the arguments in 'argv' and creates a number of + option codes, option arguments, and non-option arguments. + + In case of error, 'ap_error' returns a non-null pointer to an error + message. + + 'options' is an array of 'struct ap_Option' terminated by an element + containing a code which is zero. A null long_name means a short-only + option. A code value outside the unsigned char range means a long-only + option. + + Arg_parser normally makes it appear as if all the option arguments + were specified before all the non-option arguments for the purposes + of parsing, even if the user of your program intermixed option and + non-option arguments. If you want the arguments in the exact order + the user typed them, call 'ap_init' with 'in_order' = true. + + The argument '--' terminates all options; any following arguments are + treated as non-option arguments, even if they begin with a hyphen. + + The syntax for optional option arguments is '-' + (without whitespace), or '--='. +*/ + +#ifdef __cplusplus +extern "C" { +#endif + +enum ap_Has_arg { ap_no, ap_yes, ap_maybe }; + +struct ap_Option + { + int code; /* Short option letter or code ( code != 0 ) */ + const char * long_name; /* Long option name (maybe null) */ + enum ap_Has_arg has_arg; + }; + + +struct ap_Record + { + int code; + char * parsed_name; + char * argument; + }; + + +struct Arg_parser + { + struct ap_Record * data; + char * error; + int data_size; + int error_size; + }; + + +char ap_init( struct Arg_parser * const ap, + const int argc, const char * const argv[], + const struct ap_Option options[], const char in_order ); + +void ap_free( struct Arg_parser * const ap ); + +const char * ap_error( const struct Arg_parser * const ap ); + +/* The number of arguments parsed. May be different from argc. */ +int ap_arguments( const struct Arg_parser * const ap ); + +/* If ap_code( i ) is 0, ap_argument( i ) is a non-option. + Else ap_argument( i ) is the option's argument (or empty). */ +int ap_code( const struct Arg_parser * const ap, const int i ); + +/* Full name of the option parsed (short or long). */ +const char * ap_parsed_name( const struct Arg_parser * const ap, const int i ); + +const char * ap_argument( const struct Arg_parser * const ap, const int i ); + +#ifdef __cplusplus +} +#endif diff --git a/configure b/configure new file mode 100755 index 0000000..e241235 --- /dev/null +++ b/configure @@ -0,0 +1,193 @@ +#! /bin/sh +# configure script for Lunzip - Decompressor for the lzip format +# Copyright (C) 2010-2022 Antonio Diaz Diaz. +# +# This configure script is free software: you have unlimited permission +# to copy, distribute, and modify it. + +pkgname=lunzip +pkgversion=1.13 +progname=lunzip +srctrigger=doc/${progname}.1 + +# clear some things potentially inherited from environment. +LC_ALL=C +export LC_ALL +srcdir= +prefix=/usr/local +exec_prefix='$(prefix)' +bindir='$(exec_prefix)/bin' +datarootdir='$(prefix)/share' +infodir='$(datarootdir)/info' +mandir='$(datarootdir)/man' +CC=gcc +CPPFLAGS= +CFLAGS='-Wall -W -O2' +LDFLAGS= + +# checking whether we are using GNU C. +/bin/sh -c "${CC} --version" > /dev/null 2>&1 || { CC=cc ; CFLAGS=-O2 ; } + +# Loop over all args +args= +no_create= +while [ $# != 0 ] ; do + + # Get the first arg, and shuffle + option=$1 ; arg2=no + shift + + # Add the argument quoted to args + if [ -z "${args}" ] ; then args="\"${option}\"" + else args="${args} \"${option}\"" ; fi + + # Split out the argument for options that take them + case ${option} in + *=*) optarg=`echo "${option}" | sed -e 's,^[^=]*=,,;s,/$,,'` ;; + esac + + # Process the options + case ${option} in + --help | -h) + echo "Usage: $0 [OPTION]... [VAR=VALUE]..." + echo + echo "To assign makefile variables (e.g., CC, CFLAGS...), specify them as" + echo "arguments to configure in the form VAR=VALUE." + echo + echo "Options and variables: [defaults in brackets]" + echo " -h, --help display this help and exit" + echo " -V, --version output version information and exit" + echo " --srcdir=DIR find the sources in DIR [. or ..]" + echo " --prefix=DIR install into DIR [${prefix}]" + echo " --exec-prefix=DIR base directory for arch-dependent files [${exec_prefix}]" + echo " --bindir=DIR user executables directory [${bindir}]" + echo " --datarootdir=DIR base directory for doc and data [${datarootdir}]" + echo " --infodir=DIR info files directory [${infodir}]" + echo " --mandir=DIR man pages directory [${mandir}]" + echo " CC=COMPILER C compiler to use [${CC}]" + echo " CPPFLAGS=OPTIONS command line options for the preprocessor [${CPPFLAGS}]" + echo " CFLAGS=OPTIONS command line options for the C compiler [${CFLAGS}]" + echo " CFLAGS+=OPTIONS append options to the current value of CFLAGS" + echo " LDFLAGS=OPTIONS command line options for the linker [${LDFLAGS}]" + echo + exit 0 ;; + --version | -V) + echo "Configure script for ${pkgname} version ${pkgversion}" + exit 0 ;; + --srcdir) srcdir=$1 ; arg2=yes ;; + --prefix) prefix=$1 ; arg2=yes ;; + --exec-prefix) exec_prefix=$1 ; arg2=yes ;; + --bindir) bindir=$1 ; arg2=yes ;; + --datarootdir) datarootdir=$1 ; arg2=yes ;; + --infodir) infodir=$1 ; arg2=yes ;; + --mandir) mandir=$1 ; arg2=yes ;; + + --srcdir=*) srcdir=${optarg} ;; + --prefix=*) prefix=${optarg} ;; + --exec-prefix=*) exec_prefix=${optarg} ;; + --bindir=*) bindir=${optarg} ;; + --datarootdir=*) datarootdir=${optarg} ;; + --infodir=*) infodir=${optarg} ;; + --mandir=*) mandir=${optarg} ;; + --no-create) no_create=yes ;; + + CC=*) CC=${optarg} ;; + CPPFLAGS=*) CPPFLAGS=${optarg} ;; + CFLAGS=*) CFLAGS=${optarg} ;; + CFLAGS+=*) CFLAGS="${CFLAGS} ${optarg}" ;; + LDFLAGS=*) LDFLAGS=${optarg} ;; + + --*) + echo "configure: WARNING: unrecognized option: '${option}'" 1>&2 ;; + *=* | *-*-*) ;; + *) + echo "configure: unrecognized option: '${option}'" 1>&2 + echo "Try 'configure --help' for more information." 1>&2 + exit 1 ;; + esac + + # Check if the option took a separate argument + if [ "${arg2}" = yes ] ; then + if [ $# != 0 ] ; then args="${args} \"$1\"" ; shift + else echo "configure: Missing argument to '${option}'" 1>&2 + exit 1 + fi + fi +done + +# Find the source files, if location was not specified. +srcdirtext= +if [ -z "${srcdir}" ] ; then + srcdirtext="or . or .." ; srcdir=. + if [ ! -r "${srcdir}/${srctrigger}" ] ; then srcdir=.. ; fi + if [ ! -r "${srcdir}/${srctrigger}" ] ; then + ## the sed command below emulates the dirname command + srcdir=`echo "$0" | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'` + fi +fi + +if [ ! -r "${srcdir}/${srctrigger}" ] ; then + echo "configure: Can't find sources in ${srcdir} ${srcdirtext}" 1>&2 + echo "configure: (At least ${srctrigger} is missing)." 1>&2 + exit 1 +fi + +# Set srcdir to . if that's what it is. +if [ "`pwd`" = "`cd "${srcdir}" ; pwd`" ] ; then srcdir=. ; fi + +echo +if [ -z "${no_create}" ] ; then + echo "creating config.status" + rm -f config.status + cat > config.status << EOF +#! /bin/sh +# This file was generated automatically by configure. Don't edit. +# Run this file to recreate the current configuration. +# +# This script is free software: you have unlimited permission +# to copy, distribute, and modify it. + +exec /bin/sh $0 ${args} --no-create +EOF + chmod +x config.status +fi + +echo "creating Makefile" +echo "VPATH = ${srcdir}" +echo "prefix = ${prefix}" +echo "exec_prefix = ${exec_prefix}" +echo "bindir = ${bindir}" +echo "datarootdir = ${datarootdir}" +echo "infodir = ${infodir}" +echo "mandir = ${mandir}" +echo "CC = ${CC}" +echo "CPPFLAGS = ${CPPFLAGS}" +echo "CFLAGS = ${CFLAGS}" +echo "LDFLAGS = ${LDFLAGS}" +rm -f Makefile +cat > Makefile << EOF +# Makefile for Lunzip - Decompressor for the lzip format +# Copyright (C) 2010-2022 Antonio Diaz Diaz. +# This file was generated automatically by configure. Don't edit. +# +# This Makefile is free software: you have unlimited permission +# to copy, distribute, and modify it. + +pkgname = ${pkgname} +pkgversion = ${pkgversion} +progname = ${progname} +VPATH = ${srcdir} +prefix = ${prefix} +exec_prefix = ${exec_prefix} +bindir = ${bindir} +datarootdir = ${datarootdir} +infodir = ${infodir} +mandir = ${mandir} +CC = ${CC} +CPPFLAGS = ${CPPFLAGS} +CFLAGS = ${CFLAGS} +LDFLAGS = ${LDFLAGS} +EOF +cat "${srcdir}/Makefile.in" >> Makefile + +echo "OK. Now you can run make." diff --git a/decoder.c b/decoder.c new file mode 100644 index 0000000..b52b35f --- /dev/null +++ b/decoder.c @@ -0,0 +1,317 @@ +/* Lunzip - Decompressor for the lzip format + Copyright (C) 2010-2022 Antonio Diaz Diaz. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . +*/ + +#define _FILE_OFFSET_BITS 64 + +#include +#include +#include +#include +#include +#include +#include + +#include "lzip.h" +#include "decoder.h" + + +CRC32 crc32; + + +/* Return the number of bytes really read. + If (value returned < size) and (errno == 0), means EOF was reached. +*/ +int readblock( const int fd, uint8_t * const buf, const int size ) + { + int sz = 0; + errno = 0; + while( sz < size ) + { + const int n = read( fd, buf + sz, size - sz ); + if( n > 0 ) sz += n; + else if( n == 0 ) break; /* EOF */ + else if( errno != EINTR ) break; + errno = 0; + } + return sz; + } + + +/* Return the number of bytes really written. + If (value returned < size), it is always an error. +*/ +static int writeblock( const int fd, const uint8_t * const buf, const int size ) + { + int sz = 0; + errno = 0; + while( sz < size ) + { + const int n = write( fd, buf + sz, size - sz ); + if( n > 0 ) sz += n; + else if( n < 0 && errno != EINTR ) break; + errno = 0; + } + return sz; + } + + +unsigned seek_read_back( const int fd, uint8_t * const buf, const int size, + const int offset ) + { + if( lseek( fd, -offset, SEEK_END ) >= 0 ) + return readblock( fd, buf, size ); + return 0; + } + + +bool Rd_read_block( struct Range_decoder * const rdec ) + { + if( !rdec->at_stream_end ) + { + rdec->stream_pos = readblock( rdec->infd, rdec->buffer, rd_buffer_size ); + if( rdec->stream_pos != rd_buffer_size && errno ) + { show_error( "Read error", errno, false ); cleanup_and_fail( 1 ); } + rdec->at_stream_end = ( rdec->stream_pos < rd_buffer_size ); + rdec->partial_member_pos += rdec->pos; + rdec->pos = 0; + show_dprogress( 0, 0, 0, 0 ); + } + return rdec->pos < rdec->stream_pos; + } + + +void LZd_flush_data( struct LZ_decoder * const d ) + { + if( d->pos > d->stream_pos ) + { + const int size = d->pos - d->stream_pos; + CRC32_update_buf( &d->crc, d->buffer + d->stream_pos, size ); + if( d->outfd >= 0 && + writeblock( d->outfd, d->buffer + d->stream_pos, size ) != size ) + { show_error( "Write error", errno, false ); cleanup_and_fail( 1 ); } + if( d->pos >= d->buffer_size ) + { d->partial_data_pos += d->pos; d->pos = 0; + if( d->partial_data_pos >= d->dictionary_size ) d->pos_wrapped = true; } + d->stream_pos = d->pos; + } + } + + +static bool LZd_verify_trailer( struct LZ_decoder * const d, + struct Pretty_print * const pp ) + { + Lzip_trailer trailer; + int size = Rd_read_data( d->rdec, trailer, Lt_size ); + const unsigned long long data_size = LZd_data_position( d ); + const unsigned long long member_size = Rd_member_position( d->rdec ); + bool error = false; + + if( size < Lt_size ) + { + error = true; + if( verbosity >= 0 ) + { + Pp_show_msg( pp, 0 ); + fprintf( stderr, "Trailer truncated at trailer position %d;" + " some checks may fail.\n", size ); + } + while( size < Lt_size ) trailer[size++] = 0; + } + + const unsigned td_crc = Lt_get_data_crc( trailer ); + if( td_crc != LZd_crc( d ) ) + { + error = true; + if( verbosity >= 0 ) + { + Pp_show_msg( pp, 0 ); + fprintf( stderr, "CRC mismatch; stored %08X, computed %08X\n", + td_crc, LZd_crc( d ) ); + } + } + const unsigned long long td_size = Lt_get_data_size( trailer ); + if( td_size != data_size ) + { + error = true; + if( verbosity >= 0 ) + { + Pp_show_msg( pp, 0 ); + fprintf( stderr, "Data size mismatch; stored %llu (0x%llX), computed %llu (0x%llX)\n", + td_size, td_size, data_size, data_size ); + } + } + const unsigned long long tm_size = Lt_get_member_size( trailer ); + if( tm_size != member_size ) + { + error = true; + if( verbosity >= 0 ) + { + Pp_show_msg( pp, 0 ); + fprintf( stderr, "Member size mismatch; stored %llu (0x%llX), computed %llu (0x%llX)\n", + tm_size, tm_size, member_size, member_size ); + } + } + if( error ) return false; + if( verbosity >= 2 ) + { + if( verbosity >= 4 ) show_header( d->dictionary_size ); + if( data_size == 0 || member_size == 0 ) + fputs( "no data compressed. ", stderr ); + else + fprintf( stderr, "%6.3f:1, %5.2f%% ratio, %5.2f%% saved. ", + (double)data_size / member_size, + ( 100.0 * member_size ) / data_size, + 100.0 - ( ( 100.0 * member_size ) / data_size ) ); + if( verbosity >= 4 ) fprintf( stderr, "CRC %08X, ", td_crc ); + if( verbosity >= 3 ) + fprintf( stderr, "%9llu out, %8llu in. ", data_size, member_size ); + } + return true; + } + + +/* Return value: 0 = OK, 1 = decoder error, 2 = unexpected EOF, + 3 = trailer error, 4 = unknown marker found. */ +int LZd_decode_member( struct LZ_decoder * const d, + struct Pretty_print * const pp ) + { + struct Range_decoder * const rdec = d->rdec; + Bit_model bm_literal[1<buffer_size >= d->dictionary_size; + + Bm_array_init( bm_literal[0], (1 << literal_context_bits) * 0x300 ); + Bm_array_init( bm_match[0], states * pos_states ); + Bm_array_init( bm_rep, states ); + Bm_array_init( bm_rep0, states ); + Bm_array_init( bm_rep1, states ); + Bm_array_init( bm_rep2, states ); + Bm_array_init( bm_len[0], states * pos_states ); + Bm_array_init( bm_dis_slot[0], len_states * (1 << dis_slot_bits) ); + Bm_array_init( bm_dis, modeled_distances - end_dis_model + 1 ); + Bm_array_init( bm_align, dis_align_size ); + Lm_init( &match_len_model ); + Lm_init( &rep_len_model ); + + Rd_load( rdec ); + while( !Rd_finished( rdec ) ) + { + const int pos_state = LZd_data_position( d ) & pos_state_mask; + if( Rd_decode_bit( rdec, &bm_match[state][pos_state] ) == 0 ) /* 1st bit */ + { + /* literal byte */ + Bit_model * const bm = bm_literal[get_lit_state(LZd_peek_prev( d ))]; + if( ( state = St_set_char( state ) ) < 4 ) + LZd_put_byte( d, Rd_decode_tree8( rdec, bm ) ); + else + LZd_put_byte( d, Rd_decode_matched( rdec, bm, LZd_peek( d, rep0 ) ) ); + continue; + } + /* match or repeated match */ + int len; + if( Rd_decode_bit( rdec, &bm_rep[state] ) != 0 ) /* 2nd bit */ + { + if( Rd_decode_bit( rdec, &bm_rep0[state] ) == 0 ) /* 3rd bit */ + { + if( Rd_decode_bit( rdec, &bm_len[state][pos_state] ) == 0 ) /* 4th bit */ + { state = St_set_short_rep( state ); + LZd_put_byte( d, LZd_peek( d, rep0 ) ); continue; } + } + else + { + unsigned distance; + if( Rd_decode_bit( rdec, &bm_rep1[state] ) == 0 ) /* 4th bit */ + distance = rep1; + else + { + if( Rd_decode_bit( rdec, &bm_rep2[state] ) == 0 ) /* 5th bit */ + distance = rep2; + else + { distance = rep3; rep3 = rep2; } + rep2 = rep1; + } + rep1 = rep0; + rep0 = distance; + } + state = St_set_rep( state ); + len = Rd_decode_len( rdec, &rep_len_model, pos_state ); + } + else /* match */ + { + len = Rd_decode_len( rdec, &match_len_model, pos_state ); + unsigned distance = Rd_decode_tree6( rdec, bm_dis_slot[get_len_state(len)] ); + if( distance >= start_dis_model ) + { + const unsigned dis_slot = distance; + const int direct_bits = ( dis_slot >> 1 ) - 1; + distance = ( 2 | ( dis_slot & 1 ) ) << direct_bits; + if( dis_slot < end_dis_model ) + distance += Rd_decode_tree_reversed( rdec, + bm_dis + ( distance - dis_slot ), direct_bits ); + else + { + distance += + Rd_decode( rdec, direct_bits - dis_align_bits ) << dis_align_bits; + distance += Rd_decode_tree_reversed4( rdec, bm_align ); + if( distance == 0xFFFFFFFFU ) /* marker found */ + { + Rd_normalize( rdec ); + LZd_flush_data( d ); + if( len == min_match_len ) /* End Of Stream marker */ + { + if( LZd_verify_trailer( d, pp ) ) return 0; else return 3; + } + if( len == min_match_len + 1 ) /* Sync Flush marker */ + { + Rd_load( rdec ); continue; + } + if( verbosity >= 0 ) + { + Pp_show_msg( pp, 0 ); + fprintf( stderr, "Unsupported marker code '%d'\n", len ); + } + return 4; + } + } + } + rep3 = rep2; rep2 = rep1; rep1 = rep0; rep0 = distance; + state = St_set_match( state ); + if( rep0 >= d->dictionary_size || + ( !d->pos_wrapped && rep0 >= LZd_data_position( d ) ) ) + { LZd_flush_data( d ); return 1; } + } + if( full_buffer || rep0 < d->buffer_size ) LZd_copy_block( d, rep0, len ); + else LZd_copy_block2( d, rep0, len ); + } + LZd_flush_data( d ); + return 2; + } diff --git a/decoder.h b/decoder.h new file mode 100644 index 0000000..0afdd83 --- /dev/null +++ b/decoder.h @@ -0,0 +1,393 @@ +/* Lunzip - Decompressor for the lzip format + Copyright (C) 2010-2022 Antonio Diaz Diaz. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . +*/ + +enum { rd_buffer_size = 16384 }; + +struct Range_decoder + { + unsigned long long partial_member_pos; + uint8_t * buffer; /* input buffer */ + int pos; /* current pos in buffer */ + int stream_pos; /* when reached, a new block must be read */ + uint32_t code; + uint32_t range; + int infd; /* input file descriptor */ + bool at_stream_end; + }; + +bool Rd_read_block( struct Range_decoder * const rdec ); + +static inline bool Rd_init( struct Range_decoder * const rdec, const int ifd ) + { + rdec->partial_member_pos = 0; + rdec->buffer = (uint8_t *)malloc( rd_buffer_size ); + if( !rdec->buffer ) return false; + rdec->pos = 0; + rdec->stream_pos = 0; + rdec->code = 0; + rdec->range = 0xFFFFFFFFU; + rdec->infd = ifd; + rdec->at_stream_end = false; + return true; + } + +static inline void Rd_free( struct Range_decoder * const rdec ) + { free( rdec->buffer ); } + +static inline bool Rd_finished( struct Range_decoder * const rdec ) + { return rdec->pos >= rdec->stream_pos && !Rd_read_block( rdec ); } + +static inline unsigned long long +Rd_member_position( const struct Range_decoder * const rdec ) + { return rdec->partial_member_pos + rdec->pos; } + +static inline void Rd_reset_member_position( struct Range_decoder * const rdec ) + { rdec->partial_member_pos = 0; rdec->partial_member_pos -= rdec->pos; } + +static inline uint8_t Rd_get_byte( struct Range_decoder * const rdec ) + { + /* 0xFF avoids decoder error if member is truncated at EOS marker */ + if( Rd_finished( rdec ) ) return 0xFF; + return rdec->buffer[rdec->pos++]; + } + +static inline int Rd_read_data( struct Range_decoder * const rdec, + uint8_t * const outbuf, const int size ) + { + int sz = 0; + while( sz < size && !Rd_finished( rdec ) ) + { + const int rd = min( size - sz, rdec->stream_pos - rdec->pos ); + memcpy( outbuf + sz, rdec->buffer + rdec->pos, rd ); + rdec->pos += rd; + sz += rd; + } + return sz; + } + +static inline void Rd_load( struct Range_decoder * const rdec ) + { + int i; + rdec->code = 0; + for( i = 0; i < 5; ++i ) rdec->code = (rdec->code << 8) | Rd_get_byte( rdec ); + rdec->range = 0xFFFFFFFFU; + rdec->code &= rdec->range; /* make sure that first byte is discarded */ + } + +static inline void Rd_normalize( struct Range_decoder * const rdec ) + { + if( rdec->range <= 0x00FFFFFFU ) + { rdec->range <<= 8; rdec->code = (rdec->code << 8) | Rd_get_byte( rdec ); } + } + +static inline unsigned Rd_decode( struct Range_decoder * const rdec, + const int num_bits ) + { + unsigned symbol = 0; + int i; + for( i = num_bits; i > 0; --i ) + { + Rd_normalize( rdec ); + rdec->range >>= 1; +/* symbol <<= 1; */ +/* if( rdec->code >= rdec->range ) { rdec->code -= rdec->range; symbol |= 1; } */ + const bool bit = ( rdec->code >= rdec->range ); + symbol <<= 1; symbol += bit; + rdec->code -= rdec->range & ( 0U - bit ); + } + return symbol; + } + +static inline unsigned Rd_decode_bit( struct Range_decoder * const rdec, + Bit_model * const probability ) + { + Rd_normalize( rdec ); + const uint32_t bound = ( rdec->range >> bit_model_total_bits ) * *probability; + if( rdec->code < bound ) + { + rdec->range = bound; + *probability += ( bit_model_total - *probability ) >> bit_model_move_bits; + return 0; + } + else + { + rdec->code -= bound; + rdec->range -= bound; + *probability -= *probability >> bit_model_move_bits; + return 1; + } + } + +static inline void Rd_decode_symbol_bit( struct Range_decoder * const rdec, + Bit_model * const probability, unsigned * symbol ) + { + Rd_normalize( rdec ); + *symbol <<= 1; + const uint32_t bound = ( rdec->range >> bit_model_total_bits ) * *probability; + if( rdec->code < bound ) + { + rdec->range = bound; + *probability += ( bit_model_total - *probability ) >> bit_model_move_bits; + } + else + { + rdec->code -= bound; + rdec->range -= bound; + *probability -= *probability >> bit_model_move_bits; + *symbol |= 1; + } + } + +static inline void Rd_decode_symbol_bit_reversed( struct Range_decoder * const rdec, + Bit_model * const probability, unsigned * model, + unsigned * symbol, const int i ) + { + Rd_normalize( rdec ); + *model <<= 1; + const uint32_t bound = ( rdec->range >> bit_model_total_bits ) * *probability; + if( rdec->code < bound ) + { + rdec->range = bound; + *probability += ( bit_model_total - *probability ) >> bit_model_move_bits; + } + else + { + rdec->code -= bound; + rdec->range -= bound; + *probability -= *probability >> bit_model_move_bits; + *model |= 1; + *symbol |= 1 << i; + } + } + +static inline unsigned Rd_decode_tree6( struct Range_decoder * const rdec, + Bit_model bm[] ) + { + unsigned symbol = 1; + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + return symbol & 0x3F; + } + +static inline unsigned Rd_decode_tree8( struct Range_decoder * const rdec, + Bit_model bm[] ) + { + unsigned symbol = 1; + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + return symbol & 0xFF; + } + +static inline unsigned +Rd_decode_tree_reversed( struct Range_decoder * const rdec, + Bit_model bm[], const int num_bits ) + { + unsigned model = 1; + unsigned symbol = 0; + int i; + for( i = 0; i < num_bits; ++i ) + Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, i ); + return symbol; + } + +static inline unsigned +Rd_decode_tree_reversed4( struct Range_decoder * const rdec, Bit_model bm[] ) + { + unsigned model = 1; + unsigned symbol = 0; + Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 0 ); + Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 1 ); + Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 2 ); + Rd_decode_symbol_bit_reversed( rdec, &bm[model], &model, &symbol, 3 ); + return symbol; + } + +static inline unsigned Rd_decode_matched( struct Range_decoder * const rdec, + Bit_model bm[], unsigned match_byte ) + { + unsigned symbol = 1; + unsigned mask = 0x100; + while( true ) + { + const unsigned match_bit = ( match_byte <<= 1 ) & mask; + const unsigned bit = Rd_decode_bit( rdec, &bm[symbol+match_bit+mask] ); + symbol <<= 1; symbol += bit; + if( symbol > 0xFF ) return symbol & 0xFF; + mask &= ~(match_bit ^ (bit << 8)); /* if( match_bit != bit ) mask = 0; */ + } + } + +static inline unsigned Rd_decode_len( struct Range_decoder * const rdec, + struct Len_model * const lm, + const int pos_state ) + { + Bit_model * bm; + unsigned mask, offset, symbol = 1; + + if( Rd_decode_bit( rdec, &lm->choice1 ) == 0 ) + { bm = lm->bm_low[pos_state]; mask = 7; offset = 0; goto len3; } + if( Rd_decode_bit( rdec, &lm->choice2 ) == 0 ) + { bm = lm->bm_mid[pos_state]; mask = 7; offset = len_low_symbols; goto len3; } + bm = lm->bm_high; mask = 0xFF; offset = len_low_symbols + len_mid_symbols; + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); +len3: + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + Rd_decode_symbol_bit( rdec, &bm[symbol], &symbol ); + return ( symbol & mask ) + min_match_len + offset; + } + + +struct LZ_decoder + { + unsigned long long partial_data_pos; + struct Range_decoder * rdec; + unsigned dictionary_size; + unsigned buffer_size; + uint8_t * buffer; /* output buffer */ + unsigned pos; /* current pos in buffer */ + unsigned stream_pos; /* first byte not yet written to file */ + uint32_t crc; + int outfd; /* output file descriptor */ + bool pos_wrapped; + }; + +void LZd_flush_data( struct LZ_decoder * const d ); + +unsigned seek_read_back( const int fd, uint8_t * const buf, const int size, + const int offset ); + +static inline uint8_t LZd_peek_prev( const struct LZ_decoder * const d ) + { return d->buffer[((d->pos > 0) ? d->pos : d->buffer_size)-1]; } + +static inline uint8_t LZd_peek( const struct LZ_decoder * const d, + const unsigned distance ) + { + uint8_t b; + if( d->pos > distance ) b = d->buffer[d->pos-distance-1]; + else if( d->buffer_size > distance ) + b = d->buffer[d->buffer_size+d->pos-distance-1]; + else if( seek_read_back( d->outfd, &b, 1, + distance + 1 + d->stream_pos - d->pos ) != 1 ) + { show_error( "Seek error", errno, false ); cleanup_and_fail( 1 ); } + return b; + } + +static inline void LZd_put_byte( struct LZ_decoder * const d, const uint8_t b ) + { + d->buffer[d->pos] = b; + if( ++d->pos >= d->buffer_size ) LZd_flush_data( d ); + } + +static inline void LZd_copy_block( struct LZ_decoder * const d, + const unsigned distance, unsigned len ) + { + unsigned lpos = d->pos, i = lpos - distance - 1; + bool fast, fast2; + if( lpos > distance ) + { + fast = ( len < d->buffer_size - lpos ); + fast2 = ( fast && len <= lpos - i ); + } + else + { + i += d->buffer_size; + fast = ( len < d->buffer_size - i ); /* (i == pos) may happen */ + fast2 = ( fast && len <= i - lpos ); + } + if( fast ) /* no wrap */ + { + d->pos += len; + if( fast2 ) /* no wrap, no overlap */ + memcpy( d->buffer + lpos, d->buffer + i, len ); + else + for( ; len > 0; --len ) d->buffer[lpos++] = d->buffer[i++]; + } + else for( ; len > 0; --len ) + { + d->buffer[d->pos] = d->buffer[i]; + if( ++d->pos >= d->buffer_size ) LZd_flush_data( d ); + if( ++i >= d->buffer_size ) i = 0; + } + } + +/* block is (at least partially) outside of the buffer */ +static inline void LZd_copy_block2( struct LZ_decoder * const d, + const unsigned distance, unsigned len ) + { + if( len < d->buffer_size - d->pos ) /* no wrap */ + { + const unsigned offset = distance + 1 + d->stream_pos - d->pos; + if( len <= offset ) /* block is in file */ + { + if( seek_read_back( d->outfd, d->buffer + d->pos, len, offset ) != len ) + { show_error( "Seek error", errno, false ); cleanup_and_fail( 1 ); } + d->pos += len; + return; + } + } + for( ; len > 0; --len ) + LZd_put_byte( d, LZd_peek( d, distance ) ); + } + +static inline bool LZd_init( struct LZ_decoder * const d, + struct Range_decoder * const rde, + const unsigned buffer_size, + const unsigned dict_size, const int ofd ) + { + d->partial_data_pos = 0; + d->rdec = rde; + d->dictionary_size = dict_size; + d->buffer_size = min( buffer_size, dict_size ); + d->buffer = (uint8_t *)malloc( d->buffer_size ); + if( !d->buffer ) return false; + d->pos = 0; + d->stream_pos = 0; + d->crc = 0xFFFFFFFFU; + d->outfd = ofd; + d->pos_wrapped = false; + /* prev_byte of first byte; also for LZd_peek( 0 ) on corrupt file */ + d->buffer[d->buffer_size-1] = 0; + return true; + } + +static inline void LZd_free( struct LZ_decoder * const d ) + { free( d->buffer ); } + +static inline unsigned LZd_crc( const struct LZ_decoder * const d ) + { return d->crc ^ 0xFFFFFFFFU; } + +static inline unsigned long long +LZd_data_position( const struct LZ_decoder * const d ) + { return d->partial_data_pos + d->pos; } + +int LZd_decode_member( struct LZ_decoder * const d, + struct Pretty_print * const pp ); diff --git a/doc/lunzip.1 b/doc/lunzip.1 new file mode 100644 index 0000000..9aa0300 --- /dev/null +++ b/doc/lunzip.1 @@ -0,0 +1,97 @@ +.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.16. +.TH LUNZIP "1" "January 2022" "lunzip 1.13" "User Commands" +.SH NAME +lunzip \- decompressor for the lzip format +.SH SYNOPSIS +.B lunzip +[\fI\,options\/\fR] [\fI\,files\/\fR] +.SH DESCRIPTION +Lunzip is a decompressor for the lzip format written in C. Its small size +makes it well suited for embedded devices or software installers that need +to decompress files but don't need compression capabilities. Lunzip is fully +compatible with lzip 1.4 or newer. +.PP +Lunzip provides a 'low memory' mode able to decompress any file using as +little memory as 50 kB, irrespective of the dictionary size used to +compress the file. To activate it, specify the size of the output buffer +with the option \fB\-\-buffer\-size\fR and lunzip will use the decompressed +file as dictionary for distances beyond the buffer size. Of course, the +larger the difference between the buffer size and the dictionary size, the +more accesses to disk are needed and the slower the decompression is. +This 'low memory' mode only works when decompressing to a regular file +and is intended for systems without enough memory (RAM + swap) to keep +the whole dictionary at once. +.SH OPTIONS +.TP +\fB\-h\fR, \fB\-\-help\fR +display this help and exit +.TP +\fB\-V\fR, \fB\-\-version\fR +output version information and exit +.TP +\fB\-a\fR, \fB\-\-trailing\-error\fR +exit with error status if trailing data +.TP +\fB\-c\fR, \fB\-\-stdout\fR +write to standard output, keep input files +.TP +\fB\-d\fR, \fB\-\-decompress\fR +decompress (this is the default) +.TP +\fB\-f\fR, \fB\-\-force\fR +overwrite existing output files +.TP +\fB\-k\fR, \fB\-\-keep\fR +keep (don't delete) input files +.TP +\fB\-l\fR, \fB\-\-list\fR +print (un)compressed file sizes +.TP +\fB\-o\fR, \fB\-\-output=\fR +write to , keep input files +.TP +\fB\-q\fR, \fB\-\-quiet\fR +suppress all messages +.TP +\fB\-t\fR, \fB\-\-test\fR +test compressed file integrity +.TP +\fB\-u\fR, \fB\-\-buffer\-size=\fR +set output buffer size in bytes +.TP +\fB\-v\fR, \fB\-\-verbose\fR +be verbose (a 2nd \fB\-v\fR gives more) +.TP +\fB\-\-loose\-trailing\fR +allow trailing data seeming corrupt header +.PP +If no file names are given, or if a file is '\-', lunzip decompresses +from standard input to standard output. +Numbers may be followed by a multiplier: k = kB = 10^3 = 1000, +Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc... +Buffer sizes 12 to 29 are interpreted as powers of two, meaning 2^12 +to 2^29 bytes. +.PP +To extract all the files from archive 'foo.tar.lz', use the commands +\&'tar \fB\-xf\fR foo.tar.lz' or 'lunzip \fB\-cd\fR foo.tar.lz | tar \fB\-xf\fR \-'. +.PP +Exit status: 0 for a normal exit, 1 for environmental problems (file +not found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or +invalid input file, 3 for an internal consistency error (e.g., bug) which +caused lunzip to panic. +.PP +The ideas embodied in lunzip are due to (at least) the following people: +Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for the +definition of Markov chains), G.N.N. Martin (for the definition of range +encoding), Igor Pavlov (for putting all the above together in LZMA), and +Julian Seward (for bzip2's CLI). +.SH "REPORTING BUGS" +Report bugs to lzip\-bug@nongnu.org +.br +Lunzip home page: http://www.nongnu.org/lzip/lunzip.html +.SH COPYRIGHT +Copyright \(co 2022 Antonio Diaz Diaz. +License GPLv2+: GNU GPL version 2 or later +.br +This is free software: you are free to change and redistribute it. +There is NO WARRANTY, to the extent permitted by law. diff --git a/list.c b/list.c new file mode 100644 index 0000000..33de75c --- /dev/null +++ b/list.c @@ -0,0 +1,111 @@ +/* Lunzip - Decompressor for the lzip format + Copyright (C) 2010-2022 Antonio Diaz Diaz. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . +*/ + +#define _FILE_OFFSET_BITS 64 + +#include +#include +#include +#include +#include +#include + +#include "lzip.h" +#include "lzip_index.h" + + +static void list_line( const unsigned long long uncomp_size, + const unsigned long long comp_size, + const char * const input_filename ) + { + if( uncomp_size > 0 ) + printf( "%14llu %14llu %6.2f%% %s\n", uncomp_size, comp_size, + 100.0 - ( ( 100.0 * comp_size ) / uncomp_size ), + input_filename ); + else + printf( "%14llu %14llu -INF%% %s\n", uncomp_size, comp_size, + input_filename ); + } + + +int list_files( const char * const filenames[], const int num_filenames, + const bool ignore_trailing, const bool loose_trailing ) + { + unsigned long long total_comp = 0, total_uncomp = 0; + int files = 0, retval = 0; + int i; + bool first_post = true; + bool stdin_used = false; + for( i = 0; i < num_filenames; ++i ) + { + const bool from_stdin = ( strcmp( filenames[i], "-" ) == 0 ); + if( from_stdin ) { if( stdin_used ) continue; else stdin_used = true; } + const char * const input_filename = from_stdin ? "(stdin)" : filenames[i]; + struct stat in_stats; /* not used */ + const int infd = from_stdin ? STDIN_FILENO : + open_instream( input_filename, &in_stats, false, true ); + if( infd < 0 ) { set_retval( &retval, 1 ); continue; } + + struct Lzip_index lzip_index; + Li_init( &lzip_index, infd, ignore_trailing, loose_trailing ); + close( infd ); + if( lzip_index.retval != 0 ) + { + show_file_error( input_filename, lzip_index.error, 0 ); + set_retval( &retval, lzip_index.retval ); + Li_free( &lzip_index ); continue; + } + if( verbosity < 0 ) { Li_free( &lzip_index ); continue; } + const unsigned long long udata_size = Li_udata_size( &lzip_index ); + const unsigned long long cdata_size = Li_cdata_size( &lzip_index ); + total_comp += cdata_size; total_uncomp += udata_size; ++files; + const long members = lzip_index.members; + if( first_post ) + { + first_post = false; + if( verbosity >= 1 ) fputs( " dict memb trail ", stdout ); + fputs( " uncompressed compressed saved name\n", stdout ); + } + if( verbosity >= 1 ) + printf( "%s %5ld %6lld ", format_ds( lzip_index.dictionary_size ), + members, Li_file_size( &lzip_index ) - cdata_size ); + list_line( udata_size, cdata_size, input_filename ); + + if( verbosity >= 2 && members > 1 ) + { + long i; + fputs( " member data_pos data_size member_pos member_size\n", stdout ); + for( i = 0; i < members; ++i ) + { + const struct Block * db = Li_dblock( &lzip_index, i ); + const struct Block * mb = Li_mblock( &lzip_index, i ); + printf( "%6ld %14llu %14llu %14llu %14llu\n", + i + 1, db->pos, db->size, mb->pos, mb->size ); + } + first_post = true; /* reprint heading after list of members */ + } + fflush( stdout ); + Li_free( &lzip_index ); + } + if( verbosity >= 0 && files > 1 ) + { + if( verbosity >= 1 ) fputs( " ", stdout ); + list_line( total_uncomp, total_comp, "(totals)" ); + fflush( stdout ); + } + return retval; + } diff --git a/lzip.h b/lzip.h new file mode 100644 index 0000000..4b77be8 --- /dev/null +++ b/lzip.h @@ -0,0 +1,278 @@ +/* Lunzip - Decompressor for the lzip format + Copyright (C) 2010-2022 Antonio Diaz Diaz. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . +*/ + +#ifndef max + #define max(x,y) ((x) >= (y) ? (x) : (y)) +#endif +#ifndef min + #define min(x,y) ((x) <= (y) ? (x) : (y)) +#endif + +typedef int State; + +enum { states = 12 }; + +static inline bool St_is_char( const State st ) { return st < 7; } + +static inline State St_set_char( const State st ) + { + static const State next[states] = { 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 4, 5 }; + return next[st]; + } + +static inline State St_set_match( const State st ) + { return ( ( st < 7 ) ? 7 : 10 ); } + +static inline State St_set_rep( const State st ) + { return ( ( st < 7 ) ? 8 : 11 ); } + +static inline State St_set_short_rep( const State st ) + { return ( ( st < 7 ) ? 9 : 11 ); } + + +enum { + min_dictionary_bits = 12, + min_dictionary_size = 1 << min_dictionary_bits, /* >= modeled_distances */ + max_dictionary_bits = 29, + max_dictionary_size = 1 << max_dictionary_bits, + min_member_size = 36, + literal_context_bits = 3, + literal_pos_state_bits = 0, /* not used */ + pos_state_bits = 2, + pos_states = 1 << pos_state_bits, + pos_state_mask = pos_states - 1, + + len_states = 4, + dis_slot_bits = 6, + start_dis_model = 4, + end_dis_model = 14, + modeled_distances = 1 << (end_dis_model / 2), /* 128 */ + dis_align_bits = 4, + dis_align_size = 1 << dis_align_bits, + + len_low_bits = 3, + len_mid_bits = 3, + len_high_bits = 8, + len_low_symbols = 1 << len_low_bits, + len_mid_symbols = 1 << len_mid_bits, + len_high_symbols = 1 << len_high_bits, + max_len_symbols = len_low_symbols + len_mid_symbols + len_high_symbols, + + min_match_len = 2, /* must be 2 */ + max_match_len = min_match_len + max_len_symbols - 1, /* 273 */ + min_match_len_limit = 5 }; + +static inline int get_len_state( const int len ) + { return min( len - min_match_len, len_states - 1 ); } + +static inline int get_lit_state( const uint8_t prev_byte ) + { return prev_byte >> ( 8 - literal_context_bits ); } + + +enum { bit_model_move_bits = 5, + bit_model_total_bits = 11, + bit_model_total = 1 << bit_model_total_bits }; + +typedef int Bit_model; + +static inline void Bm_init( Bit_model * const probability ) + { *probability = bit_model_total / 2; } + +static inline void Bm_array_init( Bit_model bm[], const int size ) + { int i; for( i = 0; i < size; ++i ) Bm_init( &bm[i] ); } + +struct Len_model + { + Bit_model choice1; + Bit_model choice2; + Bit_model bm_low[pos_states][len_low_symbols]; + Bit_model bm_mid[pos_states][len_mid_symbols]; + Bit_model bm_high[len_high_symbols]; + }; + +static inline void Lm_init( struct Len_model * const lm ) + { + Bm_init( &lm->choice1 ); + Bm_init( &lm->choice2 ); + Bm_array_init( lm->bm_low[0], pos_states * len_low_symbols ); + Bm_array_init( lm->bm_mid[0], pos_states * len_mid_symbols ); + Bm_array_init( lm->bm_high, len_high_symbols ); + } + + +typedef uint32_t CRC32[256]; /* Table of CRCs of all 8-bit messages. */ + +extern CRC32 crc32; + +static inline void CRC32_init( void ) + { + unsigned n; + for( n = 0; n < 256; ++n ) + { + unsigned c = n; + int k; + for( k = 0; k < 8; ++k ) + { if( c & 1 ) c = 0xEDB88320U ^ ( c >> 1 ); else c >>= 1; } + crc32[n] = c; + } + } + +/* about as fast as it is possible without messing with endianness */ +static inline void CRC32_update_buf( uint32_t * const crc, + const uint8_t * const buffer, + const int size ) + { + int i; + uint32_t c = *crc; + for( i = 0; i < size; ++i ) + c = crc32[(c^buffer[i])&0xFF] ^ ( c >> 8 ); + *crc = c; + } + + +static inline bool isvalid_ds( const unsigned dictionary_size ) + { return ( dictionary_size >= min_dictionary_size && + dictionary_size <= max_dictionary_size ); } + + +static const uint8_t lzip_magic[4] = { 0x4C, 0x5A, 0x49, 0x50 }; /* "LZIP" */ + +typedef uint8_t Lzip_header[6]; /* 0-3 magic bytes */ + /* 4 version */ + /* 5 coded dictionary size */ +enum { Lh_size = 6 }; + +static inline bool Lh_verify_magic( const Lzip_header data ) + { return ( memcmp( data, lzip_magic, 4 ) == 0 ); } + +/* detect (truncated) header */ +static inline bool Lh_verify_prefix( const Lzip_header data, const int sz ) + { + int i; for( i = 0; i < sz && i < 4; ++i ) + if( data[i] != lzip_magic[i] ) return false; + return ( sz > 0 ); + } + +/* detect corrupt header */ +static inline bool Lh_verify_corrupt( const Lzip_header data ) + { + int matches = 0; + int i; for( i = 0; i < 4; ++i ) + if( data[i] == lzip_magic[i] ) ++matches; + return ( matches > 1 && matches < 4 ); + } + +static inline uint8_t Lh_version( const Lzip_header data ) + { return data[4]; } + +static inline bool Lh_verify_version( const Lzip_header data ) + { return ( data[4] == 1 ); } + +static inline unsigned Lh_get_dictionary_size( const Lzip_header data ) + { + unsigned sz = ( 1 << ( data[5] & 0x1F ) ); + if( sz > min_dictionary_size ) + sz -= ( sz / 16 ) * ( ( data[5] >> 5 ) & 7 ); + return sz; + } + +static inline bool Lh_verify( const Lzip_header data ) + { + return Lh_verify_magic( data ) && Lh_verify_version( data ) && + isvalid_ds( Lh_get_dictionary_size( data ) ); + } + + +typedef uint8_t Lzip_trailer[20]; + /* 0-3 CRC32 of the uncompressed data */ + /* 4-11 size of the uncompressed data */ + /* 12-19 member size including header and trailer */ +enum { Lt_size = 20 }; + +static inline unsigned Lt_get_data_crc( const Lzip_trailer data ) + { + unsigned tmp = 0; + int i; for( i = 3; i >= 0; --i ) { tmp <<= 8; tmp += data[i]; } + return tmp; + } + +static inline unsigned long long Lt_get_data_size( const Lzip_trailer data ) + { + unsigned long long tmp = 0; + int i; for( i = 11; i >= 4; --i ) { tmp <<= 8; tmp += data[i]; } + return tmp; + } + +static inline unsigned long long Lt_get_member_size( const Lzip_trailer data ) + { + unsigned long long tmp = 0; + int i; for( i = 19; i >= 12; --i ) { tmp <<= 8; tmp += data[i]; } + return tmp; + } + +/* check internal consistency */ +static inline bool Lt_verify_consistency( const Lzip_trailer data ) + { + const unsigned crc = Lt_get_data_crc( data ); + const unsigned long long dsize = Lt_get_data_size( data ); + if( ( crc == 0 ) != ( dsize == 0 ) ) return false; + const unsigned long long msize = Lt_get_member_size( data ); + if( msize < min_member_size ) return false; + const unsigned long long mlimit = ( 9 * dsize + 7 ) / 8 + min_member_size; + if( mlimit > dsize && msize > mlimit ) return false; + const unsigned long long dlimit = 7090 * ( msize - 26 ) - 1; + if( dlimit > msize && dsize > dlimit ) return false; + return true; + } + + +static inline void set_retval( int * retval, const int new_val ) + { if( *retval < new_val ) *retval = new_val; } + +static const char * const bad_magic_msg = "Bad magic number (file not in lzip format)."; +static const char * const bad_dict_msg = "Invalid dictionary size in member header."; +static const char * const corrupt_mm_msg = "Corrupt header in multimember file."; +static const char * const trailing_msg = "Trailing data not allowed."; +static const char * const mem_msg = "Not enough memory."; + +/* defined in decoder.c */ +int readblock( const int fd, uint8_t * const buf, const int size ); + +/* defined in list.c */ +int list_files( const char * const filenames[], const int num_filenames, + const bool ignore_trailing, const bool loose_trailing ); + +/* defined in main.c */ +struct stat; +struct Pretty_print; +extern int verbosity; +void * resize_buffer( void * buf, const unsigned min_size ); +void Pp_show_msg( struct Pretty_print * const pp, const char * const msg ); +const char * bad_version( const unsigned version ); +const char * format_ds( const unsigned dictionary_size ); +void show_header( const unsigned dictionary_size ); +int open_instream( const char * const name, struct stat * const in_statsp, + const bool one_to_one, const bool reg_only ); +void cleanup_and_fail( const int retval ); +void show_error( const char * const msg, const int errcode, const bool help ); +void show_file_error( const char * const filename, const char * const msg, + const int errcode ); +struct Range_decoder; +void show_dprogress( const unsigned long long cfile_size, + const unsigned long long partial_size, + const struct Range_decoder * const d, + struct Pretty_print * const p ); diff --git a/lzip_index.c b/lzip_index.c new file mode 100644 index 0000000..559fd7a --- /dev/null +++ b/lzip_index.c @@ -0,0 +1,283 @@ +/* Lunzip - Decompressor for the lzip format + Copyright (C) 2010-2022 Antonio Diaz Diaz. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . +*/ + +#define _FILE_OFFSET_BITS 64 + +#include +#include +#include +#include +#include +#include +#include + +#include "lzip.h" +#include "lzip_index.h" + + +static int seek_read( const int fd, uint8_t * const buf, const int size, + const long long pos ) + { + if( lseek( fd, pos, SEEK_SET ) == pos ) + return readblock( fd, buf, size ); + return 0; + } + + +static bool add_error( struct Lzip_index * const li, const char * const msg ) + { + const int len = strlen( msg ); + void * tmp = resize_buffer( li->error, li->error_size + len + 1 ); + if( !tmp ) return false; + li->error = (char *)tmp; + strncpy( li->error + li->error_size, msg, len + 1 ); + li->error_size += len; + return true; + } + + +static bool push_back_member( struct Lzip_index * const li, + const long long dp, const long long ds, + const long long mp, const long long ms, + const unsigned dict_size ) + { + struct Member * p; + void * tmp = resize_buffer( li->member_vector, + ( li->members + 1 ) * sizeof li->member_vector[0] ); + if( !tmp ) { add_error( li, mem_msg ); li->retval = 1; return false; } + li->member_vector = (struct Member *)tmp; + p = &(li->member_vector[li->members]); + init_member( p, dp, ds, mp, ms, dict_size ); + ++li->members; + return true; + } + + +static void Li_free_member_vector( struct Lzip_index * const li ) + { + if( li->member_vector ) + { free( li->member_vector ); li->member_vector = 0; } + li->members = 0; + } + + +static void Li_reverse_member_vector( struct Lzip_index * const li ) + { + struct Member tmp; + long i; + for( i = 0; i < li->members / 2; ++i ) + { + tmp = li->member_vector[i]; + li->member_vector[i] = li->member_vector[li->members-i-1]; + li->member_vector[li->members-i-1] = tmp; + } + } + + +static bool Li_check_header_error( struct Lzip_index * const li, + const Lzip_header header ) + { + if( !Lh_verify_magic( header ) ) + { add_error( li, bad_magic_msg ); li->retval = 2; return true; } + if( !Lh_verify_version( header ) ) + { add_error( li, bad_version( Lh_version( header ) ) ); li->retval = 2; + return true; } + if( !isvalid_ds( Lh_get_dictionary_size( header ) ) ) + { add_error( li, bad_dict_msg ); li->retval = 2; return true; } + return false; + } + +static void Li_set_errno_error( struct Lzip_index * const li, + const char * const msg ) + { + add_error( li, msg ); add_error( li, strerror( errno ) ); + li->retval = 1; + } + +static void Li_set_num_error( struct Lzip_index * const li, + const char * const msg, unsigned long long num ) + { + char buf[80]; + snprintf( buf, sizeof buf, "%s%llu", msg, num ); + add_error( li, buf ); + li->retval = 2; + } + + +static bool Li_read_header( struct Lzip_index * const li, const int fd, + Lzip_header header, const long long pos ) + { + if( seek_read( fd, header, Lh_size, pos ) != Lh_size ) + { Li_set_errno_error( li, "Error reading member header: " ); return false; } + return true; + } + + +/* If successful, push last member and set pos to member header. */ +static bool Li_skip_trailing_data( struct Lzip_index * const li, const int fd, + unsigned long long * const pos, + const bool ignore_trailing, + const bool loose_trailing ) + { + if( *pos < min_member_size ) return false; + enum { block_size = 16384, + buffer_size = block_size + Lt_size - 1 + Lh_size }; + uint8_t buffer[buffer_size]; + int bsize = *pos % block_size; /* total bytes in buffer */ + if( bsize <= buffer_size - block_size ) bsize += block_size; + int search_size = bsize; /* bytes to search for trailer */ + int rd_size = bsize; /* bytes to read from file */ + unsigned long long ipos = *pos - rd_size; /* aligned to block_size */ + + while( true ) + { + if( seek_read( fd, buffer, rd_size, ipos ) != rd_size ) + { Li_set_errno_error( li, "Error seeking member trailer: " ); return false; } + const uint8_t max_msb = ( ipos + search_size ) >> 56; + int i; + for( i = search_size; i >= Lt_size; --i ) + if( buffer[i-1] <= max_msb ) /* most significant byte of member_size */ + { + const Lzip_trailer * const trailer = + (const Lzip_trailer *)( buffer + i - Lt_size ); + const unsigned long long member_size = Lt_get_member_size( *trailer ); + if( member_size == 0 ) /* skip trailing zeros */ + { while( i > Lt_size && buffer[i-9] == 0 ) --i; continue; } + if( member_size > ipos + i || !Lt_verify_consistency( *trailer ) ) + continue; + Lzip_header header; + if( !Li_read_header( li, fd, header, ipos + i - member_size ) ) + return false; + if( !Lh_verify( header ) ) continue; + const Lzip_header * header2 = (const Lzip_header *)( buffer + i ); + const bool full_h2 = bsize - i >= Lh_size; + if( Lh_verify_prefix( *header2, bsize - i ) ) /* last member */ + { + if( !full_h2 ) add_error( li, "Last member in input file is truncated." ); + else if( !Li_check_header_error( li, *header2 ) ) + add_error( li, "Last member in input file is truncated or corrupt." ); + li->retval = 2; return false; + } + if( !loose_trailing && full_h2 && Lh_verify_corrupt( *header2 ) ) + { add_error( li, corrupt_mm_msg ); li->retval = 2; return false; } + if( !ignore_trailing ) + { add_error( li, trailing_msg ); li->retval = 2; return false; } + *pos = ipos + i - member_size; + const unsigned dictionary_size = Lh_get_dictionary_size( header ); + if( li->dictionary_size < dictionary_size ) + li->dictionary_size = dictionary_size; + return push_back_member( li, 0, Lt_get_data_size( *trailer ), *pos, + member_size, dictionary_size ); + } + if( ipos == 0 ) + { Li_set_num_error( li, "Bad trailer at pos ", *pos - Lt_size ); + return false; } + bsize = buffer_size; + search_size = bsize - Lh_size; + rd_size = block_size; + ipos -= rd_size; + memcpy( buffer + rd_size, buffer, buffer_size - rd_size ); + } + } + + +bool Li_init( struct Lzip_index * const li, const int infd, + const bool ignore_trailing, const bool loose_trailing ) + { + li->member_vector = 0; + li->error = 0; + li->insize = lseek( infd, 0, SEEK_END ); + li->members = 0; + li->error_size = 0; + li->retval = 0; + li->dictionary_size = 0; + if( li->insize < 0 ) + { Li_set_errno_error( li, "Input file is not seekable: " ); return false; } + if( li->insize < min_member_size ) + { add_error( li, "Input file is too short." ); li->retval = 2; + return false; } + if( li->insize > INT64_MAX ) + { add_error( li, "Input file is too long (2^63 bytes or more)." ); + li->retval = 2; return false; } + + Lzip_header header; + if( !Li_read_header( li, infd, header, 0 ) ) return false; + if( Li_check_header_error( li, header ) ) return false; + + unsigned long long pos = li->insize; /* always points to a header or to EOF */ + while( pos >= min_member_size ) + { + Lzip_trailer trailer; + if( seek_read( infd, trailer, Lt_size, pos - Lt_size ) != Lt_size ) + { Li_set_errno_error( li, "Error reading member trailer: " ); break; } + const unsigned long long member_size = Lt_get_member_size( trailer ); + if( member_size > pos || !Lt_verify_consistency( trailer ) ) + { /* bad trailer */ + if( li->members <= 0 ) + { if( Li_skip_trailing_data( li, infd, &pos, ignore_trailing, + loose_trailing ) ) continue; else return false; } + Li_set_num_error( li, "Bad trailer at pos ", pos - Lt_size ); + break; + } + if( !Li_read_header( li, infd, header, pos - member_size ) ) break; + if( !Lh_verify( header ) ) /* bad header */ + { + if( li->members <= 0 ) + { if( Li_skip_trailing_data( li, infd, &pos, ignore_trailing, + loose_trailing ) ) continue; else return false; } + Li_set_num_error( li, "Bad header at pos ", pos - member_size ); + break; + } + pos -= member_size; + const unsigned dictionary_size = Lh_get_dictionary_size( header ); + if( li->dictionary_size < dictionary_size ) + li->dictionary_size = dictionary_size; + if( !push_back_member( li, 0, Lt_get_data_size( trailer ), pos, + member_size, dictionary_size ) ) + return false; + } + if( pos != 0 || li->members <= 0 ) + { + Li_free_member_vector( li ); + if( li->retval == 0 ) + { add_error( li, "Can't create file index." ); li->retval = 2; } + return false; + } + Li_reverse_member_vector( li ); + long i; + for( i = 0; ; ++i ) + { + const long long end = block_end( li->member_vector[i].dblock ); + if( end < 0 || end > INT64_MAX ) + { + Li_free_member_vector( li ); + add_error( li, "Data in input file is too long (2^63 bytes or more)." ); + li->retval = 2; return false; + } + if( i + 1 >= li->members ) break; + li->member_vector[i+1].dblock.pos = end; + } + return true; + } + + +void Li_free( struct Lzip_index * const li ) + { + Li_free_member_vector( li ); + if( li->error ) { free( li->error ); li->error = 0; } + li->error_size = 0; + } diff --git a/lzip_index.h b/lzip_index.h new file mode 100644 index 0000000..0938533 --- /dev/null +++ b/lzip_index.h @@ -0,0 +1,91 @@ +/* Lunzip - Decompressor for the lzip format + Copyright (C) 2010-2022 Antonio Diaz Diaz. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . +*/ + +#ifndef INT64_MAX +#define INT64_MAX 0x7FFFFFFFFFFFFFFFLL +#endif + + +struct Block + { + long long pos, size; /* pos + size <= INT64_MAX */ + }; + +static inline void init_block( struct Block * const b, + const long long p, const long long s ) + { b->pos = p; b->size = s; } + +static inline long long block_end( const struct Block b ) + { return b.pos + b.size; } + + +struct Member + { + struct Block dblock, mblock; /* data block, member block */ + unsigned dictionary_size; + }; + +static inline void init_member( struct Member * const m, + const long long dp, const long long ds, + const long long mp, const long long ms, + const unsigned dict_size ) + { init_block( &m->dblock, dp, ds ); init_block( &m->mblock, mp, ms ); + m->dictionary_size = dict_size; } + +struct Lzip_index + { + struct Member * member_vector; + char * error; + long long insize; + long members; + int error_size; + int retval; + unsigned dictionary_size; /* largest dictionary size in the file */ + }; + +bool Li_init( struct Lzip_index * const li, const int infd, + const bool ignore_trailing, const bool loose_trailing ); + +void Li_free( struct Lzip_index * const li ); + +static inline long long Li_udata_size( const struct Lzip_index * const li ) + { + if( li->members <= 0 ) return 0; + return block_end( li->member_vector[li->members-1].dblock ); + } + +static inline long long Li_cdata_size( const struct Lzip_index * const li ) + { + if( li->members <= 0 ) return 0; + return block_end( li->member_vector[li->members-1].mblock ); + } + + /* total size including trailing data (if any) */ +static inline long long Li_file_size( const struct Lzip_index * const li ) + { if( li->insize >= 0 ) return li->insize; else return 0; } + +static inline const struct Block * Li_dblock( const struct Lzip_index * const li, + const long i ) + { return &li->member_vector[i].dblock; } + +static inline const struct Block * Li_mblock( const struct Lzip_index * const li, + const long i ) + { return &li->member_vector[i].mblock; } + +static inline unsigned Li_dictionary_size( const struct Lzip_index * const li, + const long i ) + { return li->member_vector[i].dictionary_size; } diff --git a/main.c b/main.c new file mode 100644 index 0000000..73e29b4 --- /dev/null +++ b/main.c @@ -0,0 +1,948 @@ +/* Lunzip - Decompressor for the lzip format + Copyright (C) 2010-2022 Antonio Diaz Diaz. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . +*/ +/* + Exit status: 0 for a normal exit, 1 for environmental problems + (file not found, invalid flags, I/O errors, etc), 2 to indicate a + corrupt or invalid input file, 3 for an internal consistency error + (e.g., bug) which caused lunzip to panic. +*/ + +#define _FILE_OFFSET_BITS 64 + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__ +#include +#if defined __MSVCRT__ +#define fchmod(x,y) 0 +#define fchown(x,y,z) 0 +#define SIGHUP SIGTERM +#define S_ISSOCK(x) 0 +#ifndef S_IRGRP +#define S_IRGRP 0 +#define S_IWGRP 0 +#define S_IROTH 0 +#define S_IWOTH 0 +#endif +#endif +#if defined __DJGPP__ +#define S_ISSOCK(x) 0 +#define S_ISVTX 0 +#endif +#endif + +#include "carg_parser.h" +#include "lzip.h" +#include "decoder.h" + +#ifndef O_BINARY +#define O_BINARY 0 +#endif + +#if CHAR_BIT != 8 +#error "Environments where CHAR_BIT != 8 are not supported." +#endif + +#if ( defined SIZE_MAX && SIZE_MAX < UINT_MAX ) || \ + ( defined SSIZE_MAX && SSIZE_MAX < INT_MAX ) +#error "Environments where 'size_t' is narrower than 'int' are not supported." +#endif + +int verbosity = 0; + +static const char * const program_name = "lunzip"; +static const char * const program_year = "2022"; +static const char * invocation_name = "lunzip"; /* default value */ + +static const struct { const char * from; const char * to; } known_extensions[] = { + { ".lz", "" }, + { ".tlz", ".tar" }, + { 0, 0 } }; + +enum Mode { m_compress, m_decompress, m_list, m_test }; + +/* Variables used in signal handler context. + They are not declared volatile because the handler never returns. */ +static char * output_filename = 0; +static int outfd = -1; +static bool delete_output_on_interrupt = false; + + +static void show_help( void ) + { + printf( "Lunzip is a decompressor for the lzip format written in C. Its small size\n" + "makes it well suited for embedded devices or software installers that need\n" + "to decompress files but don't need compression capabilities. Lunzip is fully\n" + "compatible with lzip 1.4 or newer.\n" + "\nLunzip provides a 'low memory' mode able to decompress any file using as\n" + "little memory as 50 kB, irrespective of the dictionary size used to\n" + "compress the file. To activate it, specify the size of the output buffer\n" + "with the option --buffer-size and lunzip will use the decompressed\n" + "file as dictionary for distances beyond the buffer size. Of course, the\n" + "larger the difference between the buffer size and the dictionary size, the\n" + "more accesses to disk are needed and the slower the decompression is.\n" + "This 'low memory' mode only works when decompressing to a regular file\n" + "and is intended for systems without enough memory (RAM + swap) to keep\n" + "the whole dictionary at once.\n" + "\nUsage: %s [options] [files]\n", invocation_name ); + printf( "\nOptions:\n" + " -h, --help display this help and exit\n" + " -V, --version output version information and exit\n" + " -a, --trailing-error exit with error status if trailing data\n" + " -c, --stdout write to standard output, keep input files\n" + " -d, --decompress decompress (this is the default)\n" + " -f, --force overwrite existing output files\n" + " -k, --keep keep (don't delete) input files\n" + " -l, --list print (un)compressed file sizes\n" + " -o, --output= write to , keep input files\n" + " -q, --quiet suppress all messages\n" + " -t, --test test compressed file integrity\n" + " -u, --buffer-size= set output buffer size in bytes\n" + " -v, --verbose be verbose (a 2nd -v gives more)\n" + " --loose-trailing allow trailing data seeming corrupt header\n" + "\nIf no file names are given, or if a file is '-', lunzip decompresses\n" + "from standard input to standard output.\n" + "Numbers may be followed by a multiplier: k = kB = 10^3 = 1000,\n" + "Ki = KiB = 2^10 = 1024, M = 10^6, Mi = 2^20, G = 10^9, Gi = 2^30, etc...\n" + "Buffer sizes 12 to 29 are interpreted as powers of two, meaning 2^12\n" + "to 2^29 bytes.\n" + "\nTo extract all the files from archive 'foo.tar.lz', use the commands\n" + "'tar -xf foo.tar.lz' or 'lunzip -cd foo.tar.lz | tar -xf -'.\n" + "\nExit status: 0 for a normal exit, 1 for environmental problems (file\n" + "not found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or\n" + "invalid input file, 3 for an internal consistency error (e.g., bug) which\n" + "caused lunzip to panic.\n" + "\nThe ideas embodied in lunzip are due to (at least) the following people:\n" + "Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for the\n" + "definition of Markov chains), G.N.N. Martin (for the definition of range\n" + "encoding), Igor Pavlov (for putting all the above together in LZMA), and\n" + "Julian Seward (for bzip2's CLI).\n" + "\nReport bugs to lzip-bug@nongnu.org\n" + "Lunzip home page: http://www.nongnu.org/lzip/lunzip.html\n" ); + } + + +static void show_version( void ) + { + printf( "%s %s\n", program_name, PROGVERSION ); + printf( "Copyright (C) %s Antonio Diaz Diaz.\n", program_year ); + printf( "License GPLv2+: GNU GPL version 2 or later \n" + "This is free software: you are free to change and redistribute it.\n" + "There is NO WARRANTY, to the extent permitted by law.\n" ); + } + + +/* assure at least a minimum size for buffer 'buf' */ +void * resize_buffer( void * buf, const unsigned min_size ) + { + if( buf ) buf = realloc( buf, min_size ); + else buf = malloc( min_size ); + if( !buf ) { show_error( mem_msg, 0, false ); cleanup_and_fail( 1 ); } + return buf; + } + + +struct Pretty_print + { + const char * name; + char * padded_name; + const char * stdin_name; + unsigned longest_name; + bool first_post; + }; + +static void Pp_init( struct Pretty_print * const pp, + const char * const filenames[], const int num_filenames ) + { + pp->name = 0; + pp->padded_name = 0; + pp->stdin_name = "(stdin)"; + pp->longest_name = 0; + pp->first_post = false; + + if( verbosity <= 0 ) return; + const unsigned stdin_name_len = strlen( pp->stdin_name ); + int i; + for( i = 0; i < num_filenames; ++i ) + { + const char * const s = filenames[i]; + const unsigned len = (strcmp( s, "-" ) == 0) ? stdin_name_len : strlen( s ); + if( pp->longest_name < len ) pp->longest_name = len; + } + if( pp->longest_name == 0 ) pp->longest_name = stdin_name_len; + } + +static void Pp_set_name( struct Pretty_print * const pp, + const char * const filename ) + { + unsigned name_len, padded_name_len, i = 0; + + if( filename && filename[0] && strcmp( filename, "-" ) != 0 ) + pp->name = filename; + else pp->name = pp->stdin_name; + name_len = strlen( pp->name ); + padded_name_len = max( name_len, pp->longest_name ) + 4; + pp->padded_name = resize_buffer( pp->padded_name, padded_name_len + 1 ); + while( i < 2 ) pp->padded_name[i++] = ' '; + while( i < name_len + 2 ) { pp->padded_name[i] = pp->name[i-2]; ++i; } + pp->padded_name[i++] = ':'; + while( i < padded_name_len ) pp->padded_name[i++] = ' '; + pp->padded_name[i] = 0; + pp->first_post = true; + } + +static void Pp_reset( struct Pretty_print * const pp ) + { if( pp->name && pp->name[0] ) pp->first_post = true; } + +void Pp_show_msg( struct Pretty_print * const pp, const char * const msg ) + { + if( verbosity < 0 ) return; + if( pp->first_post ) + { + pp->first_post = false; + fputs( pp->padded_name, stderr ); + if( !msg ) fflush( stderr ); + } + if( msg ) fprintf( stderr, "%s\n", msg ); + } + + +const char * bad_version( const unsigned version ) + { + static char buf[80]; + snprintf( buf, sizeof buf, "Version %u member format not supported.", + version ); + return buf; + } + + +const char * format_ds( const unsigned dictionary_size ) + { + enum { bufsize = 16, factor = 1024 }; + static char buf[bufsize]; + const char * const prefix[8] = + { "Ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi", "Yi" }; + const char * p = ""; + const char * np = " "; + unsigned num = dictionary_size; + bool exact = ( num % factor == 0 ); + + int i; for( i = 0; i < 8 && ( num > 9999 || ( exact && num >= factor ) ); ++i ) + { num /= factor; if( num % factor != 0 ) exact = false; + p = prefix[i]; np = ""; } + snprintf( buf, bufsize, "%s%4u %sB", np, num, p ); + return buf; + } + + +void show_header( const unsigned dictionary_size ) + { + fprintf( stderr, "dict %s, ", format_ds( dictionary_size ) ); + } + + +/* separate large numbers >= 100_000 in groups of 3 digits using '_' */ +static const char * format_num3( unsigned long long num ) + { + const char * const si_prefix = "kMGTPEZY"; + const char * const binary_prefix = "KMGTPEZY"; + enum { buffers = 8, bufsize = 4 * sizeof (long long) }; + static char buffer[buffers][bufsize]; /* circle of static buffers for printf */ + static int current = 0; + int i; + char * const buf = buffer[current++]; current %= buffers; + char * p = buf + bufsize - 1; /* fill the buffer backwards */ + *p = 0; /* terminator */ + if( num > 1024 ) + { + char prefix = 0; /* try binary first, then si */ + for( i = 0; i < 8 && num >= 1024 && num % 1024 == 0; ++i ) + { num /= 1024; prefix = binary_prefix[i]; } + if( prefix ) *(--p) = 'i'; + else + for( i = 0; i < 8 && num >= 1000 && num % 1000 == 0; ++i ) + { num /= 1000; prefix = si_prefix[i]; } + if( prefix ) *(--p) = prefix; + } + const bool split = num >= 100000; + + for( i = 0; ; ) + { + *(--p) = num % 10 + '0'; num /= 10; if( num == 0 ) break; + if( split && ++i >= 3 ) { i = 0; *(--p) = '_'; } + } + return p; + } + + +static unsigned long getnum( const char * const arg, + const char * const option_name, + const unsigned long llimit, + const unsigned long ulimit ) + { + char * tail; + errno = 0; + unsigned long long result = strtoul( arg, &tail, 0 ); + if( tail == arg ) + { + if( verbosity >= 0 ) + fprintf( stderr, "%s: Bad or missing numerical argument in " + "option '%s'.\n", program_name, option_name ); + exit( 1 ); + } + + if( !errno && tail[0] ) + { + const unsigned factor = ( tail[1] == 'i' ) ? 1024 : 1000; + int exponent = 0; /* 0 = bad multiplier */ + int i; + switch( tail[0] ) + { + case 'Y': exponent = 8; break; + case 'Z': exponent = 7; break; + case 'E': exponent = 6; break; + case 'P': exponent = 5; break; + case 'T': exponent = 4; break; + case 'G': exponent = 3; break; + case 'M': exponent = 2; break; + case 'K': if( factor == 1024 ) exponent = 1; break; + case 'k': if( factor == 1000 ) exponent = 1; break; + } + if( exponent <= 0 ) + { + if( verbosity >= 0 ) + fprintf( stderr, "%s: Bad multiplier in numerical argument of " + "option '%s'.\n", program_name, option_name ); + exit( 1 ); + } + for( i = 0; i < exponent; ++i ) + { + if( ulimit / factor >= result ) result *= factor; + else { errno = ERANGE; break; } + } + } + if( !errno && ( result < llimit || result > ulimit ) ) errno = ERANGE; + if( errno ) + { + if( verbosity >= 0 ) + fprintf( stderr, "%s: Numerical argument out of limits [%s,%s] " + "in option '%s'.\n", program_name, format_num3( llimit ), + format_num3( ulimit ), option_name ); + exit( 1 ); + } + return result; + } + + +static int get_dict_size( const char * const arg, const char * const option_name ) + { + char * tail; + const long bits = strtol( arg, &tail, 0 ); + if( bits >= min_dictionary_bits && + bits <= max_dictionary_bits && *tail == 0 ) + return 1 << bits; + return getnum( arg, option_name, min_dictionary_size, max_dictionary_size ); + } + + +static void set_mode( enum Mode * const program_modep, const enum Mode new_mode ) + { + if( *program_modep != m_compress && *program_modep != new_mode ) + { + show_error( "Only one operation can be specified.", 0, true ); + exit( 1 ); + } + *program_modep = new_mode; + } + + +static int extension_index( const char * const name ) + { + int eindex; + for( eindex = 0; known_extensions[eindex].from; ++eindex ) + { + const char * const ext = known_extensions[eindex].from; + const unsigned name_len = strlen( name ); + const unsigned ext_len = strlen( ext ); + if( name_len > ext_len && + strncmp( name + name_len - ext_len, ext, ext_len ) == 0 ) + return eindex; + } + return -1; + } + + +static void set_d_outname( const char * const name, const int eindex ) + { + const unsigned name_len = strlen( name ); + if( eindex >= 0 ) + { + const char * const from = known_extensions[eindex].from; + const unsigned from_len = strlen( from ); + if( name_len > from_len ) + { + output_filename = resize_buffer( output_filename, name_len + + strlen( known_extensions[eindex].to ) + 1 ); + strcpy( output_filename, name ); + strcpy( output_filename + name_len - from_len, known_extensions[eindex].to ); + return; + } + } + output_filename = resize_buffer( output_filename, name_len + 4 + 1 ); + strcpy( output_filename, name ); + strcat( output_filename, ".out" ); + if( verbosity >= 1 ) + fprintf( stderr, "%s: Can't guess original name for '%s' -- using '%s'\n", + program_name, name, output_filename ); + } + + +int open_instream( const char * const name, struct stat * const in_statsp, + const bool one_to_one, const bool reg_only ) + { + int infd = open( name, O_RDONLY | O_BINARY ); + if( infd < 0 ) + show_file_error( name, "Can't open input file", errno ); + else + { + const int i = fstat( infd, in_statsp ); + const mode_t mode = in_statsp->st_mode; + const bool can_read = ( i == 0 && !reg_only && + ( S_ISBLK( mode ) || S_ISCHR( mode ) || + S_ISFIFO( mode ) || S_ISSOCK( mode ) ) ); + if( i != 0 || ( !S_ISREG( mode ) && ( !can_read || one_to_one ) ) ) + { + if( verbosity >= 0 ) + fprintf( stderr, "%s: Input file '%s' is not a regular file%s.\n", + program_name, name, ( can_read && one_to_one ) ? + ",\n and neither '-c' nor '-o' were specified" : "" ); + close( infd ); + infd = -1; + } + } + return infd; + } + + +static bool open_outstream( const bool force, const bool protect ) + { + const mode_t usr_rw = S_IRUSR | S_IWUSR; + const mode_t all_rw = usr_rw | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH; + const mode_t outfd_mode = protect ? usr_rw : all_rw; + int flags = O_APPEND | O_CREAT | O_RDWR | O_BINARY; + if( force ) flags |= O_TRUNC; else flags |= O_EXCL; + + outfd = open( output_filename, flags, outfd_mode ); + if( outfd >= 0 ) delete_output_on_interrupt = true; + else if( verbosity >= 0 ) + { + if( errno == EEXIST ) + fprintf( stderr, "%s: Output file '%s' already exists, skipping.\n", + program_name, output_filename ); + else + fprintf( stderr, "%s: Can't create output file '%s': %s\n", + program_name, output_filename, strerror( errno ) ); + } + return ( outfd >= 0 ); + } + + +static void set_signals( void (*action)(int) ) + { + signal( SIGHUP, action ); + signal( SIGINT, action ); + signal( SIGTERM, action ); + } + + +void cleanup_and_fail( const int retval ) + { + set_signals( SIG_IGN ); /* ignore signals */ + if( delete_output_on_interrupt ) + { + delete_output_on_interrupt = false; + if( verbosity >= 0 ) + fprintf( stderr, "%s: Deleting output file '%s', if it exists.\n", + program_name, output_filename ); + if( outfd >= 0 ) { close( outfd ); outfd = -1; } + if( remove( output_filename ) != 0 && errno != ENOENT ) + show_error( "WARNING: deletion of output file (apparently) failed.", 0, false ); + } + exit( retval ); + } + + +static void signal_handler( int sig ) + { + if( sig ) {} /* keep compiler happy */ + show_error( "Control-C or similar caught, quitting.", 0, false ); + cleanup_and_fail( 1 ); + } + + +static bool check_tty_in( const char * const input_filename, const int infd, + const enum Mode program_mode, int * const retval ) + { + if( isatty( infd ) ) /* for example /dev/tty */ + { show_file_error( input_filename, + "I won't read compressed data from a terminal.", 0 ); + close( infd ); set_retval( retval, 2 ); + if( program_mode != m_test ) cleanup_and_fail( *retval ); + return false; } + return true; + } + + +/* Set permissions, owner, and times. */ +static void close_and_set_permissions( const struct stat * const in_statsp ) + { + bool warning = false; + if( in_statsp ) + { + const mode_t mode = in_statsp->st_mode; + /* fchown will in many cases return with EPERM, which can be safely ignored. */ + if( fchown( outfd, in_statsp->st_uid, in_statsp->st_gid ) == 0 ) + { if( fchmod( outfd, mode ) != 0 ) warning = true; } + else + if( errno != EPERM || + fchmod( outfd, mode & ~( S_ISUID | S_ISGID | S_ISVTX ) ) != 0 ) + warning = true; + } + if( close( outfd ) != 0 ) + { + show_error( "Error closing output file", errno, false ); + cleanup_and_fail( 1 ); + } + outfd = -1; + delete_output_on_interrupt = false; + if( in_statsp ) + { + struct utimbuf t; + t.actime = in_statsp->st_atime; + t.modtime = in_statsp->st_mtime; + if( utime( output_filename, &t ) != 0 ) warning = true; + } + if( warning && verbosity >= 1 ) + show_error( "Can't change output file attributes.", 0, false ); + } + + +static unsigned char xdigit( const unsigned value ) + { + if( value <= 9 ) return '0' + value; + if( value <= 15 ) return 'A' + value - 10; + return 0; + } + + +static bool show_trailing_data( const uint8_t * const data, const int size, + struct Pretty_print * const pp, const bool all, + const int ignore_trailing ) /* -1 = show */ + { + if( verbosity >= 4 || ignore_trailing <= 0 ) + { + int i; + char buf[80]; + unsigned len = max( 0, snprintf( buf, sizeof buf, "%strailing data = ", + all ? "" : "first bytes of " ) ); + for( i = 0; i < size && len + 2 < sizeof buf; ++i ) + { + buf[len++] = xdigit( data[i] >> 4 ); + buf[len++] = xdigit( data[i] & 0x0F ); + buf[len++] = ' '; + } + if( len < sizeof buf ) buf[len++] = '\''; + for( i = 0; i < size && len < sizeof buf; ++i ) + { if( isprint( data[i] ) ) buf[len++] = data[i]; else buf[len++] = '.'; } + if( len < sizeof buf ) buf[len++] = '\''; + if( len < sizeof buf ) buf[len] = 0; else buf[sizeof buf - 1] = 0; + Pp_show_msg( pp, buf ); + if( ignore_trailing == 0 ) show_file_error( pp->name, trailing_msg, 0 ); + } + return ( ignore_trailing > 0 ); + } + + +static int decompress( const unsigned long long cfile_size, const int infd, + struct Pretty_print * const pp, const unsigned buffer_size, + const bool ignore_trailing, const bool loose_trailing, + const bool testing ) + { + unsigned long long partial_file_pos = 0; + struct Range_decoder rdec; + int retval = 0; + bool first_member; + if( !Rd_init( &rdec, infd ) ) + { show_error( mem_msg, 0, false ); cleanup_and_fail( 1 ); } + + for( first_member = true; ; first_member = false ) + { + Lzip_header header; + Rd_reset_member_position( &rdec ); + const int size = Rd_read_data( &rdec, header, Lh_size ); + if( Rd_finished( &rdec ) ) /* End Of File */ + { + if( first_member ) + { show_file_error( pp->name, "File ends unexpectedly at member header.", 0 ); + retval = 2; } + else if( Lh_verify_prefix( header, size ) ) + { Pp_show_msg( pp, "Truncated header in multimember file." ); + show_trailing_data( header, size, pp, true, -1 ); + retval = 2; } + else if( size > 0 && !show_trailing_data( header, size, pp, + true, ignore_trailing ) ) + retval = 2; + break; + } + if( !Lh_verify_magic( header ) ) + { + if( first_member ) + { show_file_error( pp->name, bad_magic_msg, 0 ); retval = 2; } + else if( !loose_trailing && Lh_verify_corrupt( header ) ) + { Pp_show_msg( pp, corrupt_mm_msg ); + show_trailing_data( header, size, pp, false, -1 ); + retval = 2; } + else if( !show_trailing_data( header, size, pp, false, ignore_trailing ) ) + retval = 2; + break; + } + if( !Lh_verify_version( header ) ) + { Pp_show_msg( pp, bad_version( Lh_version( header ) ) ); + retval = 2; break; } + const unsigned dictionary_size = Lh_get_dictionary_size( header ); + if( !isvalid_ds( dictionary_size ) ) + { Pp_show_msg( pp, bad_dict_msg ); retval = 2; break; } + + if( verbosity >= 2 || ( verbosity == 1 && first_member ) ) + Pp_show_msg( pp, 0 ); + + struct LZ_decoder decoder; + if( !LZd_init( &decoder, &rdec, buffer_size, dictionary_size, outfd ) ) + { + Pp_show_msg( pp, "Not enough memory. Try a smaller output buffer size." ); + retval = 1; break; + } + show_dprogress( cfile_size, partial_file_pos, &rdec, pp ); /* init */ + const int result = LZd_decode_member( &decoder, pp ); + partial_file_pos += Rd_member_position( &rdec ); + LZd_free( &decoder ); + if( result != 0 ) + { + if( verbosity >= 0 && result <= 2 ) + { + Pp_show_msg( pp, 0 ); + fprintf( stderr, "%s at pos %llu\n", ( result == 2 ) ? + "File ends unexpectedly" : "Decoder error", + partial_file_pos ); + } + retval = 2; break; + } + if( verbosity >= 2 ) + { fputs( testing ? "ok\n" : "done\n", stderr ); Pp_reset( pp ); } + } + Rd_free( &rdec ); + if( verbosity == 1 && retval == 0 ) + fputs( testing ? "ok\n" : "done\n", stderr ); + return retval; + } + + +void show_error( const char * const msg, const int errcode, const bool help ) + { + if( verbosity < 0 ) return; + if( msg && msg[0] ) + fprintf( stderr, "%s: %s%s%s\n", program_name, msg, + ( errcode > 0 ) ? ": " : "", + ( errcode > 0 ) ? strerror( errcode ) : "" ); + if( help ) + fprintf( stderr, "Try '%s --help' for more information.\n", + invocation_name ); + } + + +void show_file_error( const char * const filename, const char * const msg, + const int errcode ) + { + if( verbosity >= 0 ) + fprintf( stderr, "%s: %s: %s%s%s\n", program_name, filename, msg, + ( errcode > 0 ) ? ": " : "", + ( errcode > 0 ) ? strerror( errcode ) : "" ); + } + + +static void internal_error( const char * const msg ) + { + if( verbosity >= 0 ) + fprintf( stderr, "%s: internal error: %s\n", program_name, msg ); + exit( 3 ); + } + + +void show_dprogress( const unsigned long long cfile_size, + const unsigned long long partial_size, + const struct Range_decoder * const d, + struct Pretty_print * const p ) + { + static unsigned long long csize = 0; /* file_size / 100 */ + static unsigned long long psize = 0; + static const struct Range_decoder * rdec = 0; + static struct Pretty_print * pp = 0; + static int counter = 0; + static bool enabled = true; + + if( !enabled ) return; + if( p ) /* initialize static vars */ + { + if( verbosity < 2 || !isatty( STDERR_FILENO ) ) { enabled = false; return; } + csize = cfile_size; psize = partial_size; rdec = d; pp = p; counter = 0; + } + if( rdec && pp && --counter <= 0 ) + { + const unsigned long long pos = psize + Rd_member_position( rdec ); + counter = 7; /* update display every 114688 bytes */ + if( csize > 0 ) + fprintf( stderr, "%4llu%% %.1f MB\r", pos / csize, pos / 1000000.0 ); + else + fprintf( stderr, " %.1f MB\r", pos / 1000000.0 ); + Pp_reset( pp ); Pp_show_msg( pp, 0 ); /* restore cursor position */ + } + } + + +int main( const int argc, const char * const argv[] ) + { + const char * default_output_filename = ""; + unsigned buffer_size = max_dictionary_size; + enum Mode program_mode = m_compress; + int i; + bool force = false; + bool ignore_trailing = true; + bool keep_input_files = false; + bool loose_trailing = false; + bool to_stdout = false; + if( argc > 0 ) invocation_name = argv[0]; + + enum { opt_lt = 256 }; + const struct ap_Option options[] = + { + { 'a', "trailing-error", ap_no }, + { 'c', "stdout", ap_no }, + { 'd', "decompress", ap_no }, + { 'f', "force", ap_no }, + { 'h', "help", ap_no }, + { 'k', "keep", ap_no }, + { 'l', "list", ap_no }, + { 'n', "threads", ap_yes }, + { 'o', "output", ap_yes }, + { 'q', "quiet", ap_no }, + { 't', "test", ap_no }, + { 'u', "buffer-size", ap_yes }, + { 'v', "verbose", ap_no }, + { 'V', "version", ap_no }, + { opt_lt, "loose-trailing", ap_no }, + { 0 , 0, ap_no } }; + + CRC32_init(); + + /* static because valgrind complains and memory management in C sucks */ + static struct Arg_parser parser; + if( !ap_init( &parser, argc, argv, options, 0 ) ) + { show_error( mem_msg, 0, false ); return 1; } + if( ap_error( &parser ) ) /* bad option */ + { show_error( ap_error( &parser ), 0, true ); return 1; } + + int argind = 0; + for( ; argind < ap_arguments( &parser ); ++argind ) + { + const int code = ap_code( &parser, argind ); + if( !code ) break; /* no more options */ + const char * const pn = ap_parsed_name( &parser, argind ); + const char * const arg = ap_argument( &parser, argind ); + switch( code ) + { + case 'a': ignore_trailing = false; break; + case 'c': to_stdout = true; break; + case 'd': set_mode( &program_mode, m_decompress ); break; + case 'f': force = true; break; + case 'h': show_help(); return 0; + case 'k': keep_input_files = true; break; + case 'l': set_mode( &program_mode, m_list ); break; + case 'n': break; + case 'o': if( strcmp( arg, "-" ) == 0 ) to_stdout = true; + else { default_output_filename = arg; } break; + case 'q': verbosity = -1; break; + case 't': set_mode( &program_mode, m_test ); break; + case 'u': buffer_size = get_dict_size( arg, pn ); break; + case 'v': if( verbosity < 4 ) ++verbosity; break; + case 'V': show_version(); return 0; + case opt_lt: loose_trailing = true; break; + default : internal_error( "uncaught option." ); + } + } /* end process options */ + +#if defined __MSVCRT__ || defined __OS2__ || defined __DJGPP__ + setmode( STDIN_FILENO, O_BINARY ); + setmode( STDOUT_FILENO, O_BINARY ); +#endif + + static const char ** filenames = 0; + int num_filenames = max( 1, ap_arguments( &parser ) - argind ); + filenames = resize_buffer( filenames, num_filenames * sizeof filenames[0] ); + filenames[0] = "-"; + + bool filenames_given = false; + for( i = 0; argind + i < ap_arguments( &parser ); ++i ) + { + filenames[i] = ap_argument( &parser, argind + i ); + if( strcmp( filenames[i], "-" ) != 0 ) filenames_given = true; + } + + if( program_mode == m_list ) + return list_files( filenames, num_filenames, ignore_trailing, loose_trailing ); + + if( program_mode == m_compress ) + program_mode = m_decompress; /* default mode */ + if( program_mode == m_test ) to_stdout = false; /* apply overrides */ + if( program_mode == m_test || to_stdout ) default_output_filename = ""; + + if( buffer_size < max_dictionary_size ) + { + bool from_stdin = false; + if( to_stdout || program_mode == m_test ) + { show_error( "'--buffer-size' is incompatible with '--stdout' and '--test'.", + 0, false ); return 1; } + for( i = 0; i < num_filenames; ++i ) + if( !filenames[i][0] || strcmp( filenames[i], "-" ) == 0 ) + { from_stdin = true; break; } + if( from_stdin && !default_output_filename[0] ) + { show_error( "Output file must be specified when decompressing from standard input\n" + " with a reduced buffer size.", 0, false ); return 1; } + } + + output_filename = resize_buffer( output_filename, 1 ); + output_filename[0] = 0; + if( to_stdout && program_mode != m_test ) outfd = STDOUT_FILENO; + else outfd = -1; + + const bool to_file = !to_stdout && program_mode != m_test && + default_output_filename[0]; + if( !to_stdout && program_mode != m_test && ( filenames_given || to_file ) ) + set_signals( signal_handler ); + + static struct Pretty_print pp; + Pp_init( &pp, filenames, num_filenames ); + + int failed_tests = 0; + int retval = 0; + const bool one_to_one = !to_stdout && program_mode != m_test && !to_file; + bool stdin_used = false; + for( i = 0; i < num_filenames; ++i ) + { + const char * input_filename = ""; + int infd; + struct stat in_stats; + + Pp_set_name( &pp, filenames[i] ); + if( strcmp( filenames[i], "-" ) == 0 ) + { + if( stdin_used ) continue; else stdin_used = true; + infd = STDIN_FILENO; + if( !check_tty_in( pp.name, infd, program_mode, &retval ) ) continue; + if( one_to_one ) { outfd = STDOUT_FILENO; output_filename[0] = 0; } + } + else + { + input_filename = filenames[i]; + infd = open_instream( input_filename, &in_stats, one_to_one, false ); + if( infd < 0 ) { set_retval( &retval, 1 ); continue; } + if( !check_tty_in( pp.name, infd, program_mode, &retval ) ) continue; + if( one_to_one ) /* open outfd after verifying infd */ + { + set_d_outname( input_filename, extension_index( input_filename ) ); + if( !open_outstream( force, true ) ) + { close( infd ); set_retval( &retval, 1 ); continue; } + } + } + + if( to_file && outfd < 0 ) /* open outfd after verifying infd */ + { + output_filename = resize_buffer( output_filename, + strlen( default_output_filename ) + 1 ); + strcpy( output_filename, default_output_filename ); + if( !open_outstream( force, false ) ) return 1; + } + + if( delete_output_on_interrupt && buffer_size < max_dictionary_size ) + { + struct stat st; + if( fstat( outfd, &st ) != 0 || !S_ISREG( st.st_mode ) ) + { + if( verbosity >= 0 ) + fprintf( stderr, "%s: Output file '%s' is not a regular file,\n" + " and 'low memory' mode has been requested.\n", + program_name, output_filename ); + set_retval( &retval, 1 ); + return retval; /* don't try to delete a non-regular file */ + } + } + + const struct stat * const in_statsp = + ( input_filename[0] && one_to_one ) ? &in_stats : 0; + const unsigned long long cfile_size = + ( input_filename[0] && S_ISREG( in_stats.st_mode ) ) ? + ( in_stats.st_size + 99 ) / 100 : 0; + int tmp = decompress( cfile_size, infd, &pp, buffer_size, ignore_trailing, + loose_trailing, program_mode == m_test ); + if( close( infd ) != 0 ) + { show_file_error( pp.name, "Error closing input file", errno ); + set_retval( &tmp, 1 ); } + set_retval( &retval, tmp ); + if( tmp ) + { if( program_mode != m_test ) cleanup_and_fail( retval ); + else ++failed_tests; } + + if( delete_output_on_interrupt && one_to_one ) + close_and_set_permissions( in_statsp ); + if( input_filename[0] && !keep_input_files && one_to_one ) + remove( input_filename ); + } + if( delete_output_on_interrupt ) close_and_set_permissions( 0 ); /* -o */ + else if( outfd >= 0 && close( outfd ) != 0 ) /* -c */ + { + show_error( "Error closing stdout", errno, false ); + set_retval( &retval, 1 ); + } + if( failed_tests > 0 && verbosity >= 1 && num_filenames > 1 ) + fprintf( stderr, "%s: warning: %d %s failed the test.\n", + program_name, failed_tests, + ( failed_tests == 1 ) ? "file" : "files" ); + free( output_filename ); + free( filenames ); + ap_free( &parser ); + return retval; + } diff --git a/testsuite/check.sh b/testsuite/check.sh new file mode 100755 index 0000000..c495ba1 --- /dev/null +++ b/testsuite/check.sh @@ -0,0 +1,351 @@ +#! /bin/sh +# check script for Lunzip - Decompressor for the lzip format +# Copyright (C) 2010-2022 Antonio Diaz Diaz. +# +# This script is free software: you have unlimited permission +# to copy, distribute, and modify it. + +LC_ALL=C +export LC_ALL +objdir=`pwd` +testdir=`cd "$1" ; pwd` +LZIP="${objdir}"/lunzip +framework_failure() { echo "failure in testing framework" ; exit 1 ; } + +if [ ! -f "${LZIP}" ] || [ ! -x "${LZIP}" ] ; then + echo "${LZIP}: cannot execute" + exit 1 +fi + +[ -e "${LZIP}" ] 2> /dev/null || + { + echo "$0: a POSIX shell is required to run the tests" + echo "Try bash -c \"$0 $1 $2\"" + exit 1 + } + +if [ -d tmp ] ; then rm -rf tmp ; fi +mkdir tmp +cd "${objdir}"/tmp || framework_failure + +cat "${testdir}"/test.txt > in || framework_failure +in_lz="${testdir}"/test.txt.lz +in_em="${testdir}"/test_em.txt.lz +fox_lz="${testdir}"/fox.lz +fail=0 +test_failed() { fail=1 ; printf " $1" ; [ -z "$2" ] || printf "($2)" ; } + +printf "testing lunzip-%s..." "$2" + +cat "${in_lz}" > uin.lz || framework_failure +for i in bad_size -1 0 4095 513MiB 1G 1T 1P 1E 1Z 1Y 10KB ; do + "${LZIP}" -dfkq -u $i uin.lz + [ $? = 1 ] || test_failed $LINENO $i + [ ! -e uin ] || test_failed $LINENO $i +done +rm -f uin.lz || framework_failure +"${LZIP}" -lq in +[ $? = 2 ] || test_failed $LINENO +"${LZIP}" -tq in +[ $? = 2 ] || test_failed $LINENO +"${LZIP}" -tq < in +[ $? = 2 ] || test_failed $LINENO +"${LZIP}" -cdq in +[ $? = 2 ] || test_failed $LINENO +"${LZIP}" -cdq < in +[ $? = 2 ] || test_failed $LINENO +"${LZIP}" -dq -o in < "${in_lz}" +[ $? = 1 ] || test_failed $LINENO +"${LZIP}" -dq -o in "${in_lz}" +[ $? = 1 ] || test_failed $LINENO +"${LZIP}" -dq -o out nx_file.lz +[ $? = 1 ] || test_failed $LINENO +[ ! -e out ] || test_failed $LINENO +# these are for code coverage +"${LZIP}" -lt "${in_lz}" 2> /dev/null +[ $? = 1 ] || test_failed $LINENO +"${LZIP}" -cdl "${in_lz}" > out 2> /dev/null +[ $? = 1 ] || test_failed $LINENO +"${LZIP}" -cdt "${in_lz}" > out 2> /dev/null +[ $? = 1 ] || test_failed $LINENO +"${LZIP}" -t -- nx_file.lz 2> /dev/null +[ $? = 1 ] || test_failed $LINENO +"${LZIP}" -t "" < /dev/null 2> /dev/null +[ $? = 1 ] || test_failed $LINENO +"${LZIP}" --help > /dev/null || test_failed $LINENO +"${LZIP}" -n1 -V > /dev/null || test_failed $LINENO +"${LZIP}" -m 2> /dev/null +[ $? = 1 ] || test_failed $LINENO +"${LZIP}" -z 2> /dev/null +[ $? = 1 ] || test_failed $LINENO +"${LZIP}" --bad_option 2> /dev/null +[ $? = 1 ] || test_failed $LINENO +"${LZIP}" --t 2> /dev/null +[ $? = 1 ] || test_failed $LINENO +"${LZIP}" --test=2 2> /dev/null +[ $? = 1 ] || test_failed $LINENO +"${LZIP}" --output= 2> /dev/null +[ $? = 1 ] || test_failed $LINENO +"${LZIP}" --output 2> /dev/null +[ $? = 1 ] || test_failed $LINENO +printf "LZIP\001-.............................." | "${LZIP}" -t 2> /dev/null +printf "LZIP\002-.............................." | "${LZIP}" -t 2> /dev/null +printf "LZIP\001+.............................." | "${LZIP}" -t 2> /dev/null +rm -f out || framework_failure + +printf "\ntesting decompression..." + +for i in "${in_lz}" "${in_em}" ; do + "${LZIP}" -lq "$i" || test_failed $LINENO "$i" + "${LZIP}" -t "$i" || test_failed $LINENO "$i" + "${LZIP}" -d "$i" -o copy || test_failed $LINENO "$i" + cmp in copy || test_failed $LINENO "$i" + "${LZIP}" -cd "$i" > copy || test_failed $LINENO "$i" + cmp in copy || test_failed $LINENO "$i" + "${LZIP}" -d "$i" -o - > copy || test_failed $LINENO "$i" + cmp in copy || test_failed $LINENO "$i" + "${LZIP}" -d < "$i" > copy || test_failed $LINENO "$i" + cmp in copy || test_failed $LINENO "$i" + rm -f copy || framework_failure +done + +lines=$("${LZIP}" -tvv "${in_em}" 2>&1 | wc -l) || test_failed $LINENO +[ "${lines}" -eq 8 ] || test_failed $LINENO "${lines}" + +lines=$("${LZIP}" -lvv "${in_em}" | wc -l) || test_failed $LINENO +[ "${lines}" -eq 11 ] || test_failed $LINENO "${lines}" + +"${LZIP}" -cd "${fox_lz}" > fox || test_failed $LINENO +cat "${in_lz}" > copy.lz || framework_failure +"${LZIP}" -dk copy.lz || test_failed $LINENO +cmp in copy || test_failed $LINENO +cat fox > copy || framework_failure +cat "${in_lz}" > out.lz || framework_failure +rm -f out || framework_failure +"${LZIP}" -d copy.lz out.lz 2> /dev/null # skip copy, decompress out +[ $? = 1 ] || test_failed $LINENO +cmp fox copy || test_failed $LINENO +cmp in out || test_failed $LINENO +"${LZIP}" -df copy.lz || test_failed $LINENO +[ ! -e copy.lz ] || test_failed $LINENO +cmp in copy || test_failed $LINENO +rm -f out || framework_failure + +printf "to be overwritten" > copy || framework_failure +"${LZIP}" -df -o copy < "${in_lz}" || test_failed $LINENO +cmp in copy || test_failed $LINENO +rm -f out copy || framework_failure +"${LZIP}" -d -o ./- "${in_lz}" || test_failed $LINENO +cmp in ./- || test_failed $LINENO +rm -f ./- || framework_failure +"${LZIP}" -d -o ./- < "${in_lz}" || test_failed $LINENO +cmp in ./- || test_failed $LINENO +rm -f ./- || framework_failure + +cat "${in_lz}" > anyothername || framework_failure +"${LZIP}" -dv - anyothername - < "${in_lz}" > copy 2> /dev/null || + test_failed $LINENO +cmp in copy || test_failed $LINENO +cmp in anyothername.out || test_failed $LINENO +rm -f copy anyothername.out || framework_failure + +"${LZIP}" -lq in "${in_lz}" +[ $? = 2 ] || test_failed $LINENO +"${LZIP}" -lq nx_file.lz "${in_lz}" +[ $? = 1 ] || test_failed $LINENO +"${LZIP}" -tq in "${in_lz}" +[ $? = 2 ] || test_failed $LINENO +"${LZIP}" -tq nx_file.lz "${in_lz}" +[ $? = 1 ] || test_failed $LINENO +"${LZIP}" -cdq in "${in_lz}" > copy +[ $? = 2 ] || test_failed $LINENO +cat copy in | cmp in - || test_failed $LINENO # copy must be empty +"${LZIP}" -cdq nx_file.lz "${in_lz}" > copy +[ $? = 1 ] || test_failed $LINENO +cmp in copy || test_failed $LINENO +rm -f copy || framework_failure +cat "${in_lz}" > copy.lz || framework_failure +for i in 1 2 3 4 5 6 7 ; do + printf "g" >> copy.lz || framework_failure + "${LZIP}" -alvv copy.lz "${in_lz}" > /dev/null 2>&1 + [ $? = 2 ] || test_failed $LINENO $i + "${LZIP}" -atvvvv copy.lz "${in_lz}" 2> /dev/null + [ $? = 2 ] || test_failed $LINENO $i +done +"${LZIP}" -dq in copy.lz +[ $? = 2 ] || test_failed $LINENO +[ -e copy.lz ] || test_failed $LINENO +[ ! -e copy ] || test_failed $LINENO +[ ! -e in.out ] || test_failed $LINENO +"${LZIP}" -dq nx_file.lz copy.lz +[ $? = 1 ] || test_failed $LINENO +[ ! -e copy.lz ] || test_failed $LINENO +[ ! -e nx_file ] || test_failed $LINENO +cmp in copy || test_failed $LINENO + +cat in in > in2 || framework_failure +"${LZIP}" -lq "${in_lz}" "${in_lz}" || test_failed $LINENO +"${LZIP}" -t "${in_lz}" "${in_lz}" || test_failed $LINENO +"${LZIP}" -cd "${in_lz}" "${in_lz}" -o out > copy2 || test_failed $LINENO +[ ! -e out ] || test_failed $LINENO # override -o +cmp in2 copy2 || test_failed $LINENO +rm -f copy2 || framework_failure +"${LZIP}" -d "${in_lz}" "${in_lz}" -o copy2 || test_failed $LINENO +cmp in2 copy2 || test_failed $LINENO +rm -f copy2 || framework_failure + +cat "${in_lz}" "${in_lz}" > copy2.lz || framework_failure +printf "\ngarbage" >> copy2.lz || framework_failure +"${LZIP}" -tvvvv copy2.lz 2> /dev/null || test_failed $LINENO +"${LZIP}" -alq copy2.lz +[ $? = 2 ] || test_failed $LINENO +"${LZIP}" -atq copy2.lz +[ $? = 2 ] || test_failed $LINENO +"${LZIP}" -atq < copy2.lz +[ $? = 2 ] || test_failed $LINENO +"${LZIP}" -adkq copy2.lz +[ $? = 2 ] || test_failed $LINENO +[ ! -e copy2 ] || test_failed $LINENO +"${LZIP}" -adkq -o copy2 < copy2.lz +[ $? = 2 ] || test_failed $LINENO +[ ! -e copy2 ] || test_failed $LINENO +printf "to be overwritten" > copy2 || framework_failure +"${LZIP}" -df copy2.lz || test_failed $LINENO +cmp in2 copy2 || test_failed $LINENO +rm -f copy2 || framework_failure + +for i in 12 5120 6Ki 29 512KiB ; do + printf "to be overwritten" > copy || framework_failure + "${LZIP}" -df -u$i -o copy < "${in_lz}" || test_failed $LINENO $i + cmp in copy || test_failed $LINENO $i + rm -f copy || framework_failure + "${LZIP}" -d -u$i -o copy "${in_lz}" || test_failed $LINENO $i + cmp in copy || test_failed $LINENO $i + "${LZIP}" -d -u$i -o copy2 "${in_lz}" "${in_lz}" || + test_failed $LINENO $i + cmp in2 copy2 || test_failed $LINENO $i + rm -f copy2 || framework_failure +done +rm -f in2 copy || framework_failure + +printf "\ntesting bad input..." + +headers='LZIp LZiP LZip LzIP LzIp LziP lZIP lZIp lZiP lzIP' +body='\001\014\000\203\377\373\377\377\300\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000$\000\000\000\000\000\000\000' +cat "${in_lz}" > int.lz +printf "LZIP${body}" >> int.lz +if "${LZIP}" -tq int.lz ; then + for header in ${headers} ; do + printf "${header}${body}" > int.lz # first member + "${LZIP}" -lq int.lz + [ $? = 2 ] || test_failed $LINENO ${header} + "${LZIP}" -tq int.lz + [ $? = 2 ] || test_failed $LINENO ${header} + "${LZIP}" -tq < int.lz + [ $? = 2 ] || test_failed $LINENO ${header} + "${LZIP}" -cdq int.lz > /dev/null + [ $? = 2 ] || test_failed $LINENO ${header} + "${LZIP}" -lq --loose-trailing int.lz + [ $? = 2 ] || test_failed $LINENO ${header} + "${LZIP}" -tq --loose-trailing int.lz + [ $? = 2 ] || test_failed $LINENO ${header} + "${LZIP}" -tq --loose-trailing < int.lz + [ $? = 2 ] || test_failed $LINENO ${header} + "${LZIP}" -cdq --loose-trailing int.lz > /dev/null + [ $? = 2 ] || test_failed $LINENO ${header} + cat "${in_lz}" > int.lz + printf "${header}${body}" >> int.lz # trailing data + "${LZIP}" -lq int.lz + [ $? = 2 ] || test_failed $LINENO ${header} + "${LZIP}" -tq int.lz + [ $? = 2 ] || test_failed $LINENO ${header} + "${LZIP}" -tq < int.lz + [ $? = 2 ] || test_failed $LINENO ${header} + "${LZIP}" -cdq int.lz > /dev/null + [ $? = 2 ] || test_failed $LINENO ${header} + "${LZIP}" -lq --loose-trailing int.lz || + test_failed $LINENO ${header} + "${LZIP}" -t --loose-trailing int.lz || + test_failed $LINENO ${header} + "${LZIP}" -t --loose-trailing < int.lz || + test_failed $LINENO ${header} + "${LZIP}" -cd --loose-trailing int.lz > /dev/null || + test_failed $LINENO ${header} + "${LZIP}" -lq --loose-trailing --trailing-error int.lz + [ $? = 2 ] || test_failed $LINENO ${header} + "${LZIP}" -tq --loose-trailing --trailing-error int.lz + [ $? = 2 ] || test_failed $LINENO ${header} + "${LZIP}" -tq --loose-trailing --trailing-error < int.lz + [ $? = 2 ] || test_failed $LINENO ${header} + "${LZIP}" -cdq --loose-trailing --trailing-error int.lz > /dev/null + [ $? = 2 ] || test_failed $LINENO ${header} + done +else + printf "\nwarning: skipping header test: 'printf' does not work on your system." +fi +rm -f int.lz || framework_failure + +for i in fox_v2.lz fox_s11.lz fox_de20.lz \ + fox_bcrc.lz fox_crc0.lz fox_das46.lz fox_mes81.lz ; do + "${LZIP}" -tq "${testdir}"/$i + [ $? = 2 ] || test_failed $LINENO $i +done + +for i in fox_bcrc.lz fox_crc0.lz fox_das46.lz fox_mes81.lz ; do + "${LZIP}" -cdq "${testdir}"/$i > out + [ $? = 2 ] || test_failed $LINENO $i + cmp fox out || test_failed $LINENO $i +done +rm -f fox out || framework_failure + +cat "${in_lz}" "${in_lz}" > in2.lz || framework_failure +cat "${in_lz}" "${in_lz}" "${in_lz}" > in3.lz || framework_failure +if dd if=in3.lz of=trunc.lz bs=14752 count=1 2> /dev/null && + [ -e trunc.lz ] && cmp in2.lz trunc.lz > /dev/null 2>&1 ; then + for i in 6 20 14734 14753 14754 14755 14756 14757 14758 ; do + dd if=in3.lz of=trunc.lz bs=$i count=1 2> /dev/null + "${LZIP}" -lq trunc.lz + [ $? = 2 ] || test_failed $LINENO $i + "${LZIP}" -tq trunc.lz + [ $? = 2 ] || test_failed $LINENO $i + "${LZIP}" -tq < trunc.lz + [ $? = 2 ] || test_failed $LINENO $i + "${LZIP}" -cdq trunc.lz > out + [ $? = 2 ] || test_failed $LINENO $i + "${LZIP}" -dq < trunc.lz > out + [ $? = 2 ] || test_failed $LINENO $i + done +else + printf "\nwarning: skipping truncation test: 'dd' does not work on your system." +fi +rm -f in2.lz in3.lz trunc.lz out || framework_failure + +cat "${in_lz}" > ingin.lz || framework_failure +printf "g" >> ingin.lz || framework_failure +cat "${in_lz}" >> ingin.lz || framework_failure +"${LZIP}" -lq ingin.lz +[ $? = 2 ] || test_failed $LINENO +"${LZIP}" -atq ingin.lz +[ $? = 2 ] || test_failed $LINENO +"${LZIP}" -atq < ingin.lz +[ $? = 2 ] || test_failed $LINENO +"${LZIP}" -acdq ingin.lz > out +[ $? = 2 ] || test_failed $LINENO +"${LZIP}" -adq < ingin.lz > out +[ $? = 2 ] || test_failed $LINENO +"${LZIP}" -t ingin.lz || test_failed $LINENO +"${LZIP}" -t < ingin.lz || test_failed $LINENO +"${LZIP}" -cd ingin.lz > copy || test_failed $LINENO +cmp in copy || test_failed $LINENO +"${LZIP}" -d < ingin.lz > copy || test_failed $LINENO +cmp in copy || test_failed $LINENO +rm -f copy ingin.lz out || framework_failure + +echo +if [ ${fail} = 0 ] ; then + echo "tests completed successfully." + cd "${objdir}" && rm -r tmp +else + echo "tests failed." +fi +exit ${fail} diff --git a/testsuite/fox.lz b/testsuite/fox.lz new file mode 100644 index 0000000..509da82 Binary files /dev/null and b/testsuite/fox.lz differ diff --git a/testsuite/fox_bcrc.lz b/testsuite/fox_bcrc.lz new file mode 100644 index 0000000..8f6a7c4 Binary files /dev/null and b/testsuite/fox_bcrc.lz differ diff --git a/testsuite/fox_crc0.lz b/testsuite/fox_crc0.lz new file mode 100644 index 0000000..1abe926 Binary files /dev/null and b/testsuite/fox_crc0.lz differ diff --git a/testsuite/fox_das46.lz b/testsuite/fox_das46.lz new file mode 100644 index 0000000..43ed9f9 Binary files /dev/null and b/testsuite/fox_das46.lz differ diff --git a/testsuite/fox_de20.lz b/testsuite/fox_de20.lz new file mode 100644 index 0000000..10949d8 Binary files /dev/null and b/testsuite/fox_de20.lz differ diff --git a/testsuite/fox_mes81.lz b/testsuite/fox_mes81.lz new file mode 100644 index 0000000..d50ef2e Binary files /dev/null and b/testsuite/fox_mes81.lz differ diff --git a/testsuite/fox_s11.lz b/testsuite/fox_s11.lz new file mode 100644 index 0000000..dca909c Binary files /dev/null and b/testsuite/fox_s11.lz differ diff --git a/testsuite/fox_v2.lz b/testsuite/fox_v2.lz new file mode 100644 index 0000000..8620981 Binary files /dev/null and b/testsuite/fox_v2.lz differ diff --git a/testsuite/test.txt b/testsuite/test.txt new file mode 100644 index 0000000..9196a3a --- /dev/null +++ b/testsuite/test.txt @@ -0,0 +1,676 @@ + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Lesser General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + , 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Lesser General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + , 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. diff --git a/testsuite/test.txt.lz b/testsuite/test.txt.lz new file mode 100644 index 0000000..22cea6e Binary files /dev/null and b/testsuite/test.txt.lz differ diff --git a/testsuite/test_em.txt.lz b/testsuite/test_em.txt.lz new file mode 100644 index 0000000..7e96250 Binary files /dev/null and b/testsuite/test_em.txt.lz differ -- cgit v1.2.3