summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorDaniel Baumann <mail@daniel-baumann.ch>2015-11-06 11:39:02 +0000
committerDaniel Baumann <mail@daniel-baumann.ch>2015-11-06 11:39:02 +0000
commit243f444e517f5319ff1fc0cd5c5145388a883940 (patch)
treeab207ec478cb32a6ba58a9538275ade26e750f3d
parentAdding debian version 1.5~pre1-3. (diff)
downloadclzip-243f444e517f5319ff1fc0cd5c5145388a883940.tar.xz
clzip-243f444e517f5319ff1fc0cd5c5145388a883940.zip
Merging upstream version 1.5~pre2.
Signed-off-by: Daniel Baumann <mail@daniel-baumann.ch>
-rw-r--r--ChangeLog5
-rw-r--r--INSTALL6
-rw-r--r--NEWS4
-rw-r--r--README52
-rwxr-xr-xconfigure28
-rw-r--r--decoder.c69
-rw-r--r--decoder.h4
-rw-r--r--doc/clzip.14
-rw-r--r--doc/clzip.info123
-rw-r--r--doc/clzip.texinfo93
-rw-r--r--encoder.c45
-rw-r--r--encoder.h21
-rw-r--r--lzip.h17
-rw-r--r--main.c46
-rwxr-xr-xtestsuite/check.sh29
15 files changed, 296 insertions, 250 deletions
diff --git a/ChangeLog b/ChangeLog
index f1ce217..753fc4e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2013-07-17 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.5-pre2 released.
+ * Show progress of compression at verbosity level 2 (-vv).
+
2013-05-13 Antonio Diaz Diaz <antonio@gnu.org>
* Version 1.5-pre1 released.
diff --git a/INSTALL b/INSTALL
index 7670406..bf7eb2b 100644
--- a/INSTALL
+++ b/INSTALL
@@ -1,7 +1,7 @@
Requirements
------------
You will need a C compiler.
-I use gcc 4.8.0 and 3.3.6, but the code should compile with any
+I use gcc 4.8.1 and 3.3.6, but the code should compile with any
standards compliant compiler.
Gcc is available at http://gcc.gnu.org.
@@ -10,9 +10,9 @@ Procedure
---------
1. Unpack the archive if you have not done so already:
- lzip -cd clzip[version].tar.lz | tar -xf -
+ tar -xf clzip[version].tar.lz
or
- gzip -cd clzip[version].tar.gz | tar -xf -
+ lzip -cd clzip[version].tar.lz | tar -xf -
This creates the directory ./clzip[version] containing the source from
the main archive.
diff --git a/NEWS b/NEWS
index ec9961a..25a7276 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,7 @@
Changes in version 1.5:
+Clzip now shows the progress of compression at verbosity level 2 (-vv).
+
Decompression time has been reduced by 1%.
File version is now shown only if verbosity >= 4.
@@ -7,4 +9,4 @@ File version is now shown only if verbosity >= 4.
Option "-n, --threads" is now accepted and ignored for compatibility
with plzip.
-"configure" now accepts options with a separate argument.
+The configure script now accepts options with a separate argument.
diff --git a/README b/README
index 26d527d..0043c8c 100644
--- a/README
+++ b/README
@@ -1,22 +1,38 @@
Description
-Clzip is a lossless data compressor based on the LZMA algorithm, with
-very safe integrity checking and a user interface similar to the one of
-gzip or bzip2. Clzip decompresses almost as fast as gzip and compresses
-better than bzip2, which makes it well suited for software distribution
-and data archiving.
+Clzip is a lossless data compressor with a user interface similar to the
+one of gzip or bzip2. Clzip decompresses almost as fast as gzip and
+compresses more than bzip2, which makes it well suited for software
+distribution and data archiving. Clzip is a clean implementation of the
+LZMA algorithm.
-Clzip uses the same well-defined exit status values used by bzip2, which
-makes it safer when used in pipes or scripts than compressors returning
-ambiguous warning values, like gzip.
+Clzip uses the same well-defined exit status values used by lzip and
+bzip2, which makes it safer when used in pipes or scripts than
+compressors returning ambiguous warning values, like gzip.
Clzip uses the lzip file format; the files produced by clzip are fully
-compatible with lzip-1.4 or newer. Clzip is in fact a C language version
-of lzip, intended for embedded devices or systems lacking a C++
-compiler.
+compatible with lzip-1.4 or newer, and can be rescued with lziprecover.
+Clzip is in fact a C language version of lzip, intended for embedded
+devices or systems lacking a C++ compiler.
+
+The lzip file format is designed for long-term data archiving and
+provides very safe integrity checking. The member trailer stores the
+32-bit CRC of the original data, the size of the original data and the
+size of the member. These values, together with the value remaining in
+the range decoder and the end-of-stream marker, provide a 4 factor
+integrity checking which guarantees that the decompressed version of the
+data is identical to the original. This guards against corruption of the
+compressed data, and against undetected bugs in clzip (hopefully very
+unlikely). The chances of data corruption going undetected are
+microscopic. Be aware, though, that the check occurs upon decompression,
+so it can only tell you that something is wrong. It can't help you
+recover the original uncompressed data.
If you ever need to recover data from a damaged lzip file, try the
-lziprecover program.
+lziprecover program. Lziprecover makes lzip files resistant to bit-flip
+(one of the most common forms of data corruption), and provides data
+recovery capabilities, including error-checked merging of damaged copies
+of a file.
Clzip replaces every file given in the command line with a compressed
version of itself, with the name "original_name.lz". Each compressed
@@ -50,18 +66,6 @@ without exceeding the given limit. Keep in mind that the decompression
memory requirement is affected at compression time by the choice of
dictionary size limit.
-As a self-check for your protection, clzip stores in the member trailer
-the 32-bit CRC of the original data, the size of the original data and
-the size of the member. These values, together with the value remaining
-in the range decoder and the end-of-stream marker, provide a very safe 4
-factor integrity checking which guarantees that the decompressed version
-of the data is identical to the original. This guards against corruption
-of the compressed data, and against undetected bugs in clzip (hopefully
-very unlikely). The chances of data corruption going undetected are
-microscopic. Be aware, though, that the check occurs upon decompression,
-so it can only tell you that something is wrong. It can't help you
-recover the original uncompressed data.
-
Clzip implements a simplified version of the LZMA (Lempel-Ziv-Markov
chain-Algorithm) algorithm. The high compression of LZMA comes from
combining two basic, well-proven compression ideas: sliding dictionaries
diff --git a/configure b/configure
index 81068f8..846ca8b 100755
--- a/configure
+++ b/configure
@@ -1,14 +1,14 @@
#! /bin/sh
-# configure script for Clzip - Data compressor based on the LZMA algorithm
+# configure script for Clzip - LZMA lossless data compressor
# Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
#
# This configure script is free software: you have unlimited permission
# to copy, distribute and modify it.
pkgname=clzip
-pkgversion=1.5-pre1
+pkgversion=1.5-pre2
progname=clzip
-srctrigger=doc/clzip.texinfo
+srctrigger=doc/${pkgname}.texinfo
# clear some things potentially inherited from environment.
LC_ALL=C
@@ -26,9 +26,8 @@ CFLAGS='-Wall -W -O2'
LDFLAGS=
# checking whether we are using GNU C.
-if [ ! -x /bin/gcc ] &&
- [ ! -x /usr/bin/gcc ] &&
- [ ! -x /usr/local/bin/gcc ] ; then
+${CC} --version > /dev/null 2>&1
+if [ $? != 0 ] ; then
CC=cc
CFLAGS='-W -O2'
fi
@@ -96,16 +95,19 @@ while [ $# != 0 ] ; do
CFLAGS=*) CFLAGS=${optarg} ;;
LDFLAGS=*) LDFLAGS=${optarg} ;;
- --* | *=* | *-*-*) ;;
+ --*)
+ echo "configure: WARNING: unrecognized option: '${option}'" 1>&2 ;;
+ *=* | *-*-*) ;;
*)
- echo "configure: Unrecognized option: \"${option}\"; use --help for usage." 1>&2
+ echo "configure: unrecognized option: '${option}'" 1>&2
+ echo "Try 'configure --help' for more information." 1>&2
exit 1 ;;
esac
# Check if the option took a separate argument
if [ "${arg2}" = yes ] ; then
if [ $# != 0 ] ; then args="${args} \"$1\"" ; shift
- else echo "configure: Missing argument to \"${option}\"" 1>&2
+ else echo "configure: Missing argument to '${option}'" 1>&2
exit 1
fi
fi
@@ -123,10 +125,8 @@ if [ -z "${srcdir}" ] ; then
fi
if [ ! -r "${srcdir}/${srctrigger}" ] ; then
- exec 1>&2
- echo
- echo "configure: Can't find sources in ${srcdir} ${srcdirtext}"
- echo "configure: (At least ${srctrigger} is missing)."
+ echo "configure: Can't find sources in ${srcdir} ${srcdirtext}" 1>&2
+ echo "configure: (At least ${srctrigger} is missing)." 1>&2
exit 1
fi
@@ -164,7 +164,7 @@ echo "CFLAGS = ${CFLAGS}"
echo "LDFLAGS = ${LDFLAGS}"
rm -f Makefile
cat > Makefile << EOF
-# Makefile for Clzip - Data compressor based on the LZMA algorithm
+# Makefile for Clzip - LZMA lossless data compressor
# Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
# This file was generated automatically by configure. Do not edit.
#
diff --git a/decoder.c b/decoder.c
index d3f2bf0..d1df32f 100644
--- a/decoder.c
+++ b/decoder.c
@@ -1,4 +1,4 @@
-/* Clzip - Data compressor based on the LZMA algorithm
+/* Clzip - LZMA lossless data compressor
Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify
@@ -34,7 +34,7 @@ CRC32 crc32;
void Pp_show_msg( struct Pretty_print * const pp, const char * const msg )
{
- if( pp->verbosity >= 0 )
+ if( verbosity >= 0 )
{
if( pp->first_post )
{
@@ -122,26 +122,23 @@ bool LZd_verify_trailer( struct LZ_decoder * const decoder,
struct Pretty_print * const pp )
{
File_trailer trailer;
- const int trailer_size = Ft_versioned_size( decoder->member_version );
const unsigned long long member_size =
- Rd_member_position( decoder->rdec ) + trailer_size;
+ Rd_member_position( decoder->rdec ) + Ft_size;
bool error = false;
- int size = Rd_read_data( decoder->rdec, trailer, trailer_size );
- if( size < trailer_size )
+ int size = Rd_read_data( decoder->rdec, trailer, Ft_size );
+ if( size < Ft_size )
{
error = true;
- if( pp->verbosity >= 0 )
+ if( verbosity >= 0 )
{
Pp_show_msg( pp, 0 );
fprintf( stderr, "Trailer truncated at trailer position %d;"
" some checks may fail.\n", size );
}
- while( size < trailer_size ) trailer[size++] = 0;
+ while( size < Ft_size ) trailer[size++] = 0;
}
- if( decoder->member_version == 0 ) Ft_set_member_size( trailer, member_size );
-
if( decoder->rdec->code != 0 )
{
error = true;
@@ -150,7 +147,7 @@ bool LZd_verify_trailer( struct LZ_decoder * const decoder,
if( Ft_get_data_crc( trailer ) != LZd_crc( decoder ) )
{
error = true;
- if( pp->verbosity >= 0 )
+ if( verbosity >= 0 )
{
Pp_show_msg( pp, 0 );
fprintf( stderr, "CRC mismatch; trailer says %08X, data CRC is %08X.\n",
@@ -160,7 +157,7 @@ bool LZd_verify_trailer( struct LZ_decoder * const decoder,
if( Ft_get_data_size( trailer ) != LZd_data_position( decoder ) )
{
error = true;
- if( pp->verbosity >= 0 )
+ if( verbosity >= 0 )
{
Pp_show_msg( pp, 0 );
fprintf( stderr, "Data size mismatch; trailer says %llu, data size is %llu (0x%llX).\n",
@@ -170,19 +167,19 @@ bool LZd_verify_trailer( struct LZ_decoder * const decoder,
if( Ft_get_member_size( trailer ) != member_size )
{
error = true;
- if( pp->verbosity >= 0 )
+ if( verbosity >= 0 )
{
Pp_show_msg( pp, 0 );
fprintf( stderr, "Member size mismatch; trailer says %llu, member size is %llu (0x%llX).\n",
Ft_get_member_size( trailer ), member_size, member_size );
}
}
- if( !error && pp->verbosity >= 2 && LZd_data_position( decoder ) > 0 && member_size > 0 )
+ if( !error && verbosity >= 2 && LZd_data_position( decoder ) > 0 && member_size > 0 )
fprintf( stderr, "%6.3f:1, %6.3f bits/byte, %5.2f%% saved. ",
(double)LZd_data_position( decoder ) / member_size,
( 8.0 * member_size ) / LZd_data_position( decoder ),
100.0 * ( 1.0 - ( (double)member_size / LZd_data_position( decoder ) ) ) );
- if( !error && pp->verbosity >= 4 )
+ if( !error && verbosity >= 4 )
fprintf( stderr, "data CRC %08X, data size %9llu, member size %8llu. ",
Ft_get_data_crc( trailer ),
Ft_get_data_size( trailer ), Ft_get_member_size( trailer ) );
@@ -195,29 +192,30 @@ bool LZd_verify_trailer( struct LZ_decoder * const decoder,
int LZd_decode_member( struct LZ_decoder * const decoder,
struct Pretty_print * const pp )
{
+ struct Range_decoder * const rdec = decoder->rdec;
unsigned rep0 = 0; /* rep[0-3] latest four distances */
unsigned rep1 = 0; /* used for efficient coding of */
unsigned rep2 = 0; /* repeated distances */
unsigned rep3 = 0;
State state = 0;
- Rd_load( decoder->rdec );
- while( !Rd_finished( decoder->rdec ) )
+ Rd_load( rdec );
+ while( !Rd_finished( rdec ) )
{
const int pos_state = LZd_data_position( decoder ) & pos_state_mask;
- if( Rd_decode_bit( decoder->rdec, &decoder->bm_match[state][pos_state] ) == 0 ) /* 1st bit */
+ if( Rd_decode_bit( rdec, &decoder->bm_match[state][pos_state] ) == 0 ) /* 1st bit */
{
const uint8_t prev_byte = LZd_get_prev_byte( decoder );
if( St_is_char( state ) )
{
state -= ( state < 4 ) ? state : 3;
- LZd_put_byte( decoder, Rd_decode_tree( decoder->rdec,
+ LZd_put_byte( decoder, Rd_decode_tree( rdec,
decoder->bm_literal[get_lit_state(prev_byte)], 8 ) );
}
else
{
state -= ( state < 10 ) ? 3 : 6;
- LZd_put_byte( decoder, Rd_decode_matched( decoder->rdec,
+ LZd_put_byte( decoder, Rd_decode_matched( rdec,
decoder->bm_literal[get_lit_state(prev_byte)],
LZd_get_byte( decoder, rep0 ) ) );
}
@@ -225,22 +223,22 @@ int LZd_decode_member( struct LZ_decoder * const decoder,
else
{
int len;
- if( Rd_decode_bit( decoder->rdec, &decoder->bm_rep[state] ) == 1 ) /* 2nd bit */
+ if( Rd_decode_bit( rdec, &decoder->bm_rep[state] ) == 1 ) /* 2nd bit */
{
- if( Rd_decode_bit( decoder->rdec, &decoder->bm_rep0[state] ) == 0 ) /* 3rd bit */
+ if( Rd_decode_bit( rdec, &decoder->bm_rep0[state] ) == 0 ) /* 3rd bit */
{
- if( Rd_decode_bit( decoder->rdec, &decoder->bm_len[state][pos_state] ) == 0 ) /* 4th bit */
+ if( Rd_decode_bit( rdec, &decoder->bm_len[state][pos_state] ) == 0 ) /* 4th bit */
{ state = St_set_short_rep( state );
LZd_put_byte( decoder, LZd_get_byte( decoder, rep0 ) ); continue; }
}
else
{
unsigned distance;
- if( Rd_decode_bit( decoder->rdec, &decoder->bm_rep1[state] ) == 0 ) /* 4th bit */
+ if( Rd_decode_bit( rdec, &decoder->bm_rep1[state] ) == 0 ) /* 4th bit */
distance = rep1;
else
{
- if( Rd_decode_bit( decoder->rdec, &decoder->bm_rep2[state] ) == 0 ) /* 5th bit */
+ if( Rd_decode_bit( rdec, &decoder->bm_rep2[state] ) == 0 ) /* 5th bit */
distance = rep2;
else
{ distance = rep3; rep3 = rep2; }
@@ -250,31 +248,30 @@ int LZd_decode_member( struct LZ_decoder * const decoder,
rep0 = distance;
}
state = St_set_rep( state );
- len = min_match_len + Rd_decode_len( decoder->rdec, &decoder->rep_len_model, pos_state );
+ len = min_match_len + Rd_decode_len( rdec, &decoder->rep_len_model, pos_state );
}
else
{
int dis_slot;
const unsigned rep0_saved = rep0;
- len = min_match_len + Rd_decode_len( decoder->rdec, &decoder->match_len_model, pos_state );
- dis_slot = Rd_decode_tree6( decoder->rdec, decoder->bm_dis_slot[get_dis_state(len)] );
+ len = min_match_len + Rd_decode_len( rdec, &decoder->match_len_model, pos_state );
+ dis_slot = Rd_decode_tree6( rdec, decoder->bm_dis_slot[get_dis_state(len)] );
if( dis_slot < start_dis_model ) rep0 = dis_slot;
else
{
const int direct_bits = ( dis_slot >> 1 ) - 1;
rep0 = ( 2 | ( dis_slot & 1 ) ) << direct_bits;
if( dis_slot < end_dis_model )
- rep0 += Rd_decode_tree_reversed( decoder->rdec,
- decoder->bm_dis + rep0 - dis_slot - 1,
- direct_bits );
+ rep0 += Rd_decode_tree_reversed( rdec,
+ decoder->bm_dis + rep0 - dis_slot - 1, direct_bits );
else
{
- rep0 += Rd_decode( decoder->rdec, direct_bits - dis_align_bits ) << dis_align_bits;
- rep0 += Rd_decode_tree_reversed4( decoder->rdec, decoder->bm_align );
+ rep0 += Rd_decode( rdec, direct_bits - dis_align_bits ) << dis_align_bits;
+ rep0 += Rd_decode_tree_reversed4( rdec, decoder->bm_align );
if( rep0 == 0xFFFFFFFFU ) /* Marker found */
{
rep0 = rep0_saved;
- Rd_normalize( decoder->rdec );
+ Rd_normalize( rdec );
LZd_flush_data( decoder );
if( len == min_match_len ) /* End Of Stream marker */
{
@@ -282,9 +279,9 @@ int LZd_decode_member( struct LZ_decoder * const decoder,
}
if( len == min_match_len + 1 ) /* Sync Flush marker */
{
- Rd_load( decoder->rdec ); continue;
+ Rd_load( rdec ); continue;
}
- if( pp->verbosity >= 0 )
+ if( verbosity >= 0 )
{
Pp_show_msg( pp, 0 );
fprintf( stderr, "Unsupported marker code '%d'.\n", len );
diff --git a/decoder.h b/decoder.h
index 1c6ed3d..280af82 100644
--- a/decoder.h
+++ b/decoder.h
@@ -1,4 +1,4 @@
-/* Clzip - Data compressor based on the LZMA algorithm
+/* Clzip - LZMA lossless data compressor
Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify
@@ -237,7 +237,6 @@ struct LZ_decoder
int stream_pos; /* first byte not yet written to file */
uint32_t crc;
int outfd; /* output file descriptor */
- int member_version;
Bit_model bm_literal[1<<literal_context_bits][0x300];
Bit_model bm_match[states][pos_states];
@@ -314,7 +313,6 @@ static inline bool LZd_init( struct LZ_decoder * const decoder,
decoder->stream_pos = 0;
decoder->crc = 0xFFFFFFFFU;
decoder->outfd = ofd;
- decoder->member_version = Fh_version( header );
Bm_array_init( decoder->bm_literal[0], (1 << literal_context_bits) * 0x300 );
Bm_array_init( decoder->bm_match[0], states * pos_states );
diff --git a/doc/clzip.1 b/doc/clzip.1
index 4fc2a26..6ad560c 100644
--- a/doc/clzip.1
+++ b/doc/clzip.1
@@ -1,12 +1,12 @@
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.37.1.
-.TH CLZIP "1" "May 2013" "Clzip 1.5-pre1" "User Commands"
+.TH CLZIP "1" "July 2013" "Clzip 1.5-pre2" "User Commands"
.SH NAME
Clzip \- reduces the size of files
.SH SYNOPSIS
.B clzip
[\fIoptions\fR] [\fIfiles\fR]
.SH DESCRIPTION
-Clzip \- Data compressor based on the LZMA algorithm.
+Clzip \- LZMA lossless data compressor.
.SH OPTIONS
.TP
\fB\-h\fR, \fB\-\-help\fR
diff --git a/doc/clzip.info b/doc/clzip.info
index 41723f3..263affa 100644
--- a/doc/clzip.info
+++ b/doc/clzip.info
@@ -3,7 +3,7 @@ clzip.texinfo.
INFO-DIR-SECTION Data Compression
START-INFO-DIR-ENTRY
-* Clzip: (clzip). Data compressor based on the LZMA algorithm
+* Clzip: (clzip). LZMA lossless data compressor
END-INFO-DIR-ENTRY

@@ -12,17 +12,17 @@ File: clzip.info, Node: Top, Next: Introduction, Up: (dir)
Clzip Manual
************
-This manual is for Clzip (version 1.5-pre1, 13 May 2013).
+This manual is for Clzip (version 1.5-pre2, 17 July 2013).
* Menu:
-* Introduction:: Purpose and features of clzip
-* Algorithm:: How clzip compresses the data
-* Invoking Clzip:: Command line interface
-* File Format:: Detailed format of the compressed file
-* Examples:: A small tutorial with examples
-* Problems:: Reporting bugs
-* Concept Index:: Index of concepts
+* Introduction:: Purpose and features of clzip
+* Algorithm:: How clzip compresses the data
+* Invoking clzip:: Command line interface
+* File format:: Detailed format of the compressed file
+* Examples:: A small tutorial with examples
+* Problems:: Reporting bugs
+* Concept index:: Index of concepts
Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
@@ -36,23 +36,39 @@ File: clzip.info, Node: Introduction, Next: Algorithm, Prev: Top, Up: Top
1 Introduction
**************
-Clzip is a lossless data compressor based on the LZMA algorithm, with
-very safe integrity checking and a user interface similar to the one of
-gzip or bzip2. Clzip decompresses almost as fast as gzip and compresses
-better than bzip2, which makes it well suited for software distribution
-and data archiving.
+Clzip is a lossless data compressor with a user interface similar to the
+one of gzip or bzip2. Clzip decompresses almost as fast as gzip and
+compresses more than bzip2, which makes it well suited for software
+distribution and data archiving. Clzip is a clean implementation of the
+LZMA algorithm.
- Clzip uses the same well-defined exit status values used by bzip2,
-which makes it safer when used in pipes or scripts than compressors
-returning ambiguous warning values, like gzip.
+ Clzip uses the same well-defined exit status values used by lzip and
+bzip2, which makes it safer when used in pipes or scripts than
+compressors returning ambiguous warning values, like gzip.
Clzip uses the lzip file format; the files produced by clzip are
-fully compatible with lzip-1.4 or newer. Clzip is in fact a C language
-version of lzip, intended for embedded devices or systems lacking a C++
-compiler.
+fully compatible with lzip-1.4 or newer, and can be rescued with
+lziprecover. Clzip is in fact a C language version of lzip, intended
+for embedded devices or systems lacking a C++ compiler.
+
+ The lzip file format is designed for long-term data archiving and
+provides very safe integrity checking. The member trailer stores the
+32-bit CRC of the original data, the size of the original data and the
+size of the member. These values, together with the value remaining in
+the range decoder and the end-of-stream marker, provide a 4 factor
+integrity checking which guarantees that the decompressed version of the
+data is identical to the original. This guards against corruption of the
+compressed data, and against undetected bugs in clzip (hopefully very
+unlikely). The chances of data corruption going undetected are
+microscopic. Be aware, though, that the check occurs upon decompression,
+so it can only tell you that something is wrong. It can't help you
+recover the original uncompressed data.
If you ever need to recover data from a damaged lzip file, try the
-lziprecover program.
+lziprecover program. Lziprecover makes lzip files resistant to bit-flip
+(one of the most common forms of data corruption), and provides data
+recovery capabilities, including error-checked merging of damaged copies
+of a file.
Clzip replaces every file given in the command line with a compressed
version of itself, with the name "original_name.lz". Each compressed
@@ -99,20 +115,8 @@ filename.lz becomes filename
filename.tlz becomes filename.tar
anyothername becomes anyothername.out
- As a self-check for your protection, clzip stores in the member
-trailer the 32-bit CRC of the original data, the size of the original
-data and the size of the member. These values, together with the value
-remaining in the range decoder and the end-of-stream marker, provide a
-very safe 4 factor integrity checking which guarantees that the
-decompressed version of the data is identical to the original. This
-guards against corruption of the compressed data, and against
-undetected bugs in clzip (hopefully very unlikely). The chances of data
-corruption going undetected are microscopic. Be aware, though, that the
-check occurs upon decompression, so it can only tell you that something
-is wrong. It can't help you recover the original uncompressed data.
-

-File: clzip.info, Node: Algorithm, Next: Invoking Clzip, Prev: Introduction, Up: Top
+File: clzip.info, Node: Algorithm, Next: Invoking clzip, Prev: Introduction, Up: Top
2 Algorithm
***********
@@ -173,9 +177,9 @@ range encoding), Igor Pavlov (for putting all the above together in
LZMA), and Julian Seward (for bzip2's CLI and the idea of unzcrash).

-File: clzip.info, Node: Invoking Clzip, Next: File Format, Prev: Algorithm, Up: Top
+File: clzip.info, Node: Invoking clzip, Next: File format, Prev: Algorithm, Up: Top
-3 Invoking Clzip
+3 Invoking clzip
****************
The format for running clzip is:
@@ -278,10 +282,10 @@ The format for running clzip is:
`--verbose'
Verbose mode.
When compressing, show the compression ratio for each file
- processed.
+ processed. A second -v shows the progress of compression.
When decompressing or testing, further -v's (up to 4) increase the
- verbosity level, showing status, dictionary size, compression
- ratio, and trailer contents (CRC, data size, member size).
+ verbosity level, showing status, compression ratio, dictionary
+ size, and trailer contents (CRC, data size, member size).
`-1 .. -9'
Set the compression parameters (dictionary size and match length
@@ -333,9 +337,9 @@ invalid input file, 3 for an internal consistency error (eg, bug) which
caused clzip to panic.

-File: clzip.info, Node: File Format, Next: Examples, Prev: Invoking Clzip, Up: Top
+File: clzip.info, Node: File format, Next: Examples, Prev: Invoking clzip, Up: Top
-4 File Format
+4 File format
*************
Perfection is reached, not when there is no longer anything to add, but
@@ -389,7 +393,8 @@ additional information before, between, or after them.
`Lzma stream'
The lzma stream, finished by an end of stream marker. Uses default
- values for encoder properties.
+ values for encoder properties. See the lzip manual for a full
+ description.
`CRC32 (4 bytes)'
CRC of the uncompressed original data.
@@ -405,7 +410,7 @@ additional information before, between, or after them.

-File: clzip.info, Node: Examples, Next: Problems, Prev: File Format, Up: Top
+File: clzip.info, Node: Examples, Next: Problems, Prev: File format, Up: Top
5 A small tutorial with examples
********************************
@@ -478,7 +483,7 @@ file with a member size of 32MiB.
clzip -b 32MiB -S 650MB big_db

-File: clzip.info, Node: Problems, Next: Concept Index, Prev: Examples, Up: Top
+File: clzip.info, Node: Problems, Next: Concept index, Prev: Examples, Up: Top
6 Reporting Bugs
****************
@@ -493,9 +498,9 @@ for all eternity, if not longer.
by running `clzip --version'.

-File: clzip.info, Node: Concept Index, Prev: Problems, Up: Top
+File: clzip.info, Node: Concept index, Prev: Problems, Up: Top
-Concept Index
+Concept index
*************
@@ -504,25 +509,25 @@ Concept Index
* algorithm: Algorithm. (line 6)
* bugs: Problems. (line 6)
* examples: Examples. (line 6)
-* file format: File Format. (line 6)
+* file format: File format. (line 6)
* getting help: Problems. (line 6)
* introduction: Introduction. (line 6)
-* invoking: Invoking Clzip. (line 6)
-* options: Invoking Clzip. (line 6)
-* usage: Invoking Clzip. (line 6)
-* version: Invoking Clzip. (line 6)
+* invoking: Invoking clzip. (line 6)
+* options: Invoking clzip. (line 6)
+* usage: Invoking clzip. (line 6)
+* version: Invoking clzip. (line 6)

Tag Table:
-Node: Top226
-Node: Introduction920
-Node: Algorithm4811
-Node: Invoking Clzip7335
-Node: File Format12847
-Node: Examples15277
-Node: Problems17238
-Node: Concept Index17764
+Node: Top212
+Node: Introduction914
+Node: Algorithm5096
+Node: Invoking clzip7620
+Node: File format13179
+Node: Examples15658
+Node: Problems17619
+Node: Concept index18145

End Tag Table
diff --git a/doc/clzip.texinfo b/doc/clzip.texinfo
index e372d60..49d0761 100644
--- a/doc/clzip.texinfo
+++ b/doc/clzip.texinfo
@@ -6,19 +6,19 @@
@finalout
@c %**end of header
-@set UPDATED 13 May 2013
-@set VERSION 1.5-pre1
+@set UPDATED 17 July 2013
+@set VERSION 1.5-pre2
@dircategory Data Compression
@direntry
-* Clzip: (clzip). Data compressor based on the LZMA algorithm
+* Clzip: (clzip). LZMA lossless data compressor
@end direntry
@ifnothtml
@titlepage
@title Clzip
-@subtitle Data compressor based on the LZMA algorithm
+@subtitle LZMA lossless data compressor
@subtitle for Clzip version @value{VERSION}, @value{UPDATED}
@author by Antonio Diaz Diaz
@@ -35,13 +35,13 @@
This manual is for Clzip (version @value{VERSION}, @value{UPDATED}).
@menu
-* Introduction:: Purpose and features of clzip
-* Algorithm:: How clzip compresses the data
-* Invoking Clzip:: Command line interface
-* File Format:: Detailed format of the compressed file
-* Examples:: A small tutorial with examples
-* Problems:: Reporting bugs
-* Concept Index:: Index of concepts
+* Introduction:: Purpose and features of clzip
+* Algorithm:: How clzip compresses the data
+* Invoking clzip:: Command line interface
+* File format:: Detailed format of the compressed file
+* Examples:: A small tutorial with examples
+* Problems:: Reporting bugs
+* Concept index:: Index of concepts
@end menu
@sp 1
@@ -55,23 +55,39 @@ to copy, distribute and modify it.
@chapter Introduction
@cindex introduction
-Clzip is a lossless data compressor based on the LZMA algorithm, with
-very safe integrity checking and a user interface similar to the one of
-gzip or bzip2. Clzip decompresses almost as fast as gzip and compresses
-better than bzip2, which makes it well suited for software distribution
-and data archiving.
+Clzip is a lossless data compressor with a user interface similar to the
+one of gzip or bzip2. Clzip decompresses almost as fast as gzip and
+compresses more than bzip2, which makes it well suited for software
+distribution and data archiving. Clzip is a clean implementation of the
+LZMA algorithm.
-Clzip uses the same well-defined exit status values used by bzip2, which
-makes it safer when used in pipes or scripts than compressors returning
-ambiguous warning values, like gzip.
+Clzip uses the same well-defined exit status values used by lzip and
+bzip2, which makes it safer when used in pipes or scripts than
+compressors returning ambiguous warning values, like gzip.
Clzip uses the lzip file format; the files produced by clzip are fully
-compatible with lzip-1.4 or newer. Clzip is in fact a C language version
-of lzip, intended for embedded devices or systems lacking a C++
-compiler.
+compatible with lzip-1.4 or newer, and can be rescued with lziprecover.
+Clzip is in fact a C language version of lzip, intended for embedded
+devices or systems lacking a C++ compiler.
+
+The lzip file format is designed for long-term data archiving and
+provides very safe integrity checking. The member trailer stores the
+32-bit CRC of the original data, the size of the original data and the
+size of the member. These values, together with the value remaining in
+the range decoder and the end-of-stream marker, provide a 4 factor
+integrity checking which guarantees that the decompressed version of the
+data is identical to the original. This guards against corruption of the
+compressed data, and against undetected bugs in clzip (hopefully very
+unlikely). The chances of data corruption going undetected are
+microscopic. Be aware, though, that the check occurs upon decompression,
+so it can only tell you that something is wrong. It can't help you
+recover the original uncompressed data.
If you ever need to recover data from a damaged lzip file, try the
-lziprecover program.
+lziprecover program. Lziprecover makes lzip files resistant to bit-flip
+(one of the most common forms of data corruption), and provides data
+recovery capabilities, including error-checked merging of damaged copies
+of a file.
Clzip replaces every file given in the command line with a compressed
version of itself, with the name "original_name.lz". Each compressed
@@ -120,18 +136,6 @@ file from that of the compressed file as follows:
@item anyothername @tab becomes @tab anyothername.out
@end multitable
-As a self-check for your protection, clzip stores in the member trailer
-the 32-bit CRC of the original data, the size of the original data and
-the size of the member. These values, together with the value remaining
-in the range decoder and the end-of-stream marker, provide a very safe 4
-factor integrity checking which guarantees that the decompressed version
-of the data is identical to the original. This guards against corruption
-of the compressed data, and against undetected bugs in clzip (hopefully
-very unlikely). The chances of data corruption going undetected are
-microscopic. Be aware, though, that the check occurs upon decompression,
-so it can only tell you that something is wrong. It can't help you
-recover the original uncompressed data.
-
@node Algorithm
@chapter Algorithm
@@ -194,8 +198,8 @@ range encoding), Igor Pavlov (for putting all the above together in
LZMA), and Julian Seward (for bzip2's CLI and the idea of unzcrash).
-@node Invoking Clzip
-@chapter Invoking Clzip
+@node Invoking clzip
+@chapter Invoking clzip
@cindex invoking
@cindex options
@cindex usage
@@ -296,9 +300,10 @@ Use it together with @samp{-v} to see information about the file.
@item -v
@itemx --verbose
Verbose mode.@*
-When compressing, show the compression ratio for each file processed.@*
+When compressing, show the compression ratio for each file processed. A
+second -v shows the progress of compression.@*
When decompressing or testing, further -v's (up to 4) increase the
-verbosity level, showing status, dictionary size, compression ratio,
+verbosity level, showing status, compression ratio, dictionary size,
and trailer contents (CRC, data size, member size).
@item -1 .. -9
@@ -356,8 +361,8 @@ invalid input file, 3 for an internal consistency error (eg, bug) which
caused clzip to panic.
-@node File Format
-@chapter File Format
+@node File format
+@chapter File format
@cindex file format
Perfection is reached, not when there is no longer anything to add, but
@@ -415,7 +420,7 @@ Valid values for dictionary size range from 4KiB to 512MiB.
@item Lzma stream
The lzma stream, finished by an end of stream marker. Uses default values
-for encoder properties.
+for encoder properties. See the lzip manual for a full description.
@item CRC32 (4 bytes)
CRC of the uncompressed original data.
@@ -549,8 +554,8 @@ If you find a bug in clzip, please send electronic mail to
find by running @w{@samp{clzip --version}}.
-@node Concept Index
-@unnumbered Concept Index
+@node Concept index
+@unnumbered Concept index
@printindex cp
diff --git a/encoder.c b/encoder.c
index 5b005b0..312c569 100644
--- a/encoder.c
+++ b/encoder.c
@@ -1,4 +1,4 @@
-/* Clzip - Data compressor based on the LZMA algorithm
+/* Clzip - LZMA lossless data compressor
Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify
@@ -248,6 +248,7 @@ void Re_flush_data( struct Range_encoder * const renc )
{ show_error( "Write error", errno, false ); cleanup_and_fail( 1 ); }
renc->partial_member_pos += renc->pos;
renc->pos = 0;
+ if( verbosity >= 2 ) show_progress( 0, 0, 0, 0 );
}
}
@@ -289,17 +290,16 @@ static void LZe_full_flush( struct LZ_encoder * const encoder, const State state
int i;
const int pos_state = Mf_data_position( encoder->matchfinder ) & pos_state_mask;
File_trailer trailer;
- Re_encode_bit( &encoder->range_encoder, &encoder->bm_match[state][pos_state], 1 );
- Re_encode_bit( &encoder->range_encoder, &encoder->bm_rep[state], 0 );
+ Re_encode_bit( &encoder->renc, &encoder->bm_match[state][pos_state], 1 );
+ Re_encode_bit( &encoder->renc, &encoder->bm_rep[state], 0 );
LZe_encode_pair( encoder, 0xFFFFFFFFU, min_match_len, pos_state );
- Re_flush( &encoder->range_encoder );
+ Re_flush( &encoder->renc );
Ft_set_data_crc( trailer, LZe_crc( encoder ) );
Ft_set_data_size( trailer, Mf_data_position( encoder->matchfinder ) );
- Ft_set_member_size( trailer, Re_member_position( &encoder->range_encoder ) +
- Ft_size );
+ Ft_set_member_size( trailer, Re_member_position( &encoder->renc ) + Ft_size );
for( i = 0; i < Ft_size; ++i )
- Re_put_byte( &encoder->range_encoder, trailer[i] );
- Re_flush_data( &encoder->range_encoder );
+ Re_put_byte( &encoder->renc, trailer[i] );
+ Re_flush_data( &encoder->renc );
}
@@ -368,7 +368,7 @@ bool LZe_init( struct LZ_encoder * const encoder,
Bm_array_init( encoder->bm_align, dis_align_size );
encoder->matchfinder = mf;
- if( !Re_init( &encoder->range_encoder, outfd ) ) return false;
+ if( !Re_init( &encoder->renc, outfd ) ) return false;
Lee_init( &encoder->match_len_encoder, encoder->matchfinder->match_len_limit );
Lee_init( &encoder->rep_len_encoder, encoder->matchfinder->match_len_limit );
encoder->num_dis_slots =
@@ -377,13 +377,13 @@ bool LZe_init( struct LZ_encoder * const encoder,
encoder->align_price_count = 0;
for( i = 0; i < Fh_size; ++i )
- Re_put_byte( &encoder->range_encoder, header[i] );
+ Re_put_byte( &encoder->renc, header[i] );
return true;
}
/* Return value == number of bytes advanced (ahead).
- trials[0]..trials[retval-1] contain the steps to encode.
+ trials[0]..trials[ahead-1] contain the steps to encode.
( trials[0].dis == -1 && trials[0].price == 1 ) means literal.
*/
static int LZe_sequence_optimizer( struct LZ_encoder * const encoder,
@@ -584,8 +584,7 @@ static int LZe_sequence_optimizer( struct LZ_encoder * const encoder,
if( St_is_char( cur_state ) )
next_price += LZe_price_literal( encoder, prev_byte, cur_byte );
else
- next_price += LZe_price_matched( encoder,
- prev_byte, cur_byte, match_byte );
+ next_price += LZe_price_matched( encoder, prev_byte, cur_byte, match_byte );
Mf_move_pos( encoder->matchfinder );
/* try last updates to next trial */
@@ -756,14 +755,14 @@ bool LZe_encode_member( struct LZ_encoder * const encoder,
for( i = 0; i < num_rep_distances; ++i ) rep_distances[i] = 0;
if( Mf_data_position( encoder->matchfinder ) != 0 ||
- Re_member_position( &encoder->range_encoder ) != Fh_size )
+ Re_member_position( &encoder->renc ) != Fh_size )
return false; /* can be called only once */
if( !Mf_finished( encoder->matchfinder ) ) /* encode first byte */
{
const uint8_t prev_byte = 0;
const uint8_t cur_byte = Mf_peek( encoder->matchfinder, 0 );
- Re_encode_bit( &encoder->range_encoder, &encoder->bm_match[state][0], 0 );
+ Re_encode_bit( &encoder->renc, &encoder->bm_match[state][0], 0 );
LZe_encode_literal( encoder, prev_byte, cur_byte );
CRC32_update_byte( &encoder->crc, cur_byte );
Mf_get_match_pairs( encoder->matchfinder, 0 );
@@ -791,7 +790,7 @@ bool LZe_encode_member( struct LZ_encoder * const encoder,
const int len = encoder->trials[i].price;
bool bit = ( dis < 0 && len == 1 );
- Re_encode_bit( &encoder->range_encoder,
+ Re_encode_bit( &encoder->renc,
&encoder->bm_match[state][pos_state], !bit );
if( bit ) /* literal byte */
{
@@ -813,23 +812,23 @@ bool LZe_encode_member( struct LZ_encoder * const encoder,
CRC32_update_buf( &encoder->crc, Mf_ptr_to_current_pos( encoder->matchfinder ) - ahead, len );
LZe_mtf_reps( dis, rep_distances );
bit = ( dis < num_rep_distances );
- Re_encode_bit( &encoder->range_encoder, &encoder->bm_rep[state], bit );
+ Re_encode_bit( &encoder->renc, &encoder->bm_rep[state], bit );
if( bit )
{
bit = ( dis == 0 );
- Re_encode_bit( &encoder->range_encoder, &encoder->bm_rep0[state], !bit );
+ Re_encode_bit( &encoder->renc, &encoder->bm_rep0[state], !bit );
if( bit )
- Re_encode_bit( &encoder->range_encoder, &encoder->bm_len[state][pos_state], len > 1 );
+ Re_encode_bit( &encoder->renc, &encoder->bm_len[state][pos_state], len > 1 );
else
{
- Re_encode_bit( &encoder->range_encoder, &encoder->bm_rep1[state], dis > 1 );
+ Re_encode_bit( &encoder->renc, &encoder->bm_rep1[state], dis > 1 );
if( dis > 1 )
- Re_encode_bit( &encoder->range_encoder, &encoder->bm_rep2[state], dis > 2 );
+ Re_encode_bit( &encoder->renc, &encoder->bm_rep2[state], dis > 2 );
}
if( len == 1 ) state = St_set_short_rep( state );
else
{
- Lee_encode( &encoder->rep_len_encoder, &encoder->range_encoder, len, pos_state );
+ Lee_encode( &encoder->rep_len_encoder, &encoder->renc, len, pos_state );
state = St_set_rep( state );
}
}
@@ -841,7 +840,7 @@ bool LZe_encode_member( struct LZ_encoder * const encoder,
}
}
ahead -= len; i += len;
- if( Re_member_position( &encoder->range_encoder ) >= member_size_limit )
+ if( Re_member_position( &encoder->renc ) >= member_size_limit )
{
if( !Mf_dec_pos( encoder->matchfinder, ahead ) ) return false;
LZe_full_flush( encoder, state );
diff --git a/encoder.h b/encoder.h
index a69f552..365d602 100644
--- a/encoder.h
+++ b/encoder.h
@@ -1,4 +1,4 @@
-/* Clzip - Data compressor based on the LZMA algorithm
+/* Clzip - LZMA lossless data compressor
Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify
@@ -493,7 +493,7 @@ struct LZ_encoder
Bit_model bm_align[dis_align_size];
struct Matchfinder * matchfinder;
- struct Range_encoder range_encoder;
+ struct Range_encoder renc;
struct Len_encoder match_len_encoder;
struct Len_encoder rep_len_encoder;
@@ -512,7 +512,7 @@ bool LZe_init( struct LZ_encoder * const encoder,
const File_header header, const int outfd );
static inline void LZe_free( struct LZ_encoder * const encoder )
- { Re_free( &encoder->range_encoder ); }
+ { Re_free( &encoder->renc ); }
static inline unsigned LZe_crc( const struct LZ_encoder * const encoder )
{ return encoder->crc ^ 0xFFFFFFFFU; }
@@ -597,13 +597,13 @@ static inline int LZe_price_matched( const struct LZ_encoder * const encoder,
static inline void LZe_encode_literal( struct LZ_encoder * const encoder,
uint8_t prev_byte, uint8_t symbol )
- { Re_encode_tree( &encoder->range_encoder,
+ { Re_encode_tree( &encoder->renc,
encoder->bm_literal[get_lit_state(prev_byte)], symbol, 8 ); }
static inline void LZe_encode_matched( struct LZ_encoder * const encoder,
uint8_t prev_byte, uint8_t symbol,
uint8_t match_byte )
- { Re_encode_matched( &encoder->range_encoder,
+ { Re_encode_matched( &encoder->renc,
encoder->bm_literal[get_lit_state(prev_byte)],
symbol, match_byte ); }
@@ -612,9 +612,8 @@ static inline void LZe_encode_pair( struct LZ_encoder * const encoder,
const int pos_state )
{
const int dis_slot = get_slot( dis );
- Lee_encode( &encoder->match_len_encoder, &encoder->range_encoder, len, pos_state );
- Re_encode_tree( &encoder->range_encoder,
- encoder->bm_dis_slot[get_dis_state(len)],
+ Lee_encode( &encoder->match_len_encoder, &encoder->renc, len, pos_state );
+ Re_encode_tree( &encoder->renc, encoder->bm_dis_slot[get_dis_state(len)],
dis_slot, dis_slot_bits );
if( dis_slot >= start_dis_model )
@@ -624,14 +623,14 @@ static inline void LZe_encode_pair( struct LZ_encoder * const encoder,
const uint32_t direct_dis = dis - base;
if( dis_slot < end_dis_model )
- Re_encode_tree_reversed( &encoder->range_encoder,
+ Re_encode_tree_reversed( &encoder->renc,
encoder->bm_dis + base - dis_slot - 1,
direct_dis, direct_bits );
else
{
- Re_encode( &encoder->range_encoder, direct_dis >> dis_align_bits,
+ Re_encode( &encoder->renc, direct_dis >> dis_align_bits,
direct_bits - dis_align_bits );
- Re_encode_tree_reversed( &encoder->range_encoder, encoder->bm_align,
+ Re_encode_tree_reversed( &encoder->renc, encoder->bm_align,
direct_dis, dis_align_bits );
--encoder->align_price_count;
}
diff --git a/lzip.h b/lzip.h
index 1996e97..6141d5d 100644
--- a/lzip.h
+++ b/lzip.h
@@ -1,4 +1,4 @@
-/* Clzip - Data compressor based on the LZMA algorithm
+/* Clzip - LZMA lossless data compressor
Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify
@@ -118,12 +118,11 @@ struct Pretty_print
const char * name;
const char * stdin_name;
int longest_name;
- int verbosity;
bool first_post;
};
void Pp_init( struct Pretty_print * const pp, const char * const filenames[],
- const int num_filenames, const int v );
+ const int num_filenames );
static inline void Pp_set_name( struct Pretty_print * const pp,
const char * const filename )
@@ -193,7 +192,7 @@ static inline uint8_t Fh_version( const File_header data )
{ return data[4]; }
static inline bool Fh_verify_version( const File_header data )
- { return ( data[4] <= 1 ); }
+ { return ( data[4] == 1 ); }
static inline unsigned Fh_get_dictionary_size( const File_header data )
{
@@ -230,9 +229,6 @@ typedef uint8_t File_trailer[20];
enum { Ft_size = 20 };
-static inline int Ft_versioned_size( const int version )
- { return ( ( version >= 1 ) ? 20 : 12 ); }
-
static inline unsigned Ft_get_data_crc( const File_trailer data )
{
unsigned tmp = 0;
@@ -281,6 +277,13 @@ int readblock( const int fd, uint8_t * const buf, const int size );
int writeblock( const int fd, const uint8_t * const buf, const int size );
/* defined in main.c */
+extern int verbosity;
void cleanup_and_fail( const int retval );
void show_error( const char * const msg, const int errcode, const bool help );
void internal_error( const char * const msg );
+struct Matchfinder;
+struct stat;
+void show_progress( const unsigned long long partial_size,
+ const struct Matchfinder * const m,
+ struct Pretty_print * const p,
+ const struct stat * const in_statsp );
diff --git a/main.c b/main.c
index 9ca4f90..c1057b5 100644
--- a/main.c
+++ b/main.c
@@ -1,4 +1,4 @@
-/* Clzip - Data compressor based on the LZMA algorithm
+/* Clzip - LZMA lossless data compressor
Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
This program is free software: you can redistribute it and/or modify
@@ -98,7 +98,7 @@ bool delete_output_on_interrupt = false;
static void show_help( void )
{
- printf( "%s - Data compressor based on the LZMA algorithm.\n", Program_name );
+ printf( "%s - LZMA lossless data compressor.\n", Program_name );
printf( "\nUsage: %s [options] [files]\n", invocation_name );
printf( "\nOptions:\n"
" -h, --help display this help and exit\n"
@@ -459,22 +459,23 @@ static int compress( const unsigned long long member_size,
while( true ) /* encode one member per iteration */
{
struct LZ_encoder encoder;
- const unsigned long long size = ( ( volume_size > 0 ) ?
- min( member_size, volume_size - partial_volume_size ) : member_size );
+ const unsigned long long size = ( volume_size > 0 ) ?
+ min( member_size, volume_size - partial_volume_size ) : member_size;
if( !LZe_init( &encoder, &matchfinder, header, outfd ) )
{
show_error( "Not enough memory. Try a smaller dictionary size.", 0, false );
cleanup_and_fail( 1 );
}
+ show_progress( in_size, &matchfinder, pp, in_statsp ); /* init */
if( !LZe_encode_member( &encoder, size ) )
{ Pp_show_msg( pp, "Encoder error" ); retval = 1; break; }
in_size += Mf_data_position( &matchfinder );
- out_size += Re_member_position( &encoder.range_encoder );
+ out_size += Re_member_position( &encoder.renc );
LZe_free( &encoder );
if( Mf_finished( &matchfinder ) ) break;
if( volume_size > 0 )
{
- partial_volume_size += Re_member_position( &encoder.range_encoder );
+ partial_volume_size += Re_member_position( &encoder.renc );
if( partial_volume_size >= volume_size - min_dictionary_size )
{
partial_volume_size = 0;
@@ -604,14 +605,13 @@ static void set_signals( void )
void Pp_init( struct Pretty_print * const pp, const char * const filenames[],
- const int num_filenames, const int v )
+ const int num_filenames )
{
unsigned stdin_name_len;
int i;
pp->name = 0;
pp->stdin_name = "(stdin)";
pp->longest_name = 0;
- pp->verbosity = v;
pp->first_post = false;
stdin_name_len = strlen( pp->stdin_name );
@@ -650,6 +650,34 @@ void internal_error( const char * const msg )
}
+void show_progress( const unsigned long long partial_size,
+ const struct Matchfinder * const m,
+ struct Pretty_print * const p,
+ const struct stat * const in_statsp )
+ {
+ static unsigned long long cfile_size = 0; /* file_size / 100 */
+ static unsigned long long psize = 0;
+ static const struct Matchfinder * mf = 0;
+ static struct Pretty_print * pp = 0;
+
+ if( m ) /* initialize static vars */
+ {
+ psize = partial_size; mf = m; pp = p;
+ cfile_size = ( in_statsp && S_ISREG( in_statsp->st_mode ) ) ?
+ in_statsp->st_size / 100 : 0;
+ return;
+ }
+ if( mf && pp )
+ {
+ const unsigned long long pos = psize + Mf_data_position( mf );
+ if( cfile_size > 0 )
+ fprintf( stderr, "%4llu%%", pos / cfile_size );
+ fprintf( stderr, " %.1f MB\r", pos / 1000000.0 );
+ Pp_reset( pp ); Pp_show_msg( pp, 0 ); /* restore cursor position */
+ }
+ }
+
+
int main( const int argc, const char * const argv[] )
{
/* Mapping from gzip/bzip2 style 1..9 compression modes
@@ -785,7 +813,7 @@ int main( const int argc, const char * const argv[] )
( filenames_given || default_output_filename[0] ) )
set_signals();
- Pp_init( &pp, filenames, num_filenames, verbosity );
+ Pp_init( &pp, filenames, num_filenames );
output_filename = resize_buffer( output_filename, 1 );
for( i = 0; i < num_filenames; ++i )
diff --git a/testsuite/check.sh b/testsuite/check.sh
index d38ebb0..980c3da 100755
--- a/testsuite/check.sh
+++ b/testsuite/check.sh
@@ -1,5 +1,5 @@
#! /bin/sh
-# check script for Clzip - Data compressor based on the LZMA algorithm
+# check script for Clzip - LZMA lossless data compressor
# Copyright (C) 2010, 2011, 2012, 2013 Antonio Diaz Diaz.
#
# This script is free software: you have unlimited permission
@@ -22,27 +22,28 @@ mkdir tmp
cd "${objdir}"/tmp
cat "${testdir}"/test.txt > in || framework_failure
+in_lz="${testdir}"/test.txt.lz
fail=0
printf "testing clzip-%s..." "$2"
"${LZIP}" -cqs-1 in > /dev/null
-if [ $? != 1 ] ; then fail=1 ; printf - ; else printf . ; fi
+if [ $? = 1 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -cqs0 in > /dev/null
-if [ $? != 1 ] ; then fail=1 ; printf - ; else printf . ; fi
+if [ $? = 1 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -cqs4095 in > /dev/null
-if [ $? != 1 ] ; then fail=1 ; printf - ; else printf . ; fi
+if [ $? = 1 ] ; then printf . ; else fail=1 ; printf - ; fi
"${LZIP}" -cqm274 in > /dev/null
-if [ $? != 1 ] ; then fail=1 ; printf - ; else printf . ; fi
+if [ $? = 1 ] ; then printf . ; else fail=1 ; printf - ; fi
-"${LZIP}" -t "${testdir}"/test.txt.lz || fail=1
-"${LZIP}" -cd "${testdir}"/test.txt.lz > copy || fail=1
+"${LZIP}" -t "${in_lz}" || fail=1
+"${LZIP}" -cd "${in_lz}" > copy || fail=1
cmp in copy || fail=1
printf .
-"${LZIP}" -cfq "${testdir}"/test.txt.lz > out
-if [ $? != 1 ] ; then fail=1 ; printf - ; else printf . ; fi
-"${LZIP}" -cF "${testdir}"/test.txt.lz > out || fail=1
+"${LZIP}" -cfq "${in_lz}" > out
+if [ $? = 1 ] ; then printf . ; else fail=1 ; printf - ; fi
+"${LZIP}" -cF "${in_lz}" > out || fail=1
"${LZIP}" -cd out | "${LZIP}" -d > copy || fail=1
cmp in copy || fail=1
printf .
@@ -53,30 +54,30 @@ for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do
printf "garbage" >> copy.lz || fail=1
"${LZIP}" -df copy.lz || fail=1
cmp in copy || fail=1
- printf .
done
+printf .
for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do
"${LZIP}" -c -$i in > out || fail=1
printf "g" >> out || fail=1
"${LZIP}" -cd out > copy || fail=1
cmp in copy || fail=1
- printf .
done
+printf .
for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do
"${LZIP}" -$i < in > out || fail=1
"${LZIP}" -d < out > copy || fail=1
cmp in copy || fail=1
- printf .
done
+printf .
for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do
"${LZIP}" -f -$i -o out < in || fail=1
"${LZIP}" -df -o copy < out.lz || fail=1
cmp in copy || fail=1
- printf .
done
+printf .
"${LZIP}" < in > anyothername || fail=1
"${LZIP}" -d anyothername || fail=1