From 50c0d5bace5875eb0a3c4c7aafa564244e46e59c Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sun, 8 Nov 2015 05:31:16 +0100 Subject: Merging upstream version 1.2. Signed-off-by: Daniel Baumann --- doc/zutils.texi | 745 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 745 insertions(+) create mode 100644 doc/zutils.texi (limited to 'doc/zutils.texi') diff --git a/doc/zutils.texi b/doc/zutils.texi new file mode 100644 index 0000000..09237d2 --- /dev/null +++ b/doc/zutils.texi @@ -0,0 +1,745 @@ +\input texinfo @c -*-texinfo-*- +@c %**start of header +@setfilename zutils.info +@documentencoding ISO-8859-15 +@settitle Zutils Manual +@finalout +@c %**end of header + +@set UPDATED 1 February 2014 +@set VERSION 1.2 + +@dircategory Data Compression +@direntry +* Zutils: (zutils). Utilities dealing with compressed files +@end direntry + + +@ifnothtml +@titlepage +@title Zutils +@subtitle Utilities dealing with compressed files +@subtitle for Zutils version @value{VERSION}, @value{UPDATED} +@author by Antonio Diaz Diaz + +@page +@vskip 0pt plus 1filll +@end titlepage + +@contents +@end ifnothtml + +@node Top +@top + +This manual is for Zutils (version @value{VERSION}, @value{UPDATED}). + +@menu +* Introduction:: Purpose and features of zutils +* Common options:: Common options +* The zutilsrc file:: The zutils configuration file +* Zcat:: Concatenating compressed files +* Zcmp:: Comparing compressed files byte by byte +* Zdiff:: Comparing compressed files line by line +* Zgrep:: Searching inside compressed files +* Ztest:: Testing integrity of compressed files +* Zupdate:: Recompressing files to lzip format +* Problems:: Reporting bugs +* Concept index:: Index of concepts +@end menu + +@sp 1 +Copyright @copyright{} 2009, 2010, 2011, 2012, 2013, 2014 +Antonio Diaz Diaz. + +This manual is free documentation: you have unlimited permission +to copy, distribute and modify it. + + +@node Introduction +@chapter Introduction +@cindex introduction + +Zutils is a collection of utilities able to deal with any combination of +compressed and uncompressed files transparently. If any given file, +including standard input, is compressed, its decompressed content is +used. Compressed files are decompressed on the fly; no temporary files +are created. + +These utilities are not wrapper scripts but safer and more efficient C++ +programs. In particular the @samp{--recursive} option is very efficient +in those utilities supporting it. + +@noindent +The provided utilities are zcat, zcmp, zdiff, zgrep, ztest and zupdate.@* +The supported formats are bzip2, gzip, lzip and xz.@* +The compressor to be used for each format is configurable at runtime. + +Zcat, zcmp, zdiff, and zgrep are improved replacements for the shell +scripts provided with GNU gzip. Ztest is unique to zutils. Zupdate is +similar to gzip's znew. + +NOTE: Bzip2 and lzip provide well-defined values of exit status, which +makes them safe to use with zutils. Gzip and xz may return ambiguous +warning values, making them less reliable back ends for zutils. + +LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never +have been compressed. Decompressed is used to refer to data which has +undergone the process of decompression. + +@sp 1 +Numbers given as arguments to options (positions, sizes) may be followed +by a multiplier and an optional @samp{B} for "byte". + +Table of SI and binary prefixes (unit multipliers): + +@multitable {Prefix} {kilobyte (10^3 = 1000)} {|} {Prefix} {kibibyte (2^10 = 1024)} +@item Prefix @tab Value @tab | @tab Prefix @tab Value +@item k @tab kilobyte (10^3 = 1000) @tab | @tab Ki @tab kibibyte (2^10 = 1024) +@item M @tab megabyte (10^6) @tab | @tab Mi @tab mebibyte (2^20) +@item G @tab gigabyte (10^9) @tab | @tab Gi @tab gibibyte (2^30) +@item T @tab terabyte (10^12) @tab | @tab Ti @tab tebibyte (2^40) +@item P @tab petabyte (10^15) @tab | @tab Pi @tab pebibyte (2^50) +@item E @tab exabyte (10^18) @tab | @tab Ei @tab exbibyte (2^60) +@item Z @tab zettabyte (10^21) @tab | @tab Zi @tab zebibyte (2^70) +@item Y @tab yottabyte (10^24) @tab | @tab Yi @tab yobibyte (2^80) +@end multitable + + +@node Common options +@chapter Common options +@cindex common options + +The following options are available in all the utilities. Rather than +writing identical descriptions for each of the programs, they are +described here. + +@table @samp +@item -h +@itemx --help +Print an informative help message describing the options and exit. Zgrep +only supports the @samp{--help} form of this option. + +@item -V +@itemx --version +Print the version number on the standard output and exit. + +@item -N +@itemx --no-rcfile +Don't read the runtime configuration file @samp{zutilsrc}. + +@item --bz2=@var{command} +@itemx --gz=@var{command} +@itemx --lz=@var{command} +@itemx --xz=@var{command} +Set program (may include arguments) to be used as (de)compressor for the +given format. These options override the values set in @file{zutilsrc}. +The compression program used must meet three requirements: + +@enumerate +@item +When called with the @samp{-d} option, it must read compressed data from +the standard input and produce decompressed data on the standard output. +@item +If the @samp{-q} option is passed to zutils, the compression program +must also accept it. +@item +It must return 0 if no errors occurred, and a non-zero value otherwise. +@end enumerate + +@end table + + +@node The zutilsrc file +@chapter The zutilsrc file +@cindex the zutilsrc file + +@file{zutilsrc} is the runtime configuration file for zutils. In it you +may define the compressor name and options to be used for each format. +The @file{zutilsrc} file is optional; you do not need to install it in +order to run zutils. + +The compressors specified in the command line override those specified +in the @file{zutilsrc} file. + +You may copy the system @file{zutilsrc} file +@file{$@{sysconfdir@}/zutilsrc} to @file{$HOME/.zutilsrc} and customize +these options as you like. The file syntax is fairly obvious (and there +are further instructions in it): + +@enumerate +@item +Any line beginning with @samp{#} is a comment line. +@item +Each non-comment line defines the command to be used for the given +format, with the syntax: +@example + = [options] +@end example +where is one of @samp{bz2}, @samp{gz}, @samp{lz} or @samp{xz}. +@end enumerate + + +@node Zcat +@chapter Zcat +@cindex zcat + +Zcat copies each given file (@samp{-} means standard input), to standard +output. If any given file is compressed, its decompressed content is +used. If a given file does not exist, and its name does not end with one +of the known extensions, zcat tries the compressed file names +corresponding to the supported formats. + +If no files are specified, data is read from standard input, +decompressed if needed, and sent to standard output. Data read from +standard input must be of the same type; all uncompressed or all in the +same compression format. + +The format for running zcat is: + +@example +zcat [@var{options}] [@var{files}] +@end example + +@noindent +Exit status is 0 if no errors occurred, non-zero otherwise. + +Zcat supports the following options: + +@table @samp +@item -A +@itemx --show-all +Equivalent to @samp{-vET}. + +@item -b +@itemx --number-nonblank +Number all nonblank output lines, starting with 1. The line count is +unlimited. + +@item -e +Equivalent to @samp{-vE}. + +@item -E +@itemx --show-ends +Print a @samp{$} after the end of each line. + +@item --format=@var{fmt} +Force the given compression format. Valid values for @var{fmt} are +@samp{bz2}, @samp{gz}, @samp{lz} and @samp{xz}. If this option is used, +the exact file name must be given. Other names won't be tried. + +@item -n +@itemx --number +Number all output lines, starting with 1. The line count is unlimited. + +@item -q +@itemx --quiet +Quiet operation. Suppress all messages. + +@item -r +@itemx --recursive +Operate recursively on directories. + +@item -s +@itemx --squeeze-blank +Replace multiple adjacent blank lines with a single blank line. + +@item -t +Equivalent to @samp{-vT}. + +@item -T +@itemx --show-tabs +Print TAB characters as @samp{^I}. + +@item -v +@itemx --show-nonprinting +Print control characters except for LF (newline) and TAB using @samp{^} +notation and precede characters larger than 127 with @samp{M-} (which +stands for "meta"). + +@item --verbose +Verbose mode. Show error messages. + +@end table + + +@node Zcmp +@chapter Zcmp +@cindex zcmp + +Zcmp compares two files (@samp{-} means standard input), and if they +differ, tells the first byte and line number where they differ. Bytes +and lines are numbered starting with 1. If any given file is compressed, +its decompressed content is used. Compressed files are decompressed on +the fly; no temporary files are created. + +The format for running zcmp is: + +@example +zcmp [@var{options}] @var{file1} [@var{file2}] +@end example + +@noindent +This compares @var{file1} to @var{file2}. If @var{file2} is omitted zcmp +tries the following: + +@enumerate +@item +If @var{file1} is compressed, compares its decompressed contents with +the corresponding uncompressed file (the name of @var{file1} with the +extension removed). +@item +If @var{file1} is uncompressed, compares it with the decompressed +contents of @var{file1}.[lz|bz2|gz|xz] (the first one that is found). +@item +If no suitable file is found, compares @var{file1} with data read from +standard input. +@end enumerate + +@noindent +An exit status of 0 means no differences were found, 1 means some +differences were found, and 2 means trouble. + +Zcmp supports the following options: + +@table @samp +@item -b +@itemx --print-bytes +Print the differing bytes. Print control bytes as a @samp{^} followed by +a letter, and precede bytes larger than 127 with @samp{M-} (which stands +for "meta"). + +@item --format=[@var{fmt1}][,@var{fmt2}] +Force the given compression formats. Any of @var{fmt1} or @var{fmt2} may +be omitted and the corresponding format will be automatically detected. +Valid values for @var{fmt} are @samp{bz2}, @samp{gz}, @samp{lz} and +@samp{xz}. If at least one format is specified with this option, the +exact file names of both @var{file1} and @var{file2} must be given. +Other names won't be tried. + +@item -i @var{size} +@itemx --ignore-initial=@var{size} +Ignore any differences in the first @var{size} bytes of the input files. +Treat files with fewer than @var{size} bytes as if they were empty. If +@var{size} is in the form @samp{@var{size1},@var{size2}}, ignore the +first @var{size1} bytes of the first input file and the first +@var{size2} bytes of the second input file. + +@item -l +@itemx -v +@itemx --list +@itemx --verbose +Print the byte numbers (in decimal) and values (in octal) of all +differing bytes. + +@item -n @var{count} +@itemx --bytes=@var{count} +Compare at most @var{count} input bytes. + +@item -q +@itemx -s +@itemx --quiet +@itemx --silent +Do not print anything; only return an exit status indicating whether the +files differ. + +@end table + + +@node Zdiff +@chapter Zdiff +@cindex zdiff + +Zdiff compares two files (@samp{-} means standard input), and if they +differ, shows the differences line by line. If any given file is +compressed, its decompressed content is used. Zdiff is a front end to +the diff program and has the limitation that messages from diff refer to +temporary filenames instead of those specified. + +The format for running zdiff is: + +@example +zdiff [@var{options}] @var{file1} [@var{file2}] +@end example + +@noindent +This compares @var{file1} to @var{file2}. If @var{file2} is omitted +zdiff tries the following: + +@enumerate +@item +If @var{file1} is compressed, compares its decompressed contents with +the corresponding uncompressed file (the name of @var{file1} with the +extension removed). +@item +If @var{file1} is uncompressed, compares it with the decompressed +contents of @var{file1}.[lz|bz2|gz|xz] (the first one that is found). +@item +If no suitable file is found, compares @var{file1} with data read from +standard input. +@end enumerate + +@noindent +An exit status of 0 means no differences were found, 1 means some +differences were found, and 2 means trouble. + +Zdiff supports the following options: + +@table @samp +@item -a +@itemx --text +Treat all files as text. + +@item -b +@itemx --ignore-space-change +Ignore changes in the amount of white space. + +@item -B +@itemx --ignore-blank-lines +Ignore changes whose lines are all blank. + +@itemx -c +Use the context output format. + +@item -C @var{n} +@itemx --context=@var{n} +Same as -c but use @var{n} lines of context. + +@item -d +@itemx --minimal +Try hard to find a smaller set of changes. + +@item -E +@itemx --ignore-tab-expansion +Ignore changes due to tab expansion. + +@item --format=[@var{fmt1}][,@var{fmt2}] +Force the given compression formats. Any of @var{fmt1} or @var{fmt2} may +be omitted and the corresponding format will be automatically detected. +Valid values for @var{fmt} are @samp{bz2}, @samp{gz}, @samp{lz} and +@samp{xz}. If at least one format is specified with this option, the +exact file names of both @var{file1} and @var{file2} must be given. +Other names won't be tried. + +@item -i +@itemx --ignore-case +Ignore case differences in file contents. + +@item -p +@itemx --show-c-function +Show which C function each change is in. + +@item -q +@itemx --brief +Output only whether files differ. + +@item -s +@itemx --report-identical-files +Report when two files are identical. + +@item -t +@itemx --expand-tabs +Expand tabs to spaces in output. + +@item -T +@itemx --initial-tab +Make tabs line up by prepending a tab. + +@item -u +Use the unified output format. + +@item -U @var{n} +@itemx --unified=@var{n} +Same as -u but use @var{n} lines of context. + +@item -w +@itemx --ignore-all-space +Ignore all white space. + +@end table + + +@node Zgrep +@chapter Zgrep +@cindex zgrep + +Zgrep is a front end to the grep program that allows transparent search +on any combination of compressed and uncompressed files. If any given +file is compressed, its decompressed content is used. If a given file +does not exist, and its name does not end with one of the known +extensions, zgrep tries the compressed file names corresponding to the +supported formats. + +If no files are specified, data is read from standard input, +decompressed if needed, and fed to grep. Data read from standard input +must be of the same type; all uncompressed or all in the same +compression format. + +The format for running zgrep is: + +@example +zgrep [@var{options}] @var{pattern} [@var{files}] +@end example + +@noindent +An exit status of 0 means at least one match was found, 1 means no +matches were found, and 2 means trouble. + +Zgrep supports the following options: + +@table @samp +@item -a +@itemx --text +Treat all files as text. + +@item -A @var{n} +@itemx --after-context=@var{n} +Print @var{n} lines of trailing context. + +@item -b +@itemx --byte-offset +Print the byte offset of each line. + +@item -B @var{n} +@itemx --before-context=@var{n} +Print @var{n} lines of leading context. + +@item -c +@itemx --count +Only print a count of matching lines per file. + +@item -C @var{n} +@itemx --context=@var{n} +Print @var{n} lines of output context. + +@item -e @var{pattern} +@itemx --regexp=@var{pattern} +Use @var{pattern} as the pattern to match. + +@item -E +@itemx --extended-regexp +Treat @var{pattern} as an extended regular expression. + +@item -f @var{file} +@itemx --file=@var{file} +Obtain patterns from @var{file}, one per line. + +@item -F +@itemx --fixed-strings +Treat @var{pattern} as a set of newline-separated strings. + +@item --format=@var{fmt} +Force the given compression format. Valid values for @var{fmt} are +@samp{bz2}, @samp{gz}, @samp{lz} and @samp{xz}. If this option is used, +the exact file name must be given. Other names won't be tried. + +@item -h +@itemx --no-filename +Suppress the prefixing of filenames on output when multiple files are +searched. + +@item -H +@itemx --with-filename +Print the filename for each match. + +@item -i +@itemx --ignore-case +Ignore case distinctions. + +@item -I +Ignore binary files. + +@item -l +@itemx --files-with-matches +Only print names of files containing at least one match. + +@item -L +@itemx --files-without-match +Only print names of files not containing any matches. + +@item -m @var{n} +@itemx --max-count=@var{n} +Stop after @var{n} matches. + +@item -n +@itemx --line-number +Prefix each matched line with its line number in the input file. + +@item -o +@itemx --only-matching +Show only the part of matching lines that actually matches @var{pattern}. + +@item -q +@itemx --quiet +Suppress all messages. Exit immediately with zero status if any match is +found, even if an error was detected. + +@item -r +@itemx --recursive +Operate recursively on directories. + +@item -s +@itemx --no-messages +Suppress error messages about nonexistent or unreadable files. + +@item -v +@itemx --invert-match +Select non-matching lines. + +@item --verbose +Verbose mode. Show error messages. + +@item -w +@itemx --word-regexp +Match only whole words. + +@item -x +@itemx --line-regexp +Match only whole lines. + +@end table + + +@node Ztest +@chapter Ztest +@cindex ztest + +Ztest verifies the integrity of the specified compressed files. +Uncompressed files are ignored. If no files are specified, the integrity +of compressed data read from standard input is verified. Data read from +standard input must be all in the same compression format. + +Note that some xz files lack integrity information, and therefore can't +be verified as reliably as the other formats can. + +The format for running ztest is: + +@example +ztest [@var{options}] [@var{files}] +@end example + +@noindent +The exit status is 0 if all compressed files verify OK, 1 if +environmental problems (file not found, invalid flags, I/O errors, etc), +2 if any compressed file is corrupt or invalid. + +Ztest supports the following options: + +@table @samp +@item --format=@var{fmt} +Force the given compression format. Valid values for @var{fmt} are +@samp{bz2}, @samp{gz}, @samp{lz} and @samp{xz}. If this option is used, +all files not in the given format will fail. + +@item -q +@itemx --quiet +Quiet operation. Suppress all messages. + +@item -r +@itemx --recursive +Operate recursively on directories. + +@item -v +@itemx --verbose +Verbose mode. Show the verify status for each file processed.@* +Further -v's increase the verbosity level. + +@end table + + +@node Zupdate +@chapter Zupdate +@cindex zupdate + +Zupdate recompresses files from bzip2, gzip, and xz formats to lzip +format. The originals are compared with the new files and then deleted. +Only regular files with standard file name extensions are recompressed, +other files are ignored. Compressed files are decompressed and then +recompressed on the fly; no temporary files are created. The lzip format +is chosen as destination because it is by far the most appropriate for +long-term data archiving. + +If the lzip compressed version of a file already exists, the file is +skipped unless the @samp{--force} option is given. In this case, if the +comparison with the existing lzip version fails, an error is returned +and the original file is not deleted. The operation of zupdate is meant +to be safe and not produce any data loss. Therefore, existing lzip +compressed files are never overwritten nor deleted. + +The names of the original files must have one of the following +extensions: @samp{.bz2}, @samp{.tbz}, @samp{.tbz2}, @samp{.gz}, +@samp{.tgz}, @samp{.xz}, @samp{.txz}. The files produced have the +extensions @samp{.lz} or @samp{.tar.lz}. + +The format for running zupdate is: + +@example +zupdate [@var{options}] [@var{files}] +@end example + +@noindent +Exit status is 0 if all the compressed files were successfully +recompressed (if needed), compared and deleted. Non-zero otherwise. + +Zupdate supports the following options: + +@table @samp +@item -f +@itemx --force +Do not skip a file for which a lzip compressed version already exists. +@samp{--force} compares the content of the input file with the content +of the existing lzip file and deletes the input file if both contents +are identical. + +@item -k +@itemx --keep +Keep (don't delete) the input file after comparing it with the lzip file. + +@item -l +@itemx --lzip-verbose +Pass a @samp{-v} option to the lzip compressor so that it shows the +compression ratio for each file processed. Using lzip 1.15 and newer, a +second @samp{-l} shows the progress of compression. Use it together with +@samp{-v} to see the name of the file. + +@item -q +@itemx --quiet +Quiet operation. Suppress all messages. + +@item -r +@itemx --recursive +Operate recursively on directories. + +@item -v +@itemx --verbose +Verbose mode. Show the files being processed. A second @samp{-v} also +shows the files being ignored. + +@item -0 .. -9 +Set the compression level of lzip. By default zupdate passes @samp{-9} +to lzip. + +@end table + + +@node Problems +@chapter Reporting bugs +@cindex bugs +@cindex getting help + +There are probably bugs in zutils. There are certainly errors and +omissions in this manual. If you report them, they will get fixed. If +you don't, no one will ever know about them and they will remain unfixed +for all eternity, if not longer. + +If you find a bug in zutils, please send electronic mail to +@email{zutils-bug@@nongnu.org}. Include the version number, which you can +find by running @w{@samp{zupdate --version}}. + + +@node Concept index +@unnumbered Concept index + +@printindex cp + +@bye -- cgit v1.2.3