diff options
Diffstat (limited to 'doc/zutils.texi')
-rw-r--r-- | doc/zutils.texi | 1034 |
1 files changed, 1034 insertions, 0 deletions
diff --git a/doc/zutils.texi b/doc/zutils.texi new file mode 100644 index 0000000..1646e7d --- /dev/null +++ b/doc/zutils.texi @@ -0,0 +1,1034 @@ +\input texinfo @c -*-texinfo-*- +@c %**start of header +@setfilename zutils.info +@documentencoding ISO-8859-15 +@settitle Zutils Manual +@finalout +@c %**end of header + +@set UPDATED 23 January 2024 +@set VERSION 1.13 + +@dircategory Compression +@direntry +* Zutils: (zutils). Utilities dealing with compressed files +@end direntry + + +@ifnothtml +@titlepage +@title Zutils +@subtitle Utilities dealing with compressed files +@subtitle for Zutils version @value{VERSION}, @value{UPDATED} +@author by Antonio Diaz Diaz + +@page +@vskip 0pt plus 1filll +@end titlepage + +@contents +@end ifnothtml + +@ifnottex +@node Top +@top + +This manual is for Zutils (version @value{VERSION}, @value{UPDATED}). + +@menu +* Introduction:: Purpose and features of zutils +* Common options:: Options common to all utilities +* Configuration:: The configuration file zutils.conf +* Zcat:: Concatenating compressed files +* Zcmp:: Comparing compressed files byte by byte +* Zdiff:: Comparing compressed files line by line +* Zgrep:: Searching inside compressed files +* Ztest:: Testing the integrity of compressed files +* Zupdate:: Recompressing files to lzip format +* Problems:: Reporting bugs +* Concept index:: Index of concepts +@end menu + +@sp 1 +Copyright @copyright{} 2009-2024 Antonio Diaz Diaz. + +This manual is free documentation: you have unlimited permission to copy, +distribute, and modify it. +@end ifnottex + + +@node Introduction +@chapter Introduction +@cindex introduction + +@uref{http://www.nongnu.org/zutils/zutils.html,,Zutils} +is a collection of utilities able to process any combination of +compressed and uncompressed files transparently. If any file given, +including standard input, is compressed, its decompressed content is used. +Compressed files are decompressed on the fly; no temporary files are +created. Data format is detected by its identifier string (magic bytes), not +by the file name extension. Empty files are considered uncompressed. + +These utilities are not wrapper scripts but safer and more efficient C++ +programs. In particular the option @option{--recursive} is very efficient in +those utilities supporting it. + +@noindent +The utilities provided are @command{zcat}, @command{zcmp}, @command{zdiff}, +@command{zgrep}, @command{ztest}, and @command{zupdate}.@* +The formats supported are bzip2, gzip, +@uref{http://www.nongnu.org/lzip/lzip.html,,lzip}, xz, and zstd.@* +Zutils uses external compressors. The compressor to be used for each format +is configurable at runtime. + +@command{zcat}, @command{zcmp}, @command{zdiff}, and @command{zgrep} are +improved replacements for the shell scripts provided by GNU gzip. +@command{ztest} is unique to zutils. @command{zupdate} is similar to gzip's +znew. + +@anchor{search-order} +When @command{zcat}, @command{zcmp}, @command{zdiff}, or @command{zgrep} +need to try compressed file names, the search order is: lzip, gzip, bzip2, +zstd, xz. (@var{file}.[lz|gz|bz2|zst|xz]). + +NOTE: Bzip2 and lzip provide well-defined values of exit status, which makes +them safe to use with zutils. Gzip and xz may return ambiguous warning +values, making them less reliable back ends for zutils. Zstd currently does +not even document its exit status in its man page. +@xref{compressor-requirements}. + +FORMAT NOTE 1: The option @option{--format} allows the processing of a subset +of formats in recursive mode and when trying compressed file names. For +example, use the following command to search for the string @samp{foo} in +gzip and lzip files only: +@w{@samp{zgrep foo -r --format=gz,lz somedir somefile.tar}}. + +FORMAT NOTE 2: The standard POSIX compress format (.Z) is obsolete and is +only supported through gzip. For this to work, the gzip program used (for +example GNU gzip) must be able to decompress .Z files. + +LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never have +been compressed. Decompressed is used to refer to data which have undergone +the process of decompression. + + +@node Common options +@chapter Common options +@cindex common options + +The following +@uref{http://www.nongnu.org/arg-parser/manual/arg_parser_manual.html#Argument-syntax,,options}: +are available in all the utilities. Rather than writing identical +descriptions for each of the programs, they are described here. Remember to +prepend @file{./} to any file name beginning with a hyphen, or use @samp{--}. +@ifnothtml +@xref{Argument syntax,,,arg_parser}. +@end ifnothtml + +@table @code +@item -h +@itemx --help +Print an informative help message describing the options and exit. +@command{zgrep} only supports the @option{--help} form of this option. + +@anchor{version} +@item -V +@itemx --version +Print the version number on the standard output and exit. +This version number should be included in all bug reports. +In verbose mode, @command{zdiff} and @command{zgrep} print also the version +of the diff or grep program used respectively. At verbosity level 1 (2 for +@command{zdiff} and @command{zgrep}) or higher, print also the versions of +the compressors used (perhaps limited by option @option{--format}). (The +compressors used must support the option @option{-V} for this to work). + +@item -M @var{format_list} +@itemx --format=@var{format_list} +Process only the formats listed in the comma-separated @var{format_list}. +Valid formats are @samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, @samp{zst}, +and @samp{un} for @samp{uncompressed}, meaning "any file name without a +known extension". This option excludes files based on extension, instead of +format, because it is more efficient. The exclusion only applies to names +generated automatically (for example when adding extensions to a file name +or when operating recursively on directories). Files given in the command +line are always processed. + +Each format in @var{format_list} enables file names with the following +extensions: + +@multitable {bz2} {enables} {any other file name} +@item bz2 @tab enables @tab .bz2 .tbz .tbz2 +@item gz @tab enables @tab .gz .tgz .Z +@item lz @tab enables @tab .lz .tlz +@item xz @tab enables @tab .xz .txz +@item zst @tab enables @tab .zst .tzst +@item un @tab enables @tab any other file name +@end multitable + +@item -N +@itemx --no-rcfile +Don't read the runtime configuration file @file{zutils.conf}. + +@item --bz2=@var{command} +@itemx --gz=@var{command} +@itemx --lz=@var{command} +@itemx --xz=@var{command} +@itemx --zst=@var{command} +Set program to be used as decompressor for the corresponding format. +@var{command} may include arguments. For example +@w{@option{--lz='plzip --threads=2'}}. @command{zupdate} uses @option{--lz} +for compression, not for decompression (@pxref{lz-compressor}). The name of +the program can't begin with @samp{-}. These options override the values set +in @file{zutils.conf}. The compression program used must meet three +requirements: + +@anchor{compressor-requirements} +@enumerate +@item +When called with the option @option{-d} and without file names, it must read +compressed data from the standard input and produce decompressed data on the +standard output. +@item +If the option @option{-q} is passed to zutils, the compression program must +also accept it. +@item +It must return 0 if no errors occurred, and a non-zero value otherwise. +@end enumerate + +@end table + +Numbers given as arguments to options may be expressed in decimal, +hexadecimal, or octal (using the same syntax as integer constants in C++), +and may be followed by a multiplier and an optional @samp{B} for "byte". + +Table of SI and binary prefixes (unit multipliers): + +@multitable {Prefix} {kilobyte (10^3 = 1000)} {|} {Prefix} {kibibyte (2^10 = 1024)} +@item Prefix @tab Value @tab | @tab Prefix @tab Value +@item k @tab kilobyte (10^3 = 1000) @tab | @tab Ki @tab kibibyte (2^10 = 1024) +@item M @tab megabyte (10^6) @tab | @tab Mi @tab mebibyte (2^20) +@item G @tab gigabyte (10^9) @tab | @tab Gi @tab gibibyte (2^30) +@item T @tab terabyte (10^12) @tab | @tab Ti @tab tebibyte (2^40) +@item P @tab petabyte (10^15) @tab | @tab Pi @tab pebibyte (2^50) +@item E @tab exabyte (10^18) @tab | @tab Ei @tab exbibyte (2^60) +@item Z @tab zettabyte (10^21) @tab | @tab Zi @tab zebibyte (2^70) +@item Y @tab yottabyte (10^24) @tab | @tab Yi @tab yobibyte (2^80) +@item R @tab ronnabyte (10^27) @tab | @tab Ri @tab robibyte (2^90) +@item Q @tab quettabyte (10^30) @tab | @tab Qi @tab quebibyte (2^100) +@end multitable + + +@node Configuration +@chapter The configuration file 'zutils.conf' +@cindex zutils.conf + +@file{zutils.conf} is the runtime configuration file for zutils. In it you +may define the compressor name and options to be used for each format. +@file{zutils.conf} is optional; you don't need to install it in order to run +zutils. + +The compressors specified in the command line override those specified +in @file{zutils.conf}. + +You may copy the system @file{zutils.conf} file @file{$@{sysconfdir@}/zutils.conf} +to @file{$XDG_CONFIG_HOME/zutils.conf} and customize these options as you like. +(@env{XDG_CONFIG_HOME} defaults to @file{$HOME/.config}). The file syntax is +fairly obvious (and there are further instructions in it): + +@enumerate +@item +Any line beginning with @samp{#} is a comment line. +@item +Each non-comment line defines the command to be used for the corresponding +format, with the syntax: +@example +<format> = <compressor> [options] +@end example +where <format> is one of @samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, or +@samp{zst}. +@end enumerate + + +@node Zcat +@chapter Zcat +@cindex zcat + +@command{zcat} copies each @var{file} argument to standard output in +sequence. If any file given is compressed, its decompressed content is +copied. If a file given does not exist, and its name does not end with one +of the known extensions, @command{zcat} tries the compressed file names +corresponding to the formats supported until one is found. +@xref{search-order}. If a file fails to decompress, @command{zcat} continues +copying the rest of the files. + +If a file is specified as @samp{-}, data are read from standard input, +decompressed if needed, and sent to standard output. Data read from +standard input must be of the same type; all uncompressed or all in the +same compressed format. + +If no files are specified, recursive searches examine the current working +directory, and nonrecursive searches read standard input. + +The format for running @command{zcat} is: + +@example +zcat [@var{options}] [@var{files}] +@end example + +@noindent +Exit status is 0 if no errors occurred, 1 otherwise. + +@command{zcat} supports the following options: + +@table @code +@item -A +@itemx --show-all +Equivalent to @option{-vET}. + +@item -b +@itemx --number-nonblank +Number all nonblank output lines, starting with 1. The line count is +unlimited. + +@item -e +Equivalent to @option{-vE}. + +@item -E +@itemx --show-ends +Print a @samp{$} after the end of each line. + +@item -n +@itemx --number +Number all output lines, starting with 1. The line count is unlimited. + +@item -O @var{format} +@itemx --force-format=@var{format} +Force the compressed format given. Valid values for @var{format} are +@samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, @samp{zst}, and @samp{un} for +@samp{uncompressed}. If this option is used, the files are passed to the +corresponding decompressor (or transmitted unmodified) without checking +their format, and the exact file name must be given. Other names are not +tried. + +@item -q +@itemx --quiet +Quiet operation. Suppress all messages. + +@item -r +@itemx --recursive +For each directory operand, read and process all files in that directory, +recursively. Follow symbolic links given in the command line, but skip +symbolic links that are encountered recursively. + +@item -R +@itemx --dereference-recursive +For each directory operand, read and process all files in that directory, +recursively, following all symbolic links. + +@item -s +@itemx --squeeze-blank +Replace multiple adjacent blank lines with a single blank line. + +@item -t +Equivalent to @option{-vT}. + +@item -T +@itemx --show-tabs +Print TAB characters as @samp{^I}. + +@item -v +@itemx --show-nonprinting +Print control characters except for LF (newline) and TAB using @samp{^} +notation and precede characters larger than 127 with @samp{M-} (which +stands for "meta"). + +@item --verbose +Verbose mode. Show error messages. Repeating it increases the verbosity +level. @xref{version}. + +@end table + + +@node Zcmp +@chapter Zcmp +@cindex zcmp + +@command{zcmp} compares two files and, if they differ, writes to standard +output the first byte and line number where they differ. Bytes and lines are +numbered starting with 1. A hyphen @samp{-} used as a @var{file} argument +means standard input. If any file given is compressed, its decompressed +content is used. Compressed files are decompressed on the fly; no temporary +files are created. + +The format for running @command{zcmp} is: + +@example +zcmp [@var{options}] @var{file1} [@var{file2}] +@end example + +@noindent +This compares @var{file1} to @var{file2}. The standard input is used only if +@var{file1} or @var{file2} refers to standard input. If @var{file2} is +omitted @command{zcmp} tries to compare @var{file1} with the corresponding +uncompressed file (if @var{file1} is compressed), and then with the +corresponding compressed files of the remaining formats until one is found. +@xref{search-order}. + +@noindent +An exit status of 0 means no differences were found, 1 means some +differences were found, and 2 means trouble. + +@command{zcmp} supports the following options: + +@table @code +@item -b +@itemx --print-bytes +Print the values of the differing bytes (in octal by default) followed by +the bytes themselves in printable form. Print control bytes as a @samp{^} +followed by a letter, and precede bytes larger than 127 with @samp{M-} +(which stands for "meta"). + +@item -H +@itemx --hexadecimal +Print the values of the differing bytes in hexadecimal instead of octal. + +@item -i @var{size} +@itemx --ignore-initial=@var{size} +Ignore any differences in the first @var{size} bytes of the input files. +Treat files with fewer than @var{size} bytes as if they were empty. If +@var{size} is in the form @samp{@var{size1}:@var{size2}}, ignore the +first @var{size1} bytes of the first input file and the first +@var{size2} bytes of the second input file. + +@item -l +@itemx --list +Print the byte numbers (in decimal) and values (in octal by default) of all +differing bytes. Bytes are numbered starting with 1. + +@item -n @var{count} +@itemx --bytes=@var{count} +Compare at most @var{count} input bytes. + +@item -O [@var{format1}][,@var{format2}] +@itemx --force-format=[@var{format1}][,@var{format2}] +Force the compressed formats given. If @var{format1} or @var{format2} is +omitted, the corresponding format is automatically detected. Valid values +for @var{format} are @samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, +@samp{zst}, and @samp{un} for @samp{uncompressed}. If at least one format is +specified with this option, the file is passed to the corresponding +decompressor (or transmitted unmodified) without checking its format, and +the exact file names of both @var{file1} and @var{file2} must be given. +Other names are not tried. + +@item -q +@itemx --quiet +@itemx --silent +Suppress diagnostics written to standard error, even the +@w{@samp{EOF on <name_of_shorter_file>}} diagnostic. Byte differences are +still written to standard output. (@option{-q} produces no output except +byte differences). + +@item -s +@itemx --script +Write nothing to standard output or standard error when files differ, not +even the @w{@samp{EOF on <name_of_shorter_file>}} diagnostic; indicate +differing files through exit status only. Diagnostic messages are still +written to standard error when an error is encountered. (@option{-s} +produces no output except error messages). + +@item -v +@itemx --verbose +Verbose mode. Undoes the effect of @option{--quiet}. Further -v's increase +the verbosity level. @xref{version}. + +@end table + + +@node Zdiff +@chapter Zdiff +@cindex zdiff + +@command{zdiff} compares two files and, if they differ, writes to standard +output the differences line by line. A hyphen @samp{-} used as a @var{file} +argument means standard input. If any file given is compressed, its +decompressed content is used. @command{zdiff} is a front end to the program +diff and has the limitation that messages from diff refer to temporary file +names instead of those specified. + +The format for running @command{zdiff} is: + +@example +zdiff [@var{options}] @var{file1} [@var{file2}] +@end example + +@noindent +This compares @var{file1} to @var{file2}. The standard input is used only if +@var{file1} or @var{file2} refers to standard input. If @var{file2} is +omitted @command{zdiff} tries to compare @var{file1} with the corresponding +uncompressed file (if @var{file1} is compressed), and then with the +corresponding compressed files of the remaining formats until one is found. +@xref{search-order}. + +@noindent +An exit status of 0 means no differences were found, 1 means some +differences were found, and 2 means trouble. + +@command{zdiff} supports the following options (some options only work if +the diff program used supports them): + +@table @code +@item -a +@itemx --text +Treat all files as text. + +@item -b +@itemx --ignore-space-change +Ignore changes in the amount of white space. + +@item -B +@itemx --ignore-blank-lines +Ignore changes whose lines are all blank. + +@item -c +Use the context output format. + +@item -C @var{n} +@itemx --context=@var{n} +Same as -c but use @var{n} lines of context. + +@item -d +@itemx --minimal +Try hard to find a smaller set of changes. + +@item -E +@itemx --ignore-tab-expansion +Ignore changes due to tab expansion. + +@item -i +@itemx --ignore-case +Ignore case differences. Consider uppercase and lowercase letters equivalent. + +@item -O [@var{format1}][,@var{format2}] +@itemx --force-format=[@var{format1}][,@var{format2}] +Force the compressed formats given. If @var{format1} or @var{format2} is +omitted, the corresponding format is automatically detected. Valid values +for @var{format} are @samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, +@samp{zst}, and @samp{un} for @samp{uncompressed}. If at least one format is +specified with this option, the file is passed to the corresponding +decompressor (or transmitted unmodified) without checking its format, and +the exact file names of both @var{file1} and @var{file2} must be given. +Other names are not tried. + +@item -p +@itemx --show-c-function +Show which C function each change is in. + +@item -q +@itemx --brief +Output only whether files differ. + +@item -s +@itemx --report-identical-files +Report when two files are identical. + +@item -t +@itemx --expand-tabs +Expand tabs to spaces in output. + +@item -T +@itemx --initial-tab +Make tabs line up by prepending a tab. + +@item -u +Use the unified output format. + +@item -U @var{n} +@itemx --unified=@var{n} +Same as -u but use @var{n} lines of context. + +@item -v +@itemx --verbose +When specified before @option{--version}, print the version of the diff +program used. Further -v's increase the verbosity level. @xref{version}. + +@item -w +@itemx --ignore-all-space +Ignore all white space. + +@item -W @var{columns} +@itemx --width=@var{columns} +Output at most the specified number of print columns per line in side by +side format. + +@item -y +@itemx --side-by-side +Use the side by side output format. + +@end table + + +@node Zgrep +@chapter Zgrep +@cindex zgrep + +@command{zgrep} is a front end to the program grep that allows transparent +search on any combination of compressed and uncompressed files. If any file +given is compressed, its decompressed content is used. If a file given does +not exist, and its name does not end with one of the known extensions, +@command{zgrep} tries the compressed file names corresponding to the formats +supported until one is found. @xref{search-order}. If a file fails to +decompress, @command{zgrep} continues searching the rest of the files. + +If a file is specified as @samp{-}, data are read from standard input, +decompressed if needed, and fed to grep. Data read from standard input must +be of the same type; all uncompressed or all in the same compressed format. + +If no files are specified, recursive searches examine the current working +directory, and nonrecursive searches read standard input. + +For efficiency reasons, @command{zgrep} does not always read all its input. +For example, the shell command @w{@samp{base64 -d foo | zgrep -q X}} can +cause @command{zgrep} to exit immediately after reading a line containing +@samp{X}, without bothering to read the rest of its input data. This in turn +can cause base64 to exit with a nonzero status because base64 cannot write +to its output pipe after @command{zgrep} exits. + +The format for running @command{zgrep} is: + +@example +zgrep [@var{options}] @var{pattern} [@var{files}] +@end example + +@noindent +An exit status of 0 means at least one match was found, 1 means no +matches were found, and 2 means trouble. + +@command{zgrep} supports the following options (Some options only work if +the grep program used supports them. Options -h, -H, -r, -R, and -Z are +managed by @command{zgrep} and not passed to grep): + +@table @code +@item -a +@itemx --text +Treat all files as text. + +@item -A @var{n} +@itemx --after-context=@var{n} +Print @var{n} lines of trailing context. + +@item -b +@itemx --byte-offset +Print the byte offset of each line. + +@item -B @var{n} +@itemx --before-context=@var{n} +Print @var{n} lines of leading context. + +@item -c +@itemx --count +Only print a count of matching lines per file. + +@item -C @var{n} +@itemx --context=@var{n} +Print @var{n} lines of output context. + +@item --color[=@var{when}] +Show matched strings in color. @var{when} is @samp{never}, @samp{always}, +or @samp{auto}. + +@item -e @var{pattern} +@itemx --regexp=@var{pattern} +Use @var{pattern} as the pattern to match. + +@item -E +@itemx --extended-regexp +Interpret @var{pattern} as an extended regular expression (ERE). + +@item -f @var{file} +@itemx --file=@var{file} +Obtain patterns from @var{file}, one per line.@* +When searching in several files at once, command substitution can be used +with @option{-e} to read @var{file} only once, for example if @var{file} is +not a regular file: +@w{@samp{zgrep -e "$(cat @var{file})" file1.lz file2.gz}} + +@item -F +@itemx --fixed-strings +Interpret @var{pattern} as a set of newline-separated strings. + +@item -G +@itemx --basic-regexp +Interpret @var{pattern} as a basic regular expression (BRE). This is the +default. + +@item -h +@itemx --no-filename +Suppress the prefixing of file names on output when multiple files are +searched. + +@item -H +@itemx --with-filename +Print the file name for each match. + +@item -i +@itemx --ignore-case +Ignore case distinctions. + +@item -I +Ignore binary files. + +@item -l +@itemx --files-with-matches +Only print names of files containing at least one match. Stop reading each +file on the first match. + +@item -L +@itemx --files-without-match +Only print names of files not containing any matches. Stop reading each file +on the first match.@* +Note: option -L fails (prints wrong results, returns wrong status, and even +hangs) when using GNU grep versions 3.2 to 3.4 inclusive because of a wrong +change in the exit status of grep, which was reverted in GNU grep 3.5. + +@item --label=@var{label} +Display input actually coming from standard input as input coming from file +@var{label}. + +@item --line-buffered +Use line buffering on output. This may cause a performance penalty. + +@item -m @var{n} +@itemx --max-count=@var{n} +Stop after @var{n} matches. + +@item -n +@itemx --line-number +Prefix each matched line with its line number in the input file. + +@item -o +@itemx --only-matching +Show only the part of matching lines that actually matches @var{pattern}. + +@item -O @var{format} +@itemx --force-format=@var{format} +Force the compressed format given. Valid values for @var{format} are +@samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, @samp{zst}, and @samp{un} for +@samp{uncompressed}. If this option is used, the files are passed to the +corresponding decompressor (or transmitted unmodified) without checking +their format, and the exact file name must be given. Other names are not +tried. + +@item -P +@itemx --perl-regexp +Interpret @var{pattern} as a Perl-compatible regular expression (PCRE). + +@item -q +@itemx --quiet +@itemx --silent +Suppress all messages. Exit immediately with zero status if any match is +found, even if an error was detected. + +@item -r +@itemx --recursive +For each directory operand, read and process all files in that directory, +recursively. Follow symbolic links given in the command line, but skip +symbolic links that are encountered recursively. + +@item -R +@itemx --dereference-recursive +For each directory operand, read and process all files in that directory, +recursively, following all symbolic links. + +@item -s +@itemx --no-messages +Suppress error messages about nonexistent or unreadable files. + +@item -T +@itemx --initial-tab +Make sure that the first character of actual line content lies on a tab +stop, so that the alignment of tabs looks normal. + +@item -U +@itemx --binary +Use binary I/O on platforms affected by the bug known as "text mode I/O". +(MS-DOS, MS-Windows, OS/2). + +@item -v +@itemx --invert-match +Select non-matching lines. + +@item --verbose +Verbose mode. Show error messages. When specified before @option{--version}, +print the version of the grep program used. Repeating it increases the +verbosity level. @xref{version}. + +@item -w +@itemx --word-regexp +Match only whole words. + +@item -x +@itemx --line-regexp +Match only whole lines. + +@item -Z +@itemx --null +Output a zero byte (the ASCII NUL character) instead of the character that +normally follows a file name. For example, @w{@samp{zgrep -lZ}} outputs a +zero byte after each file name instead of the usual newline. This option +makes the output unambiguous, even in the presence of file names containing +unusual characters like newlines. + +@end table + + +@node Ztest +@chapter Ztest +@cindex ztest + +@command{ztest} checks the integrity of the compressed files specified. It +also warns if an uncompressed file has a compressed file name extension, or +if a compressed file has a wrong compressed extension. Uncompressed files +are otherwise ignored. If a file is specified as @samp{-}, the integrity of +compressed data read from standard input is checked. Data read from +standard input must be all in the same compressed format. If a file fails to +decompress, does not exist, can't be opened, or is a terminal, @command{ztest} +continues testing the rest of the files. A final diagnostic is shown at +verbosity level 1 or higher if any file fails the test when testing multiple +files. + +If no files are specified, recursive searches examine the current working +directory, and nonrecursive searches read standard input. + +Bzip2, gzip, and lzip are the primary formats. Xz and zstd are optional. If +the decompressor for the xz or zstd formats is not found, the corresponding +files are ignored. + +Note that error detection in the xz format is broken. First, some xz files +lack integrity information. Second, not all xz decompressors can +@uref{http://www.nongnu.org/lzip/xz_inadequate.html#fragmented,,check the integrity} +of all xz files. Third, section 2.1.1.2 'Stream Flags' of the +@uref{http://tukaani.org/xz/xz-file-format.txt,,xz format specification} +allows xz decompressors to produce garbage output without issuing any +warning. Therefore, xz files can't always be checked as reliably as files in +the other formats can. +@c We can only hope that xz is soon abandoned. + +The format for running @command{ztest} is: + +@example +ztest [@var{options}] [@var{files}] +@end example + +@noindent +Exit status is 0 if all compressed files check OK, 1 if environmental +problems (file not found, invalid command-line options, I/O errors, etc), +2 if any compressed file is corrupt or invalid, or if any file has an +incorrect file name extension. + +@command{ztest} supports the following options: + +@table @code +@item -O @var{format} +@itemx --force-format=@var{format} +Force the compressed format given. Valid values for @var{format} are +@samp{bz2}, @samp{gz}, @samp{lz}, @samp{xz}, and @samp{zst}. If this option +is used, the files are passed to the corresponding decompressor without +checking their format, and any files in a format that the decompressor can't +understand fail the test. + +@item -q +@itemx --quiet +Quiet operation. Suppress all messages. + +@item -r +@itemx --recursive +For each directory operand, read and process all files in that directory, +recursively. Follow symbolic links given in the command line, but skip +symbolic links that are encountered recursively. + +@item -R +@itemx --dereference-recursive +For each directory operand, read and process all files in that directory, +recursively, following all symbolic links. + +@item -v +@itemx --verbose +Verbose mode. Show the check status for each file processed. Further -v's +increase the verbosity level. @xref{version}. + +@end table + + +@node Zupdate +@chapter Zupdate +@cindex zupdate + +@command{zupdate} recompresses files from bzip2, gzip, xz, and zstd formats +to lzip format. Each original is compared with the new file and then +deleted. Only regular files with standard file name extensions are +recompressed, other files are ignored. Compressed files are decompressed and +then recompressed on the fly; no temporary files are created. If an error +happens while recompressing a file, @command{zupdate} exits immediately +without recompressing the rest of the files. The lzip format is chosen as +destination because it is the most appropriate for long-term archiving. + +If no files are specified, recursive searches examine the current working +directory, and nonrecursive searches do nothing. + +If the lzip-compressed version of a file already exists, the file is skipped +unless the option @option{--force} is given. In this case, if the comparison +with the existing lzip version fails, an error is returned and the original +file is not deleted. The operation of @command{zupdate} is meant to be safe +and not cause any data loss. Therefore, existing lzip-compressed files are +never overwritten nor deleted. + +Combining the options @option{--force} and @option{--keep}, as in +@w{@samp{zupdate -f -k *.gz}}, checks that there are no differences between +each pair of files in a multiformat set of files. + +The names of the original files must have one of the following extensions:@* +@samp{.bz2}, @samp{.gz}, @samp{.xz}, @samp{.zst}, or @samp{.Z}, which are +recompressed to @samp{.lz};@* +@samp{.tbz}, @samp{.tbz2}, @samp{.tgz}, @samp{.txz}, or @samp{.tzst}, which +are recompressed to @samp{.tlz}.@* +Keeping the combined extensions @w{(@samp{.tgz} ---> @samp{.tlz})} may be +useful when recompressing Slackware packages, for example. + +Bzip2, gzip, and lzip are the primary formats. Xz and zstd are optional. If +the decompressor for the xz or zstd formats is not found, the corresponding +files are ignored. + +Recompressing a file is much like copying or moving it. Therefore +@command{zupdate} preserves the access and modification dates, permissions, +and, if you have appropriate privileges, ownership of the file just as +@w{@samp{cp -p}} does. (If the user ID or the group ID can't be duplicated, +the file permission bits S_ISUID and S_ISGID are cleared). + +The format for running @command{zupdate} is: + +@example +zupdate [@var{options}] [@var{files}] +@end example + +@noindent +Exit status is 0 if all the compressed files were successfully recompressed +(if needed), compared, and deleted (if requested). 1 if a non-fatal error +occurred (file not found or not regular, or has invalid format, or can't be +deleted). 2 if a fatal error occurred (invalid command-line options, +compressor can't be run, or comparison fails). + +@command{zupdate} supports the following options: + +@table @code +@item -d @var{dir} +@itemx --destdir=@var{dir} +Write recompressed files to another directory, using @var{dir} as base +directory, instead of writing them in the same directory as the original +files. In recursive mode, this is done by replacing each directory specified +in the command line with @var{dir} to produce the recompressed file names. +For example, @w{@samp{zupdate -r -d @var{dir} ../a}} recompresses a file +named @file{../a/b/c.gz} to @file{@var{dir}/b/c.lz}. Regular files specified +in the command line are recompressed directly into @var{dir}. For example, +@w{@samp{zupdate -d @var{dir} ../a/b/c.gz}} writes the recompressed file to +@file{@var{dir}/c.lz}. + +This option allows recompressing files from a read-only file system to +another place without the need to copy or link them to the destination +directory first. (Remember to use option @option{--keep} when recompressing +read-only files to avoid warnings about files that can't be deleted). + +@item -e +@itemx --expand-extensions +Expand combined file name extensions; recompress @samp{.tbz}, @samp{.tbz2}, +@samp{.tgz}, @samp{.txz}, and @samp{.tzst} to @samp{tar.lz}. + +@item -f +@itemx --force +Don't skip a file for which a lzip-compressed version already exists. +@option{--force} compares the content of the input file with the content of +the existing lzip file and deletes the input file if both contents are +identical. + +@item -i +@itemx --ignore-errors +Ignore non-fatal errors. (See exit status above). + +@item -k +@itemx --keep +Keep (don't delete) the input file after comparing it with the lzip file. +Use it when recompressing files from a read-only file system. (See option +@option{--destdir} above). + +@item -l +@itemx --lzip-verbose +Pass one option @option{-v} to the lzip compressor so that it shows the +compression ratio for each file processed. Using lzip 1.15 or newer, a +second @option{-l} shows the progress of compression. Use it together with +@option{-v} to see the name of the file. + +@item -q +@itemx --quiet +Quiet operation. Suppress all messages. + +@item -r +@itemx --recursive +For each directory operand, read and process all files in that directory, +recursively. Follow symbolic links given in the command line, but skip +symbolic links that are encountered recursively. + +@item -R +@itemx --dereference-recursive +For each directory operand, read and process all files in that directory, +recursively, following all symbolic links. + +@item -v +@itemx --verbose +Verbose mode. Show the files being processed. A second @option{-v} also shows +the files being ignored and increases the verbosity level. @xref{version}. + +@item -0 .. -9 +Set the compression level of lzip. By default @command{zupdate} passes +@option{-9} to lzip. Custom compression options can be passed to lzip with +the option @option{--lz}. For example @w{@option{--lz='lzip -9 -s64MiB'}}. + +@anchor{lz-compressor} +@item --lz=@var{command} +Set compression command. @var{command} may include arguments. For example +@w{@option{--lz='plzip --threads=2'}}. The name of the program can't begin +with @samp{-}. This option overrides the value set in @file{zutils.conf}. +The compression program used does not need to implement decompression +(@pxref{compressor-requirements}), but it must implement at least the +compression level option @option{-9} and the option @w{@option{-o @var{file}}} +to write the compressed output to @var{file}. +@uref{http://www.nongnu.org/lzip/manual/tarlz_manual.html,,tarlz} meets +these requirements, and therefore can be used to recompress POSIX tar +archives by using a command like +@w{@samp{zupdate --lz='tarlz -9 -z --no-solid' archive.tar.gz}}. +@ifnothtml +@xref{Top,tarlz manual,,tarlz}. +@end ifnothtml + +@end table + + +@node Problems +@chapter Reporting bugs +@cindex bugs +@cindex getting help + +There are probably bugs in zutils. There are certainly errors and +omissions in this manual. If you report them, they will get fixed. If +you don't, no one will ever know about them and they will remain unfixed +for all eternity, if not longer. + +If you find a bug in zutils, please send electronic mail to +@email{zutils-bug@@nongnu.org}. Include the version number, which you can +find by running @w{@samp{zupdate --version}}. + + +@node Concept index +@unnumbered Concept index + +@printindex cp + +@bye |