diff options
Diffstat (limited to 'doc/zutils.texi')
-rw-r--r-- | doc/zutils.texi | 334 |
1 files changed, 171 insertions, 163 deletions
diff --git a/doc/zutils.texi b/doc/zutils.texi index 789643a..bb1c3b1 100644 --- a/doc/zutils.texi +++ b/doc/zutils.texi @@ -6,8 +6,8 @@ @finalout @c %**end of header -@set UPDATED 1 January 2019 -@set VERSION 1.8 +@set UPDATED 27 June 2020 +@set VERSION 1.9 @dircategory Data Compression @direntry @@ -49,53 +49,54 @@ This manual is for Zutils (version @value{VERSION}, @value{UPDATED}). @end menu @sp 1 -Copyright @copyright{} 2009-2019 Antonio Diaz Diaz. +Copyright @copyright{} 2009-2020 Antonio Diaz Diaz. -This manual is free documentation: you have unlimited permission -to copy, distribute and modify it. +This manual is free documentation: you have unlimited permission to copy, +distribute, and modify it. @node Introduction @chapter Introduction @cindex introduction -Zutils is a collection of utilities able to process any combination of -compressed and uncompressed files transparently. If any given file, -including standard input, is compressed, its decompressed content is -used. Compressed files are decompressed on the fly; no temporary files -are created. +@uref{http://www.nongnu.org/zutils/zutils.html,,Zutils} +is a collection of utilities able to process any combination of +compressed and uncompressed files transparently. If any file given, +including standard input, is compressed, its decompressed content is used. +Compressed files are decompressed on the fly; no temporary files are +created. These utilities are not wrapper scripts but safer and more efficient C++ -programs. In particular the @samp{--recursive} option is very efficient -in those utilities supporting it. +programs. In particular the option @samp{--recursive} is very efficient in +those utilities supporting it. @noindent -The utilities provided are zcat, zcmp, zdiff, zgrep, ztest and zupdate.@* -The formats supported are bzip2, gzip, lzip and xz.@* -Zutils uses external compressors. The compressor to be used for each -format is configurable at runtime. - -zcat, zcmp, zdiff, and zgrep are improved replacements for the shell -scripts provided by GNU gzip. ztest is unique to zutils. zupdate is -similar to gzip's znew. - -NOTE: Bzip2 and lzip provide well-defined values of exit status, which -makes them safe to use with zutils. Gzip and xz may return ambiguous -warning values, making them less reliable back ends for zutils. +The utilities provided are zcat, zcmp, zdiff, zgrep, ztest, and zupdate.@* +The formats supported are bzip2, gzip, lzip, and xz.@* +Zutils uses external compressors. The compressor to be used for each format +is configurable at runtime. + +zcat, zcmp, zdiff, and zgrep are improved replacements for the shell scripts +provided by GNU gzip. ztest is unique to zutils. zupdate is similar to +gzip's znew. + +NOTE: Bzip2 and lzip provide well-defined values of exit status, which makes +them safe to use with zutils. Gzip and xz may return ambiguous warning +values, making them less reliable back ends for zutils. @xref{compressor-requirements}. -FORMAT NOTE 1: The @samp{--format} option allows the processing of a -subset of formats in recursive mode and when trying compressed file -names: @w{@samp{zgrep foo -r --format=bz2,lz somedir somefile.tar}}. +FORMAT NOTE 1: The option @samp{--format} allows the processing of a subset +of formats in recursive mode and when trying compressed file names: +@w{@samp{zgrep foo -r --format=bz2,lz somedir somefile.tar}}. -FORMAT NOTE 2: If the @samp{--force-format} option is given, the files -are passed to the corresponding decompressor without verifying their -format, allowing for example the processing of compress'd (.Z) files -with gzip: @w{@samp{zcmp --force-format=gz file.Z file.lz}}. +FORMAT NOTE 2: If the option @samp{--force-format} is given, the files are +passed to the corresponding decompressor without verifying their format, +allowing for example the processing of compress'd (.Z) files with gzip: +@w{@samp{zcmp --force-format=gz file.Z file.lz}}. -LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never -have been compressed. Decompressed is used to refer to data which have -undergone the process of decompression. +LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never have +been compressed. Decompressed is used to refer to data which have undergone +the process of decompression. @sp 1 Numbers given as arguments to options (positions, sizes) may be followed @@ -120,9 +121,13 @@ Table of SI and binary prefixes (unit multipliers): @chapter Common options @cindex common options -The following options are available in all the utilities. Rather than -writing identical descriptions for each of the programs, they are -described here. +The following +@uref{http://www.nongnu.org/arg-parser/manual/arg_parser_manual.html#Argument-syntax,,options}: +are available in all the utilities. Rather than writing identical +descriptions for each of the programs, they are described here. +@ifnothtml +@xref{Argument syntax,,,arg_parser}. +@end ifnothtml @table @code @item -h @@ -139,7 +144,7 @@ This version number should be included in all bug reports. @itemx --format=@var{format_list} Process only the formats listed in the comma-separated @var{format_list}. Valid formats are @samp{bz2}, @samp{gz}, @samp{lz}, -@samp{xz} and @samp{un} for @samp{uncompressed}, meaning "any file name +@samp{xz}, and @samp{un} for @samp{uncompressed}, meaning "any file name without a known extension". This option excludes files based on extension, instead of format, because it is more efficient. The exclusion only applies to names generated automatically (for example @@ -165,19 +170,22 @@ Don't read the runtime configuration file @samp{zutilsrc}. @itemx --gz=@var{command} @itemx --lz=@var{command} @itemx --xz=@var{command} -Set program (may include arguments) to be used as (de)compressor for the -given format. The name of the program can't begin with @samp{-}. These -options override the values set in @file{zutilsrc}. The compression -program used must meet three requirements: +Set program to be used as (de)compressor for the corresponding format. +@var{command} may include arguments. For example +@w{@samp{--lz='plzip --threads=2'}}. The program set with @samp{--lz} is +used for both compression and decompression. The other three are used only +for decompression. The name of the program can't begin with @samp{-}. These +options override the values set in @file{zutilsrc}. The compression program +used must meet three requirements: @anchor{compressor-requirements} @enumerate @item -When called with the @samp{-d} option, it must read compressed data from +When called with the option @samp{-d}, it must read compressed data from the standard input and produce decompressed data on the standard output. @item -If the @samp{-q} option is passed to zutils, the compression program -must also accept it. +If the option @samp{-q} is passed to zutils, the compression program must +also accept it. @item It must return 0 if no errors occurred, and a non-zero value otherwise. @end enumerate @@ -206,12 +214,12 @@ are further instructions in it): @item Any line beginning with @samp{#} is a comment line. @item -Each non-comment line defines the command to be used for the given +Each non-comment line defines the command to be used for the corresponding format, with the syntax: @example <format> = <compressor> [options] @end example -where <format> is one of @samp{bz2}, @samp{gz}, @samp{lz} or @samp{xz}. +where <format> is one of @samp{bz2}, @samp{gz}, @samp{lz}, or @samp{xz}. @end enumerate @@ -219,20 +227,20 @@ where <format> is one of @samp{bz2}, @samp{gz}, @samp{lz} or @samp{xz}. @chapter Zcat @cindex zcat -zcat copies each given file to standard output. If any given file is -compressed, its decompressed content is used. If a given file does not -exist, and its name does not end with one of the known extensions, zcat -tries the compressed file names corresponding to the formats supported. -If a file fails to decompress, zcat continues copying the rest of the -files. +zcat copies each @var{file} argument to standard output in sequence. If any +file given is compressed, its decompressed content is copied. If a file +given does not exist, and its name does not end with one of the known +extensions, zcat tries the compressed file names corresponding to the +formats supported. If a file fails to decompress, zcat continues copying the +rest of the files. If a file is specified as @samp{-}, data are read from standard input, decompressed if needed, and sent to standard output. Data read from standard input must be of the same type; all uncompressed or all in the -same compression format. +same compressed format. -If no files are specified, recursive searches examine the current -working directory, and nonrecursive searches read standard input. +If no files are specified, recursive searches examine the current working +directory, and nonrecursive searches read standard input. The format for running zcat is: @@ -241,7 +249,7 @@ zcat [@var{options}] [@var{files}] @end example @noindent -Exit status is 0 if no errors occurred, non-zero otherwise. +Exit status is 0 if no errors occurred, 1 otherwise. zcat supports the following options: @@ -268,8 +276,8 @@ Number all output lines, starting with 1. The line count is unlimited. @item -O @var{format} @itemx --force-format=@var{format} -Force the given compression format. Valid values for @var{format} are -@samp{bz2}, @samp{gz}, @samp{lz} and @samp{xz}. If this option is used, +Force the compressed format given. Valid values for @var{format} are +@samp{bz2}, @samp{gz}, @samp{lz}, and @samp{xz}. If this option is used, the files are passed to the corresponding decompressor without verifying their format, and the exact file name must be given. Other names won't be tried. @@ -280,14 +288,14 @@ Quiet operation. Suppress all messages. @item -r @itemx --recursive -For each directory operand, read and process all files in that -directory, recursively. Follow symbolic links in the command line, but -skip symlinks that are encountered recursively. +For each directory operand, read and process all files in that directory, +recursively. Follow symbolic links given in the command line, but skip +symbolic links that are encountered recursively. @item -R @itemx --dereference-recursive -For each directory operand, read and process all files in that -directory, recursively, following all symbolic links. +For each directory operand, read and process all files in that directory, +recursively, following all symbolic links. @item -s @itemx --squeeze-blank @@ -316,11 +324,12 @@ Verbose mode. Show error messages. @chapter Zcmp @cindex zcmp -zcmp compares two files (@samp{-} means standard input), and if they -differ, tells the first byte and line number where they differ. Bytes -and lines are numbered starting with 1. If any given file is compressed, -its decompressed content is used. Compressed files are decompressed on -the fly; no temporary files are created. +zcmp compares two files and, if they differ, writes to standard output the +first byte and line number where they differ. Bytes and lines are numbered +starting with 1. A hyphen @samp{-} used as a @var{file} argument means +standard input. If any file given is compressed, its decompressed content is +used. Compressed files are decompressed on the fly; no temporary files are +created. The format for running zcmp is: @@ -329,10 +338,11 @@ zcmp [@var{options}] @var{file1} [@var{file2}] @end example @noindent -This compares @var{file1} to @var{file2}. If @var{file2} is omitted zcmp -tries the following: +This compares @var{file1} to @var{file2}. The standard input is used only if +@var{file1} or @var{file2} refers to standard input. If @var{file2} is +omitted zcmp tries the following: -@enumerate +@itemize - @item If @var{file1} is compressed, compares its decompressed contents with the corresponding uncompressed file (the name of @var{file1} with the @@ -340,10 +350,7 @@ extension removed). @item If @var{file1} is uncompressed, compares it with the decompressed contents of @var{file1}.[lz|bz2|gz|xz] (the first one that is found). -@item -If no suitable file is found, compares @var{file1} with data read from -standard input. -@end enumerate +@end itemize @noindent An exit status of 0 means no differences were found, 1 means some @@ -379,10 +386,10 @@ Compare at most @var{count} input bytes. @item -O [@var{format1}][,@var{format2}] @itemx --force-format=[@var{format1}][,@var{format2}] -Force the given compression formats. Any of @var{format1} or +Force the compressed formats given. Any of @var{format1} or @var{format2} may be omitted and the corresponding format will be automatically detected. Valid values for @var{format} are @samp{bz2}, -@samp{gz}, @samp{lz} and @samp{xz}. If at least one format is specified +@samp{gz}, @samp{lz}, and @samp{xz}. If at least one format is specified with this option, the file is passed to the corresponding decompressor without verifying its format, and the exact file names of both @var{file1} and @var{file2} must be given. Other names won't be tried. @@ -401,11 +408,12 @@ files differ. @chapter Zdiff @cindex zdiff -zdiff compares two files (@samp{-} means standard input), and if they -differ, shows the differences line by line. If any given file is -compressed, its decompressed content is used. zdiff is a front end to -the diff program and has the limitation that messages from diff refer to -temporary file names instead of those specified. +zdiff compares two files and, if they differ, writes to standard output the +differences line by line. A hyphen @samp{-} used as a @var{file} argument +means standard input. If any file given is compressed, its decompressed +content is used. zdiff is a front end to the program diff and has the +limitation that messages from diff refer to temporary file names instead of +those specified. The format for running zdiff is: @@ -414,10 +422,11 @@ zdiff [@var{options}] @var{file1} [@var{file2}] @end example @noindent -This compares @var{file1} to @var{file2}. If @var{file2} is omitted -zdiff tries the following: +This compares @var{file1} to @var{file2}. The standard input is used only if +@var{file1} or @var{file2} refers to standard input. If @var{file2} is +omitted zdiff tries the following: -@enumerate +@itemize - @item If @var{file1} is compressed, compares its decompressed contents with the corresponding uncompressed file (the name of @var{file1} with the @@ -425,10 +434,7 @@ extension removed). @item If @var{file1} is uncompressed, compares it with the decompressed contents of @var{file1}.[lz|bz2|gz|xz] (the first one that is found). -@item -If no suitable file is found, compares @var{file1} with data read from -standard input. -@end enumerate +@end itemize @noindent An exit status of 0 means no differences were found, 1 means some @@ -471,10 +477,10 @@ Ignore case differences in file contents. @item -O [@var{format1}][,@var{format2}] @itemx --force-format=[@var{format1}][,@var{format2}] -Force the given compression formats. Any of @var{format1} or +Force the compressed formats given. Any of @var{format1} or @var{format2} may be omitted and the corresponding format will be automatically detected. Valid values for @var{format} are @samp{bz2}, -@samp{gz}, @samp{lz} and @samp{xz}. If at least one format is specified +@samp{gz}, @samp{lz}, and @samp{xz}. If at least one format is specified with this option, the file is passed to the corresponding decompressor without verifying its format, and the exact file names of both @var{file1} and @var{file2} must be given. Other names won't be tried. @@ -517,9 +523,9 @@ Ignore all white space. @chapter Zgrep @cindex zgrep -zgrep is a front end to the grep program that allows transparent search -on any combination of compressed and uncompressed files. If any given -file is compressed, its decompressed content is used. If a given file +zgrep is a front end to the program grep that allows transparent search +on any combination of compressed and uncompressed files. If any file +given is compressed, its decompressed content is used. If a file given does not exist, and its name does not end with one of the known extensions, zgrep tries the compressed file names corresponding to the formats supported. If a file fails to decompress, zgrep continues @@ -528,10 +534,10 @@ searching the rest of the files. If a file is specified as @samp{-}, data are read from standard input, decompressed if needed, and fed to grep. Data read from standard input must be of the same type; all uncompressed or all in the same -compression format. +compressed format. -If no files are specified, recursive searches examine the current -working directory, and nonrecursive searches read standard input. +If no files are specified, recursive searches examine the current working +directory, and nonrecursive searches read standard input. The format for running zgrep is: @@ -572,7 +578,7 @@ Only print a count of matching lines per file. Print @var{n} lines of output context. @item --color[=@var{when}] -Show matched strings in color. @var{when} is @samp{never}, @samp{always} +Show matched strings in color. @var{when} is @samp{never}, @samp{always}, or @samp{auto}. @item -e @var{pattern} @@ -587,9 +593,9 @@ Treat @var{pattern} as an extended regular expression. @itemx --file=@var{file} Obtain patterns from @var{file}, one per line.@* When searching in several files at once, command substitution can be -used with @code{-e} to read @var{file} only once, for example if +used with @samp{-e} to read @var{file} only once, for example if @var{file} is not a regular file: -@w{@code{zgrep -e "$(cat @var{file})" file1.lz file2.gz}} +@w{@samp{zgrep -e "$(cat @var{file})" file1.lz file2.gz}} @item -F @itemx --fixed-strings @@ -633,8 +639,8 @@ Show only the part of matching lines that actually matches @var{pattern}. @item -O @var{format} @itemx --force-format=@var{format} -Force the given compression format. Valid values for @var{format} are -@samp{bz2}, @samp{gz}, @samp{lz} and @samp{xz}. If this option is used, +Force the compressed format given. Valid values for @var{format} are +@samp{bz2}, @samp{gz}, @samp{lz}, and @samp{xz}. If this option is used, the files are passed to the corresponding decompressor without verifying their format, and the exact file name must be given. Other names won't be tried. @@ -646,14 +652,14 @@ found, even if an error was detected. @item -r @itemx --recursive -For each directory operand, read and process all files in that -directory, recursively. Follow symbolic links in the command line, but -skip symlinks that are encountered recursively. +For each directory operand, read and process all files in that directory, +recursively. Follow symbolic links given in the command line, but skip +symbolic links that are encountered recursively. @item -R @itemx --dereference-recursive -For each directory operand, read and process all files in that -directory, recursively, following all symbolic links. +For each directory operand, read and process all files in that directory, +recursively, following all symbolic links. @item -s @itemx --no-messages @@ -681,15 +687,17 @@ Match only whole lines. @chapter Ztest @cindex ztest -ztest verifies the integrity of the specified compressed files. +ztest verifies the integrity of the compressed files specified. Uncompressed files are ignored. If a file is specified as @samp{-}, the integrity of compressed data read from standard input is verified. Data -read from standard input must be all in the same compression format. If -a file fails to decompress, ztest continues verifying the rest of the -files. +read from standard input must be all in the same compressed format. If +a file fails to decompress, does not exist, can't be opened, or is a +terminal, ztest continues verifying the rest of the files. A final +diagnostic is shown at verbosity level 1 or higher if any file fails the +test when testing multiple files. -If no files are specified, recursive searches examine the current -working directory, and nonrecursive searches read standard input. +If no files are specified, recursive searches examine the current working +directory, and nonrecursive searches read standard input. Note that error detection in the xz format is broken. First, some xz files lack integrity information. Second, not all xz decompressors can @@ -717,13 +725,12 @@ ztest supports the following options: @table @code @item -O @var{format} @itemx --force-format=@var{format} -Force the given compression format. Valid values for @var{format} are -@samp{bz2}, @samp{gz}, @samp{lz} and @samp{xz}. If this option is used, -the files are passed to the corresponding decompressor without verifying -their format, and any files in a format that the decompressor can't -understand will fail. For example, @samp{--force-format=gz} can test -gzipped (.gz) and compress'd (.Z) files if the compressor used is GNU -gzip. +Force the compressed format given. Valid values for @var{format} are +@samp{bz2}, @samp{gz}, @samp{lz}, and @samp{xz}. If this option is used, the +files are passed to the corresponding decompressor without verifying their +format, and any files in a format that the decompressor can't understand +will fail. For example, @samp{--force-format=gz} can test gzipped (.gz) and +compress'd (.Z) files if the compressor used is GNU gzip. @item -q @itemx --quiet @@ -731,14 +738,14 @@ Quiet operation. Suppress all messages. @item -r @itemx --recursive -For each directory operand, read and process all files in that -directory, recursively. Follow symbolic links in the command line, but -skip symlinks that are encountered recursively. +For each directory operand, read and process all files in that directory, +recursively. Follow symbolic links given in the command line, but skip +symbolic links that are encountered recursively. @item -R @itemx --dereference-recursive -For each directory operand, read and process all files in that -directory, recursively, following all symbolic links. +For each directory operand, read and process all files in that directory, +recursively, following all symbolic links. @item -v @itemx --verbose @@ -752,40 +759,41 @@ Further -v's increase the verbosity level. @chapter Zupdate @cindex zupdate -zupdate recompresses files from bzip2, gzip, and xz formats to lzip -format. Each original is compared with the new file and then deleted. -Only regular files with standard file name extensions are recompressed, -other files are ignored. Compressed files are decompressed and then -recompressed on the fly; no temporary files are created. If an error -happens while recompressing a file, zupdate exits immediately without -recompressing the rest of the files. The lzip format is chosen as -destination because it is the most appropriate for long-term data -archiving. +zupdate recompresses files from bzip2, gzip, and xz formats to lzip format. +Each original is compared with the new file and then deleted. Only regular +files with standard file name extensions are recompressed, other files are +ignored. Compressed files are decompressed and then recompressed on the fly; +no temporary files are created. If an error happens while recompressing a +file, zupdate exits immediately without recompressing the rest of the files. +The lzip format is chosen as destination because it is the most appropriate +for long-term data archiving. -If no files are specified, recursive searches examine the current -working directory, and nonrecursive searches do nothing. +If no files are specified, recursive searches examine the current working +directory, and nonrecursive searches do nothing. If the lzip compressed version of a file already exists, the file is -skipped unless the @samp{--force} option is given. In this case, if the +skipped unless the option @samp{--force} is given. In this case, if the comparison with the existing lzip version fails, an error is returned and the original file is not deleted. The operation of zupdate is meant -to be safe and not produce any data loss. Therefore, existing lzip +to be safe and not cause any data loss. Therefore, existing lzip compressed files are never overwritten nor deleted. -Combining the @samp{--force} and @samp{--keep} options, as in -@w{@code{zupdate -f -k *.gz}}, verifies that there are no differences +Combining the options @samp{--force} and @samp{--keep}, as in +@w{@samp{zupdate -f -k *.gz}}, verifies that there are no differences between each pair of files in a multiformat set of files. -The names of the original files must have one of the following -extensions: @samp{.bz2}, @samp{.tbz}, @samp{.tbz2}, @samp{.gz}, -@samp{.tgz}, @samp{.xz}, @samp{.txz}. The files produced have the -extensions @samp{.lz} or @samp{.tar.lz}. +The names of the original files must have one of the following extensions:@* +@samp{.bz2}, @samp{.gz}, and @samp{.xz} are recompressed to @samp{.lz}.@* +@samp{.tbz}, @samp{.tbz2}, @samp{.tgz}, and @samp{.txz} are recompressed to +@samp{.tlz}.@* +Keeping the combined extensions (@samp{.tgz} --> @samp{.tlz}) may be useful +when recompressing Slackware packages, for example. -Recompressing a file is much like copying or moving it; therefore -zupdate preserves the access and modification dates, permissions, and, -when possible, ownership of the file just as @samp{cp -p} does. (If the user -ID or the group ID can't be duplicated, the file permission bits S_ISUID -and S_ISGID are cleared). +Recompressing a file is much like copying or moving it; therefore zupdate +preserves the access and modification dates, permissions, and, when +possible, ownership of the file just as @samp{cp -p} does. (If the user ID or +the group ID can't be duplicated, the file permission bits S_ISUID and +S_ISGID are cleared). The format for running zupdate is: @@ -794,9 +802,8 @@ zupdate [@var{options}] [@var{files}] @end example @noindent -Exit status is 0 if all the compressed files were successfully -recompressed (if needed), compared and deleted (if requested). Non-zero -otherwise. +Exit status is 0 if all the compressed files were successfully recompressed +(if needed), compared, and deleted (if requested). Non-zero otherwise. zupdate supports the following options: @@ -814,8 +821,8 @@ Keep (don't delete) the input file after comparing it with the lzip file. @item -l @itemx --lzip-verbose -Pass a @samp{-v} option to the lzip compressor so that it shows the -compression ratio for each file processed. Using lzip 1.15 and newer, a +Pass one option @samp{-v} to the lzip compressor so that it shows the +compression ratio for each file processed. Using lzip 1.15 or newer, a second @samp{-l} shows the progress of compression. Use it together with @samp{-v} to see the name of the file. @@ -825,14 +832,14 @@ Quiet operation. Suppress all messages. @item -r @itemx --recursive -For each directory operand, read and process all files in that -directory, recursively. Follow symbolic links in the command line, but -skip symlinks that are encountered recursively. +For each directory operand, read and process all files in that directory, +recursively. Follow symbolic links given in the command line, but skip +symbolic links that are encountered recursively. @item -R @itemx --dereference-recursive -For each directory operand, read and process all files in that -directory, recursively, following all symbolic links. +For each directory operand, read and process all files in that directory, +recursively, following all symbolic links. @item -v @itemx --verbose @@ -840,8 +847,9 @@ Verbose mode. Show the files being processed. A second @samp{-v} also shows the files being ignored. @item -0 .. -9 -Set the compression level of lzip. By default zupdate passes @samp{-9} -to lzip. +Set the compression level of lzip. By default zupdate passes @samp{-9} to +lzip. Custom compression options can be passed to lzip with the option +@samp{--lz}. For example @w{@samp{--lz='lzip -9 -s64MiB'}}. @end table @@ -858,7 +866,7 @@ for all eternity, if not longer. If you find a bug in zutils, please send electronic mail to @email{zutils-bug@@nongnu.org}. Include the version number, which you can -find by running @w{@code{zupdate --version}}. +find by running @w{@samp{zupdate --version}}. @node Concept index |