diff options
Diffstat (limited to 'doc/zutils.info')
-rw-r--r-- | doc/zutils.info | 838 |
1 files changed, 838 insertions, 0 deletions
diff --git a/doc/zutils.info b/doc/zutils.info new file mode 100644 index 0000000..854100f --- /dev/null +++ b/doc/zutils.info @@ -0,0 +1,838 @@ +This is zutils.info, produced by makeinfo version 4.13+ from zutils.texi. + +INFO-DIR-SECTION Data Compression +START-INFO-DIR-ENTRY +* Zutils: (zutils). Utilities dealing with compressed files +END-INFO-DIR-ENTRY + + +File: zutils.info, Node: Top, Next: Introduction, Up: (dir) + +Zutils Manual +************* + +This manual is for Zutils (version 1.10, 5 January 2021). + +* Menu: + +* Introduction:: Purpose and features of zutils +* Common options:: Options common to all utilities +* The zutilsrc file:: The zutils configuration file +* Zcat:: Concatenating compressed files +* Zcmp:: Comparing compressed files byte by byte +* Zdiff:: Comparing compressed files line by line +* Zgrep:: Searching inside compressed files +* Ztest:: Testing the integrity of compressed files +* Zupdate:: Recompressing files to lzip format +* Problems:: Reporting bugs +* Concept index:: Index of concepts + + + Copyright (C) 2009-2021 Antonio Diaz Diaz. + + This manual is free documentation: you have unlimited permission to copy, +distribute, and modify it. + + +File: zutils.info, Node: Introduction, Next: Common options, Prev: Top, Up: Top + +1 Introduction +************** + +Zutils is a collection of utilities able to process any combination of +compressed and uncompressed files transparently. If any file given, +including standard input, is compressed, its decompressed content is used. +Compressed files are decompressed on the fly; no temporary files are +created. + + These utilities are not wrapper scripts but safer and more efficient C++ +programs. In particular the option '--recursive' is very efficient in those +utilities supporting it. + +The utilities provided are zcat, zcmp, zdiff, zgrep, ztest, and zupdate. +The formats supported are bzip2, gzip, lzip, and xz. +Zutils uses external compressors. The compressor to be used for each format +is configurable at runtime. + + zcat, zcmp, zdiff, and zgrep are improved replacements for the shell +scripts provided by GNU gzip. ztest is unique to zutils. zupdate is similar +to gzip's znew. + + NOTE: Bzip2 and lzip provide well-defined values of exit status, which +makes them safe to use with zutils. Gzip and xz may return ambiguous warning +values, making them less reliable back ends for zutils. *Note +compressor-requirements::. + + FORMAT NOTE 1: The option '--format' allows the processing of a subset +of formats in recursive mode and when trying compressed file names: +'zgrep foo -r --format=bz2,lz somedir somefile.tar'. + + FORMAT NOTE 2: If the option '--force-format' is given, the files are +passed to the corresponding decompressor without verifying their format, +allowing for example the processing of compress'd (.Z) files with gzip: +'zcmp --force-format=gz file.Z file.lz'. + + LANGUAGE NOTE: Uncompressed = not compressed = plain data; it may never +have been compressed. Decompressed is used to refer to data which have +undergone the process of decompression. + + + Numbers given as arguments to options (positions, sizes) may be followed +by a multiplier and an optional 'B' for "byte". + + Table of SI and binary prefixes (unit multipliers): + +Prefix Value | Prefix Value +k kilobyte (10^3 = 1000) | Ki kibibyte (2^10 = 1024) +M megabyte (10^6) | Mi mebibyte (2^20) +G gigabyte (10^9) | Gi gibibyte (2^30) +T terabyte (10^12) | Ti tebibyte (2^40) +P petabyte (10^15) | Pi pebibyte (2^50) +E exabyte (10^18) | Ei exbibyte (2^60) +Z zettabyte (10^21) | Zi zebibyte (2^70) +Y yottabyte (10^24) | Yi yobibyte (2^80) + + +File: zutils.info, Node: Common options, Next: The zutilsrc file, Prev: Introduction, Up: Top + +2 Common options +**************** + +The following options: are available in all the utilities. Rather than +writing identical descriptions for each of the programs, they are described +here. *Note Argument syntax: (arg_parser)Argument syntax. + +'-h' +'--help' + Print an informative help message describing the options and exit. + zgrep only supports the '--help' form of this option. + +'-V' +'--version' + Print the version number on the standard output and exit. This version + number should be included in all bug reports. + +'-M FORMAT_LIST' +'--format=FORMAT_LIST' + Process only the formats listed in the comma-separated FORMAT_LIST. + Valid formats are 'bz2', 'gz', 'lz', 'xz', and 'un' for + 'uncompressed', meaning "any file name without a known extension". + This option excludes files based on extension, instead of format, + because it is more efficient. The exclusion only applies to names + generated automatically (for example when adding extensions to a file + name or when operating recursively on directories). Files given in the + command line are always processed. + + Each format in FORMAT_LIST enables file names with the following + extensions: + + bz2 enables .bz2 .tbz .tbz2 + gz enables .gz .tgz + lz enables .lz .tlz + xz enables .xz .txz + un enables any other file name + +'-N' +'--no-rcfile' + Don't read the runtime configuration file 'zutilsrc'. + +'--bz2=COMMAND' +'--gz=COMMAND' +'--lz=COMMAND' +'--xz=COMMAND' + Set program to be used as (de)compressor for the corresponding format. + COMMAND may include arguments. For example '--lz='plzip --threads=2''. + The program set with '--lz' is used for both compression and + decompression. The other three are used only for decompression. The + name of the program can't begin with '-'. These options override the + values set in 'zutilsrc'. The compression program used must meet three + requirements: + + 1. When called with the option '-d', it must read compressed data + from the standard input and produce decompressed data on the + standard output. + + 2. If the option '-q' is passed to zutils, the compression program + must also accept it. + + 3. It must return 0 if no errors occurred, and a non-zero value + otherwise. + + + +File: zutils.info, Node: The zutilsrc file, Next: Zcat, Prev: Common options, Up: Top + +3 The zutils configuration file 'zutilsrc' +****************************************** + +'zutilsrc' is the runtime configuration file for zutils. In it you may +define the compressor name and options to be used for each format. +'zutilsrc' is optional; you don't need to install it in order to run zutils. + + The compressors specified in the command line override those specified +in 'zutilsrc'. + + You may copy the system 'zutilsrc' file '${sysconfdir}/zutilsrc' to +'$HOME/.zutilsrc' and customize these options as you like. The file syntax +is fairly obvious (and there are further instructions in it): + + 1. Any line beginning with '#' is a comment line. + + 2. Each non-comment line defines the command to be used for the + corresponding format, with the syntax: + <format> = <compressor> [options] + where <format> is one of 'bz2', 'gz', 'lz', or 'xz'. + + +File: zutils.info, Node: Zcat, Next: Zcmp, Prev: The zutilsrc file, Up: Top + +4 Zcat +****** + +zcat copies each FILE argument to standard output in sequence. If any file +given is compressed, its decompressed content is copied. If a file given +does not exist, and its name does not end with one of the known extensions, +zcat tries the compressed file names corresponding to the formats +supported. If a file fails to decompress, zcat continues copying the rest +of the files. + + If a file is specified as '-', data are read from standard input, +decompressed if needed, and sent to standard output. Data read from +standard input must be of the same type; all uncompressed or all in the +same compressed format. + + If no files are specified, recursive searches examine the current working +directory, and nonrecursive searches read standard input. + + The format for running zcat is: + + zcat [OPTIONS] [FILES] + +Exit status is 0 if no errors occurred, 1 otherwise. + + zcat supports the following options: + +'-A' +'--show-all' + Equivalent to '-vET'. + +'-b' +'--number-nonblank' + Number all nonblank output lines, starting with 1. The line count is + unlimited. + +'-e' + Equivalent to '-vE'. + +'-E' +'--show-ends' + Print a '$' after the end of each line. + +'-n' +'--number' + Number all output lines, starting with 1. The line count is unlimited. + +'-O FORMAT' +'--force-format=FORMAT' + Force the compressed format given. Valid values for FORMAT are 'bz2', + 'gz', 'lz', and 'xz'. If this option is used, the files are passed to + the corresponding decompressor without verifying their format, and the + exact file name must be given. Other names won't be tried. + +'-q' +'--quiet' + Quiet operation. Suppress all messages. + +'-r' +'--recursive' + For each directory operand, read and process all files in that + directory, recursively. Follow symbolic links given in the command + line, but skip symbolic links that are encountered recursively. + +'-R' +'--dereference-recursive' + For each directory operand, read and process all files in that + directory, recursively, following all symbolic links. + +'-s' +'--squeeze-blank' + Replace multiple adjacent blank lines with a single blank line. + +'-t' + Equivalent to '-vT'. + +'-T' +'--show-tabs' + Print TAB characters as '^I'. + +'-v' +'--show-nonprinting' + Print control characters except for LF (newline) and TAB using '^' + notation and precede characters larger than 127 with 'M-' (which + stands for "meta"). + +'--verbose' + Verbose mode. Show error messages. + + + +File: zutils.info, Node: Zcmp, Next: Zdiff, Prev: Zcat, Up: Top + +5 Zcmp +****** + +zcmp compares two files and, if they differ, writes to standard output the +first byte and line number where they differ. Bytes and lines are numbered +starting with 1. A hyphen '-' used as a FILE argument means standard input. +If any file given is compressed, its decompressed content is used. +Compressed files are decompressed on the fly; no temporary files are +created. + + The format for running zcmp is: + + zcmp [OPTIONS] FILE1 [FILE2] + +This compares FILE1 to FILE2. The standard input is used only if FILE1 or +FILE2 refers to standard input. If FILE2 is omitted zcmp tries the +following: + + - If FILE1 is compressed, compares its decompressed contents with the + corresponding uncompressed file (the name of FILE1 with the extension + removed). + + - If FILE1 is uncompressed, compares it with the decompressed contents + of FILE1.[lz|bz2|gz|xz] (the first one that is found). + +An exit status of 0 means no differences were found, 1 means some +differences were found, and 2 means trouble. + + zcmp supports the following options: + +'-b' +'--print-bytes' + Print the differing bytes. Print control bytes as a '^' followed by a + letter, and precede bytes larger than 127 with 'M-' (which stands for + "meta"). + +'-i SIZE' +'--ignore-initial=SIZE' + Ignore any differences in the first SIZE bytes of the input files. + Treat files with fewer than SIZE bytes as if they were empty. If SIZE + is in the form 'SIZE1:SIZE2', ignore the first SIZE1 bytes of the + first input file and the first SIZE2 bytes of the second input file. + +'-l' +'-v' +'--list' +'--verbose' + Print the byte numbers (in decimal) and values (in octal) of all + differing bytes. + +'-n COUNT' +'--bytes=COUNT' + Compare at most COUNT input bytes. + +'-O [FORMAT1][,FORMAT2]' +'--force-format=[FORMAT1][,FORMAT2]' + Force the compressed formats given. Any of FORMAT1 or FORMAT2 may be + omitted and the corresponding format will be automatically detected. + Valid values for FORMAT are 'bz2', 'gz', 'lz', and 'xz'. If at least + one format is specified with this option, the file is passed to the + corresponding decompressor without verifying its format, and the exact + file names of both FILE1 and FILE2 must be given. Other names won't be + tried. + +'-q' +'-s' +'--quiet' +'--silent' + Don't print anything; only return an exit status indicating whether the + files differ. + + + +File: zutils.info, Node: Zdiff, Next: Zgrep, Prev: Zcmp, Up: Top + +6 Zdiff +******* + +zdiff compares two files and, if they differ, writes to standard output the +differences line by line. A hyphen '-' used as a FILE argument means +standard input. If any file given is compressed, its decompressed content +is used. zdiff is a front end to the program diff and has the limitation +that messages from diff refer to temporary file names instead of those +specified. + + The format for running zdiff is: + + zdiff [OPTIONS] FILE1 [FILE2] + +This compares FILE1 to FILE2. The standard input is used only if FILE1 or +FILE2 refers to standard input. If FILE2 is omitted zdiff tries the +following: + + - If FILE1 is compressed, compares its decompressed contents with the + corresponding uncompressed file (the name of FILE1 with the extension + removed). + + - If FILE1 is uncompressed, compares it with the decompressed contents + of FILE1.[lz|bz2|gz|xz] (the first one that is found). + +An exit status of 0 means no differences were found, 1 means some +differences were found, and 2 means trouble. + + zdiff supports the following options (some options only work if the diff +program used supports them): + +'-a' +'--text' + Treat all files as text. + +'-b' +'--ignore-space-change' + Ignore changes in the amount of white space. + +'-B' +'--ignore-blank-lines' + Ignore changes whose lines are all blank. + +'-c' + Use the context output format. + +'-C N' +'--context=N' + Same as -c but use N lines of context. + +'-d' +'--minimal' + Try hard to find a smaller set of changes. + +'-E' +'--ignore-tab-expansion' + Ignore changes due to tab expansion. + +'-i' +'--ignore-case' + Ignore case differences in file contents. + +'-O [FORMAT1][,FORMAT2]' +'--force-format=[FORMAT1][,FORMAT2]' + Force the compressed formats given. Any of FORMAT1 or FORMAT2 may be + omitted and the corresponding format will be automatically detected. + Valid values for FORMAT are 'bz2', 'gz', 'lz', and 'xz'. If at least + one format is specified with this option, the file is passed to the + corresponding decompressor without verifying its format, and the exact + file names of both FILE1 and FILE2 must be given. Other names won't be + tried. + +'-p' +'--show-c-function' + Show which C function each change is in. + +'-q' +'--brief' + Output only whether files differ. + +'-s' +'--report-identical-files' + Report when two files are identical. + +'-t' +'--expand-tabs' + Expand tabs to spaces in output. + +'-T' +'--initial-tab' + Make tabs line up by prepending a tab. + +'-u' + Use the unified output format. + +'-U N' +'--unified=N' + Same as -u but use N lines of context. + +'-w' +'--ignore-all-space' + Ignore all white space. + + + +File: zutils.info, Node: Zgrep, Next: Ztest, Prev: Zdiff, Up: Top + +7 Zgrep +******* + +zgrep is a front end to the program grep that allows transparent search on +any combination of compressed and uncompressed files. If any file given is +compressed, its decompressed content is used. If a file given does not +exist, and its name does not end with one of the known extensions, zgrep +tries the compressed file names corresponding to the formats supported. If +a file fails to decompress, zgrep continues searching the rest of the files. + + If a file is specified as '-', data are read from standard input, +decompressed if needed, and fed to grep. Data read from standard input must +be of the same type; all uncompressed or all in the same compressed format. + + If no files are specified, recursive searches examine the current working +directory, and nonrecursive searches read standard input. + + The format for running zgrep is: + + zgrep [OPTIONS] PATTERN [FILES] + +An exit status of 0 means at least one match was found, 1 means no matches +were found, and 2 means trouble. + + zgrep supports the following options (some options only work if the grep +program used supports them): + +'-a' +'--text' + Treat all files as text. + +'-A N' +'--after-context=N' + Print N lines of trailing context. + +'-b' +'--byte-offset' + Print the byte offset of each line. + +'-B N' +'--before-context=N' + Print N lines of leading context. + +'-c' +'--count' + Only print a count of matching lines per file. + +'-C N' +'--context=N' + Print N lines of output context. + +'--color[=WHEN]' + Show matched strings in color. WHEN is 'never', 'always', or 'auto'. + +'-e PATTERN' +'--regexp=PATTERN' + Use PATTERN as the pattern to match. + +'-E' +'--extended-regexp' + Treat PATTERN as an extended regular expression. + +'-f FILE' +'--file=FILE' + Obtain patterns from FILE, one per line. + When searching in several files at once, command substitution can be + used with '-e' to read FILE only once, for example if FILE is not a + regular file: 'zgrep -e "$(cat FILE)" file1.lz file2.gz' + +'-F' +'--fixed-strings' + Treat PATTERN as a set of newline-separated strings. + +'-h' +'--no-filename' + Suppress the prefixing of file names on output when multiple files are + searched. + +'-H' +'--with-filename' + Print the file name for each match. + +'-i' +'--ignore-case' + Ignore case distinctions. + +'-I' + Ignore binary files. + +'-l' +'--files-with-matches' + Only print names of files containing at least one match. + +'-L' +'--files-without-match' + Only print names of files not containing any matches. + Note: option -L fails (prints wrong results, returns wrong status, and + even hangs) when using GNU grep versions 3.2 to 3.4 inclusive because + of a wrong change in the exit status of grep, which was reverted in + GNU grep 3.5. + +'-m N' +'--max-count=N' + Stop after N matches. + +'-n' +'--line-number' + Prefix each matched line with its line number in the input file. + +'-o' +'--only-matching' + Show only the part of matching lines that actually matches PATTERN. + +'-O FORMAT' +'--force-format=FORMAT' + Force the compressed format given. Valid values for FORMAT are 'bz2', + 'gz', 'lz', and 'xz'. If this option is used, the files are passed to + the corresponding decompressor without verifying their format, and the + exact file name must be given. Other names won't be tried. + +'-q' +'--quiet' + Suppress all messages. Exit immediately with zero status if any match + is found, even if an error was detected. + +'-r' +'--recursive' + For each directory operand, read and process all files in that + directory, recursively. Follow symbolic links given in the command + line, but skip symbolic links that are encountered recursively. + +'-R' +'--dereference-recursive' + For each directory operand, read and process all files in that + directory, recursively, following all symbolic links. + +'-s' +'--no-messages' + Suppress error messages about nonexistent or unreadable files. + +'-v' +'--invert-match' + Select non-matching lines. + +'--verbose' + Verbose mode. Show error messages. + +'-w' +'--word-regexp' + Match only whole words. + +'-x' +'--line-regexp' + Match only whole lines. + + + +File: zutils.info, Node: Ztest, Next: Zupdate, Prev: Zgrep, Up: Top + +8 Ztest +******* + +ztest verifies the integrity of the compressed files specified. +Uncompressed files are ignored. If a file is specified as '-', the +integrity of compressed data read from standard input is verified. Data +read from standard input must be all in the same compressed format. If a +file fails to decompress, does not exist, can't be opened, or is a +terminal, ztest continues verifying the rest of the files. A final +diagnostic is shown at verbosity level 1 or higher if any file fails the +test when testing multiple files. + + If no files are specified, recursive searches examine the current working +directory, and nonrecursive searches read standard input. + + Note that error detection in the xz format is broken. First, some xz +files lack integrity information. Second, not all xz decompressors can +verify the integrity of all xz files. Third, section 2.1.1.2 'Stream Flags' +of the xz format specification allows xz decompressors to produce garbage +output without issuing any warning. Therefore, xz files can't always be +verified as reliably as files in the other formats can. + + The format for running ztest is: + + ztest [OPTIONS] [FILES] + +The exit status is 0 if all compressed files verify OK, 1 if environmental +problems (file not found, invalid flags, I/O errors, etc), 2 if any +compressed file is corrupt or invalid. + + ztest supports the following options: + +'-O FORMAT' +'--force-format=FORMAT' + Force the compressed format given. Valid values for FORMAT are 'bz2', + 'gz', 'lz', and 'xz'. If this option is used, the files are passed to + the corresponding decompressor without verifying their format, and any + files in a format that the decompressor can't understand will fail. + For example, '--force-format=gz' can test gzipped (.gz) and compress'd + (.Z) files if the compressor used is GNU gzip. + +'-q' +'--quiet' + Quiet operation. Suppress all messages. + +'-r' +'--recursive' + For each directory operand, read and process all files in that + directory, recursively. Follow symbolic links given in the command + line, but skip symbolic links that are encountered recursively. + +'-R' +'--dereference-recursive' + For each directory operand, read and process all files in that + directory, recursively, following all symbolic links. + +'-v' +'--verbose' + Verbose mode. Show the verify status for each file processed. + Further -v's increase the verbosity level. + + + +File: zutils.info, Node: Zupdate, Next: Problems, Prev: Ztest, Up: Top + +9 Zupdate +********* + +zupdate recompresses files from bzip2, gzip, and xz formats to lzip format. +Each original is compared with the new file and then deleted. Only regular +files with standard file name extensions are recompressed, other files are +ignored. Compressed files are decompressed and then recompressed on the fly; +no temporary files are created. If an error happens while recompressing a +file, zupdate exits immediately without recompressing the rest of the files. +The lzip format is chosen as destination because it is the most appropriate +for long-term data archiving. + + If no files are specified, recursive searches examine the current working +directory, and nonrecursive searches do nothing. + + If the lzip compressed version of a file already exists, the file is +skipped unless the option '--force' is given. In this case, if the +comparison with the existing lzip version fails, an error is returned and +the original file is not deleted. The operation of zupdate is meant to be +safe and not cause any data loss. Therefore, existing lzip compressed files +are never overwritten nor deleted. + + Combining the options '--force' and '--keep', as in +'zupdate -f -k *.gz', verifies that there are no differences between each +pair of files in a multiformat set of files. + + The names of the original files must have one of the following +extensions: +'.bz2', '.gz', or '.xz', which are recompressed to '.lz'; +'.tbz', '.tbz2', '.tgz', or '.txz', which are recompressed to '.tlz'. +Keeping the combined extensions ('.tgz' -> '.tlz') may be useful when +recompressing Slackware packages, for example. + + Recompressing a file is much like copying or moving it; therefore zupdate +preserves the access and modification dates, permissions, and, when +possible, ownership of the file just as 'cp -p' does. (If the user ID or +the group ID can't be duplicated, the file permission bits S_ISUID and +S_ISGID are cleared). + + The format for running zupdate is: + + zupdate [OPTIONS] [FILES] + +Exit status is 0 if all the compressed files were successfully recompressed +(if needed), compared, and deleted (if requested). Non-zero otherwise. + + zupdate supports the following options: + +'-f' +'--force' + Don't skip a file for which a lzip compressed version already exists. + '--force' compares the content of the input file with the content of + the existing lzip file and deletes the input file if both contents are + identical. + +'-k' +'--keep' + Keep (don't delete) the input file after comparing it with the lzip + file. + +'-l' +'--lzip-verbose' + Pass one option '-v' to the lzip compressor so that it shows the + compression ratio for each file processed. Using lzip 1.15 or newer, a + second '-l' shows the progress of compression. Use it together with + '-v' to see the name of the file. + +'-q' +'--quiet' + Quiet operation. Suppress all messages. + +'-r' +'--recursive' + For each directory operand, read and process all files in that + directory, recursively. Follow symbolic links given in the command + line, but skip symbolic links that are encountered recursively. + +'-R' +'--dereference-recursive' + For each directory operand, read and process all files in that + directory, recursively, following all symbolic links. + +'-v' +'--verbose' + Verbose mode. Show the files being processed. A second '-v' also shows + the files being ignored. + +'-0 .. -9' + Set the compression level of lzip. By default zupdate passes '-9' to + lzip. Custom compression options can be passed to lzip with the option + '--lz'. For example '--lz='lzip -9 -s64MiB''. + + + +File: zutils.info, Node: Problems, Next: Concept index, Prev: Zupdate, Up: Top + +10 Reporting bugs +***************** + +There are probably bugs in zutils. There are certainly errors and omissions +in this manual. If you report them, they will get fixed. If you don't, no +one will ever know about them and they will remain unfixed for all +eternity, if not longer. + + If you find a bug in zutils, please send electronic mail to +<zutils-bug@nongnu.org>. Include the version number, which you can find by +running 'zupdate --version'. + + +File: zutils.info, Node: Concept index, Prev: Problems, Up: Top + +Concept index +************* + + +* Menu: + +* bugs: Problems. (line 6) +* common options: Common options. (line 6) +* getting help: Problems. (line 6) +* introduction: Introduction. (line 6) +* zcat: Zcat. (line 6) +* zcmp: Zcmp. (line 6) +* zdiff: Zdiff. (line 6) +* zgrep: Zgrep. (line 6) +* ztest: Ztest. (line 6) +* zupdate: Zupdate. (line 6) +* zutilsrc: The zutilsrc file. (line 6) + + + +Tag Table: +Node: Top222 +Node: Introduction1151 +Node: Common options3776 +Ref: compressor-requirements5847 +Node: The zutilsrc file6219 +Node: Zcat7180 +Node: Zcmp9743 +Node: Zdiff12233 +Node: Zgrep14973 +Node: Ztest19218 +Node: Zupdate21725 +Node: Problems25409 +Node: Concept index25943 + +End Tag Table + + +Local Variables: +coding: iso-8859-15 +End: |