diff options
Diffstat (limited to 'src/xz/xz.1')
-rw-r--r-- | src/xz/xz.1 | 469 |
1 files changed, 316 insertions, 153 deletions
diff --git a/src/xz/xz.1 b/src/xz/xz.1 index 73ca6ef..5b880e8 100644 --- a/src/xz/xz.1 +++ b/src/xz/xz.1 @@ -1,12 +1,10 @@ '\" t +.\" SPDX-License-Identifier: 0BSD .\" .\" Authors: Lasse Collin .\" Jia Tan .\" -.\" This file has been put into the public domain. -.\" You can do whatever you want with this file. -.\" -.TH XZ 1 "2023-07-17" "Tukaani" "XZ Utils" +.TH XZ 1 "2024-04-08" "Tukaani" "XZ Utils" . .SH NAME xz, unxz, xzcat, lzma, unlzma, lzcat \- Compress or decompress .xz and .lzma files @@ -801,8 +799,6 @@ in the single-threaded mode. It may vary slightly between .B xz versions. -Memory requirements of some of the future multithreaded modes may -be dramatically higher than that of the single-threaded mode. .IP \(bu 3 DecMem contains the decompressor memory requirements. That is, the compression settings determine @@ -811,6 +807,15 @@ The exact decompressor memory usage is slightly more than the LZMA2 dictionary size, but the values in the table have been rounded up to the next full MiB. .RE +.IP "" +Memory requirements of the multi-threaded mode are +significantly higher than that of the single-threaded mode. +With the default value of +.BR \-\-block\-size , +each thread needs 3*3*DictSize plus CompMem or DecMem. +For example, four threads with preset +.B \-6 +needs 660\(en670\ MiB of memory. .TP .BR \-e ", " \-\-extreme Use a slower variant of the selected compression preset level @@ -902,50 +907,90 @@ Using .I size less than the LZMA2 dictionary size is waste of RAM because then the LZMA2 dictionary buffer will never get fully used. -The sizes of the blocks are stored in the block headers, -which a future version of -.B xz -will use for multi-threaded decompression. +In multi-threaded mode, +the sizes of the blocks are stored in the block headers. +This size information is required for multi-threaded decompression. .IP "" In single-threaded mode no block splitting is done by default. Setting this option doesn't affect memory usage. No size information is stored in block headers, thus files created in single-threaded mode won't be identical to files created in multi-threaded mode. -The lack of size information also means that a future version of +The lack of size information also means that .B xz won't be able decompress the files in multi-threaded mode. .TP -.BI \-\-block\-list= sizes +.BI \-\-block\-list= items When compressing to the .B .xz -format, start a new block after +format, start a new block with an optional custom filter chain after the given intervals of uncompressed data. .IP "" -The uncompressed -.I sizes -of the blocks are specified as a comma-separated list. -Omitting a size (two or more consecutive commas) is a shorthand -to use the size of the previous block. +The +.I items +are a comma-separated list. +Each item consists of an optional filter chain number +between 0 and 9 followed by a colon +.RB ( : ) +and a required size of uncompressed data. +Omitting an item (two or more consecutive commas) is a +shorthand to use the size and filters of the previous item. .IP "" If the input file is bigger than the sum of -.IR sizes , -the last value in -.I sizes -is repeated until the end of the file. +the sizes in +.IR items , +the last item is repeated until the end of the file. A special value of .B 0 -may be used as the last value to indicate that +may be used as the last size to indicate that the rest of the file should be encoded as a single block. .IP "" -If one specifies -.I sizes -that exceed the encoder's block size +An alternative filter chain for each block can be +specified in combination with the +.BI \-\-filters1= filters +\&...\& +.BI \-\-filters9= filters +options. +These options define filter chains with an identifier +between 1\(en9. +Filter chain 0 can be used to refer to the default filter chain, +which is the same as not specifying a filter chain. +The filter chain identifier can be used before the uncompressed +size, followed by a colon +.RB ( : ). +For example, if one specifies +.B \-\-block\-list=1:2MiB,3:2MiB,2:4MiB,,2MiB,0:4MiB +then blocks will be created using: +.RS +.IP \(bu 3 +The filter chain specified by +.B \-\-filters1 +and 2 MiB input +.IP \(bu 3 +The filter chain specified by +.B \-\-filters3 +and 2 MiB input +.IP \(bu 3 +The filter chain specified by +.B \-\-filters2 +and 4 MiB input +.IP \(bu 3 +The filter chain specified by +.B \-\-filters2 +and 4 MiB input +.IP \(bu 3 +The default filter chain and 2 MiB input +.IP \(bu 3 +The default filter chain and 4 MiB input for every block until +end of input. +.RE +.IP "" +If one specifies a size that exceeds the encoder's block size (either the default value in threaded mode or the value specified with \fB\-\-block\-size=\fIsize\fR), the encoder will create additional blocks while keeping the boundaries specified in -.IR sizes . +.IR items . For example, if one specifies .B \-\-block\-size=10MiB .B \-\-block\-list=5MiB,10MiB,8MiB,12MiB,24MiB @@ -1262,6 +1307,15 @@ meet this condition, but files compressed in single-threaded mode don't even if .BI \-\-block\-size= size has been used. +.IP "" +The default value for +.I threads +is +.BR 0 . +In +.B xz +5.4.x and older the default is +.BR 1 . . .SS "Custom compressor filter chains" A custom filter chain allows specifying @@ -1295,22 +1349,37 @@ in the chain. Depending on the filter, this limitation is either inherent to the filter design or exists to prevent security issues. .PP -A custom filter chain is specified by using one or more -filter options in the order they are wanted in the filter chain. -That is, the order of filter options is significant! +A custom filter chain can be specified in two different ways. +The options +.BI \-\-filters= filters +and +.BI \-\-filters1= filters +\&...\& +.BI \-\-filters9= filters +allow specifying an entire filter chain in one option using the +liblzma filter string syntax. +Alternatively, a filter chain can be specified by using one or more +individual filter options in the order they are wanted in the filter chain. +That is, the order of the individual filter options is significant! When decoding raw streams .RB ( \-\-format=raw ), -the filter chain is specified in the same order as +the filter chain must be specified in the same order as it was specified when compressing. -.PP -Filters take filter-specific +Any individual filter or preset options specified before the full +chain option +(\fB\-\-filters=\fIfilters\fR) +will be forgotten. +Individual filters specified after the full chain option will reset the +filter chain. +.PP +Both the full and individual filter options take filter-specific .I options as a comma-separated list. Extra commas in .I options are ignored. -Every option has a default value, so you need to -specify only those you want to change. +Every option has a default value, so +specify those you want to change. .PP To see the whole filter chain and .IR options , @@ -1321,6 +1390,45 @@ use twice). This works also for viewing the filter chain options used by presets. .TP +.BI \-\-filters= filters +Specify the full filter chain or a preset in a single option. +Each filter can be separated by spaces or two dashes +.RB ( \-\- ). +.I filters +may need to be quoted on the shell command line so it is +parsed as a single option. +To denote +.IR options , +use +.B : +or +.BR = . +A preset can be prefixed with a +.B \- +and followed with zero or more flags. +The only supported flag is +.B e +to apply the same options as +.BR \-\-extreme . +.TP +\fB\-\-filters1\fR=\fIfilters\fR ... \fB\-\-filters9\fR=\fIfilters +Specify up to nine additional filter chains that can be used with +.BR \-\-block\-list . +.IP "" +For example, when compressing an archive with executable files +followed by text files, the executable part could use a filter +chain with a BCJ filter and the text part only the LZMA2 filter. +.TP +.B \-\-filters-help +Display a help message describing how to specify presets and +custom filter chains in the +.B \-\-filters +and +.BI \-\-filters1= filters +\&...\& +.BI \-\-filters9= filters +options, and exit successfully. +.TP \fB\-\-lzma1\fR[\fB=\fIoptions\fR] .PD 0 .TP @@ -1704,6 +1812,8 @@ and \fB\-\-ia64\fR[\fB=\fIoptions\fR] .TP \fB\-\-sparc\fR[\fB=\fIoptions\fR] +.TP +\fB\-\-riscv\fR[\fB=\fIoptions\fR] .PD Add a branch/call/jump (BCJ) filter to the filter chain. These filters can be used only as a non-last filter @@ -1762,6 +1872,7 @@ ARM64;4;4096-byte alignment is best PowerPC;4;Big endian only IA-64;16;Itanium SPARC;4; +RISC-V;2; .TE .RE .RE @@ -1770,14 +1881,38 @@ Since the BCJ-filtered data is usually compressed with LZMA2, the compression ratio may be improved slightly if the LZMA2 options are set to match the alignment of the selected BCJ filter. -For example, with the IA-64 filter, it's good to set -.B pb=4 -or even +Examples: +.RS +.IP \(bu 3 +IA-64 filter has 16-byte alignment so .B pb=4,lp=4,lc=0 +is good with LZMA2 (2^4=16). -The x86 filter is an exception; -it's usually good to stick to LZMA2's default -four-byte alignment when compressing x86 executables. +.IP \(bu 3 +RISC-V code has 2-byte or 4-byte alignment +depending on whether the file contains +16-bit compressed instructions (the C extension). +When 16-bit instructions are used, +.B pb=2,lp=1,lc=3 +or +.B pb=1,lp=1,lc=3 +is good. +When 16-bit instructions aren't present, +.B pb=2,lp=2,lc=2 +is the best. +.B readelf \-h +can be used to check if "RVC" +appears on the "Flags" line. +.IP \(bu 3 +ARM64 is always 4-byte aligned so +.B pb=2,lp=2,lc=2 +is the best. +.IP \(bu 3 +The x86 filter is an exception. +It's usually good to stick to LZMA2's defaults +.RB ( pb=2,lp=0,lc=3 ) +when compressing x86 executables. +.RE .IP "" All BCJ filters support the same .IR options : @@ -1954,107 +2089,14 @@ easier to parse by other programs. Currently .B \-\-robot is supported only together with -.BR \-\-version , +.BR \-\-list , +.BR \-\-filters\-help , .BR \-\-info\-memory , and -.BR \-\-list . +.BR \-\-version . It will be supported for compression and decompression in the future. . -.SS Version -.B "xz \-\-robot \-\-version" -prints the version number of -.B xz -and liblzma in the following format: -.PP -.BI XZ_VERSION= XYYYZZZS -.br -.BI LIBLZMA_VERSION= XYYYZZZS -.TP -.I X -Major version. -.TP -.I YYY -Minor version. -Even numbers are stable. -Odd numbers are alpha or beta versions. -.TP -.I ZZZ -Patch level for stable releases or -just a counter for development releases. -.TP -.I S -Stability. -0 is alpha, 1 is beta, and 2 is stable. -.I S -should be always 2 when -.I YYY -is even. -.PP -.I XYYYZZZS -are the same on both lines if -.B xz -and liblzma are from the same XZ Utils release. -.PP -Examples: 4.999.9beta is -.B 49990091 -and -5.0.0 is -.BR 50000002 . -. -.SS "Memory limit information" -.B "xz \-\-robot \-\-info\-memory" -prints a single line with multiple tab-separated columns: -.IP 1. 4 -Total amount of physical memory (RAM) in bytes. -.IP 2. 4 -Memory usage limit for compression in bytes -.RB ( \-\-memlimit\-compress ). -A special value of -.B 0 -indicates the default setting -which for single-threaded mode is the same as no limit. -.IP 3. 4 -Memory usage limit for decompression in bytes -.RB ( \-\-memlimit\-decompress ). -A special value of -.B 0 -indicates the default setting -which for single-threaded mode is the same as no limit. -.IP 4. 4 -Since -.B xz -5.3.4alpha: -Memory usage for multi-threaded decompression in bytes -.RB ( \-\-memlimit\-mt\-decompress ). -This is never zero because a system-specific default value -shown in the column 5 -is used if no limit has been specified explicitly. -This is also never greater than the value in the column 3 -even if a larger value has been specified with -.BR \-\-memlimit\-mt\-decompress . -.IP 5. 4 -Since -.B xz -5.3.4alpha: -A system-specific default memory usage limit -that is used to limit the number of threads -when compressing with an automatic number of threads -.RB ( \-\-threads=0 ) -and no memory usage limit has been specified -.RB ( \-\-memlimit\-compress ). -This is also used as the default value for -.BR \-\-memlimit\-mt\-decompress . -.IP 6. 4 -Since -.B xz -5.3.4alpha: -Number of available processor threads. -.PP -In the future, the output of -.B "xz \-\-robot \-\-info\-memory" -may have more columns, but never more than a single line. -. .SS "List mode" .B "xz \-\-robot \-\-list" uses tab-separated output. @@ -2339,6 +2381,127 @@ Future versions may add new line types and new columns can be added to the existing line types, but the existing columns won't be changed. . +.SS "Filters help" +.B "xz \-\-robot \-\-filters-help" +prints the supported filters in the following format: +.PP +\fIfilter\fB:\fIoption\fB=<\fIvalue\fB>,\fIoption\fB=<\fIvalue\fB>\fR... +.TP +.I filter +Name of the filter +.TP +.I option +Name of a filter specific option +.TP +.I value +Numeric +.I value +ranges appear as +\fB<\fImin\fB\-\fImax\fB>\fR. +String +.I value +choices are shown within +.B "< >" +and separated by a +.B | +character. +.PP +Each filter is printed on its own line. +. +.SS "Memory limit information" +.B "xz \-\-robot \-\-info\-memory" +prints a single line with multiple tab-separated columns: +.IP 1. 4 +Total amount of physical memory (RAM) in bytes. +.IP 2. 4 +Memory usage limit for compression in bytes +.RB ( \-\-memlimit\-compress ). +A special value of +.B 0 +indicates the default setting +which for single-threaded mode is the same as no limit. +.IP 3. 4 +Memory usage limit for decompression in bytes +.RB ( \-\-memlimit\-decompress ). +A special value of +.B 0 +indicates the default setting +which for single-threaded mode is the same as no limit. +.IP 4. 4 +Since +.B xz +5.3.4alpha: +Memory usage for multi-threaded decompression in bytes +.RB ( \-\-memlimit\-mt\-decompress ). +This is never zero because a system-specific default value +shown in the column 5 +is used if no limit has been specified explicitly. +This is also never greater than the value in the column 3 +even if a larger value has been specified with +.BR \-\-memlimit\-mt\-decompress . +.IP 5. 4 +Since +.B xz +5.3.4alpha: +A system-specific default memory usage limit +that is used to limit the number of threads +when compressing with an automatic number of threads +.RB ( \-\-threads=0 ) +and no memory usage limit has been specified +.RB ( \-\-memlimit\-compress ). +This is also used as the default value for +.BR \-\-memlimit\-mt\-decompress . +.IP 6. 4 +Since +.B xz +5.3.4alpha: +Number of available processor threads. +.PP +In the future, the output of +.B "xz \-\-robot \-\-info\-memory" +may have more columns, but never more than a single line. +. +.SS Version +.B "xz \-\-robot \-\-version" +prints the version number of +.B xz +and liblzma in the following format: +.PP +.BI XZ_VERSION= XYYYZZZS +.br +.BI LIBLZMA_VERSION= XYYYZZZS +.TP +.I X +Major version. +.TP +.I YYY +Minor version. +Even numbers are stable. +Odd numbers are alpha or beta versions. +.TP +.I ZZZ +Patch level for stable releases or +just a counter for development releases. +.TP +.I S +Stability. +0 is alpha, 1 is beta, and 2 is stable. +.I S +should be always 2 when +.I YYY +is even. +.PP +.I XYYYZZZS +are the same on both lines if +.B xz +and liblzma are from the same XZ Utils release. +.PP +Examples: 4.999.9beta is +.B 49990091 +and +5.0.0 is +.BR 50000002 . +. .SH "EXIT STATUS" .TP .B 0 @@ -2391,7 +2554,7 @@ is run by a script or tool, for example, GNU .RS .PP .nf -.ft CW +.ft CR XZ_OPT=\-2v tar caf foo.tar.xz foo .ft R .fi @@ -2411,7 +2574,7 @@ scripts one may use something like this: .RS .PP .nf -.ft CW +.ft CR XZ_OPT=${XZ_OPT\-"\-7e"} export XZ_OPT .ft R @@ -2669,7 +2832,7 @@ if compression is successful: .RS .PP .nf -.ft CW +.ft CR xz foo .ft R .fi @@ -2685,7 +2848,7 @@ even if decompression is successful: .RS .PP .nf -.ft CW +.ft CR xz \-dk bar.xz .ft R .fi @@ -2703,7 +2866,7 @@ and 5\ MiB, respectively): .RS .PP .nf -.ft CW +.ft CR tar cf \- baz | xz \-4e > baz.tar.xz .ft R .fi @@ -2714,7 +2877,7 @@ to standard output with a single command: .RS .PP .nf -.ft CW +.ft CR xz \-dcf a.txt b.txt.xz c.txt d.txt.lzma > abcd.txt .ft R .fi @@ -2729,7 +2892,7 @@ can be used to parallelize compression of many files: .RS .PP .nf -.ft CW +.ft CR find . \-type f \e! \-name '*.xz' \-print0 \e | xargs \-0r \-P4 \-n16 xz \-T1 .ft R @@ -2769,7 +2932,7 @@ after compressing multiple files: .RS .PP .nf -.ft CW +.ft CR xz \-\-robot \-\-list *.xz | awk '/^totals/{print $5\-$4}' .ft R .fi @@ -2789,7 +2952,7 @@ option: .RS .PP .nf -.ft CW +.ft CR if ! eval "$(xz \-\-robot \-\-version 2> /dev/null)" || [ "$XZ_VERSION" \-lt 50000002 ]; then echo "Your xz is too old." @@ -2805,7 +2968,7 @@ but if a limit has already been set, don't increase it: .RS .PP .nf -.ft CW +.ft CR NEWLIM=$((123 << 20))\ \ # 123 MiB OLDLIM=$(xz \-\-robot \-\-info\-memory | cut \-f3) if [ $OLDLIM \-eq 0 \-o $OLDLIM \-gt $NEWLIM ]; then @@ -2858,7 +3021,7 @@ can be modified to use a bigger dictionary: .RS .PP .nf -.ft CW +.ft CR xz \-\-lzma2=preset=1,dict=32MiB foo.tar .ft R .fi @@ -2886,7 +3049,7 @@ would use: .RS .PP .nf -.ft CW +.ft CR xz \-vv \-\-lzma2=dict=192MiB big_foo.tar .ft R .fi @@ -2916,7 +3079,7 @@ using about 100\ KiB of memory. .RS .PP .nf -.ft CW +.ft CR xz \-\-check=crc32 \-\-lzma2=preset=6e,dict=64KiB foo .ft R .fi @@ -2944,7 +3107,7 @@ slightly (like 0.1\ %) smaller file than .RS .PP .nf -.ft CW +.ft CR xz \-\-lzma2=preset=6e,pb=0,lc=4 source_code.tar .ft R .fi @@ -2957,7 +3120,7 @@ using the x86 BCJ filter: .RS .PP .nf -.ft CW +.ft CR xz \-\-x86 \-\-lzma2 libfoo.so .ft R .fi @@ -2992,7 +3155,7 @@ to LZMA2 to accommodate the three-byte alignment: .RS .PP .nf -.ft CW +.ft CR xz \-\-delta=dist=3 \-\-lzma2=pb=0 foo.tiff .ft R .fi |