diff options
Diffstat (limited to 'upstream/archlinux/man1p/pax.1p')
-rw-r--r-- | upstream/archlinux/man1p/pax.1p | 4179 |
1 files changed, 4179 insertions, 0 deletions
diff --git a/upstream/archlinux/man1p/pax.1p b/upstream/archlinux/man1p/pax.1p new file mode 100644 index 00000000..cd34bb25 --- /dev/null +++ b/upstream/archlinux/man1p/pax.1p @@ -0,0 +1,4179 @@ +'\" et +.TH PAX "1P" 2017 "IEEE/The Open Group" "POSIX Programmer's Manual" +.\" +.SH PROLOG +This manual page is part of the POSIX Programmer's Manual. +The Linux implementation of this interface may differ (consult +the corresponding Linux manual page for details of Linux behavior), +or the interface may not be implemented on Linux. +.\" +.SH NAME +pax +\(em portable archive interchange +.SH SYNOPSIS +.LP +.nf +pax \fB[\fR-dv\fB] [\fR-c|-n\fB] [\fR-H|-L\fB] [\fR-o \fIoptions\fB] [\fR-f \fIarchive\fB] [\fR-s \fIreplstr\fB]\fR... + \fB[\fIpattern\fR...\fB]\fR +.P +pax -r\fB[\fR-c|-n\fB] [\fR-dikuv\fB] [\fR-H|-L\fB] [\fR-f \fIarchive\fB] [\fR-o \fIoptions\fB]\fR... \fB[\fR-p \fIstring\fB]\fR... + \fB[\fR-s \fIreplstr\fB]\fR... \fB[\fIpattern\fR...\fB]\fR +.P +pax -w \fB[\fR-dituvX\fB] [\fR-H|-L\fB] [\fR-b \fIblocksize\fB] [[\fR-a\fB] [\fR-f \fIarchive\fB]] [\fR-o \fIoptions\fB]\fR... + \fB[\fR-s \fIreplstr\fB]\fR... \fB[\fR-x \fIformat\fB] [\fIfile\fR...\fB]\fR +.P +pax -r -w \fB[\fR-diklntuvX\fB] [\fR-H|-L\fB] [\fR-o \fIoptions\fB]\fR... \fB[\fR-p \fIstring\fB]\fR... + \fB[\fR-s \fIreplstr\fB]\fR... \fB[\fIfile\fR...\fB] \fIdirectory\fR +.fi +.SH DESCRIPTION +The +.IR pax +utility shall read, write, and write lists of the members of archive +files and copy directory hierarchies. A variety of archive formats +shall be supported; see the +.BR \-x +.IR format +option. +.P +The action to be taken depends on the presence of the +.BR \-r +and +.BR \-w +options. The four combinations of +.BR \-r +and +.BR \-w +are referred to as the four modes of operation: +.BR list , +.BR read , +.BR write , +and +.BR copy +modes, corresponding respectively to the four forms shown in the +SYNOPSIS section. +.IP "\fBlist\fP" 10 +In +.BR list +mode (when neither +.BR \-r +nor +.BR \-w +are specified), +.IR pax +shall write the names of the members of the archive file read from the +standard input, with pathnames matching the specified patterns, to +standard output. If a named file is of type directory, the file +hierarchy rooted at that file shall be listed as well. +.IP "\fBread\fP" 10 +In +.BR read +mode (when +.BR \-r +is specified, but +.BR \-w +is not), +.IR pax +shall extract the members of the archive file read from the standard +input, with pathnames matching the specified patterns. If an extracted +file is of type directory, the file hierarchy rooted at that file shall +be extracted as well. The extracted files shall be created performing +pathname resolution with the directory in which +.IR pax +was invoked as the current working directory. +.RS 10 +.P +If an attempt is made to extract a directory when the directory +already exists, this shall not be considered an error. If +an attempt is made to extract a FIFO when the FIFO already exists, +this shall not be considered an error. +.P +The ownership, access, and modification times, and file mode of the +restored files are discussed under the +.BR \-p +option. +.RE +.IP "\fBwrite\fP" 10 +In +.BR write +mode (when +.BR \-w +is specified, but +.BR \-r +is not), +.IR pax +shall write the contents of the +.IR file +operands to the standard output in an archive format. If no +.IR file +operands are specified, a list of files to copy, one per line, shall be +read from the standard input and each entry in this list shall be +processed as if it had been a +.IR file +operand on the command line. A file of type directory shall include +all of the files in the file hierarchy rooted at the file. +.IP "\fBcopy\fP" 10 +In +.BR copy +mode (when both +.BR \-r +and +.BR \-w +are specified), +.IR pax +shall copy the +.IR file +operands to the destination directory. +.RS 10 +.P +If no +.IR file +operands are specified, a list of files to copy, one per line, shall be +read from the standard input. A file of type directory shall include +all of the files in the file hierarchy rooted at the file. +.P +The effect of the +.BR copy +shall be as if the copied files were written to a +.IR pax +format archive file and then subsequently extracted, except that +copying of sockets may be supported even if archiving them in write +mode is not supported, and that there may be hard links between the +original and the copied files. If the destination directory is a +subdirectory of one of the files to be copied, the results +are unspecified. If the destination directory is a file of a +type not defined by the System Interfaces volume of POSIX.1\(hy2017, the results are implementation-defined; +otherwise, it shall be an error for the file named by the +.IR directory +operand not to exist, not be writable by the user, or not be a file of +type directory. +.RE +.P +In +.BR read +or +.BR copy +modes, if intermediate directories are necessary to extract an archive +member, +.IR pax +shall perform actions equivalent to the +\fImkdir\fR() +function defined in the System Interfaces volume of POSIX.1\(hy2017, called with the following arguments: +.IP " *" 4 +The intermediate directory used as the +.IR path +argument +.IP " *" 4 +The value of the bitwise-inclusive OR of S_IRWXU, S_IRWXG, and S_IRWXO +as the +.IR mode +argument +.P +If any specified +.IR pattern +or +.IR file +operands are not matched by at least one file or archive member, +.IR pax +shall write a diagnostic message to standard error for each one that +did not match and exit with a non-zero exit status. +.P +The archive formats described in the EXTENDED DESCRIPTION section shall +be automatically detected on input. The default output archive format +shall be implementation-defined. +.P +A single archive can span multiple files. The +.IR pax +utility shall determine, in an implementation-defined manner, what +file to read or write as the next file. +.P +If the selected archive format supports the specification of linked files, +it shall be an error if these files cannot be linked when the archive +is extracted. For archive formats that do not store file contents with +each name that causes a hard link, if the file that contains the data +is not extracted during this +.IR pax +session, either the data shall be restored from the original file, or a +diagnostic message shall be displayed with the name of a file that can +be used to extract the data. In traversing directories, +.IR pax +shall detect infinite loops; that is, entering a previously visited +directory that is an ancestor of the last file visited. When it detects +an infinite loop, +.IR pax +shall write a diagnostic message to standard error and shall +terminate. +.SH OPTIONS +The +.IR pax +utility shall conform to the Base Definitions volume of POSIX.1\(hy2017, +.IR "Section 12.2" ", " "Utility Syntax Guidelines", +except that the order of presentation of the +.BR \-o , +.BR \-p , +and +.BR \-s +options is significant. +.P +The following options shall be supported: +.IP "\fB\-r\fP" 10 +Read an archive file from standard input. +.IP "\fB\-w\fP" 10 +Write files to the standard output in the specified archive format. +.IP "\fB\-a\fP" 10 +Append files to the end of the archive. It is implementation-defined +which devices on the system support appending. Additional file formats +unspecified by this volume of POSIX.1\(hy2017 may impose restrictions on appending. +.IP "\fB\-b\ \fIblocksize\fR" 10 +Block the output at a positive decimal integer number of bytes per +write to the archive file. Devices and archive formats may impose +restrictions on blocking. Blocking shall be automatically determined on +input. Conforming applications shall not specify a +.IR blocksize +value larger than 32\|256. Default blocking when creating archives +depends on the archive format. (See the +.BR \-x +option below.) +.IP "\fB\-c\fP" 10 +Match all file or archive members except those specified by the +.IR pattern +or +.IR file +operands. +.IP "\fB\-d\fP" 10 +Cause files of type directory being copied or archived or archive +members of type directory being extracted or listed to match only the +file or archive member itself and not the file hierarchy rooted at the +file. +.IP "\fB\-f\ \fIarchive\fR" 10 +Specify the pathname of the input or output archive, overriding the +default standard input (in +.BR list +or +.BR read +modes) or standard output (\c +.BR write +mode). +.IP "\fB\-H\fP" 10 +If a symbolic link referencing a file of type directory is specified on +the command line, +.IR pax +shall archive the file hierarchy rooted in the file referenced by the +link, using the name of the link as the root of the file hierarchy. +Otherwise, if a symbolic link referencing a file of any other file type +which +.IR pax +can normally archive is specified on the command line, then +.IR pax +shall archive the file referenced by the link, using the name of the +link. The default behavior, when neither +.BR \-H +or +.BR \-L +are specified, shall be to archive the symbolic link itself. +.IP "\fB\-i\fP" 10 +Interactively rename files or archive members. For each archive member +matching a +.IR pattern +operand or file matching a +.IR file +operand, a prompt shall be written to the file +.BR /dev/tty . +The prompt shall contain the name of the file or archive member, but +the format is otherwise unspecified. A line shall then be read from +.BR /dev/tty . +If this line is blank, the file or archive member shall be skipped. If +this line consists of a single period, the file or archive member shall +be processed with no modification to its name. Otherwise, its name +shall be replaced with the contents of the line. The +.IR pax +utility shall immediately exit with a non-zero exit status if +end-of-file is encountered when reading a response or if +.BR /dev/tty +cannot be opened for reading and writing. +.RS 10 +.P +The results of extracting a hard link to a file that has been renamed +during extraction are unspecified. +.RE +.IP "\fB\-k\fP" 10 +Prevent the overwriting of existing files. +.IP "\fB\-l\fP" 10 +(The letter ell.) In +.BR copy +mode, hard links shall be made between the source and destination file +hierarchies whenever possible. If specified in conjunction with +.BR \-H +or +.BR \-L , +when a symbolic link is encountered, the hard link created in the +destination file hierarchy shall be to the file referenced by the +symbolic link. If specified when neither +.BR \-H +nor +.BR \-L +is specified, when a symbolic link is encountered, the implementation +shall create a hard link to the symbolic link in the source file +hierarchy or copy the symbolic link to the destination. +.IP "\fB\-L\fP" 10 +If a symbolic link referencing a file of type directory is specified on +the command line or encountered during the traversal of a file +hierarchy, +.IR pax +shall archive the file hierarchy rooted in the file referenced by the +link, using the name of the link as the root of the file hierarchy. +Otherwise, if a symbolic link referencing a file of any other file type +which +.IR pax +can normally archive is specified on the command line or encountered +during the traversal of a file hierarchy, +.IR pax +shall archive the file referenced by the link, using the name of the +link. The default behavior, when neither +.BR \-H +or +.BR \-L +are specified, shall be to archive the symbolic link itself. +.IP "\fB\-n\fP" 10 +Select the first archive member that matches each +.IR pattern +operand. No more than one archive member shall be matched for each +pattern (although members of type directory shall still match the file +hierarchy rooted at that file). +.IP "\fB\-o\ \fIoptions\fR" 10 +Provide information to the implementation to modify the algorithm for +extracting or writing files. The value of +.IR options +shall consist of one or more +<comma>-separated +keywords of the form: +.RS 10 +.sp +.RS 4 +.nf + +\fIkeyword\fB[[\fR:\fB]\fR=\fIvalue\fB][\fR,\fIkeyword\fB[[\fR:\fB]\fR=\fIvalue\fB]\fR, ...\fB]\fR +.fi +.P +.RE +.P +Some keywords apply only to certain file formats, as indicated with +each description. Use of keywords that are inapplicable to the file +format being processed produces undefined results. +.P +Keywords in the +.IR options +argument shall be a string that would be a valid portable filename as +described in the Base Definitions volume of POSIX.1\(hy2017, +.IR "Section 3.282" ", " "Portable Filename Character Set". +.TP 10 +.BR Note: +Keywords are not expected to be filenames, merely to follow the same +character composition rules as portable filenames. +.P +.P +Keywords can be preceded with white space. The +.IR value +field shall consist of zero or more characters; within +.IR value , +the application shall precede any literal +<comma> +with a +<backslash>, +which shall be ignored, but preserves the +<comma> +as part of +.IR value . +A +<comma> +as the final character, or a +<comma> +followed solely by white space as the final characters, in +.IR options +shall be ignored. Multiple +.BR \-o +options can be specified; if keywords given to these multiple +.BR \-o +options conflict, the keywords and values appearing later in command +line sequence shall take precedence and the earlier shall be silently +ignored. The following keyword values of +.IR options +shall be supported for the file formats as indicated: +.IP "\fBdelete\fR=\fIpattern\fR" 6 +.br +(Applicable only to the +.BR \-x +.BR pax +format.) When used in +.BR write +or +.BR copy +mode, +.IR pax +shall omit from extended header records that it produces any keywords +matching the string pattern. When used in +.BR read +or +.BR list +mode, +.IR pax +shall ignore any keywords matching the string pattern in the extended +header records. In both cases, matching shall be performed using the +pattern matching notation described in +.IR "Section 2.13.1" ", " "Patterns Matching a Single Character" +and +.IR "Section 2.13.2" ", " "Patterns Matching Multiple Characters". +For example: +.RS 6 +.sp +.RS 4 +.nf + +-o \fBdelete\fR=\fIsecurity\fR.* +.fi +.P +.RE +.P +would suppress security-related information. See +.IR "pax Extended Header" +for extended header record keyword usage. +.P +When multiple +.BR \-o \c +.BR delete=pattern +options are specified, the patterns shall be additive; all keywords +matching the specified string patterns shall be omitted from extended +header records that +.IR pax +produces. +.RE +.IP "\fBexthdr.name\fR=\fIstring\fR" 6 +.br +(Applicable only to the +.BR \-x +.BR pax +format.) This keyword allows user control over the name that is written +into the +.BR ustar +header blocks for the extended header produced under the circumstances +described in +.IR "pax Header Block". +The name shall be the contents of +.IR string , +after the following character substitutions have been made: +.TS +center box tab(!); +cB | cB +cB | cB +lf5 | lw(3.8i). +\fIstring\fP +Includes:!Replaced by: +_ +%d!T{ +The directory name of the file, equivalent to the result of the +.IR dirname +utility on the translated pathname. +T} +%f!T{ +The filename of the file, equivalent to the result of the +.IR basename +utility on the translated pathname. +T} +%p!T{ +The process ID of the +.IR pax +process. +T} +%%!T{ +A +.BR '%' +character. +T} +.TE +.RS 6 +.P +Any other +.BR '%' +characters in +.IR string +produce undefined results. +.P +If no +.BR \-o +.BR exthdr.name=string +is specified, +.IR pax +shall use the following default value: +.sp +.RS 4 +.nf + +%d/PaxHeaders.%p/%f +.fi +.P +.RE +.RE +.IP "\fBglobexthdr.name\fR=\fIstring\fR" 6 +.br +(Applicable only to the +.BR \-x +.BR pax +format.) When used in +.BR write +or +.BR copy +mode with the appropriate options, +.IR pax +shall create global extended header records with +.BR ustar +header blocks that will be treated as regular files by previous +versions of +.IR pax . +This keyword allows user control over the name that is written into the +.BR ustar +header blocks for global extended header records. The name shall be the +contents of string, after the following character substitutions have +been made: +.TS +center box tab(!); +cB | cB +cB | cB +lf5 | lw(3.8i). +\fIstring\fP +Includes:!Replaced by: +_ +%n!T{ +An integer that represents the sequence number of the global extended +header record in the archive, starting at 1. +T} +%p!T{ +The process ID of the +.IR pax +process. +T} +%%!T{ +A +.BR '%' +character. +T} +.TE +.RS 6 +.P +Any other +.BR '%' +characters in +.IR string +produce undefined results. +.P +If no +.BR \-o +.BR globexthdr.name=string +is specified, +.IR pax +shall use the following default value: +.sp +.RS 4 +.nf + +$TMPDIR/GlobalHead.%p.%n +.fi +.P +.RE +.P +where $\c +.IR TMPDIR +represents the value of the +.IR TMPDIR +environment variable. If +.IR TMPDIR +is not set, +.IR pax +shall use +.BR /tmp . +.RE +.IP "\fBinvalid\fR=\fIaction\fR" 6 +.br +(Applicable only to the +.BR \-x +.BR pax +format.) This keyword allows user control over the action +.IR pax +takes upon encountering values in an extended header record that, in +.BR read +or +.BR copy +mode, are invalid in the destination hierarchy or, in +.BR list +mode, cannot be written in the codeset and current locale of the +implementation. The following are invalid values that shall be +recognized by +.IR pax : +.RS 6 +.IP -- 4 +In +.BR read +or +.BR copy +mode, a filename or link name that contains character encodings +invalid in the destination hierarchy. (For example, the name may +contain embedded NULs.) +.IP -- 4 +In +.BR read +or +.BR copy +mode, a filename or link name that is longer than the maximum allowed +in the destination hierarchy (for either a pathname component or the +entire pathname). +.IP -- 4 +In +.BR list +mode, any character string value (filename, link name, user name, and +so on) that cannot be written in the codeset and current locale of the +implementation. +.P +The following mutually-exclusive values of the +.IR action +argument are supported: +.IP "\fBbinary\fR" 10 +In +.BR write +mode, +.IR pax +shall generate a +.BR hdrcharset = BINARY +extended header record for each file with a filename, link name, group +name, owner name, or any other field in an extended header record that +cannot be translated to the UTF\(hy8 codeset, allowing the archive to +contain the files with unencoded extended header record values. In +.BR read +or +.BR copy +mode, +.IR pax +shall use the values specified in the header without translation, +regardless of whether this may overwrite an existing file with a valid +name. In +.BR list +mode, +.IR pax +shall behave identically to the +.BR bypass +action. +.IP "\fBbypass\fR" 10 +In +.BR read +or +.BR copy +mode, +.IR pax +shall bypass the file, causing no change to the destination hierarchy. +In +.BR list +mode, +.IR pax +shall write all requested valid values for the file, but its method for +writing invalid values is unspecified. +.IP "\fBrename\fR" 10 +In +.BR read +or +.BR copy +mode, +.IR pax +shall act as if the +.BR \-i +option were in effect for each file with invalid filename or link name +values, allowing the user to provide a replacement name interactively. +In +.BR list +mode, +.IR pax +shall behave identically to the +.BR bypass +action. +.IP "\fBUTF\(hy8\fR" 10 +When used in +.BR read , +.BR copy , +or +.BR list +mode and a filename, link name, owner name, or any other field in an +extended header record cannot be translated from the +.BR pax +UTF\(hy8 codeset format to the codeset and current locale of the +implementation, +.IR pax +shall use the actual UTF\(hy8 encoding for the name. If a +.BR hdrcharset +extended header record is in effect for this file, the character set +specified by that record shall be used instead of UTF\(hy8. If a +.BR hdrcharset = BINARY +extended header record is in effect for this file, no translation shall +be performed. +.IP "\fBwrite\fR" 10 +In +.BR read +or +.BR copy +mode, +.IR pax +shall write the file, translating the name, regardless of whether this +may overwrite an existing file with a valid name. In +.BR list +mode, +.IR pax +shall behave identically to the +.BR bypass +action. +.P +If no +.BR \-o +.BR invalid=option +is specified, +.IR pax +shall act as if +.BR \-o \c +.BR invalid=bypass +were specified. Any overwriting of existing files that may be allowed +by the +.BR \-o \c +.BR invalid= +actions shall be subject to permission (\c +.BR \-p ) +and modification time (\c +.BR \-u ) +restrictions, and shall be suppressed if the +.BR \-k +option is also specified. +.RE +.IP "\fBlinkdata\fP" 6 +.br +(Applicable only to the +.BR \-x +.BR pax +format.) In +.BR write +mode, +.IR pax +shall write the contents of a file to the archive even when that file +is merely a hard link to a file whose contents have already been +written to the archive. +.IP "\fBlistopt\fR=\fIformat\fP" 6 +.br +This keyword specifies the output format of the table of contents +produced when the +.BR \-v +option is specified in +.BR list +mode. See +.IR "List Mode Format Specifications". +To avoid ambiguity, the +.BR listopt=format +shall be the only or final +.BR keyword=value +pair in a +.BR \-o +option-argument; all characters in the remainder of the option-argument +shall be considered part of the format string. When multiple +.BR \-o \c +.BR listopt=format +options are specified, the format strings shall be considered a single, +concatenated string, evaluated in command line order. +.IP "\fBtimes\fR" 6 +.br +(Applicable only to the +.BR \-x +.IR pax +format.) When used in +.BR write +or +.BR copy +mode, +.IR pax +shall include +.BR atime +and +.BR mtime +extended header records for each file. See +.IR "pax Extended Header File Times". +.P +In addition to these keywords, if the +.BR \-x +.IR pax +format is specified, any of the keywords and values defined in +.IR "pax Extended Header", +including implementation extensions, can be used in +.BR \-o +option-arguments, in either of two modes: +.IP "\fBkeyword\fR=\fIvalue\fR" 6 +.br +When used in +.BR write +or +.BR copy +mode, these keyword/value pairs shall be included at the beginning of +the archive as +.BR typeflag +.BR g +global extended header records. When used in +.BR read +or +.BR list +mode, these keyword/value pairs shall act as if they had been at the +beginning of the archive as +.BR typeflag +.BR g +global extended header records. +.IP "\fBkeyword\fR:=\fIvalue\fR" 6 +.br +When used in +.BR write +or +.BR copy +mode, these keyword/value pairs shall be included as records at the +beginning of a +.BR typeflag +.BR x +extended header for each file. (This shall be equivalent to the +<equals-sign> +form except that it creates no +.BR typeflag +.BR g +global extended header records.) When used in +.BR read +or +.BR list +mode, these keyword/value pairs shall act as if they were included as +records at the end of each extended header; thus, they shall override +any global or file-specific extended header record keywords of the same +names. For example, in the command: +.RS 6 +.sp +.RS 4 +.nf + +pax -r -o " +gname:=mygroup, +" <archive +.fi +.P +.RE +.P +the group name will be forced to a new value for all files read from +the archive. +.RE +.P +The precedence of +.BR \-o +keywords over various fields in the archive is described in +.IR "pax Extended Header Keyword Precedence". +If the +.BR \-o +.BR delete =\c +.IR pattern , +.BR \-o +.BR keyword =\c +.IR value , +or +.BR \-o +.BR keyword :=\c +.IR value +options are used to override or remove any extended header data needed +to find files in an archive (e.g., +.BR "-o delete=size" +for a file whose size cannot be represented in a +.BR ustar +header or +.BR "-o size=100" +for a file whose size is not 100 bytes), the behavior is undefined. +.RE +.IP "\fB\-p\ \fIstring\fR" 10 +Specify one or more file characteristic options (privileges). The +.IR string +option-argument shall be a string specifying file characteristics to be +retained or discarded on extraction. The string shall consist of the +specification characters +.BR a , +.BR e , +.BR m , +.BR o , +and +.BR p . +Other implementation-defined characters can be included. Multiple +characteristics can be concatenated within the same string and multiple +.BR \-p +options can be specified. The meaning of the specification characters +are as follows: +.RS 10 +.IP "\fRa\fP" 6 +Do not preserve file access times. +.IP "\fRe\fP" 6 +Preserve the user ID, group ID, file mode bits (see the Base Definitions volume of POSIX.1\(hy2017, +.IR "Section 3.169" ", " "File Mode Bits"), +access time, modification time, and any other implementation-defined +file characteristics. +.IP "\fRm\fP" 6 +Do not preserve file modification times. +.IP "\fRo\fP" 6 +Preserve the user ID and group ID. +.IP "\fRp\fP" 6 +Preserve the file mode bits. Other implementation-defined file mode +attributes may be preserved. +.P +In the preceding list, ``preserve'' indicates that an attribute stored +in the archive shall be given to the extracted file, subject to the +permissions of the invoking process. The access and modification times +of the file shall be preserved unless otherwise specified with the +.BR \-p +option or not stored in the archive. All attributes that are not +preserved shall be determined as part of the normal file creation +action (see +.IR "Section 1.1.1.4" ", " "File Read" ", " "Write" ", " "and Creation"). +.P +If neither the +.BR e +nor the +.BR o +specification character is specified, or the user ID and group ID are +not preserved for any reason, +.IR pax +shall not set the S_ISUID and S_ISGID bits of the file mode. +.P +If the preservation of any of these items fails for any reason, +.IR pax +shall write a diagnostic message to standard error. Failure to preserve +these items shall affect the final exit status, but shall not cause the +extracted file to be deleted. +.P +If file characteristic letters in any of the +.IR string +option-arguments are duplicated or conflict with each other, the ones +given last shall take precedence. For example, if +.BR \-p +.BR eme +is specified, file modification times are preserved. +.RE +.IP "\fB\-s\ \fIreplstr\fR" 10 +Modify file or archive member names named by +.IR pattern +or +.IR file +operands according to the substitution expression +.IR replstr , +using the syntax of the +.IR ed +utility. The concepts of ``address'' and ``line'' are meaningless in +the context of the +.IR pax +utility, and shall not be supplied. The format shall be: +.RS 10 +.sp +.RS 4 +.nf + +-s /\fIold\fR/\fInew\fR/\fB[\fRgp\fB]\fR +.fi +.P +.RE +.P +where as in +.IR ed , +.IR old +is a basic regular expression and +.IR new +can contain an +<ampersand>, +.BR '\en' +(where +.IR n +is a digit) back-references, or subexpression matching. The +.IR old +string shall also be permitted to contain +<newline> +characters. +.P +Any non-null character can be used as a delimiter (\c +.BR '/' +shown here). Multiple +.BR \-s +expressions can be specified; the expressions shall be applied in the +order specified, terminating with the first successful substitution. +The optional trailing +.BR 'g' +is as defined in the +.IR ed +utility. The optional trailing +.BR 'p' +shall cause successful substitutions to be written to standard error. +File or archive member names that substitute to the empty string shall +be ignored when reading and writing archives. +.RE +.IP "\fB\-t\fP" 10 +When reading files from the file system, and if the user has the +permissions required by +\fIutime\fR() +to do so, set the access time of each file read to the access time that +it had before being read by +.IR pax . +.IP "\fB\-u\fP" 10 +Ignore files that are older (having a less recent file modification +time) than a pre-existing file or archive member with the same name. +In +.BR read +mode, an archive member with the same name as a file in the file system +shall be extracted if the archive member is newer than the file. In +.BR write +mode, an archive file member with the same name as a file in the file +system shall be superseded if the file is newer than the archive +member. If +.BR \-a +is also specified, this is accomplished by appending to the archive; +otherwise, it is unspecified whether this is accomplished by actual +replacement in the archive or by appending to the archive. In +.BR copy +mode, the file in the destination hierarchy shall be replaced by the +file in the source hierarchy or by a link to the file in the source +hierarchy if the file in the source hierarchy is newer. +.IP "\fB\-v\fP" 10 +In +.BR list +mode, produce a verbose table of contents (see the STDOUT section). +Otherwise, write archive member pathnames to standard error (see the +STDERR section). +.IP "\fB\-x\ \fIformat\fR" 10 +Specify the output archive format. The +.IR pax +utility shall support the following formats: +.RS 10 +.IP "\fBcpio\fR" 10 +The +.BR cpio +interchange format; see the EXTENDED DESCRIPTION section. The default +.IR blocksize +for this format for character special archive files shall be 5\|120. +Implementations shall support all +.IR blocksize +values less than or equal to 32\|256 that are multiples of 512. +.IP "\fBpax\fR" 10 +The +.BR pax +interchange format; see the EXTENDED DESCRIPTION section. The default +.IR blocksize +for this format for character special archive files shall be 5\|120. +Implementations shall support all +.IR blocksize +values less than or equal to 32\|256 that are multiples of 512. +.IP "\fBustar\fR" 10 +The +.BR tar +interchange format; see the EXTENDED DESCRIPTION section. The default +.IR blocksize +for this format for character special archive files shall be 10\|240. +Implementations shall support all +.IR blocksize +values less than or equal to 32\|256 that are multiples of 512. +.P +Implementation-defined formats shall specify a default block size as +well as any other block sizes supported for character special archive +files. +.P +Any attempt to append to an archive file in a format different from the +existing archive format shall cause +.IR pax +to exit immediately with a non-zero exit status. +.RE +.IP "\fB\-X\fP" 10 +When traversing the file hierarchy specified by a pathname, +.IR pax +shall not descend into directories that have a different device ID (\c +.IR st_dev ; +see the System Interfaces volume of POSIX.1\(hy2017, +\fIstat\fR()). +.P +Specifying more than one of the mutually-exclusive options +.BR \-H +and +.BR \-L +shall not be considered an error and the last option specified shall +determine the behavior of the utility. +.P +The options that operate on the names of files or archive members (\c +.BR \-c , +.BR \-i , +.BR \-n , +.BR \-s , +.BR \-u , +and +.BR \-v ) +shall interact as follows. In +.BR read +mode, the archive members shall be selected based on the user-specified +.IR pattern +operands as modified by the +.BR \-c , +.BR \-n , +and +.BR \-u +options. Then, any +.BR \-s +and +.BR \-i +options shall modify, in that order, the names of the selected files. +The +.BR \-v +option shall write names resulting from these modifications. +.P +In +.BR write +mode, the files shall be selected based on the user-specified +pathnames as modified by the +.BR \-n +and +.BR \-u +options. Then, any +.BR \-s +and +.BR \-i +options shall modify, in that order, the names of these selected files. +The +.BR \-v +option shall write names resulting from these modifications. +.P +If both the +.BR \-u +and +.BR \-n +options are specified, +.IR pax +shall not consider a file selected unless it is newer than the file to +which it is compared. +.SS "List Mode Format Specifications" +.P +In +.BR list +mode with the +.BR \-o +.BR listopt=format +option, the +.IR format +argument shall be applied for each selected file. The +.IR pax +utility shall append a +<newline> +to the +.BR listopt +output for each selected file. The +.IR format +argument shall be used as the +.IR format +string described in the Base Definitions volume of POSIX.1\(hy2017, +.IR "Chapter 5" ", " "File Format Notation", +with the exceptions 1. through 6. defined in the EXTENDED DESCRIPTION +section of +.IR printf , +plus the following exceptions: +.IP 7. 6 +The sequence (\c +.IR keyword ) +can occur before a format conversion specifier. The conversion +argument is defined by the value of +.IR keyword . +The implementation shall support the following keywords: +.RS 6 +.IP -- 4 +Any of the Field Name entries in +.IR "Table 4-14, ustar Header Block" +and +.IR "Table 4-16, Octet-Oriented cpio Archive Entry". +The implementation may support the +.IR cpio +keywords without the leading +.BR c_ +in addition to the form required by +.IR "Table 4-16, Octet-Oriented cpio Archive Entry". +.IP -- 4 +Any keyword defined for the extended header in +.IR "pax Extended Header". +.IP -- 4 +Any keyword provided as an implementation-defined extension within +the extended header defined in +.IR "pax Extended Header". +.P +For example, the sequence +.BR \(dq%(charset)s\(dq +is the string value of the name of the character set in the extended +header. +.P +The result of the keyword conversion argument shall be the value from +the applicable header field or extended header, without any trailing +NULs. +.P +All keyword values used as conversion arguments shall be translated +from the UTF\(hy8 encoding (or alternative encoding specified by any +.BR hdrcharset +extended header record) to the character set appropriate for the local +file system, user database, and so on, as applicable. +.RE +.IP 8. 6 +An additional conversion specifier character, +.BR T , +shall be used to specify time formats. The +.BR T +conversion specifier character can be preceded by the sequence (\c +.IR keyword= \c +.IR subformat ), +where +.IR subformat +is a date format as defined by +.IR date +operands. The default +.IR keyword +shall be +.BR mtime +and the default subformat shall be: +.RS 6 +.sp +.RS 4 +.nf + +%b %e %H:%M %Y +.fi +.P +.RE +.RE +.IP 9. 6 +An additional conversion specifier character, +.BR M , +shall be used to specify the file mode string as defined in +.IR ls +Standard Output. If (\c +.IR keyword ) +is omitted, the +.BR mode +keyword shall be used. For example, +.BR %.1M +writes the single character corresponding to the <\fIentry\ type\fP> +field of the +.IR ls +.BR \-l +command. +.IP 10. 6 +An additional conversion specifier character, +.BR D , +shall be used to specify the device for block or special files, if +applicable, in an implementation-defined format. If not applicable, +and (\c +.IR keyword ) +is specified, then this conversion shall be equivalent to +\fR%(\fIkeyword\fR)u\fR. If not applicable, and (\c +.IR keyword ) +is omitted, then this conversion shall be equivalent to +<space>. +.IP 11. 6 +An additional conversion specifier character, +.BR F , +shall be used to specify a pathname. The +.BR F +conversion character can be preceded by a sequence of +<comma>-separated +keywords: +.RS 6 +.sp +.RS 4 +.nf + +(\fIkeyword\fB[\fR,\fIkeyword\fB]\fR ... ) +.fi +.P +.RE +.P +The values for all the keywords that are non-null shall be concatenated +together, each separated by a +.BR '/' . +The default shall be (\c +.BR path ) +if the keyword +.BR path +is defined; otherwise, the default shall be (\c +.BR prefix ,\c +.BR name ). +.RE +.IP 12. 6 +An additional conversion specifier character, +.BR L , +shall be used to specify a symbolic link expansion. If the current +file is a symbolic link, then +.BR %L +shall expand to: +.RS 6 +.sp +.RS 4 +.nf + +"%s -> %s", <\fIvalue of keyword\fR>, <\fIcontents of link\fR> +.fi +.P +.RE +.P +Otherwise, the +.BR %L +conversion specification shall be the equivalent of +.BR %F . +.RE +.SH OPERANDS +The following operands shall be supported: +.IP "\fIdirectory\fR" 10 +The destination directory pathname for +.BR copy +mode. +.IP "\fIfile\fR" 10 +A pathname of a file to be copied or archived. +.IP "\fIpattern\fR" 10 +A pattern matching one or more pathnames of archive members. A pattern +must be given in the name-generating notation of the pattern matching +notation in +.IR "Section 2.13" ", " "Pattern Matching Notation", +including the filename expansion rules in +.IR "Section 2.13.3" ", " "Patterns Used for Filename Expansion". +The default, if no +.IR pattern +is specified, is to select all members in the archive. +.SH STDIN +In +.BR write +mode, the standard input shall be used only if no +.IR file +operands are specified. It shall be a file containing a list of +pathnames, each terminated by a +<newline> +character. +.P +In +.BR list +and +.BR read +modes, if +.BR \-f +is not specified, the standard input shall be an archive file. +.P +Otherwise, the standard input shall not be used. +.SH "INPUT FILES" +The input file named by the +.IR archive +option-argument, or standard input when the archive is read from there, +shall be a file formatted according to one of the specifications in the +EXTENDED DESCRIPTION section or some other implementation-defined +format. +.P +The file +.BR /dev/tty +shall be used to write prompts and read responses. +.SH "ENVIRONMENT VARIABLES" +The following environment variables shall affect the execution of +.IR pax : +.IP "\fILANG\fP" 10 +Provide a default value for the internationalization variables that are +unset or null. (See the Base Definitions volume of POSIX.1\(hy2017, +.IR "Section 8.2" ", " "Internationalization Variables" +the precedence of internationalization variables used to determine the +values of locale categories.) +.IP "\fILC_ALL\fP" 10 +If set to a non-empty string value, override the values of all the +other internationalization variables. +.IP "\fILC_COLLATE\fP" 10 +.br +Determine the locale for the behavior of ranges, equivalence classes, +and multi-character collating elements used in the pattern matching +expressions for the +.IR pattern +operand, the basic regular expression for the +.BR \-s +option, and the extended regular expression defined for the +.BR yesexpr +locale keyword in the +.IR LC_MESSAGES +category. +.IP "\fILC_CTYPE\fP" 10 +Determine the locale for the interpretation of sequences of bytes of +text data as characters (for example, single-byte as opposed to +multi-byte characters in arguments and input files), the behavior of +character classes used in the extended regular expression defined for +the +.BR yesexpr +locale keyword in the +.IR LC_MESSAGES +category, and pattern matching. +.IP "\fILC_MESSAGES\fP" 10 +.br +Determine the locale used to process affirmative responses, and the +locale used to affect the format and contents of diagnostic messages +and prompts written to standard error. +.IP "\fILC_TIME\fP" 10 +Determine the format and contents of date and time strings when the +.BR \-v +option is specified. +.IP "\fINLSPATH\fP" 10 +Determine the location of message catalogs for the processing of +.IR LC_MESSAGES . +.IP "\fITMPDIR\fP" 10 +Determine the pathname that provides part of the default global +extended header record file, as described for the +.BR \-o +.BR globexthdr= +keyword in the OPTIONS section. +.IP "\fITZ\fP" 10 +Determine the timezone used to calculate date and time strings when the +.BR \-v +option is specified. If +.IR TZ +is unset or null, an unspecified default timezone shall be used. +.SH "ASYNCHRONOUS EVENTS" +Default. +.SH STDOUT +In +.BR write +mode, if +.BR \-f +is not specified, the standard output shall be the archive formatted +according to one of the specifications in the EXTENDED DESCRIPTION +section, or some other implementation-defined format (see +.BR \-x +.IR format ). +.P +In +.BR list +mode, when the +.BR \-o \c +.BR listopt =\c +.IR format +has been specified, the selected archive members shall be written to +standard output using the format described under +.IR "List Mode Format Specifications". +In +.BR list +mode without the +.BR \-o \c +.BR listopt =\c +.IR format +option, the table of contents of the selected archive members shall +be written to standard output using the following format: +.sp +.RS 4 +.nf + +"%s\en", <\fIpathname\fR> +.fi +.P +.RE +.P +If the +.BR \-v +option is specified in +.BR list +mode, the table of contents of the selected archive members shall be +written to standard output using the following formats. +.P +For pathnames representing hard links to previous members of the +archive: +.sp +.RS 4 +.nf + +"%s == %s\en", <\fIls\fR -l \fIlisting\fR>, <\fIlinkname\fR> +.fi +.P +.RE +.P +For all other pathnames: +.sp +.RS 4 +.nf + +"%s\en", <\fIls\fR -l \fIlisting\fR> +.fi +.P +.RE +.P +where <\fIls\ \fR\-l\ \fIlisting\fR> shall be the format specified by +the +.IR ls +utility with the +.BR \-l +option. When writing pathnames in this format, it is unspecified what +is written for fields for which the underlying archive format does not +have the correct information, although the correct number of +<blank>-separated +fields shall be written. +.P +In +.BR list +mode, standard output shall not be buffered more than a pathname +(plus any associated information and a +<newline> +terminator) at a time. +.SH STDERR +If +.BR \-v +is specified in +.BR read , +.BR write , +or +.BR copy +modes, +.IR pax +shall write the pathnames it processes to the standard error output +using the following format: +.sp +.RS 4 +.nf + +"%s\en", <\fIpathname\fR> +.fi +.P +.RE +.P +These pathnames shall be written as soon as processing is begun on the +file or archive member, and shall be flushed to standard error. The +trailing +<newline>, +which shall not be buffered, is written when the file has been read or +written. +.P +If the +.BR \-s +option is specified, and the replacement string has a trailing +.BR 'p' , +substitutions shall be written to standard error in the following +format: +.sp +.RS 4 +.nf + +"%s >> %s\en", <\fIoriginal pathname\fR>, <\fInew pathname\fR> +.fi +.P +.RE +.P +In all operating modes of +.IR pax , +optional messages of unspecified format concerning the input archive +format and volume number, the number of files, blocks, volumes, and +media parts as well as other diagnostic messages may be written to +standard error. +.P +In all formats, for both standard output and standard error, it is +unspecified how non-printable characters in pathnames or link names +are written. +.P +When using the +.BR \-x \c +.BR pax +archive format, if a filename, link name, group name, owner name, or +any other field in an extended header record cannot be translated +between the codeset in use for that extended header record and the +character set of the current locale, +.IR pax +shall write a diagnostic message to standard error, shall process the +file as described for the +.BR \-o +.BR invalid= +option, and then shall continue processing with the next file. +.SH "OUTPUT FILES" +In +.BR read +mode, the extracted output files shall be of the archived file type. +In +.BR copy +mode, the copied output files shall be the type of the file being +copied. In either mode, existing files in the destination hierarchy +shall be overwritten only when all permission (\c +.BR \-p ), +modification time (\c +.BR \-u ), +and invalid-value (\c +.BR \-o \c +.BR invalid= ) +tests allow it. +.P +In +.BR write +mode, the output file named by the +.BR \-f +option-argument shall be a file formatted according to one of the +specifications in the EXTENDED DESCRIPTION section, or some other +implementation-defined format. +.SH "EXTENDED DESCRIPTION" +.SS "pax Interchange Format" +.P +A +.IR pax +archive tape or file produced in the +.BR \-x \c +.BR pax +format shall contain a series of blocks. The physical layout of the +archive shall be identical to the +.BR ustar +format described in +.IR "ustar Interchange Format". +Each file archived shall be represented by the following sequence: +.IP " *" 4 +An optional header block with extended header records. This header +block is of the form described in +.IR "pax Header Block", +with a +.IR typeflag +value of +.BR x +or +.BR g . +The extended header records, described in +.IR "pax Extended Header", +shall be included as the data for this header block. +.IP " *" 4 +A header block that describes the file. Any fields in the preceding +optional extended header shall override the associated fields in +this header block for this file. +.IP " *" 4 +Zero or more blocks that contain the contents of the file. +.P +At the end of the archive file there shall be two 512-byte blocks +filled with binary zeros, interpreted as an end-of-archive indicator. +.P +A schematic of an example archive with global extended header records +and two actual files is shown in +.IR "Figure 4-1, pax Format Archive Example". +In the example, the second file in the archive has no extended header +preceding it, presumably because it has no need for extended +attributes. +.sp +.ce 1 +\fBFigure 4-1: pax Format Archive Example\fR +.SS "pax Header Block" +.P +The +.BR pax +header block shall be identical to the +.BR ustar +header block described in +.IR "ustar Interchange Format", +except that two additional +.IR typeflag +values are defined: +.IP "\fRx\fP" 6 +Represents extended header records for the following file in the +archive (which shall have its own +.BR ustar +header block). The format of these extended header records shall be as +described in +.IR "pax Extended Header". +.IP "\fRg\fR" 6 +Represents global extended header records for the following files in +the archive. The format of these extended header records shall be as +described in +.IR "pax Extended Header". +Each value shall affect all subsequent files that do not override that +value in their own extended header record and until another global +extended header record is reached that provides another value for the +same field. The +.IR typeflag +.BR g +global headers should not be used with interchange media that could +suffer partial data loss in transporting the archive. +.P +For both of these types, the +.IR size +field shall be the size of the extended header records in octets. The +other fields in the header block are not meaningful to this version of +the +.IR pax +utility. However, if this archive is read by a +.IR pax +utility conforming to the ISO\ POSIX\(hy2:\|1993 standard, the header block fields are used to +create a regular file that contains the extended header records as +data. Therefore, header block field values should be selected to +provide reasonable file access to this regular file. +.P +A further difference from the +.BR ustar +header block is that data blocks for files of +.IR typeflag +1 (the digit one) (hard link) may be included, which means that the +size field may be greater than zero. Archives created by +.IR pax +.BR \-o +.BR linkdata +shall include these data blocks with the hard links. +.SS "pax Extended Header" +.P +A +.BR pax +extended header contains values that are inappropriate for the +.BR ustar +header block because of limitations in that format: fields requiring a +character encoding other than that described in the ISO/IEC\ 646:\|1991 standard, fields +representing file attributes not described in the +.BR ustar +header, and fields whose format or length do not fit the requirements +of the +.BR ustar +header. The values in an extended header add attributes to the +following file (or files; see the description of the +.IR typeflag +.BR g +header block) or override values in the following header block(s), as +indicated in the following list of keywords. +.P +An extended header shall consist of one or more records, each +constructed as follows: +.sp +.RS 4 +.nf + +"%d %s=%s\en", <\fIlength\fR>, <\fIkeyword\fR>, <\fIvalue\fR> +.fi +.P +.RE +.P +The extended header records shall be encoded according to the ISO/IEC\ 10646\(hy1:\|2000 standard +UTF\(hy8 encoding. The <\fIlength\fP> field, +<blank>, +<equals-sign>, +and +<newline> +shown shall be limited to the portable character set, as encoded in +UTF\(hy8. The <\fIkeyword\fP> fields can be any UTF\(hy8 characters. +The <\fIlength\fP> field shall be the decimal length of the extended +header record in octets, including the trailing +<newline>. +If there is a +.BR hdrcharset +extended header in effect for a file, the +.IR value +field for any +.BR gname , +.BR linkpath , +.BR path , +and +.BR uname +extended header records shall be encoded using the character set +specified by the +.BR hdrcharset +extended header record; otherwise, the +.IR value +field shall be encoded using UTF\(hy8. The +.IR value +field for all other keywords specified by POSIX.1\(hy2008 shall be +encoded using UTF\(hy8. +.P +The <\fIkeyword\fP> field shall be one of the entries from the +following list or a keyword provided as an implementation extension. +Keywords consisting entirely of lowercase letters, digits, and periods +are reserved for future standardization. A keyword shall not include an +<equals-sign>. +(In the following list, the notations ``file(s)'' or ``block(s)'' is used +to acknowledge that a keyword affects the following single file after a +.IR typeflag +.BR x +extended header, but possibly multiple files after +.IR typeflag +.BR g . +Any requirements in the list for +.IR pax +to include a record when in +.BR write +or +.BR copy +mode shall apply only when such a record has not already been provided +through the use of the +.BR \-o +option. When used in +.BR copy +mode, +.IR pax +shall behave as if an archive had been created with applicable extended +header records and then extracted.) +.IP "\fBatime\fP" 10 +The file access time for the following file(s), equivalent to the value +of the +.IR st_atime +member of the +.BR stat +structure for a file, as described by the +\fIstat\fR() +function. The access time shall be restored if the process has +appropriate privileges required to do so. The format of the +<\fIvalue\fP> shall be as described in +.IR "pax Extended Header File Times". +.IP "\fBcharset\fP" 10 +The name of the character set used to encode the data in the following +file(s). The entries in the following table are defined to refer to +known standards; additional names may be agreed on between the +originator and recipient. +.TS +center box tab(!); +cB | cB +lf5 | l. +<value>!Formal Standard +_ +ISO-IR 646 1990!ISO/IEC 646:\|1990 +ISO-IR 8859 1 1998!ISO/IEC 8859\(hy1:\|1998 +ISO-IR 8859 2 1999!ISO/IEC 8859\(hy2:\|1999 +ISO-IR 8859 3 1999!ISO/IEC 8859\(hy3:\|1999 +ISO-IR 8859 4 1998!ISO/IEC 8859\(hy4:\|1998 +ISO-IR 8859 5 1999!ISO/IEC 8859\(hy5:\|1999 +ISO-IR 8859 6 1999!ISO/IEC 8859\(hy6:\|1999 +ISO-IR 8859 7 1987!ISO/IEC 8859\(hy7:\|1987 +ISO-IR 8859 8 1999!ISO/IEC 8859\(hy8:\|1999 +ISO-IR 8859 9 1999!ISO/IEC 8859\(hy9:\|1999 +ISO-IR 8859 10 1998!ISO/IEC 8859\(hy10:\|1998 +ISO-IR 8859 13 1998!ISO/IEC 8859\(hy13:\|1998 +ISO-IR 8859 14 1998!ISO/IEC 8859\(hy14:\|1998 +ISO-IR 8859 15 1999!ISO/IEC 8859\(hy15:\|1999 +ISO-IR 10646 2000!ISO/IEC 10646:\|2000 +ISO-IR 10646 2000 UTF-8!ISO/IEC 10646, UTF-8 encoding +BINARY!None. +.TE +.RS 10 +.P +The encoding is included in an extended header for information only; +when +.IR pax +is used as described in POSIX.1\(hy2008, it shall not translate the file data +into any other encoding. The +.BR BINARY +entry indicates unencoded binary data. +.P +When used in +.BR write +or +.BR copy +mode, it is implementation-defined whether +.IR pax +includes a +.BR charset +extended header record for a file. +.RE +.IP "\fBcomment\fP" 10 +A series of characters used as a comment. All characters in the +<\fIvalue\fP> field shall be ignored by +.IR pax . +.IP "\fBgid\fP" 10 +The group ID of the group that owns the file, expressed as a decimal +number using digits from the ISO/IEC\ 646:\|1991 standard. This record shall override the +.IR gid +field in the following header block(s). When used in +.BR write +or +.BR copy +mode, +.IR pax +shall include a +.IR gid +extended header record for each file whose group ID is greater than +2\|097\|151 (octal 7\|777\|777). +.IP "\fBgname\fP" 10 +The group of the file(s), formatted as a group name in the group +database. This record shall override the +.IR gid +and +.IR gname +fields in the following header block(s), and any +.IR gid +extended header record. When used in +.BR read , +.BR copy , +or +.BR list +mode, +.IR pax +shall translate the name from the encoding in the header record to +the character set appropriate for the group database on the +receiving system. If any of the characters cannot be +translated, and if neither the +.BR \-o \c +.BR invalid=UTF\(hy8 +option nor the +.BR \-o \c +.BR invalid=binary +option is specified, the results are implementation-defined. +When used in +.BR write +or +.BR copy +mode, +.IR pax +shall include a +.BR gname +extended header record for each file whose group name cannot be +represented entirely with the letters and digits of the portable +character set. +.IP "\fBhdrcharset\fR" 10 +The name of the character set used to encode the value field of the +.BR gname , +.BR linkpath , +.BR path , +and +.BR uname +.IR pax +extended header records. The entries in the following table are defined +to refer to known standards; additional names may be agreed between the +originator and the recipient. +.br +.TS +center box tab(!); +cB | cB +lf5 | l. +<value>!Formal Standard +_ +ISO-IR 10646 2000 UTF-8!ISO/IEC 10646, UTF-8 encoding +BINARY!None. +.TE +.RS 10 +.P +If no +.BR hdrcharset +extended header record is specified, the default character set used to +encode all values in extended header records shall be the ISO/IEC\ 10646\(hy1:\|2000 standard +UTF\(hy8 encoding. +.P +The +.BR BINARY +entry indicates that all values recorded in extended headers for +affected files are unencoded binary data from the underlying system. +.RE +.IP "\fBlinkpath\fP" 10 +The pathname of a link being created to another file, of any type, +previously archived. This record shall override the +.IR linkname +field in the following +.BR ustar +header block(s). The following +.BR ustar +header block shall determine the type of link created. If +.IR typeflag +of the following header block is 1, it shall be a hard link. If +.IR typeflag +is 2, it shall be a symbolic link and the +.BR linkpath +value shall be the contents of the symbolic link. The +.IR pax +utility shall translate the name of the link (contents of the symbolic +link) from the encoding in the header to the character set appropriate +for the local file system. When used in +.BR write +or +.BR copy +mode, +.IR pax +shall include a +.BR linkpath +extended header record for each link whose pathname cannot be +represented entirely with the members of the portable character set +other than NUL. +.IP "\fBmtime\fP" 10 +The file modification time of the following file(s), equivalent to the +value of the +.IR st_mtime +member of the +.BR stat +structure for a file, as described in the +\fIstat\fR() +function. This record shall override the +.IR mtime +field in the following header block(s). The modification time shall be +restored if the process has appropriate privileges required to do +so. The format of the <\fIvalue\fP> shall be as described in +.IR "pax Extended Header File Times". +.IP "\fBpath\fP" 10 +The pathname of the following file(s). This record shall override the +.IR name +and +.IR prefix +fields in the following header block(s). The +.IR pax +utility shall translate the pathname of the file from the encoding in +the header to the character set appropriate for the local file system. +.RS 10 +.P +When used in +.BR write +or +.BR copy +mode, +.IR pax +shall include a +.IR path +extended header record for each file whose pathname cannot be +represented entirely with the members of the portable character set +other than NUL. +.RE +.IP "\fBrealtime.\fIany\fR" 10 +The keywords prefixed by ``realtime.'' are reserved for future +standardization. +.IP "\fBsecurity.\fIany\fR" 10 +The keywords prefixed by ``security.'' are reserved for future +standardization. +.IP "\fBsize\fP" 10 +The size of the file in octets, expressed as a decimal number using +digits from the ISO/IEC\ 646:\|1991 standard. This record shall override the +.IR size +field in the following header block(s). When used in +.BR write +or +.BR copy +mode, +.IR pax +shall include a +.IR size +extended header record for each file with a size value greater than +8\|589\|934\|591 (octal 77\|777\|777\|777). +.IP "\fBuid\fP" 10 +The user ID of the file owner, expressed as a decimal number using +digits from the ISO/IEC\ 646:\|1991 standard. This record shall override the +.IR uid +field in the following header block(s). When used in +.BR write +or +.BR copy +mode, +.IR pax +shall include a +.IR uid +extended header record for each file whose owner ID is greater than +2\|097\|151 (octal 7\|777\|777). +.IP "\fBuname\fP" 10 +The owner of the following file(s), formatted as a user name in the +user database. This record shall override the +.IR uid +and +.IR uname +fields in the following header block(s), and any +.IR uid +extended header record. When used in +.BR read , +.BR copy , +or +.BR list +mode, +.IR pax +shall translate the name from the encoding in the header record to the +character set appropriate for the user database on the receiving +system. If any of the characters cannot be translated, and if neither +the +.BR \-o \c +.BR invalid=UTF\(hy8 +option nor the +.BR \-o \c +.BR invalid=binary +option is specified, the results are implementation-defined. +When used in +.BR write +or +.BR copy +mode, +.IR pax +shall include a +.BR uname +extended header record for each file whose user name cannot be +represented entirely with the letters and digits of the portable +character set. +.P +If the <\fIvalue\fP> field is zero length, it shall delete any header +block field, previously entered extended header value, or global +extended header value of the same name. +.P +If a keyword in an extended header record (or in a +.BR \-o +option-argument) overrides or deletes a corresponding field in the +.BR ustar +header block, +.IR pax +shall ignore the contents of that header block field. +.P +Unlike the +.BR ustar +header block fields, NULs shall not delimit <\fIvalue\fP>s; all +characters within the <\fIvalue\fP> field shall be considered data for +the field. None of the length limitations of the +.BR ustar +header block fields in +.IR "Table 4-14, ustar Header Block" +shall apply to the extended header records. +.SS "pax Extended Header Keyword Precedence" +.P +This section describes the precedence in which the various header +records and fields and command line options are selected to apply to a +file in the archive. When +.IR pax +is used in +.BR read +or +.BR list +modes, it shall determine a file attribute in the following sequence: +.IP " 1." 4 +If +.BR \-o \c +.BR delete=keyword-prefix +is used, the affected attributes shall be determined from step 7., if +applicable, or ignored otherwise. +.IP " 2." 4 +If +.BR \-o \c +.IR keyword := +is used, the affected attributes shall be ignored. +.IP " 3." 4 +If +.BR \-o \c +.BR keyword:=value +is used, the affected attribute shall be assigned the value. +.IP " 4." 4 +If there is a +.IR typeflag +.BR x +extended header record, the affected attribute shall be assigned the +<\fIvalue\fP>. When extended header records conflict, the last one +given in the header shall take precedence. +.IP " 5." 4 +If +.BR \-o \c +.BR keyword=value +is used, the affected attribute shall be assigned the value. +.IP " 6." 4 +If there is a +.IR typeflag +.BR g +global extended header record, the affected attribute shall be assigned +the <\fIvalue\fP>. When global extended header records conflict, the +last one given in the global header shall take precedence. +.IP " 7." 4 +Otherwise, the attribute shall be determined from the +.BR ustar +header block. +.SS "pax Extended Header File Times" +.P +The +.IR pax +utility shall write an +.BR mtime +record for each file in +.BR write +or +.BR copy +modes if the file's modification time cannot be represented exactly in +the +.BR ustar +header logical record described in +.IR "ustar Interchange Format". +This can occur if the time is out of +.BR ustar +range, or if the file system of the underlying implementation supports +non-integer time granularities and the time is not an integer. All of +these time records shall be formatted as a decimal representation of +the time in seconds since the Epoch. If a +<period> +(\c +.BR '.' ) +decimal point character is present, the digits to the right of the +point shall represent the units of a subsecond timing granularity, +where the first digit is tenths of a second and each subsequent digit +is a tenth of the previous digit. In +.BR read +or +.BR copy +mode, the +.IR pax +utility shall truncate the time of a file to the greatest value that is +not greater than the input header file time. In +.BR write +or +.BR copy +mode, the +.IR pax +utility shall output a time exactly if it can be represented exactly as +a decimal number, and otherwise shall generate only enough digits so +that the same time shall be recovered if the file is extracted on a +system whose underlying implementation supports the same time +granularity. +.SS "ustar Interchange Format" +.P +A +.BR ustar +archive tape or file shall contain a series of logical records. Each +logical record shall be a fixed-size logical record of 512 octets (see +below). Although this format may be thought of as being stored on +9-track industry-standard 12.7 mm (0.5 in) magnetic tape, other types of +transportable media are not excluded. Each file archived shall be +represented by a header logical record that describes the file, +followed by zero or more logical records that give the contents of the +file. At the end of the archive file there shall be two 512-octet +logical records filled with binary zeros, interpreted as an +end-of-archive indicator. +.P +The logical records may be grouped for physical I/O operations, as +described under the +.BR \-b \c +.IR blocksize +and +.BR \-x +.BR ustar +options. Each group of logical records may be written with a single +operation equivalent to the +\fIwrite\fR() +function. On magnetic tape, the result of this write shall be a single +tape physical block. The last physical block shall always be the full +size, so logical records after the two zero logical records may contain +undefined data. +.P +The header logical record shall be structured as shown in the following +table. All lengths and offsets are in decimal. +.br +.sp +.ce 1 +\fBTable 4-14: ustar Header Block\fR +.TS +center box tab(@); +cB | cB | cB +lI | n | n. +Field Name@Octet Offset@Length (in Octets) +_ +name@0@100 +mode@100@8 +uid@108@8 +gid@116@8 +size@124@12 +mtime@136@12 +chksum@148@8 +typeflag@156@1 +linkname@157@100 +magic@257@6 +version@263@2 +uname@265@32 +gname@297@32 +devmajor@329@8 +devminor@337@8 +prefix@345@155 +.TE +.P +All characters in the header logical record shall be represented in the +coded character set of the ISO/IEC\ 646:\|1991 standard. For maximum portability between +implementations, names should be selected from characters represented +by the portable filename character set as octets with the most +significant bit zero. If an implementation supports the use of +characters outside of +<slash> +and the portable filename character set in names for files, users, and +groups, one or more implementation-defined encodings of these characters +shall be provided for interchange purposes. +.P +However, the +.IR pax +utility shall never create filenames on the local system that cannot +be accessed via the procedures described in POSIX.1\(hy2008. If a filename is +found on the medium that would create an invalid filename, it is +implementation-defined whether the data from the file is stored on the +file hierarchy and under what name it is stored. The +.IR pax +utility may choose to ignore these files as long as it produces an +error indicating that the file is being ignored. +.P +Each field within the header logical record is contiguous; that is, +there is no padding used. Each character on the archive medium shall be +stored contiguously. +.P +The fields +.IR magic , +.IR uname , +and +.IR gname +are character strings each terminated by a NUL character. The fields +.IR name , +.IR linkname , +and +.IR prefix +are NUL-terminated character strings except when all characters in the +array contain non-NUL characters including the last character. The +.IR version +field is two octets containing the characters +.BR \(dq00\(dq +(zero-zero). The +.IR typeflag +contains a single character. All other fields are leading zero-filled +octal numbers using digits from the ISO/IEC\ 646:\|1991 standard IRV. Each numeric field is +terminated by one or more +<space> +or NUL characters. +.P +The +.IR name +and the +.IR prefix +fields shall produce the pathname of the file. A new pathname shall +be formed, if +.IR prefix +is not an empty string (its first character is not NUL), by +concatenating +.IR prefix +(up to the first NUL character), a +<slash> +character, and +.IR name ; +otherwise, +.IR name +is used alone. In either case, +.IR name +is terminated at the first NUL character. If +.IR prefix +begins with a NUL character, it shall be ignored. In this manner, +pathnames of at most 256 characters can be supported. If a pathname +does not fit in the space provided, +.IR pax +shall notify the user of the error, and shall not store any part of the +file\(emheader or data\(emon the medium. +.P +The +.IR linkname +field, described below, shall not use the +.IR prefix +to produce a pathname. As such, a +.IR linkname +is limited to 100 characters. If the name does not fit in the space +provided, +.IR pax +shall notify the user of the error, and shall not attempt to store the +link on the medium. +.P +The +.IR mode +field provides 12 bits encoded in the ISO/IEC\ 646:\|1991 standard octal digit representation. +The encoded bits shall represent the following values: +.br +.sp +.ce 1 +\fBTable: ustar \fImode\fP Field\fR +.TS +tab(!) center box; +cB | cB | cB +n | l | l. +Bit Value!POSIX.1\(hy2008 Bit!Description +_ +04\|000!S_ISUID!Set UID on execution. +02\|000!S_ISGID!Set GID on execution. +01\|000!<reserved>!Reserved for future standardization. +00\|400!S_IRUSR!Read permission for file owner class. +00\|200!S_IWUSR!Write permission for file owner class. +00\|100!S_IXUSR!Execute/search permission for file owner class. +00\|040!S_IRGRP!Read permission for file group class. +00\|020!S_IWGRP!Write permission for file group class. +00\|010!S_IXGRP!Execute/search permission for file group class. +00\|004!S_IROTH!Read permission for file other class. +00\|002!S_IWOTH!Write permission for file other class. +00\|001!S_IXOTH!Execute/search permission for file other class. +.TE +.P +When appropriate privileges are required to set one of these mode bits, +and the user restoring the files from the archive does not have +appropriate privileges, the mode bits for which the user does not have +appropriate privileges shall be ignored. Some of the mode bits in the +archive format are not mentioned elsewhere in this volume of POSIX.1\(hy2017. If the +implementation does not support those bits, they may be ignored. +.P +The +.IR uid +and +.IR gid +fields are the user and group ID of the owner and group of the file, +respectively. +.P +The +.IR size +field is the size of the file in octets. If the +.IR typeflag +field is set to specify a file to be of type 1 (a link) or 2 (a +symbolic link), the +.IR size +field shall be specified as zero. If the +.IR typeflag +field is set to specify a file of type 5 (directory), the +.IR size +field shall be interpreted as described under the definition of that +record type. No data logical records are stored for types 1, 2, or 5. +If the +.IR typeflag +field is set to 3 (character special file), 4 (block special file), or +6 (FIFO), the meaning of the +.IR size +field is unspecified by this volume of POSIX.1\(hy2017, and no data logical records shall be +stored on the medium. Additionally, for type 6, the +.IR size +field shall be ignored when reading. If the +.IR typeflag +field is set to any other value, the number of logical records written +following the header shall be (\c +.IR size +511)/512, +ignoring any fraction in the result of the division. +.P +The +.IR mtime +field shall be the modification time of the file at the time it was +archived. It is the ISO/IEC\ 646:\|1991 standard representation of the octal value of the +modification time obtained from the +\fIstat\fR() +function. +.P +The +.IR chksum +field shall be the ISO/IEC\ 646:\|1991 standard IRV representation of the octal value of the +simple sum of all octets in the header logical record. Each octet in +the header shall be treated as an unsigned value. These values shall be +added to an unsigned integer, initialized to zero, the precision of +which is not less than 17 bits. When calculating the checksum, the +.IR chksum +field is treated as if it were all +<space> +characters. +.P +The +.IR typeflag +field specifies the type of file archived. If a particular +implementation does not recognize the type, or the user does not have +appropriate privileges to create that type, the file shall be extracted +as if it were a regular file if the file type is defined to have a +meaning for the +.IR size +field that could cause data logical records to be written on the medium +(see the previous description for +.IR size ). +If conversion to a regular file occurs, the +.IR pax +utility shall produce an error indicating that the conversion took +place. All of the +.IR typeflag +fields shall be coded in the ISO/IEC\ 646:\|1991 standard IRV: +.IP "\fR0\fR" 8 +Represents a regular file. For backwards-compatibility, a +.IR typeflag +value of binary zero (\c +.BR '\e0' ) +should be recognized as meaning a regular file when extracting files +from the archive. Archives written with this version of the archive +file format create regular files with a +.IR typeflag +value of the ISO/IEC\ 646:\|1991 standard IRV +.BR '0' . +.IP "\fR1\fR" 8 +Represents a file linked to another file, of any type, previously +archived. Such files are identified by having the same device +and file serial numbers, and pathnames that refer to different +directory entries. All such files shall be archived as linked files. +The linked-to name is specified in the +.IR linkname +field with a NUL-character terminator if it is less than 100 octets in +length. +.IP "\fR2\fR" 8 +Represents a symbolic link. The contents of the symbolic link shall be +stored in the +.IR linkname +field. +.IP "\fR3,4\fR" 8 +Represent character special files and block special files respectively. +In this case the +.IR devmajor +and +.IR devminor +fields shall contain information defining the device, the format of +which is unspecified by this volume of POSIX.1\(hy2017. Implementations may map the device +specifications to their own local specification or may ignore the +entry. +.IP "\fR5\fR" 8 +Specifies a directory or subdirectory. On systems where disk allocation +is performed on a directory basis, the +.IR size +field shall contain the maximum number of octets (which may be rounded +to the nearest disk block allocation unit) that the directory may hold. +A +.IR size +field of zero indicates no such limiting. Systems that do not support +limiting in this manner should ignore the +.IR size +field. +.IP "\fR6\fR" 8 +Specifies a FIFO special file. Note that the archiving of a FIFO file +archives the existence of this file and not its contents. +.IP "\fR7\fR" 8 +Reserved to represent a file to which an implementation has associated +some high-performance attribute. Implementations without such +extensions should treat this file as a regular file (type 0). +.IP "\fRA\(hyZ\fR" 8 +The letters +.BR 'A' +to +.BR 'Z' , +inclusive, are reserved for custom implementations. All other values +are reserved for future versions of this standard. +.P +It is unspecified whether files with pathnames that refer to the same +directory entry are archived as linked files or as separate files. If +they are archived as linked files, this means that attempting to +extract both pathnames from the resulting archive will always cause an +error (unless the +.BR \-u +option is used) because the link cannot be created. +.P +It is unspecified whether files with the same device and file serial +numbers being appended to an archive are treated as linked files to +members that were in the archive before the append. +.P +Attempts to archive a socket shall produce a diagnostic message when +.BR ustar +interchange format is used, but may be allowed when +.BR pax +interchange format is used. Handling of other file types is +implementation-defined. +.P +The +.IR magic +field is the specification that this archive was output in this archive +format. If this field contains +.BR ustar +(the five characters from the ISO/IEC\ 646:\|1991 standard IRV shown followed by NUL), the +.IR uname +and +.IR gname +fields shall contain the ISO/IEC\ 646:\|1991 standard IRV representation of the owner and +group of the file, respectively (truncated to fit, if necessary). When +the file is restored by a privileged, protection-preserving version of +the utility, the user and group databases shall be scanned for these +names. If found, the user and group IDs contained within these files +shall be used rather than the values contained within the +.IR uid +and +.IR gid +fields. +.SS "cpio Interchange Format" +.P +The octet-oriented +.BR cpio +archive format shall be a series of entries, each comprising a header +that describes the file, the name of the file, and then the contents of +the file. +.P +An archive may be recorded as a series of fixed-size blocks of octets. +This blocking shall be used only to make physical I/O more efficient. +The last group of blocks shall always be at the full size. +.P +For the octet-oriented +.BR cpio +archive format, the individual entry information shall be in the order +indicated and described by the following table; see also the +.IR <cpio.h> +header. +.br +.sp +.ce 1 +\fBTable 4-16: Octet-Oriented cpio Archive Entry\fR +.TS +center box tab(!); +cB | cB | cB +lI | n | l. +Header Field Name!Length (in Octets)!Interpreted as +_ +c_magic!6!Octal number +c_dev!6!Octal number +c_ino!6!Octal number +c_mode!6!Octal number +c_uid!6!Octal number +c_gid!6!Octal number +c_nlink!6!Octal number +c_rdev!6!Octal number +c_mtime!11!Octal number +c_namesize!6!Octal number +c_filesize!11!Octal number +_ +.T& +cB | cB | cB +lI lI l. +Filename Field Name!Length!Interpreted as +_ +c_name!c_namesize!Pathname string +_ +.T& +cB | cB | cB +lI lI l. +File Data Field Name!Length!Interpreted as +_ +c_filedata!c_filesize!Data +.TE +.SS "cpio Header" +.P +For each file in the archive, a header as defined previously shall be +written. The information in the header fields is written as streams of +the ISO/IEC\ 646:\|1991 standard characters interpreted as octal numbers. The octal numbers +shall be extended to the necessary length by appending the ISO/IEC\ 646:\|1991 standard IRV +zeros at the most-significant-digit end of the number; the result is +written to the most-significant digit of the stream of octets first. +The fields shall be interpreted as follows: +.IP "\fIc_magic\fR" 10 +Identify the archive as being a transportable archive by containing the +identifying value +.BR \(dq070707\(dq . +.IP "\fIc_dev\fR,\ \fIc_ino\fR" 10 +Contains values that uniquely identify the file within the archive +(that is, no files contain the same pair of +.IR c_dev +and +.IR c_ino +values unless they are links to the same file). The values shall be +determined in an unspecified manner. +.IP "\fIc_mode\fR" 10 +Contains the file type and access permissions as defined in the +following table. +.br +.sp +.ce 1 +\fBTable 4-17: Values for cpio c_mode Field\fR +.TS +center box tab(@); +cB | cB | cB +l | n | l. +File Permissions Name@Value@Indicates +_ +C_IRUSR@000\|400@Read by owner +C_IWUSR@000\|200@Write by owner +C_IXUSR@000\|100@Execute by owner +C_IRGRP@000\|040@Read by group +C_IWGRP@000\|020@Write by group +C_IXGRP@000\|010@Execute by group +C_IROTH@000\|004@Read by others +C_IWOTH@000\|002@Write by others +C_IXOTH@000\|001@Execute by others +C_ISUID@004\|000@Set \fIuid\fP +C_ISGID@002\|000@Set \fIgid\fP +C_ISVTX@001\|000@Reserved +_ +.T& +cB | cB | cB +l | n | l. +File Type Name@Value@Indicates +_ +C_ISDIR@040\|000@Directory +C_ISFIFO@010\|000@FIFO +C_ISREG@0100\|000@Regular file +C_ISLNK@0120\|000@Symbolic link +.RS 10 +.P +C_ISBLK@060\|000@Block special file +C_ISCHR@020\|000@Character special file +C_ISSOCK@0140\|000@Socket +.P +C_ISCTG@0110\|000@Reserved +.TE +.P +Directories, FIFOs, symbolic links, and regular files shall be +supported on a system conforming to this volume of POSIX.1\(hy2017; additional values defined +previously are reserved for compatibility with existing systems. +Additional file types may be supported; however, such files should not +be written to archives intended to be transported to other systems. +.RE +.IP "\fIc_uid\fR" 10 +Contains the user ID of the owner. +.IP "\fIc_gid\fR" 10 +Contains the group ID of the group. +.IP "\fIc_nlink\fR" 10 +Contains a number greater than or equal to the number of links in the +archive referencing the file. If the +.BR \-a +option is used to append to a +.IR cpio +archive, then the +.IR pax +utility need not account for the files in the existing part of the +archive when calculating the +.IR c_nlink +values for the appended part of the archive, and need not alter the +.IR c_nlink +values in the existing part of the archive if additional files with the +same +.IR c_dev +and +.IR c_ino +values are appended to the archive. +.IP "\fIc_rdev\fR" 10 +Contains implementation-defined information for character or block +special files. +.IP "\fIc_mtime\fR" 10 +Contains the latest time of modification of the file at the time the +archive was created. +.IP "\fIc_namesize\fR" 10 +Contains the length of the pathname, including the terminating NUL +character. +.IP "\fIc_filesize\fR" 10 +Contains the length in octets of the data section following the +header structure. +.SS "cpio Filename" +.P +The +.IR c_name +field shall contain the pathname of the file. The length of this field +in octets is the value of +.IR c_namesize . +.P +If a filename is found on the medium that would create an invalid +pathname, it is implementation-defined whether the data from the file +is stored on the file hierarchy and under what name it is stored. +.P +All characters shall be represented in the ISO/IEC\ 646:\|1991 standard IRV. For maximum +portability between implementations, names should be selected from +characters represented by the portable filename character set as +octets with the most significant bit zero. If an implementation +supports the use of characters outside the portable filename character +set in names for files, users, and groups, one or more +implementation-defined encodings of these characters shall be provided +for interchange purposes. However, the +.IR pax +utility shall never create filenames on the local system that cannot +be accessed via the procedures described previously in this volume of POSIX.1\(hy2017. If a +filename is found on the medium that would create an invalid filename, +it is implementation-defined whether the data from the file is stored on +the local file system and under what name it is stored. The +.IR pax +utility may choose to ignore these files as long as it produces an +error indicating that the file is being ignored. +.SS "cpio File Data" +.P +Following +.IR c_name , +there shall be +.IR c_filesize +octets of data. Interpretation of such data occurs in a manner +dependent on the file. For regular files, the data shall consist +of the contents of the file. For symbolic links, the data shall +consist of the contents of the symbolic link. If +.IR c_filesize +is zero, no data shall be contained in +.IR c_filedata . +.P +When restoring from an archive: +.IP " *" 4 +If the user does not have appropriate privileges to create a file of +the specified type, +.IR pax +shall ignore the entry and write an error message to standard error. +.IP " *" 4 +Only regular files and symbolic links have data to be restored. Presuming +a regular file meets any selection criteria that might be imposed on +the format-reading utility by the user, such data shall be restored. +.IP " *" 4 +If a user does not have appropriate privileges to set a particular mode +flag, the flag shall be ignored. Some of the mode flags in the archive +format are not mentioned elsewhere in this volume of POSIX.1\(hy2017. If the implementation does +not support those flags, they may be ignored. +.SS "cpio Special Entries" +.P +FIFO special files, directories, and the trailer shall be recorded with +.IR c_filesize +equal to zero. Symbolic links shall be recorded with +.IR c_filesize +equal to the length of the contents of the symbolic link. +For other special files, +.IR c_filesize +is unspecified by this volume of POSIX.1\(hy2017. The header for the next file entry in the +archive shall be written directly after the last octet of the file +entry preceding it. A header denoting the filename +.BR TRAILER!!! +shall indicate the end of the archive; the contents of octets in the +last block of the archive following such a header are undefined. +.SH "EXIT STATUS" +The following exit values shall be returned: +.IP "\00" 6 +All files were processed successfully. +.IP >0 6 +An error occurred. +.SH "CONSEQUENCES OF ERRORS" +If +.IR pax +cannot create a file or a link when reading an archive or cannot find a +file when writing an archive, or cannot preserve the user ID, group ID, +or file mode when the +.BR \-p +option is specified, a diagnostic message shall be written to standard +error and a non-zero exit status shall be returned, but processing +shall continue. In the case where +.IR pax +cannot create a link to a file, +.IR pax +shall not, by default, create a second copy of the file. +.P +If the extraction of a file from an archive is prematurely terminated +by a signal or error, +.IR pax +may have only partially extracted the file or (if the +.BR \-n +option was not specified) may have extracted a file of the same name as +that specified by the user, but which is not the file the user wanted. +Additionally, the file modes of extracted directories may have +additional bits from the S_IRWXU mask set as well as incorrect +modification and access times. +.LP +.IR "The following sections are informative." +.SH "APPLICATION USAGE" +Caution is advised when using the +.BR \-a +option to append to a +.IR cpio +format archive. If any of the files being appended happen to be given +the same +.IR c_dev +and +.IR c_ino +values as a file in the existing part of the archive, then they may be +treated as links to that file on extraction. Thus, it is risky to use +.BR \-a +with +.IR cpio +format except when it is done on the same system that the original +archive was created on, and with the same +.IR pax +utility, and in the knowledge that there has been little or no file +system activity since the original archive was created that could lead +to any of the files appended being given the same +.IR c_dev +and +.IR c_ino +values as an unrelated file in the existing part of the archive. Also, +when (intentionally) appending additional links to a file in the +existing part of the archive, the +.IR c_nlink +values in the modified archive can be smaller than the number of links +to the file in the archive, which may mean that the links are not +preserved on extraction. +.P +The +.BR \-p +(privileges) option was invented to reconcile differences between +historical +.IR tar +and +.IR cpio +implementations. In particular, the two utilities use +.BR \-m +in diametrically opposed ways. The +.BR \-p +option also provides a consistent means of extending the ways in which +future file attributes can be addressed, such as for enhanced security +systems or high-performance files. Although it may seem complex, there +are really two modes that are most commonly used: +.IP "\fB\-p\ e\fR" 8 +``Preserve everything''. This would be used by the historical +superuser, someone with all appropriate privileges, to preserve all +aspects of the files as they are recorded in the archive. The +.BR e +flag is the sum of +.BR o +and +.BR p , +and other implementation-defined attributes. +.IP "\fB\-p\ p\fR" 8 +``Preserve'' the file mode bits. This would be used by the user with +regular privileges who wished to preserve aspects of the file other +than the ownership. The file times are preserved by default, but two +other flags are offered to disable these and use the time of +extraction. +.P +The one pathname per line format of standard input precludes +pathnames containing +<newline> +characters. Although such pathnames violate the portable filename +guidelines, they may exist and their presence may inhibit usage of +.IR pax +within shell scripts. This problem is inherited from historical archive +programs. The problem can be avoided by listing filename arguments on +the command line instead of on standard input. +.P +It is almost certain that appropriate privileges are required for +.IR pax +to accomplish parts of this volume of POSIX.1\(hy2017. Specifically, creating files of type +block special or character special, restoring file access times unless +the files are owned by the user (the +.BR \-t +option), or preserving file owner, group, and mode (the +.BR \-p +option) all probably require appropriate privileges. +.P +In +.BR read +mode, implementations are permitted to overwrite files when the archive +has multiple members with the same name. This may fail if permissions +on the first version of the file do not permit it to be overwritten. +.P +The +.BR cpio +and +.BR ustar +formats can only support files up to 8\|589\|934\|592 bytes +(8 \(** 2^30) in size. +.P +When archives containing binary header information are listed , the +filenames printed may cause strange behavior on some terminals. +.P +When all of the following are true: +.IP " 1." 4 +A file of type directory is being placed into an archive. +.IP " 2." 4 +The +.BR ustar +archive format is being used. +.IP " 3." 4 +The pathname of the directory is less than or equal to 155 bytes long +(it will fit in the +.IR prefix +field in the +.BR ustar +header block). +.IP " 4." 4 +The last component of the pathname of the directory is longer than 100 +bytes long (it will not fit in the +.IR name +field in the +.BR ustar +header block). +.P +some implementations of the +.IR pax +utility will place the entire directory pathname in the +.IR prefix +field, set the +.IR name +field to an empty string, and place the directory in the archive. +Other implementations of the +.IR pax +utility will give an error under these conditions because the +.IR name +field is not large enough to hold the last component of the directory name. +This standard allows either behavior. However, when extracting a directory +from a +.BR ustar +format archive, this standard requires that all implementations be able +to extract a directory even if the +.IR name +field contains an empty string as long as the +.IR prefix +field does not also contain an empty string. +.SH EXAMPLES +The following command: +.sp +.RS 4 +.nf + +pax -w -f /dev/rmt/1m . +.fi +.P +.RE +.P +copies the contents of the current directory to tape drive 1, medium +density (assuming historical System V device naming procedures\(emthe +historical BSD device name would be +.BR /dev/rmt9 ). +.P +The following commands: +.sp +.RS 4 +.nf + +mkdir \fInewdir\fR +pax -rw \fIolddir newdir\fR +.fi +.P +.RE +.P +copy the +.IR olddir +directory hierarchy to +.IR newdir . +.sp +.RS 4 +.nf + +pax -r -s \(aq,\(ha//*usr//*,,\(aq -f a.pax +.fi +.P +.RE +.P +reads the archive +.BR a.pax , +with all files rooted in +.BR /usr +in the archive extracted relative to the current directory. +.P +Using the option: +.sp +.RS 4 +.nf + +-o listopt="%M %(atime)T %(size)D %(name)s" +.fi +.P +.RE +.P +overrides the default output description in Standard Output and instead +writes: +.sp +.RS 4 +.nf + +-rw-rw--- Jan 12 15:53 2003 1492 /usr/foo/bar +.fi +.P +.RE +.P +Using the options: +.sp +.RS 4 +.nf + +-o listopt=\(aq%L\et%(size)D\en%.7\(aq \e +-o listopt=\(aq(name)s\en%(atime)T\en%T\(aq +.fi +.P +.RE +.P +overrides the default output description in Standard Output and instead +writes: +.sp +.RS 4 +.nf + +/usr/foo/bar -> /tmp 1492 +/usr/fo +Jan 12 15:53 1991 +Jan 31 15:53 2003 +.fi +.P +.RE +.SH RATIONALE +The +.IR pax +utility was new for the ISO\ POSIX\(hy2:\|1993 standard. It represents a peaceful +compromise between advocates of the historical +.IR tar +and +.IR cpio +utilities. +.P +A fundamental difference between +.IR cpio +and +.IR tar +was in the way directories were treated. The +.IR cpio +utility did not treat directories differently from other files, and to +select a directory and its contents required that each file in the +hierarchy be explicitly specified. For +.IR tar , +a directory matched every file in the file hierarchy it rooted. +.P +The +.IR pax +utility offers both interfaces; by default, directories map into the +file hierarchy they root. The +.BR \-d +option causes +.IR pax +to skip any file not explicitly referenced, as +.IR cpio +historically did. The +.IR tar +.BR \- \c +.IR style +behavior was chosen as the default because it was believed that this +was the more common usage and because +.IR tar +is the more commonly available interface, as it was historically +provided on both System V and BSD implementations. +.P +The data interchange format specification in this volume of POSIX.1\(hy2017 requires that +processes with ``appropriate privileges'' shall always restore the +ownership and permissions of extracted files exactly as archived. If +viewed from the historic equivalence between superuser and +``appropriate privileges'', there are two problems +with this requirement. First, users running as superusers may +unknowingly set dangerous permissions on extracted files. Second, it is +needlessly limiting, in that superusers cannot extract files and own +them as superuser unless the archive was created by the superuser. (It +should be noted that restoration of ownerships and permissions for the +superuser, by default, is historical practice in +.IR cpio , +but not in +.IR tar .) +In order to avoid these two problems, the +.IR pax +specification has an additional ``privilege'' mechanism, the +.BR \-p +option. Only a +.IR pax +invocation with the privileges needed, and which has the +.BR \-p +option set using the +.BR e +specification character, has appropriate privileges to restore +full ownership and permission information. +.P +Note also that this volume of POSIX.1\(hy2017 requires that the file ownership and access +permissions shall be set, on extraction, in the same fashion as the +\fIcreat\fR() +function when provided with the mode stored in the archive. This means +that the file creation mask of the user is applied to the file +permissions. +.P +Users should note that directories may be created by +.IR pax +while extracting files with permissions that are different from those +that existed at the time the archive was created. When extracting +sensitive information into a directory hierarchy that no longer exists, +users are encouraged to set their file creation mask appropriately to +protect these files during extraction. +.P +The table of contents output is written to standard output to +facilitate pipeline processing. +.P +An early proposal had hard links displaying for all pathnames. This +was removed because it complicates the output of the case where +.BR \-v +is not specified and does not match historical +.IR cpio +usage. The hard-link information is available in the +.BR \-v +display. +.P +The description of the +.BR \-l +option allows implementations to make hard links to symbolic links. +Earlier versions of this standard did not specify any way to create a +hard link to a symbolic link, but many implementations provided this +capability as an extension. If there are hard links to symbolic links +when an archive is created, the implementation is required to archive +the hard link in the archive (unless +.BR \-H +or +.BR \-L +is specified). When in +.BR read +mode and in +.BR copy +mode, implementations supporting hard links to symbolic links should +use them when appropriate. +.P +The archive formats inherited from the POSIX.1\(hy1990 standard have certain restrictions +that have been brought along from historical usage. For example, there +are restrictions on the length of pathnames stored in the archive. +When +.IR pax +is used in +.BR copy (\c +.BR \-rw ) +mode (copying directory hierarchies), the ability to use extensions +from the +.BR \-x \c +.BR pax +format overcomes these restrictions. +.P +The default +.IR blocksize +value of 5\|120 bytes for +.IR cpio +was selected because it is one of the standard block-size values for +.IR cpio , +set when the +.BR \-B +option is specified. (The other default block-size value for +.IR cpio +is 512 bytes, and this was considered to be too small.) The default +block value of 10\|240 bytes for +.IR tar +was selected because that is the standard block-size value for BSD +.IR tar . +The maximum block size of 32\|256 bytes (2\s-3\u15\d\s+3\-512 bytes) +is the largest multiple of 512 bytes that fits into a signed 16-bit +tape controller transfer register. There are known limitations in some +historical systems that would prevent larger blocks from being +accepted. Historical values were chosen to improve compatibility with +historical scripts using +.IR dd +or similar utilities to manipulate archives. Also, default block sizes +for any file type other than character special file has been deleted +from this volume of POSIX.1\(hy2017 as unimportant and not likely to affect the structure of the +resulting archive. +.P +Implementations are permitted to modify the block-size value based on +the archive format or the device to which the archive is being +written. This is to provide implementations with the opportunity to +take advantage of special types of devices, and it should not be used +without a great deal of consideration as it almost certainly decreases +archive portability. +.P +The intended use of the +.BR \-n +option was to permit extraction of one or more files from the archive +without processing the entire archive. This was viewed by the standard +developers as offering significant performance advantages over +historical implementations. The +.BR \-n +option in early proposals had three effects; the first was to cause +special characters in patterns to not be treated specially. The second +was to cause only the first file that matched a pattern to be +extracted. The third was to cause +.IR pax +to write a diagnostic message to standard error when no file was found +matching a specified pattern. Only the second behavior is retained by +this volume of POSIX.1\(hy2017, for many reasons. First, it is in general not acceptable for a +single option to have multiple effects. Second, the ability to make +pattern matching characters act as normal characters +is useful for parts of +.IR pax +other than file extraction. Third, a finer degree of control over the +special characters is useful because users may wish to normalize only a +single special character in a single filename. Fourth, given a more +general escape mechanism, the previous behavior of the +.BR \-n +option can be easily obtained using the +.BR \-s +option or a +.IR sed +script. Finally, writing a diagnostic message when a pattern specified +by the user is unmatched by any file is useful behavior in all cases. +.P +In this version, the +.BR \-n +was removed from the +.BR copy +mode synopsis of +.IR pax ; +it is inapplicable because there are no pattern operands specified in +this mode. +.P +There is another method than +.IR pax +for copying subtrees in POSIX.1\(hy2008 described as part of the +.IR cp +utility. Both methods are historical practice: +.IR cp +provides a simpler, more intuitive interface, while +.IR pax +offers a finer granularity of control. Each provides additional +functionality to the other; in particular, +.IR pax +maintains the hard-link structure of the hierarchy while +.IR cp +does not. It is the intention of the standard developers that the +results be similar (using appropriate option combinations in both +utilities). The results are not required to be identical; there seemed +insufficient gain to applications to balance the difficulty of +implementations having to guarantee that the results would be exactly +identical. +.P +A single archive may span more than one file. It is suggested that +implementations provide informative messages to the user on standard +error whenever the archive file is changed. +.P +The +.BR \-d +option (do not create intermediate directories not listed in the +archive) found in early proposals was originally provided as a +complement to the historic +.BR \-d +option of +.IR cpio . +It has been deleted. +.P +The +.BR \-s +option in early proposals specified a subset of the substitution +command from the +.IR ed +utility. As there was no reason for only a subset to be supported, the +.BR \-s +option is now compatible with the current +.IR ed +specification. Since the delimiter can be any non-null character, the +following usage with single +<space> +characters is valid: +.sp +.RS 4 +.nf + +pax -s " foo bar " ... +.fi +.P +.RE +.P +The +.BR \-t +description is worded so as to note that this may cause the access time +update caused by some other activity (which occurs while the file is +being read) to be overwritten. +.P +The default behavior of +.IR pax +with regard to file modification times is the same as historical +implementations of +.IR tar . +It is not the historical behavior of +.IR cpio . +.P +Because the +.BR \-i +option uses +.BR /dev/tty , +utilities without a controlling terminal are not able to use this +option. +.P +The +.BR \-y +option, found in early proposals, has been deleted because a line +containing a single +<period> +for the +.BR \-i +option has equivalent functionality. The special lines for the +.BR \-i +option (a single +<period> +and the empty line) are historical practice in +.IR cpio . +.P +In early drafts, a +.BR \-e \c +.IR charmap +option was included to increase portability of files between systems +using different coded character sets. This option was omitted because +it was apparent that consensus could not be formed for it. In this +version, the use of UTF\(hy8 should be an adequate substitute. +.P +The ISO\ POSIX\(hy2:\|1993 standard and ISO\ POSIX\(hy1 standard requirements for +.IR pax , +however, made it very difficult to create a single archive containing +files created using extended characters provided by different locales. +This version adds the +.BR hdrcharset +keyword to make it possible to archive files in these cases without +dropping files due to translation errors. +.P +Translating filenames and other attributes from a locale's encoding to +UTF\(hy8 and then back again can lose information, as the resulting +filename might not be byte-for-byte equivalent to the original. To +avoid this problem, users can specify the +.BR \-o +.BR hdrcharset=binary +option, which will cause the resulting archive to use binary +format for all names and attributes. Such archives are not portable +among hosts that use different native encodings (e.g., EBCDIC +\fIversus\fR ASCII-based encodings), but they will allow interchange +among the vast majority of POSIX file systems in practical use. Also, +the +.BR \-o +.BR hdrcharset=binary +option will cause +.IR pax +in +.BR copy +mode to behave more like other standard utilities such as +.IR cp . +.P +If the values specified by the +.BR \-o +.BR exthdr.name=value , +.BR \-o +.BR globexthdr.name=value , +or by +.BR $TMPDIR +(if +.BR \-o +.BR globexthdr.name +is not specified) require a character encoding other than that +described in the ISO/IEC\ 646:\|1991 standard, a +.BR path +extended header record will have to be created for the file. If a +.BR hdrcharset +extended header record is active for such headers, it will determine +the codeset used for the value field in these extended +.BR path +header records. These +.BR path +extended header records always need to be created when writing an +archive even if +.BR hdrcharset=binary +has been specified and would contain the same (binary) data that +appears in the +.BR ustar +header record prefix and +.IR name +fields. (In other words, an extended header +.BR path +record is always required to be generated if the +.IR prefix +or +.IR name +fields contain non-ASCII characters even when +.BR hdrcharset=binary +is also in effect for that file.) +.P +The +.BR \-k +option was added to address international concerns about the dangers +involved in the character set transformations of +.BR \-e +(if the target character set were different from the source, the +filenames might be transformed into names matching existing files) and +also was made more general to protect files transferred between file +systems with different +{NAME_MAX} +values (truncating a filename on a smaller system might also +inadvertently overwrite existing files). As stated, it prevents any +overwriting, even if the target file is older than the source. This +version adds more granularity of options to solve this problem by +introducing the +.BR \-o \c +.BR invalid=option \c +\(emspecifically the +.BR UTF\(hy8 +and +.BR binary +actions. (Note that an existing file is still subject to overwriting in +this case. The +.BR \-k +option closes that loophole.) +.P +Some of the file characteristics referenced in this volume of POSIX.1\(hy2017 might not be +supported by some archive formats. For example, neither the +.BR tar +nor +.BR cpio +formats contain the file access time. For this reason, the +.BR e +specification character has been provided, intended to cause all file +characteristics specified in the archive to be retained. +.P +It is required that extracted directories, by default, have their +access and modification times and permissions set to the values +specified in the archive. This has obvious problems in that the +directories are almost certainly modified after being extracted and +that directory permissions may not permit file creation. One possible +solution is to create directories with the mode specified in the +archive, as modified by the +.IR umask +of the user, with sufficient permissions to allow file creation. After +all files have been extracted, +.IR pax +would then reset the access and modification times and permissions as +necessary. +.P +The list-mode formatting description borrows heavily from the one +defined by the +.IR printf +utility. However, since there is no separate operand list to get +conversion arguments, the format was extended to allow specifying the +name of the conversion argument as part of the conversion +specification. +.P +The +.BR T +conversion specifier allows time fields to be displayed in any of +the date formats. Unlike the +.IR ls +utility, +.IR pax +does not adjust the format when the date is less than six months in the +past. This makes parsing the output more predictable. +.P +The +.BR D +conversion specifier handles the ability to display the major/minor +or file size, as with +.IR ls , +by using \fR%\-8(\fIsize\fR)D\fR. +.P +The +.BR L +conversion specifier handles the +.IR ls +display for symbolic links. +.P +Conversion specifiers were added to generate existing known types used +for +.IR ls . +.SS "pax Interchange Format" +.P +The new POSIX data interchange format was developed primarily to +satisfy international concerns that the +.BR ustar +and +.BR cpio +formats did not provide for file, user, and group names encoded in +characters outside a subset of the ISO/IEC\ 646:\|1991 standard. The standard developers +realized that this new POSIX data interchange format should be very +extensible because there were other requirements they foresaw in the +near future: +.IP " *" 4 +Support international character encodings and locale information +.IP " *" 4 +Support security information (ACLs, and so on) +.IP " *" 4 +Support future file types, such as realtime or contiguous files +.IP " *" 4 +Include data areas for implementation use +.IP " *" 4 +Support systems with words larger than 32 bits and timers with +subsecond granularity +.P +The following were not goals for this format because these are better +handled by separate utilities or are inappropriate for a portable +format: +.IP " *" 4 +Encryption +.IP " *" 4 +Compression +.IP " *" 4 +Data translation between locales and codesets +.IP " *" 4 +.IR inode +storage +.P +The format chosen to support the goals is an extension of the +.BR ustar +format. Of the two formats previously available, only the +.BR ustar +format was selected for extensions because: +.IP " *" 4 +It was easier to extend in an upwards-compatible way. It offered version +flags and header block type fields with room for future +standardization. The +.BR cpio +format, while possessing a more flexible file naming methodology, could +not be extended without breaking some theoretical implementation +or using a dummy filename that could be a legitimate filename. +.IP " *" 4 +Industry experience since the original ``\c +.IR tar +wars'' fought in developing the ISO\ POSIX\(hy1 standard has clearly been in favor of the +.BR ustar +format, which is generally the default output format selected for +.IR pax +implementations on new systems. +.P +The new format was designed with one additional goal in mind: +reasonable behavior when an older +.IR tar +or +.IR pax +utility happened to read an archive. Since the POSIX.1\(hy1990 standard mandated that a +``format-reading utility'' had to treat unrecognized +.IR typeflag +values as regular files, this allowed the format to include all the +extended information in a pseudo-regular file that preceded each real +file. An option is given that allows the archive creator to set up +reasonable names for these files on the older systems. Also, the +normative text suggests that reasonable file access values be used for +this +.BR ustar +header block. Making these header files inaccessible for convenient +reading and deleting would not be reasonable. File permissions of 600 +or 700 are suggested. +.P +The +.BR ustar +.IR typeflag +field was used to accommodate the additional functionality of the new +format rather than magic or version because the POSIX.1\(hy1990 standard (and, by +reference, the previous version of +.IR pax ), +mandated the behavior of the format-reading utility when it encountered +an unknown +.IR typeflag , +but was silent about the other two fields. +.P +Early proposals for the first version of this standard contained a proposed +archive format that was based on compatibility with the standard for +tape files (ISO\ 1001, similar to the format used historically on many +mainframes and minicomputers). This format was overly complex and required +considerable overhead in volume and header records. Furthermore, the +standard developers felt that it would not be acceptable to the community +of POSIX developers, so it was later changed to be a format more closely +related to historical practice on POSIX systems. +.P +The prefix and name split of pathnames in +.BR ustar +was replaced by the single path extended header record for simplicity. +.P +The concept of a global extended header (\c +.IR typeflag \c +.BR g ) +was controversial. If this were applied to an archive being recorded on +magnetic tape, a few unreadable blocks at the beginning of the tape +could be a serious problem; a utility attempting to extract as many +files as possible from a damaged archive could lose a large percentage +of file header information in this case. However, if the archive were +on a reliable medium, such as a CD\(hyROM, the global extended header +offers considerable potential size reductions by eliminating redundant +information. Thus, the text warns against using the global method for +unreliable media and provides a method for implanting global +information in the extended header for each file, rather than in the +.IR typeflag +.BR g +records. +.P +No facility for data translation or filtering on a per-file basis is +included because the standard developers could not invent an interface +that would allow this in an efficient manner. If a filter, such as +encryption or compression, is to be applied to all the files, it is +more efficient to apply the filter to the entire archive as a single +file. The standard developers considered interfaces that would invoke a +shell script for each file going into or out of the archive, but the +system overhead in this approach was considered to be too high. +.P +One such approach would be to have +.BR filter= +records that give a pathname for an executable. When the program is +invoked, the file and archive would be open for standard input/output +and all the header fields would be available as environment variables +or command-line arguments. The standard developers did discuss such +schemes, but they were omitted from POSIX.1\(hy2008 due to concerns about +excessive overhead. Also, the program itself would need to be in the +archive if it were to be used portably. +.P +There is currently no portable means of identifying the character +set(s) used for a file in the file system. Therefore, +.IR pax +has not been given a mechanism to generate charset records +automatically. The only portable means of doing this is for the user to +write the archive using the +.BR \-o \c +.BR charset=string +command line option. This assumes that all of the files in the archive +use the same encoding. The ``implementation-defined'' text is +included to allow for a system that can identify the encodings used for +each of its files. +.P +The table of standards that accompanies the charset record description +is acknowledged to be very limited. Only a limited number of character +set standards is reasonable for maximal interchange. Any character set +is, of course, possible by prior agreement. It was suggested that +EBCDIC be listed, but it was omitted because it is not defined by a +formal standard. Formal standards, and then only those with reasonably +large followings, can be included here, simply as a matter of +practicality. The <\fIvalue\fP>s represent names of officially +registered character sets in the format required by the ISO\ 2375:\|1985 standard. +.P +The normal +<comma> +or +<blank>-separated +list rules are not followed in the case of keyword options to allow +ease of argument parsing for +.IR getopts . +.P +Further information on character encodings is in +.IR "pax Archive Character Set Encoding/Decoding". +.P +The standard developers have reserved keyword name space for vendor +extensions. It is suggested that the format to be used is: +.sp +.RS 4 +.nf + +\fIVENDOR.keyword\fR +.fi +.P +.RE +.P +where +.IR VENDOR +is the name of the vendor or organization in all uppercase letters. It +is further suggested that the keyword following the +<period> +be named differently than any of the standard keywords so that it could +be used for future standardization, if appropriate, by omitting the +.IR VENDOR +prefix. +.P +The <\fIlength\fP> field in the extended header record was included to +make it simpler to step through the records, even if a record contains +an unknown format (to a particular +.IR pax ) +with complex interactions of special characters. It also provides a +minor integrity checkpoint within the records to aid a program +attempting to recover files from a damaged archive. +.P +There are no extended header versions of the +.IR devmajor +and +.IR devminor +fields because the unspecified format +.BR ustar +header field should be sufficient. If they are not, vendor-specific +extended keywords (such as +.IR VENDOR.devmajor ) +should be used. +.P +Device and +.IR i -number +labeling of files was not adopted from +.IR cpio ; +files are interchanged strictly on a symbolic name basis, as in +.BR ustar . +.P +Just as with the +.BR ustar +format descriptions, the new format makes no special arrangements for +multi-volume archives. Each of the +.IR pax +archive types is assumed to be inside a single POSIX file and splitting +that file over multiple volumes (diskettes, tape cartridges, and so +on), processing their labels, and mounting each in the proper sequence +are considered to be implementation details that cannot be described +portably. +.P +The +.BR pax +format is intended for interchange, not only for backup on a single +(family of) systems. It is not as densely packed as might be possible +for backup: +.IP " *" 4 +It contains information as coded characters that could be coded in +binary. +.IP " *" 4 +It identifies extended records with name fields that could be omitted +in favor of a fixed-field layout. +.IP " *" 4 +It translates names into a portable character set and identifies +locale-related information, both of which are probably unnecessary for +backup. +.P +The requirements on restoring from an archive are slightly different +from the historical wording, allowing for non-monolithic privilege to +bring forward as much as possible. In particular, attributes such as +``high performance file'' might be broadly but not universally granted +while set-user-ID or +\fIchown\fR() +might be much more restricted. There is no implication in POSIX.1\(hy2008 that +the security information be honored after it is restored to the file +hierarchy, in spite of what might be improperly inferred by the silence +on that topic. That is a topic for another standard. +.P +Links are recorded in the fashion described here because a link can be +to any file type. It is desirable in general to be able to restore part +of an archive selectively and restore all of those files completely. If +the data is not associated with each link, it is not possible to do +this. However, the data associated with a file can be large, and when +selective restoration is not needed, this can be a significant burden. +The archive is structured so that files that have no associated data +can always be restored by the name of any link name of any link, and +the user may choose whether data is recorded with each instance of a +file that contains data. The format permits mixing of both types of +links in a single archive; this can be done for special needs, and +.IR pax +is expected to interpret such archives on input properly, despite the +fact that there is no +.IR pax +option that would force this mixed case on output. (When +.BR \-o +.BR linkdata +is used, the output must contain the duplicate data, but the +implementation is free to include it or omit it when +.BR \-o +.BR linkdata +is not used.) +.P +The time values are included as extended header records for those +implementations needing more than the eleven octal digits allowed by +the +.BR ustar +format. Portable file timestamps cannot be negative. If +.IR pax +encounters a file with a negative timestamp in +.BR copy +or +.BR write +mode, it can reject the file, substitute a non-negative timestamp, or +generate a non-portable timestamp with a leading +.BR '\-' . +Even though some implementations can support finer file-time +granularities than seconds, the normative text requires support only +for seconds since the Epoch because the ISO\ POSIX\(hy1 standard states them that way. The +.BR ustar +format includes only +.IR mtime ; +the new format adds +.IR atime +and +.IR ctime +for symmetry. The +.IR atime +access time restored to the file system will be affected by the +.BR \-p +.BR a +and +.BR \-p +.BR e +options. The +.IR ctime +creation time (actually +.IR inode +modification time) is described with appropriate privileges so that +it can be ignored when writing to the file system. POSIX does not +provide a portable means to change file creation time. Nothing is +intended to prevent a non-portable implementation of +.IR pax +from restoring the value. +.P +The +.IR gid , +.IR size , +and +.IR uid +extended header records were included to allow expansion beyond the +sizes specified in the regular +.IR tar +header. New file system architectures are emerging that will exhaust +the 12-digit size field. There are probably not many systems requiring +more than 8 digits for user and group IDs, but the extended header +values were included for completeness, allowing overrides for all of +the decimal values in the +.IR tar +header. +.P +The standard developers intended to describe the effective results of +.IR pax +with regard to file ownerships and permissions; implementations are not +restricted in timing or sequencing the restoration of such, provided +the results are as specified. +.P +Much of the text describing the extended headers refers to use in ``\c +.BR write +or +.BR copy +modes''. The +.BR copy +mode references are due to the normative text: ``The effect of the +copy shall be as if the copied files were written to an archive file +and then subsequently extracted .\|.\|.''. There is certainly no way to +test whether +.IR pax +is actually generating the extended headers in +.BR copy +mode, but the effects must be as if it had. +.SS "pax Archive Character Set Encoding/Decoding" +.P +There is a need to exchange archives of files between systems of +different native codesets. Filenames, group names, and user names must +be preserved to the fullest extent possible when an archive is read on +the receiving platform. Translation of the contents of files is not +within the scope of the +.IR pax +utility. +.P +There will also be the need to represent characters that are not +available on the receiving platform. These unsupported characters +cannot be automatically folded to the local set of characters due to +the chance of collisions. This could result in overwriting previous +extracted files from the archive or pre-existing files on the system. +.P +For these reasons, the codeset used to represent characters within the +extended header records of the +.IR pax +archive must be sufficiently rich to handle all commonly used character +sets. The fields requiring translation include, at a minimum, +filenames, user names, group names, and link pathnames. Implementations +may wish to have localized extended keywords that use non-portable +characters. +.P +The standard developers considered the following options: +.IP " *" 4 +The archive creator specifies the well-defined name of the source +codeset. The receiver must then recognize the codeset name and perform +the appropriate translations to the destination codeset. +.IP " *" 4 +The archive creator includes within the archive the character mapping +table for the source codeset used to encode extended header records. +The receiver must then read the character mapping table and perform the +appropriate translations to the destination codeset. +.IP " *" 4 +The archive creator translates the extended header records in the +source codeset into a canonical form. The receiver must then perform +the appropriate translations to the destination codeset. +.P +The approach that incorporates the name of the source codeset poses the +problem of codeset name registration, and makes the archive useless to +.IR pax +archive decoders that do not recognize that codeset. +.P +Because parts of an archive may be corrupted, the standard developers +felt that including the character map of the source codeset was too +fragile. The loss of this one key component could result in making the +entire archive useless. (The difference between this and the global +extended header decision was that the latter has a +workaround\(emduplicating extended header records on unreliable +media\(embut this would be too burdensome for large character set +maps.) +.P +Both of the above approaches also put an undue burden on the +.IR pax +archive receiver to handle the cross-product of all source and +destination codesets. +.P +To simplify the translation from the source codeset to the canonical +form and from the canonical form to the destination codeset, the +standard developers decided that the internal representation should be +a stateless encoding. A stateless encoding is one where each codepoint +has the same meaning, without regard to the decoder being in a specific +state. An example of a stateful encoding would be the Japanese +Shift-JIS; an example of a stateless encoding would be the ISO/IEC\ 646:\|1991 standard +(equivalent to 7-bit ASCII). +.P +For these reasons, the standard developers decided to adopt a canonical +format for the representation of file information strings. The obvious, +well-endorsed candidate is the ISO/IEC\ 10646\(hy1:\|2000 standard (based in part on Unicode), which +can be used to represent the characters of virtually all standardized +character sets. The standard developers initially agreed upon using +UCS2 (16-bit Unicode) as the internal representation. This repertoire +of characters provides a sufficiently rich set to represent all +commonly-used codesets. +.P +However, the standard developers found that the 16-bit Unicode +representation had some problems. It forced the issue of standardizing +byte ordering. The 2-byte length of each character made the extended +header records twice as long for the case of strings coded entirely +from historical 7-bit ASCII. For these reasons, the standard developers +chose the UTF\(hy8 defined in the ISO/IEC\ 10646\(hy1:\|2000 standard. This multi-byte representation +encodes UCS2 or UCS4 characters reliably and deterministically, +eliminating the need for a canonical byte ordering. In addition, NUL +octets and other characters possibly confusing to POSIX file systems do +not appear, except to represent themselves. It was realized that +certain national codesets take up more space after the encoding, due to +their placement within the UCS range; it was felt that the usefulness +of the encoding of the names outweighs the disadvantage of size +increase for file, user, and group names. +.P +The encoding of UTF\(hy8 is as follows: +.sp +.RS 4 +.nf + +UCS4 Hex Encoding UTF-8 Binary Encoding +.P +00000000-0000007F 0xxxxxxx +00000080-000007FF 110xxxxx 10xxxxxx +00000800-0000FFFF 1110xxxx 10xxxxxx 10xxxxxx +00010000-001FFFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx +00200000-03FFFFFF 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx +04000000-7FFFFFFF 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx +.fi +.P +.RE +.P +where each +.BR 'x' +represents a bit value from the character being translated. +.SS "ustar Interchange Format" +.P +The description of the +.BR ustar +format reflects numerous enhancements over pre-1988 versions of the +historical +.IR tar +utility. The goal of these changes was not only to provide the +functional enhancements desired, but also to retain compatibility +between new and old versions. This compatibility has been retained. +Archives written using the old archive format are compatible with the +new format. +.P +Implementors should be aware that the previous file format did not +include a mechanism to archive directory type files. For this reason, +the convention of using a filename ending with +<slash> +was adopted to specify a directory on the archive. +.P +The total size of the +.IR name +and +.IR prefix +fields have been set to meet the minimum requirements for +{PATH_MAX}. +If a pathname will fit within the +.IR name +field, it is recommended that the pathname be stored there without the +use of the +.IR prefix +field. Although the name field is known to be too small to contain +{PATH_MAX} +characters, the value was not changed in this version of the archive +file format to retain backwards-compatibility, and instead the prefix +was introduced. Also, because of the earlier version of the format, +there is no way to remove the restriction on the +.IR linkname +field being limited in size to just that of the +.IR name +field. +.P +The +.IR size +field is required to be meaningful in all implementation extensions, +although it could be zero. This is required so that the data blocks can +always be properly counted. +.P +It is suggested that if device special files need to be represented +that cannot be represented in the standard format, that one of the +extension types (\c +.BR A \(hy\c +.BR Z ) +be used, and that the additional information for the special file be +represented as data and be reflected in the +.IR size +field. +.P +Attempting to restore a special file type, where it is converted to +ordinary data and conflicts with an existing filename, need not be +specially detected by the utility. If run as an ordinary user, +.IR pax +should not be able to overwrite the entries in, for example, +.BR /dev +in any case (whether the file is converted to another type or not). If +run as a privileged user, it should be able to do so, and it would be +considered a bug if it did not. The same is true of ordinary data files +and similarly named special files; it is impossible to anticipate the +needs of the user (who could really intend to overwrite the file), so +the behavior should be predictable (and thus regular) and rely on the +protection system as required. +.P +The value 7 in the +.IR typeflag +field is intended to define how contiguous files can be stored in a +.BR ustar +archive. POSIX.1\(hy2008 does not require the contiguous file extension, but does +define a standard way of archiving such files so that all conforming +systems can interpret these file types in a meaningful and consistent +manner. On a system that does not support extended file types, the +.IR pax +utility should do the best it can with the file and go on to the next. +.P +The file protection modes are those conventionally used by the +.IR ls +utility. This is extended beyond the usage in the ISO\ POSIX\(hy2 standard to support the +``shared text'' or ``sticky'' bit. It is intended that the conformance +document should not document anything beyond the existence of and +support of such a mode. Further extensions are expected to these bits, +particularly with overloading the set-user-ID and set-group-ID flags. +.SS "cpio Interchange Format" +.P +The reference to appropriate privileges in the +.BR cpio +format refers to an error on standard output; the +.BR ustar +format does not make comparable statements. +.P +The model for this format was the historical System V +.IR cpio \c +.BR \-c +data interchange format. This model documents the portable version of +the +.BR cpio +format and not the binary version. It has the flexibility to transfer +data of any type described within POSIX.1\(hy2008, yet is extensible to transfer +data types specific to extensions beyond POSIX.1\(hy2008 (for example, contiguous +files). Because it describes existing practice, there is no question of +maintaining upwards-compatibility. +.SS "cpio Header" +.P +There has been some concern that the size of the +.IR c_ino +field of the header is too small to handle those systems that have very +large +.IR inode +numbers. However, the +.IR c_ino +field in the header is used strictly as a hard-link resolution +mechanism for archives. It is not necessarily the same value as the +.IR inode +number of the file in the location from which that file is extracted. +.P +The name +.IR c_magic +is based on historical usage. +.SS "cpio Filename" +.P +For most historical implementations of the +.IR cpio +utility, +{PATH_MAX} +octets can be used to describe the pathname without the addition of +any other header fields (the NUL character would be included in this +count). +{PATH_MAX} +is the minimum value for pathname size, documented as 256 bytes. +However, an implementation may use +.IR c_namesize +to determine the exact length of the pathname. With the current +description of the +.IR <cpio.h> +header, this pathname size can be as large as a number that is +described in six octal digits. +.P +Two values are documented under the +.IR c_mode +field values to provide for extensibility for known file types: +.IP "\fB0110\ 000\fP" 10 +Reserved for contiguous files. The implementation may treat the rest of +the information for this archive like a regular file. If this file type +is undefined, the implementation may create the file as a regular +file. +.P +This provides for extensibility of the +.BR cpio +format while allowing for the ability to read old archives. Files of an +unknown type may be read as ``regular files'' on some implementations. +On a system that does not support extended file types, the +.IR pax +utility should do the best it can with the file and go on to the next. +.SH "FUTURE DIRECTIONS" +None. +.SH "SEE ALSO" +.IR "Chapter 2" ", " "Shell Command Language", +.IR "\fIcp\fR\^", +.IR "\fIed\fR\^", +.IR "\fIgetopts\fR\^", +.IR "\fIls\fR\^", +.IR "\fIprintf\fR\^" +.P +The Base Definitions volume of POSIX.1\(hy2017, +.IR "Section 3.169" ", " "File Mode Bits", +.IR "Chapter 5" ", " "File Format Notation", +.IR "Chapter 8" ", " "Environment Variables", +.IR "Section 12.2" ", " "Utility Syntax Guidelines", +.IR "\fB<cpio.h>\fP", +.IR "\fB<tar.h>\fP" +.P +The System Interfaces volume of POSIX.1\(hy2017, +.IR "\fIchown\fR\^(\|)", +.IR "\fIcreat\fR\^(\|)", +.IR "\fIfstatat\fR\^(\|)", +.IR "\fImkdir\fR\^(\|)", +.IR "\fImkfifo\fR\^(\|)", +.IR "\fIutime\fR\^(\|)", +.IR "\fIwrite\fR\^(\|)" +.\" +.SH COPYRIGHT +Portions of this text are reprinted and reproduced in electronic form +from IEEE Std 1003.1-2017, Standard for Information Technology +-- Portable Operating System Interface (POSIX), The Open Group Base +Specifications Issue 7, 2018 Edition, +Copyright (C) 2018 by the Institute of +Electrical and Electronics Engineers, Inc and The Open Group. +In the event of any discrepancy between this version and the original IEEE and +The Open Group Standard, the original IEEE and The Open Group Standard +is the referee document. The original Standard can be obtained online at +http://www.opengroup.org/unix/online.html . +.PP +Any typographical or formatting errors that appear +in this page are most likely +to have been introduced during the conversion of the source files to +man page format. To report such errors, see +https://www.kernel.org/doc/man-pages/reporting_bugs.html . |