diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-15 19:43:11 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-15 19:43:11 +0000 |
commit | fc22b3d6507c6745911b9dfcc68f1e665ae13dbc (patch) | |
tree | ce1e3bce06471410239a6f41282e328770aa404a /upstream/archlinux/man1p/tr.1p | |
parent | Initial commit. (diff) | |
download | manpages-l10n-fc22b3d6507c6745911b9dfcc68f1e665ae13dbc.tar.xz manpages-l10n-fc22b3d6507c6745911b9dfcc68f1e665ae13dbc.zip |
Adding upstream version 4.22.0.upstream/4.22.0
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'upstream/archlinux/man1p/tr.1p')
-rw-r--r-- | upstream/archlinux/man1p/tr.1p | 695 |
1 files changed, 695 insertions, 0 deletions
diff --git a/upstream/archlinux/man1p/tr.1p b/upstream/archlinux/man1p/tr.1p new file mode 100644 index 00000000..253e3cde --- /dev/null +++ b/upstream/archlinux/man1p/tr.1p @@ -0,0 +1,695 @@ +'\" et +.TH TR "1P" 2017 "IEEE/The Open Group" "POSIX Programmer's Manual" +.\" +.SH PROLOG +This manual page is part of the POSIX Programmer's Manual. +The Linux implementation of this interface may differ (consult +the corresponding Linux manual page for details of Linux behavior), +or the interface may not be implemented on Linux. +.\" +.SH NAME +tr +\(em translate characters +.SH SYNOPSIS +.LP +.nf +tr \fB[\fR-c|-C\fB] [\fR-s\fB] \fIstring1 string2\fR +.P +tr -s \fB[\fR-c|-C\fB] \fIstring1\fR +.P +tr -d \fB[\fR-c|-C\fB] \fIstring1\fR +.P +tr -ds \fB[\fR-c|-C\fB] \fIstring1 string2\fR +.fi +.SH DESCRIPTION +The +.IR tr +utility shall copy the standard input to the standard output with +substitution or deletion of selected characters. The options specified +and the +.IR string1 +and +.IR string2 +operands shall control translations that occur while copying characters +and single-character collating elements. +.SH OPTIONS +The +.IR tr +utility shall conform to the Base Definitions volume of POSIX.1\(hy2017, +.IR "Section 12.2" ", " "Utility Syntax Guidelines". +.P +The following options shall be supported: +.IP "\fB\-c\fP" 10 +Complement the set of values specified by +.IR string1 . +See the EXTENDED DESCRIPTION section. +.IP "\fB\-C\fP" 10 +Complement the set of characters specified by +.IR string1 . +See the EXTENDED DESCRIPTION section. +.IP "\fB\-d\fP" 10 +Delete all occurrences of input characters that are specified by +.IR string1 . +.IP "\fB\-s\fP" 10 +Replace instances of repeated characters with a single character, as +described in the EXTENDED DESCRIPTION section. +.SH OPERANDS +The following operands shall be supported: +.IP "\fIstring1\fR,\ \fIstring2\fR" 10 +.br +Translation control strings. Each string shall represent a set of +characters to be converted into an array of characters used for the +translation. For a detailed description of how the strings are +interpreted, see the EXTENDED DESCRIPTION section. +.SH STDIN +The standard input can be any type of file. +.SH "INPUT FILES" +None. +.SH "ENVIRONMENT VARIABLES" +The following environment variables shall affect the execution of +.IR tr : +.IP "\fILANG\fP" 10 +Provide a default value for the internationalization variables that are +unset or null. (See the Base Definitions volume of POSIX.1\(hy2017, +.IR "Section 8.2" ", " "Internationalization Variables" +for the precedence of internationalization variables used to determine +the values of locale categories.) +.IP "\fILC_ALL\fP" 10 +If set to a non-empty string value, override the values of all the +other internationalization variables. +.IP "\fILC_COLLATE\fP" 10 +.br +Determine the locale for the behavior of range expressions and +equivalence classes. +.IP "\fILC_CTYPE\fP" 10 +Determine the locale for the interpretation of sequences of bytes of +text data as characters (for example, single-byte as opposed to +multi-byte characters in arguments) and the behavior of character +classes. +.IP "\fILC_MESSAGES\fP" 10 +.br +Determine the locale that should be used to affect the format and +contents of diagnostic messages written to standard error. +.IP "\fINLSPATH\fP" 10 +Determine the location of message catalogs for the processing of +.IR LC_MESSAGES . +.SH "ASYNCHRONOUS EVENTS" +Default. +.SH STDOUT +The +.IR tr +output shall be identical to the input, with the exception of the +specified transformations. +.SH STDERR +The standard error shall be used only for diagnostic messages. +.SH "OUTPUT FILES" +None. +.SH "EXTENDED DESCRIPTION" +The operands +.IR string1 +and +.IR string2 +(if specified) define two arrays of characters. The constructs in the +following list can be used to specify characters or single-character +collating elements. If any of the constructs result in multi-character +collating elements, +.IR tr +shall exclude, without a diagnostic, those multi-character elements +from the resulting array. +.IP "\fIcharacter\fR" 10 +Any character not described by one of the conventions below shall +represent itself. +.IP "\e\fIoctal\fR" 10 +Octal sequences can be used to represent characters with specific coded +values. An octal sequence shall consist of a +<backslash> +followed by the longest sequence of one, two, or three-octal-digit +characters (01234567). The sequence shall cause the value whose encoding +is represented by the one, two, or three-digit octal integer to be placed +into the array. Multi-byte characters require multiple, concatenated +escape sequences of this type, including the leading +<backslash> +for each byte. +.IP "\e\fIcharacter\fR" 10 +The +<backslash>-escape +sequences in the Base Definitions volume of POSIX.1\(hy2017, +.IR "Table 5-1" ", " "Escape Sequences and Associated Actions" +(\c +.BR '\e\e' , +.BR '\ea' , +.BR '\eb' , +.BR '\ef' , +.BR '\en' , +.BR '\er' , +.BR '\et' , +.BR '\ev' ) +shall be supported. The results of using any other character, other +than an octal digit, following the +<backslash> +are unspecified. Also, if there is no character following the +<backslash>, +the results are unspecified. +.IP "\fIc\fR\-\fIc\fR" 10 +In the POSIX locale, this construct shall represent the range of +collating elements between the range endpoints (as long as neither +endpoint is an octal sequence of the form \e\fIoctal\fP), inclusive, as +defined by the collation sequence. The characters or collating elements +in the range shall be placed in the array in ascending collation +sequence. If the second endpoint precedes the starting endpoint in the +collation sequence, it is unspecified whether the range of collating +elements is empty, or this construct is treated as invalid. In locales +other than the POSIX locale, this construct has unspecified behavior. +.RS 10 +.P +If either or both of the range endpoints are octal sequences of the +form \e\fIoctal\fP, this shall represent the range of specific coded +values between the two range endpoints, inclusive. +.RE +.IP "[:\fIclass\fR:]" 10 +Represents all characters belonging to the defined character class, as +defined by the current setting of the +.IR LC_CTYPE +locale category. The following character class names shall be accepted +when specified in +.IR string1 : +.TS +tab(@); +lB lB lB lB lB lB. +alnum@blank@digit@lower@punct@upper +alpha@cntrl@graph@print@space@xdigit +.TE +.RS 10 +.P +In addition, character class expressions of the form [:\c +.IR name :] +shall be recognized in those locales where the +.IR name +keyword has been given a +.BR charclass +definition in the +.IR LC_CTYPE +category. +.P +When both the +.BR \-d +and +.BR \-s +options are specified, any of the character class names shall be +accepted in +.IR string2 . +Otherwise, only character class names +.BR lower +or +.BR upper +are valid in +.IR string2 +and then only if the corresponding character class (\c +.BR upper +and +.BR lower , +respectively) is specified in the same relative position in +.IR string1 . +Such a specification shall be interpreted as a request for case +conversion. When [:\c +.IR lower :] +appears in +.IR string1 +and [:\c +.IR upper :] +appears in +.IR string2 , +the arrays shall contain the characters from the +.BR toupper +mapping in the +.IR LC_CTYPE +category of the current locale. When [:\c +.IR upper :] +appears in +.IR string1 +and [:\c +.IR lower :] +appears in +.IR string2 , +the arrays shall contain the characters from the +.BR tolower +mapping in the +.IR LC_CTYPE +category of the current locale. The first character from each mapping +pair shall be in the array for +.IR string1 +and the second character from each mapping pair shall be in the array +for +.IR string2 +in the same relative position. +.P +Except for case conversion, the characters specified by a character +class expression shall be placed in the array in an unspecified order. +.P +If the name specified for +.IR class +does not define a valid character class in the current locale, the +behavior is undefined. +.RE +.IP "[=\fIequiv\fR=]" 10 +Represents all characters or collating elements belonging to the same +equivalence class as +.IR equiv , +as defined by the current setting of the +.IR LC_COLLATE +locale category. An equivalence class expression shall be allowed only +in +.IR string1 , +or in +.IR string2 +when it is being used by the combined +.BR \-d +and +.BR \-s +options. The characters belonging to the equivalence class shall be +placed in the array in an unspecified order. +.IP "[\fIx\fR*\fIn\fR]" 10 +Represents +.IR n +repeated occurrences of the character +.IR x . +Because this expression is used to map multiple characters to one, it +is only valid when it occurs in +.IR string2 . +If +.IR n +is omitted or is zero, it shall be interpreted as large enough to +extend the +.IR string2 -based +sequence to the length of the +.IR string1 -based +sequence. If +.IR n +has a leading zero, it shall be interpreted as an octal value. +Otherwise, it shall be interpreted as a decimal value. +.P +When the +.BR \-d +option is not specified: +.IP " *" 4 +If +.IR string2 +is present, each input character found in the array specified by +.IR string1 +shall be replaced by the character in the same relative position in the +array specified by +.IR string2 . +If the array specified by +.IR string2 +is shorter that the one specified by +.IR string1 , +or if a character occurs more than once in +.IR string1 , +the results are unspecified. +.IP " *" 4 +If the +.BR \-C +option is specified, the complements of the characters specified by +.IR string1 +(the set of all characters in the current character set, as defined by +the current setting of +.IR LC_CTYPE , +except for those actually specified in the +.IR string1 +operand) shall be placed in the array in ascending collation sequence, +as defined by the current setting of +.IR LC_COLLATE . +.IP " *" 4 +If the +.BR \-c +option is specified, the complement of the values specified by +.IR string1 +shall be placed in the array in ascending order by binary value. +.IP " *" 4 +Because the order in which characters specified by character class +expressions or equivalence class expressions is undefined, such +expressions should only be used if the intent is to map several +characters into one. An exception is case conversion, as described +previously. +.P +When the +.BR \-d +option is specified: +.IP " *" 4 +Input characters found in the array specified by +.IR string1 +shall be deleted. +.IP " *" 4 +When the +.BR \-C +option is specified with +.BR \-d , +all characters except those specified by +.IR string1 +shall be deleted. The contents of +.IR string2 +are ignored, unless the +.BR \-s +option is also specified. +.IP " *" 4 +When the +.BR \-c +option is specified with +.BR \-d , +all values except those specified by +.IR string1 +shall be deleted. The contents of +.IR string2 +shall be ignored, unless the +.BR \-s +option is also specified. +.IP " *" 4 +The same string cannot be used for both the +.BR \-d +and the +.BR \-s +option; when both options are specified, both +.IR string1 +(used for deletion) and +.IR string2 +(used for squeezing) shall be required. +.P +When the +.BR \-s +option is specified, after any deletions or translations have taken +place, repeated sequences of the same character shall be replaced by +one occurrence of the same character, if the character is found in the +array specified by the last operand. If the last operand contains a +character class, such as the following example: +.sp +.RS 4 +.nf + +tr -s \(aq[:space:]\(aq +.fi +.P +.RE +.P +the last operand's array shall contain all of the characters in that +character class. However, in a case conversion, as described +previously, such as: +.sp +.RS 4 +.nf + +tr -s \(aq[:upper:]\(aq \(aq[:lower:]\(aq +.fi +.P +.RE +.P +the last operand's array shall contain only those characters defined as +the second characters in each of the +.BR toupper +or +.BR tolower +character pairs, as appropriate. +.P +An empty string used for +.IR string1 +or +.IR string2 +produces undefined results. +.SH "EXIT STATUS" +The following exit values shall be returned: +.IP "\00" 6 +All input was processed successfully. +.IP >0 6 +An error occurred. +.SH "CONSEQUENCES OF ERRORS" +Default. +.LP +.IR "The following sections are informative." +.SH "APPLICATION USAGE" +If necessary, +.IR string1 +and +.IR string2 +can be quoted to avoid pattern matching by the shell. +.P +If an ordinary digit (representing itself) is to follow an octal +sequence, the octal sequence must use the full three digits to avoid +ambiguity. +.P +When +.IR string2 +is shorter than +.IR string1 , +a difference results between historical System\ V and BSD systems. A +BSD system pads +.IR string2 +with the last character found in +.IR string2 . +Thus, it is possible to do the following: +.sp +.RS 4 +.nf + +tr 0123456789 d +.fi +.P +.RE +.P +which would translate all digits to the letter +.BR 'd' . +Since this area is specifically unspecified in this volume of POSIX.1\(hy2017, both the BSD and +System\ V behaviors are allowed, but a conforming application cannot rely +on the BSD behavior. It would have to code the example in the +following way: +.sp +.RS 4 +.nf + +tr 0123456789 \(aq[d*]\(aq +.fi +.P +.RE +.P +It should be noted that, despite similarities in appearance, the string +operands used by +.IR tr +are not regular expressions. +.P +Unlike some historical implementations, this definition of the +.IR tr +utility correctly processes NUL characters in its input stream. NUL +characters can be stripped by using: +.sp +.RS 4 +.nf + +tr -d \(aq\e000\(aq +.fi +.P +.RE +.SH EXAMPLES +.IP " 1." 4 +The following example creates a list of all words in +.BR file1 +one per line in +.BR file2 , +where a word is taken to be a maximal string of letters. +.RS 4 +.sp +.RS 4 +.nf + +tr -cs "[:alpha:]" "[\en*]" <file1 >file2 +.fi +.P +.RE +.RE +.IP " 2." 4 +The next example translates all lowercase characters in +.BR file1 +to uppercase and writes the results to standard output. +.RS 4 +.sp +.RS 4 +.nf + +tr "[:lower:]" "[:upper:]" <file1 +.fi +.P +.RE +.RE +.IP " 3." 4 +This example uses an equivalence class to identify accented variants of +the base character +.BR 'e' +in +.BR file1 , +which are stripped of diacritical marks and written to +.BR file2 . +.RS 4 +.sp +.RS 4 +.nf + +tr "[=e=]" "[e*]" <file1 >file2 +.fi +.P +.RE +.RE +.SH RATIONALE +In some early proposals, an explicit option +.BR \-n +was added to disable the historical behavior of stripping NUL +characters from the input. It was considered that automatically +stripping NUL characters from the input was not correct functionality. +However, the removal of +.BR \-n +in a later proposal does not remove the requirement that +.IR tr +correctly process NUL characters in its input stream. NUL characters +can be stripped by using +.IR tr +.BR \-d +\&\(aq\e000\(aq. +.P +Historical implementations of +.IR tr +differ widely in syntax and behavior. For example, the BSD version has +not needed the bracket characters for the repetition sequence. The +.IR tr +utility syntax is based more closely on the System V and XPG3 model +while attempting to accommodate historical BSD implementations. In the +case of the short +.IR string2 +padding, the decision was to unspecify the behavior and preserve System +V and XPG3 scripts, which might find difficulty with the BSD method. +The assumption was made that BSD users of +.IR tr +have to make accommodations to meet the syntax defined here. Since it +is possible to use the repetition sequence to duplicate the desired +behavior, whereas there is no simple way to achieve the System V +method, this was the correct, if not desirable, approach. +.P +The use of octal values to specify control characters, while having +historical precedents, is not portable. The introduction of escape +sequences for control characters should provide the necessary +portability. It is recognized that this may cause some historical +scripts to break. +.P +An early proposal included support for multi-character collating elements. +It was pointed out that, while +.IR tr +does employ some syntactical elements from REs, the aim of +.IR tr +is quite different; ranges, for example, do not have a similar meaning +(``any of the chars in the range matches'', \fIversus\fP ``translate +each character in the range to the output counterpart''). As a result, +the previously included support for multi-character collating elements +has been removed. What remains are ranges in current collation order +(to support, for example, accented characters), character classes, and +equivalence classes. +.P +In XPG3 the [:\c +.IR class :] +and [=\c +.IR equiv =] +conventions are shown with double brackets, as in RE syntax. However, +.IR tr +does not implement RE principles; it just borrows part of the syntax. +Consequently, [:\c +.IR class :] +and [=\c +.IR equiv =] +should be regarded as syntactical elements on a par with [\c +.IR x *\c +.IR n ], +which is not an RE bracket expression. +.P +The standard developers will consider changes to +.IR tr +that allow it to translate characters between different character +encodings, or they will consider providing a new utility to accomplish +this. +.P +On historical System V systems, a range expression requires enclosing +square-brackets, such as: +.sp +.RS 4 +.nf + +tr \(aq[a-z]\(aq \(aq[A-Z]\(aq +.fi +.P +.RE +.P +However, BSD-based systems did not require the brackets, and this +convention is used here to avoid breaking large numbers of BSD scripts: +.sp +.RS 4 +.nf + +tr a-z A-Z +.fi +.P +.RE +.P +The preceding System V script will continue to work because the +brackets, treated as regular characters, are translated to themselves. +However, any System V script that relied on +.BR \(dqa\(hyz\(dq +representing the three characters +.BR 'a' , +.BR '\-' , +and +.BR 'z' +have to be rewritten as +.BR \(dqaz-\(dq . +.P +The ISO\ POSIX\(hy2:\|1993 standard had a +.BR \-c +option that behaved similarly to the +.BR \-C +option, but did not supply functionality equivalent to the +.BR \-c +option specified in POSIX.1\(hy2008. +.P +The earlier version also said that octal sequences referred to +collating elements and could be placed adjacent to each other to +specify multi-byte characters. However, it was noted that this caused +ambiguities because +.IR tr +would not be able to tell whether adjacent octal sequences were +intending to specify multi-byte characters or multiple single byte +characters. POSIX.1\(hy2008 specifies that octal sequences always refer to single +byte binary values when used to specify an endpoint of a range of +collating elements. +.P +Earlier versions of this standard allowed for implementations with +bytes other than eight bits, but this has been modified in this +version. +.SH "FUTURE DIRECTIONS" +None. +.SH "SEE ALSO" +.IR "\fIsed\fR\^" +.P +The Base Definitions volume of POSIX.1\(hy2017, +.IR "Table 5-1" ", " "Escape Sequences and Associated Actions", +.IR "Chapter 8" ", " "Environment Variables", +.IR "Section 12.2" ", " "Utility Syntax Guidelines" +.\" +.SH COPYRIGHT +Portions of this text are reprinted and reproduced in electronic form +from IEEE Std 1003.1-2017, Standard for Information Technology +-- Portable Operating System Interface (POSIX), The Open Group Base +Specifications Issue 7, 2018 Edition, +Copyright (C) 2018 by the Institute of +Electrical and Electronics Engineers, Inc and The Open Group. +In the event of any discrepancy between this version and the original IEEE and +The Open Group Standard, the original IEEE and The Open Group Standard +is the referee document. The original Standard can be obtained online at +http://www.opengroup.org/unix/online.html . +.PP +Any typographical or formatting errors that appear +in this page are most likely +to have been introduced during the conversion of the source files to +man page format. To report such errors, see +https://www.kernel.org/doc/man-pages/reporting_bugs.html . |