From 3d08cd331c1adcf0d917392f7e527b3f00511748 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Fri, 24 May 2024 06:52:22 +0200 Subject: Merging upstream version 6.8. Signed-off-by: Daniel Baumann --- man/man7/glob.7 | 205 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 205 insertions(+) create mode 100644 man/man7/glob.7 (limited to 'man/man7/glob.7') diff --git a/man/man7/glob.7 b/man/man7/glob.7 new file mode 100644 index 0000000..a7546b4 --- /dev/null +++ b/man/man7/glob.7 @@ -0,0 +1,205 @@ +.\" Copyright (c) 1998 Andries Brouwer +.\" +.\" SPDX-License-Identifier: GPL-2.0-or-later +.\" +.\" 2003-08-24 fix for / by John Kristoff + joey +.\" +.TH glob 7 2024-05-02 "Linux man-pages (unreleased)" +.SH NAME +glob \- globbing pathnames +.SH DESCRIPTION +Long ago, in UNIX\ V6, there was a program +.I /etc/glob +that would expand wildcard patterns. +Soon afterward this became a shell built-in. +.P +These days there is also a library routine +.BR glob (3) +that will perform this function for a user program. +.P +The rules are as follows (POSIX.2, 3.13). +.SS Wildcard matching +A string is a wildcard pattern if it contains one of the +characters \[aq]?\[aq], \[aq]*\[aq], or \[aq][\[aq]. +Globbing is the operation +that expands a wildcard pattern into the list of pathnames +matching the pattern. +Matching is defined by: +.P +A \[aq]?\[aq] (not between brackets) matches any single character. +.P +A \[aq]*\[aq] (not between brackets) matches any string, +including the empty string. +.P +.B "Character classes" +.P +An expression "\fI[...]\fP" where the first character after the +leading \[aq][\[aq] is not an \[aq]!\[aq] matches a single character, +namely any of the characters enclosed by the brackets. +The string enclosed by the brackets cannot be empty; +therefore \[aq]]\[aq] can be allowed between the brackets, provided +that it is the first character. +(Thus, "\fI[][!]\fP" matches the +three characters \[aq][\[aq], \[aq]]\[aq], and \[aq]!\[aq].) +.P +.B Ranges +.P +There is one special convention: +two characters separated by \[aq]\-\[aq] denote a range. +(Thus, +"\fI[A\-Fa\-f0\-9]\fP" is equivalent to "\fI[ABCDEFabcdef0123456789]\fP".) +One may include \[aq]\-\[aq] in its literal meaning +by making it the first or last character between the brackets. +(Thus, +"\fI[]\-]\fP" matches just the two characters \[aq]]\[aq] and \[aq]\-\[aq], +and "\fI[\-\-0]\fP" matches the +three characters \[aq]\-\[aq], \[aq].\[aq], and \[aq]0\[aq], +since \[aq]/\[aq] cannot be matched.) +.P +.B Complementation +.P +An expression "\fI[!...]\fP" matches a single character, namely +any character that is not matched by the expression obtained +by removing the first \[aq]!\[aq] from it. +(Thus, "\fI[!]a\-]\fP" matches any +single character except \[aq]]\[aq], \[aq]a\[aq], and \[aq]\-\[aq].) +.P +One can remove the special meaning of \[aq]?\[aq], \[aq]*\[aq], and \[aq][\[aq] +by preceding them by a backslash, +or, +in case this is part of a shell command line, +enclosing them in quotes. +Between brackets these characters stand for themselves. +Thus, "\fI[[?*\e]\fP" matches the +four characters \[aq][\[aq], \[aq]?\[aq], \[aq]*\[aq], and \[aq]\e\[aq]. +.SS Pathnames +Globbing is applied on each of the components of a pathname +separately. +A \[aq]/\[aq] in a pathname cannot be matched by a \[aq]?\[aq] or \[aq]*\[aq] +wildcard, or by a range like "\fI[.\-0]\fP". +A range containing an explicit \[aq]/\[aq] character is syntactically incorrect. +(POSIX requires that syntactically incorrect patterns are left unchanged.) +.P +If a filename starts with a \[aq].\[aq], +this character must be matched explicitly. +(Thus, \fIrm\ *\fP will not remove .profile, and \fItar\ c\ *\fP will not +archive all your files; \fItar\ c\ .\fP is better.) +.SS Empty lists +The nice and simple rule given above: "expand a wildcard pattern +into the list of matching pathnames" was the original UNIX +definition. +It allowed one to have patterns that expand into +an empty list, as in +.P +.nf + xv \-wait 0 *.gif *.jpg +.fi +.P +where perhaps no *.gif files are present (and this is not +an error). +However, POSIX requires that a wildcard pattern is left +unchanged when it is syntactically incorrect, or the list of +matching pathnames is empty. +With +.I bash +one can force the classical behavior using this command: +.P +.in +4n +.EX +shopt \-s nullglob +.EE +.in +.\" In Bash v1, by setting allow_null_glob_expansion=true +.P +(Similar problems occur elsewhere. +For example, where old scripts have +.P +.in +4n +.EX +rm \`find . \-name "*\[ti]"\` +.EE +.in +.P +new scripts require +.P +.in +4n +.EX +rm \-f nosuchfile \`find . \-name "*\[ti]"\` +.EE +.in +.P +to avoid error messages from +.I rm +called with an empty argument list.) +.SH NOTES +.SS Regular expressions +Note that wildcard patterns are not regular expressions, +although they are a bit similar. +First of all, they match +filenames, rather than text, and secondly, the conventions +are not the same: for example, in a regular expression \[aq]*\[aq] means zero or +more copies of the preceding thing. +.P +Now that regular expressions have bracket expressions where +the negation is indicated by a \[aq]\[ha]\[aq], POSIX has declared the +effect of a wildcard pattern "\fI[\[ha]...]\fP" to be undefined. +.SS Character classes and internationalization +Of course ranges were originally meant to be ASCII ranges, +so that "\fI[\ \-%]\fP" stands for "\fI[\ !"#$%]\fP" and "\fI[a\-z]\fP" stands +for "any lowercase letter". +Some UNIX implementations generalized this so that a range X\-Y +stands for the set of characters with code between the codes for +X and for Y. +However, this requires the user to know the +character coding in use on the local system, and moreover, is +not convenient if the collating sequence for the local alphabet +differs from the ordering of the character codes. +Therefore, POSIX extended the bracket notation greatly, +both for wildcard patterns and for regular expressions. +In the above we saw three types of items that can occur in a bracket +expression: namely (i) the negation, (ii) explicit single characters, +and (iii) ranges. +POSIX specifies ranges in an internationally +more useful way and adds three more types: +.P +(iii) Ranges X\-Y comprise all characters that fall between X +and Y (inclusive) in the current collating sequence as defined +by the +.B LC_COLLATE +category in the current locale. +.P +(iv) Named character classes, like +.P +.nf +[:alnum:] [:alpha:] [:blank:] [:cntrl:] +[:digit:] [:graph:] [:lower:] [:print:] +[:punct:] [:space:] [:upper:] [:xdigit:] +.fi +.P +so that one can say "\fI[[:lower:]]\fP" instead of "\fI[a\-z]\fP", and have +things work in Denmark, too, where there are three letters past \[aq]z\[aq] +in the alphabet. +These character classes are defined by the +.B LC_CTYPE +category +in the current locale. +.P +(v) Collating symbols, like "\fI[.ch.]\fP" or "\fI[.a-acute.]\fP", +where the string between "\fI[.\fP" and "\fI.]\fP" is a collating +element defined for the current locale. +Note that this may +be a multicharacter element. +.P +(vi) Equivalence class expressions, like "\fI[=a=]\fP", +where the string between "\fI[=\fP" and "\fI=]\fP" is any collating +element from its equivalence class, as defined for the +current locale. +For example, "\fI[[=a=]]\fP" might be equivalent +to "\fI[a\('a\(`a\(:a\(^a]\fP", that is, +to "\fI[a[.a-acute.][.a-grave.][.a-umlaut.][.a-circumflex.]]\fP". +.SH SEE ALSO +.BR sh (1), +.BR fnmatch (3), +.BR glob (3), +.BR locale (7), +.BR regex (7) -- cgit v1.2.3