summaryrefslogtreecommitdiffstats
path: root/upstream/archlinux/man1/perlrequick.1perl
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-15 19:43:11 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-15 19:43:11 +0000
commitfc22b3d6507c6745911b9dfcc68f1e665ae13dbc (patch)
treece1e3bce06471410239a6f41282e328770aa404a /upstream/archlinux/man1/perlrequick.1perl
parentInitial commit. (diff)
downloadmanpages-l10n-fc22b3d6507c6745911b9dfcc68f1e665ae13dbc.tar.xz
manpages-l10n-fc22b3d6507c6745911b9dfcc68f1e665ae13dbc.zip
Adding upstream version 4.22.0.upstream/4.22.0
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'upstream/archlinux/man1/perlrequick.1perl')
-rw-r--r--upstream/archlinux/man1/perlrequick.1perl651
1 files changed, 651 insertions, 0 deletions
diff --git a/upstream/archlinux/man1/perlrequick.1perl b/upstream/archlinux/man1/perlrequick.1perl
new file mode 100644
index 00000000..ebb81c45
--- /dev/null
+++ b/upstream/archlinux/man1/perlrequick.1perl
@@ -0,0 +1,651 @@
+.\" -*- mode: troff; coding: utf-8 -*-
+.\" Automatically generated by Pod::Man 5.01 (Pod::Simple 3.43)
+.\"
+.\" Standard preamble:
+.\" ========================================================================
+.de Sp \" Vertical space (when we can't use .PP)
+.if t .sp .5v
+.if n .sp
+..
+.de Vb \" Begin verbatim text
+.ft CW
+.nf
+.ne \\$1
+..
+.de Ve \" End verbatim text
+.ft R
+.fi
+..
+.\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>.
+.ie n \{\
+. ds C` ""
+. ds C' ""
+'br\}
+.el\{\
+. ds C`
+. ds C'
+'br\}
+.\"
+.\" Escape single quotes in literal strings from groff's Unicode transform.
+.ie \n(.g .ds Aq \(aq
+.el .ds Aq '
+.\"
+.\" If the F register is >0, we'll generate index entries on stderr for
+.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
+.\" entries marked with X<> in POD. Of course, you'll have to process the
+.\" output yourself in some meaningful fashion.
+.\"
+.\" Avoid warning from groff about undefined register 'F'.
+.de IX
+..
+.nr rF 0
+.if \n(.g .if rF .nr rF 1
+.if (\n(rF:(\n(.g==0)) \{\
+. if \nF \{\
+. de IX
+. tm Index:\\$1\t\\n%\t"\\$2"
+..
+. if !\nF==2 \{\
+. nr % 0
+. nr F 2
+. \}
+. \}
+.\}
+.rr rF
+.\" ========================================================================
+.\"
+.IX Title "PERLREQUICK 1perl"
+.TH PERLREQUICK 1perl 2024-02-11 "perl v5.38.2" "Perl Programmers Reference Guide"
+.\" For nroff, turn off justification. Always turn off hyphenation; it makes
+.\" way too many mistakes in technical documents.
+.if n .ad l
+.nh
+.SH NAME
+perlrequick \- Perl regular expressions quick start
+.SH DESCRIPTION
+.IX Header "DESCRIPTION"
+This page covers the very basics of understanding, creating and
+using regular expressions ('regexes') in Perl.
+.SH "The Guide"
+.IX Header "The Guide"
+This page assumes you already know things, like what a "pattern" is, and
+the basic syntax of using them. If you don't, see perlretut.
+.SS "Simple word matching"
+.IX Subsection "Simple word matching"
+The simplest regex is simply a word, or more generally, a string of
+characters. A regex consisting of a word matches any string that
+contains that word:
+.PP
+.Vb 1
+\& "Hello World" =~ /World/; # matches
+.Ve
+.PP
+In this statement, \f(CW\*(C`World\*(C'\fR is a regex and the \f(CW\*(C`//\*(C'\fR enclosing
+\&\f(CW\*(C`/World/\*(C'\fR tells Perl to search a string for a match. The operator
+\&\f(CW\*(C`=~\*(C'\fR associates the string with the regex match and produces a true
+value if the regex matched, or false if the regex did not match. In
+our case, \f(CW\*(C`World\*(C'\fR matches the second word in \f(CW"Hello World"\fR, so the
+expression is true. This idea has several variations.
+.PP
+Expressions like this are useful in conditionals:
+.PP
+.Vb 1
+\& print "It matches\en" if "Hello World" =~ /World/;
+.Ve
+.PP
+The sense of the match can be reversed by using \f(CW\*(C`!~\*(C'\fR operator:
+.PP
+.Vb 1
+\& print "It doesn\*(Aqt match\en" if "Hello World" !~ /World/;
+.Ve
+.PP
+The literal string in the regex can be replaced by a variable:
+.PP
+.Vb 2
+\& $greeting = "World";
+\& print "It matches\en" if "Hello World" =~ /$greeting/;
+.Ve
+.PP
+If you're matching against \f(CW$_\fR, the \f(CW\*(C`$_ =~\*(C'\fR part can be omitted:
+.PP
+.Vb 2
+\& $_ = "Hello World";
+\& print "It matches\en" if /World/;
+.Ve
+.PP
+Finally, the \f(CW\*(C`//\*(C'\fR default delimiters for a match can be changed to
+arbitrary delimiters by putting an \f(CW\*(Aqm\*(Aq\fR out front:
+.PP
+.Vb 4
+\& "Hello World" =~ m!World!; # matches, delimited by \*(Aq!\*(Aq
+\& "Hello World" =~ m{World}; # matches, note the matching \*(Aq{}\*(Aq
+\& "/usr/bin/perl" =~ m"/perl"; # matches after \*(Aq/usr/bin\*(Aq,
+\& # \*(Aq/\*(Aq becomes an ordinary char
+.Ve
+.PP
+Regexes must match a part of the string \fIexactly\fR in order for the
+statement to be true:
+.PP
+.Vb 3
+\& "Hello World" =~ /world/; # doesn\*(Aqt match, case sensitive
+\& "Hello World" =~ /o W/; # matches, \*(Aq \*(Aq is an ordinary char
+\& "Hello World" =~ /World /; # doesn\*(Aqt match, no \*(Aq \*(Aq at end
+.Ve
+.PP
+Perl will always match at the earliest possible point in the string:
+.PP
+.Vb 2
+\& "Hello World" =~ /o/; # matches \*(Aqo\*(Aq in \*(AqHello\*(Aq
+\& "That hat is red" =~ /hat/; # matches \*(Aqhat\*(Aq in \*(AqThat\*(Aq
+.Ve
+.PP
+Not all characters can be used 'as is' in a match. Some characters,
+called \fBmetacharacters\fR, are considered special, and reserved for use
+in regex notation. The metacharacters are
+.PP
+.Vb 1
+\& {}[]()^$.|*+?\e
+.Ve
+.PP
+A metacharacter can be matched literally by putting a backslash before
+it:
+.PP
+.Vb 4
+\& "2+2=4" =~ /2+2/; # doesn\*(Aqt match, + is a metacharacter
+\& "2+2=4" =~ /2\e+2/; # matches, \e+ is treated like an ordinary +
+\& \*(AqC:\eWIN32\*(Aq =~ /C:\e\eWIN/; # matches
+\& "/usr/bin/perl" =~ /\e/usr\e/bin\e/perl/; # matches
+.Ve
+.PP
+In the last regex, the forward slash \f(CW\*(Aq/\*(Aq\fR is also backslashed,
+because it is used to delimit the regex.
+.PP
+Most of the metacharacters aren't always special, and other characters
+(such as the ones delimiting the pattern) become special under various
+circumstances. This can be confusing and lead to unexpected results.
+\&\f(CW\*(C`use\ re\ \*(Aqstrict\*(Aq\*(C'\fR can notify you of potential
+pitfalls.
+.PP
+Non-printable ASCII characters are represented by \fBescape sequences\fR.
+Common examples are \f(CW\*(C`\et\*(C'\fR for a tab, \f(CW\*(C`\en\*(C'\fR for a newline, and \f(CW\*(C`\er\*(C'\fR
+for a carriage return. Arbitrary bytes are represented by octal
+escape sequences, e.g., \f(CW\*(C`\e033\*(C'\fR, or hexadecimal escape sequences,
+e.g., \f(CW\*(C`\ex1B\*(C'\fR:
+.PP
+.Vb 3
+\& "1000\et2000" =~ m(0\et2) # matches
+\& "cat" =~ /\e143\ex61\ex74/ # matches in ASCII, but
+\& # a weird way to spell cat
+.Ve
+.PP
+Regexes are treated mostly as double-quoted strings, so variable
+substitution works:
+.PP
+.Vb 3
+\& $foo = \*(Aqhouse\*(Aq;
+\& \*(Aqcathouse\*(Aq =~ /cat$foo/; # matches
+\& \*(Aqhousecat\*(Aq =~ /${foo}cat/; # matches
+.Ve
+.PP
+With all of the regexes above, if the regex matched anywhere in the
+string, it was considered a match. To specify \fIwhere\fR it should
+match, we would use the \fBanchor\fR metacharacters \f(CW\*(C`^\*(C'\fR and \f(CW\*(C`$\*(C'\fR. The
+anchor \f(CW\*(C`^\*(C'\fR means match at the beginning of the string and the anchor
+\&\f(CW\*(C`$\*(C'\fR means match at the end of the string, or before a newline at the
+end of the string. Some examples:
+.PP
+.Vb 5
+\& "housekeeper" =~ /keeper/; # matches
+\& "housekeeper" =~ /^keeper/; # doesn\*(Aqt match
+\& "housekeeper" =~ /keeper$/; # matches
+\& "housekeeper\en" =~ /keeper$/; # matches
+\& "housekeeper" =~ /^housekeeper$/; # matches
+.Ve
+.SS "Using character classes"
+.IX Subsection "Using character classes"
+A \fBcharacter class\fR allows a set of possible characters, rather than
+just a single character, to match at a particular point in a regex.
+There are a number of different types of character classes, but usually
+when people use this term, they are referring to the type described in
+this section, which are technically called "Bracketed character
+classes", because they are denoted by brackets \f(CW\*(C`[...]\*(C'\fR, with the set of
+characters to be possibly matched inside. But we'll drop the "bracketed"
+below to correspond with common usage. Here are some examples of
+(bracketed) character classes:
+.PP
+.Vb 3
+\& /cat/; # matches \*(Aqcat\*(Aq
+\& /[bcr]at/; # matches \*(Aqbat\*(Aq, \*(Aqcat\*(Aq, or \*(Aqrat\*(Aq
+\& "abc" =~ /[cab]/; # matches \*(Aqa\*(Aq
+.Ve
+.PP
+In the last statement, even though \f(CW\*(Aqc\*(Aq\fR is the first character in
+the class, the earliest point at which the regex can match is \f(CW\*(Aqa\*(Aq\fR.
+.PP
+.Vb 3
+\& /[yY][eE][sS]/; # match \*(Aqyes\*(Aq in a case\-insensitive way
+\& # \*(Aqyes\*(Aq, \*(AqYes\*(Aq, \*(AqYES\*(Aq, etc.
+\& /yes/i; # also match \*(Aqyes\*(Aq in a case\-insensitive way
+.Ve
+.PP
+The last example shows a match with an \f(CW\*(Aqi\*(Aq\fR \fBmodifier\fR, which makes
+the match case-insensitive.
+.PP
+Character classes also have ordinary and special characters, but the
+sets of ordinary and special characters inside a character class are
+different than those outside a character class. The special
+characters for a character class are \f(CW\*(C`\-]\e^$\*(C'\fR and are matched using an
+escape:
+.PP
+.Vb 5
+\& /[\e]c]def/; # matches \*(Aq]def\*(Aq or \*(Aqcdef\*(Aq
+\& $x = \*(Aqbcr\*(Aq;
+\& /[$x]at/; # matches \*(Aqbat, \*(Aqcat\*(Aq, or \*(Aqrat\*(Aq
+\& /[\e$x]at/; # matches \*(Aq$at\*(Aq or \*(Aqxat\*(Aq
+\& /[\e\e$x]at/; # matches \*(Aq\eat\*(Aq, \*(Aqbat, \*(Aqcat\*(Aq, or \*(Aqrat\*(Aq
+.Ve
+.PP
+The special character \f(CW\*(Aq\-\*(Aq\fR acts as a range operator within character
+classes, so that the unwieldy \f(CW\*(C`[0123456789]\*(C'\fR and \f(CW\*(C`[abc...xyz]\*(C'\fR
+become the svelte \f(CW\*(C`[0\-9]\*(C'\fR and \f(CW\*(C`[a\-z]\*(C'\fR:
+.PP
+.Vb 2
+\& /item[0\-9]/; # matches \*(Aqitem0\*(Aq or ... or \*(Aqitem9\*(Aq
+\& /[0\-9a\-fA\-F]/; # matches a hexadecimal digit
+.Ve
+.PP
+If \f(CW\*(Aq\-\*(Aq\fR is the first or last character in a character class, it is
+treated as an ordinary character.
+.PP
+The special character \f(CW\*(C`^\*(C'\fR in the first position of a character class
+denotes a \fBnegated character class\fR, which matches any character but
+those in the brackets. Both \f(CW\*(C`[...]\*(C'\fR and \f(CW\*(C`[^...]\*(C'\fR must match a
+character, or the match fails. Then
+.PP
+.Vb 4
+\& /[^a]at/; # doesn\*(Aqt match \*(Aqaat\*(Aq or \*(Aqat\*(Aq, but matches
+\& # all other \*(Aqbat\*(Aq, \*(Aqcat, \*(Aq0at\*(Aq, \*(Aq%at\*(Aq, etc.
+\& /[^0\-9]/; # matches a non\-numeric character
+\& /[a^]at/; # matches \*(Aqaat\*(Aq or \*(Aq^at\*(Aq; here \*(Aq^\*(Aq is ordinary
+.Ve
+.PP
+Perl has several abbreviations for common character classes. (These
+definitions are those that Perl uses in ASCII-safe mode with the \f(CW\*(C`/a\*(C'\fR modifier.
+Otherwise they could match many more non-ASCII Unicode characters as
+well. See "Backslash sequences" in perlrecharclass for details.)
+.IP \(bu 4
+\&\ed is a digit and represents
+.Sp
+.Vb 1
+\& [0\-9]
+.Ve
+.IP \(bu 4
+\&\es is a whitespace character and represents
+.Sp
+.Vb 1
+\& [\e \et\er\en\ef]
+.Ve
+.IP \(bu 4
+\&\ew is a word character (alphanumeric or _) and represents
+.Sp
+.Vb 1
+\& [0\-9a\-zA\-Z_]
+.Ve
+.IP \(bu 4
+\&\eD is a negated \ed; it represents any character but a digit
+.Sp
+.Vb 1
+\& [^0\-9]
+.Ve
+.IP \(bu 4
+\&\eS is a negated \es; it represents any non-whitespace character
+.Sp
+.Vb 1
+\& [^\es]
+.Ve
+.IP \(bu 4
+\&\eW is a negated \ew; it represents any non-word character
+.Sp
+.Vb 1
+\& [^\ew]
+.Ve
+.IP \(bu 4
+The period '.' matches any character but "\en"
+.PP
+The \f(CW\*(C`\ed\es\ew\eD\eS\eW\*(C'\fR abbreviations can be used both inside and outside
+of character classes. Here are some in use:
+.PP
+.Vb 7
+\& /\ed\ed:\ed\ed:\ed\ed/; # matches a hh:mm:ss time format
+\& /[\ed\es]/; # matches any digit or whitespace character
+\& /\ew\eW\ew/; # matches a word char, followed by a
+\& # non\-word char, followed by a word char
+\& /..rt/; # matches any two chars, followed by \*(Aqrt\*(Aq
+\& /end\e./; # matches \*(Aqend.\*(Aq
+\& /end[.]/; # same thing, matches \*(Aqend.\*(Aq
+.Ve
+.PP
+The \fBword\ anchor\fR\ \f(CW\*(C`\eb\*(C'\fR matches a boundary between a word
+character and a non-word character \f(CW\*(C`\ew\eW\*(C'\fR or \f(CW\*(C`\eW\ew\*(C'\fR:
+.PP
+.Vb 4
+\& $x = "Housecat catenates house and cat";
+\& $x =~ /\ebcat/; # matches cat in \*(Aqcatenates\*(Aq
+\& $x =~ /cat\eb/; # matches cat in \*(Aqhousecat\*(Aq
+\& $x =~ /\ebcat\eb/; # matches \*(Aqcat\*(Aq at end of string
+.Ve
+.PP
+In the last example, the end of the string is considered a word
+boundary.
+.PP
+For natural language processing (so that, for example, apostrophes are
+included in words), use instead \f(CW\*(C`\eb{wb}\*(C'\fR
+.PP
+.Vb 1
+\& "don\*(Aqt" =~ / .+? \eb{wb} /x; # matches the whole string
+.Ve
+.SS "Matching this or that"
+.IX Subsection "Matching this or that"
+We can match different character strings with the \fBalternation\fR
+metacharacter \f(CW\*(Aq|\*(Aq\fR. To match \f(CW\*(C`dog\*(C'\fR or \f(CW\*(C`cat\*(C'\fR, we form the regex
+\&\f(CW\*(C`dog|cat\*(C'\fR. As before, Perl will try to match the regex at the
+earliest possible point in the string. At each character position,
+Perl will first try to match the first alternative, \f(CW\*(C`dog\*(C'\fR. If
+\&\f(CW\*(C`dog\*(C'\fR doesn't match, Perl will then try the next alternative, \f(CW\*(C`cat\*(C'\fR.
+If \f(CW\*(C`cat\*(C'\fR doesn't match either, then the match fails and Perl moves to
+the next position in the string. Some examples:
+.PP
+.Vb 2
+\& "cats and dogs" =~ /cat|dog|bird/; # matches "cat"
+\& "cats and dogs" =~ /dog|cat|bird/; # matches "cat"
+.Ve
+.PP
+Even though \f(CW\*(C`dog\*(C'\fR is the first alternative in the second regex,
+\&\f(CW\*(C`cat\*(C'\fR is able to match earlier in the string.
+.PP
+.Vb 2
+\& "cats" =~ /c|ca|cat|cats/; # matches "c"
+\& "cats" =~ /cats|cat|ca|c/; # matches "cats"
+.Ve
+.PP
+At a given character position, the first alternative that allows the
+regex match to succeed will be the one that matches. Here, all the
+alternatives match at the first string position, so the first matches.
+.SS "Grouping things and hierarchical matching"
+.IX Subsection "Grouping things and hierarchical matching"
+The \fBgrouping\fR metacharacters \f(CW\*(C`()\*(C'\fR allow a part of a regex to be
+treated as a single unit. Parts of a regex are grouped by enclosing
+them in parentheses. The regex \f(CWhouse(cat|keeper)\fR means match
+\&\f(CW\*(C`house\*(C'\fR followed by either \f(CW\*(C`cat\*(C'\fR or \f(CW\*(C`keeper\*(C'\fR. Some more examples
+are
+.PP
+.Vb 2
+\& /(a|b)b/; # matches \*(Aqab\*(Aq or \*(Aqbb\*(Aq
+\& /(^a|b)c/; # matches \*(Aqac\*(Aq at start of string or \*(Aqbc\*(Aq anywhere
+\&
+\& /house(cat|)/; # matches either \*(Aqhousecat\*(Aq or \*(Aqhouse\*(Aq
+\& /house(cat(s|)|)/; # matches either \*(Aqhousecats\*(Aq or \*(Aqhousecat\*(Aq or
+\& # \*(Aqhouse\*(Aq. Note groups can be nested.
+\&
+\& "20" =~ /(19|20|)\ed\ed/; # matches the null alternative \*(Aq()\ed\ed\*(Aq,
+\& # because \*(Aq20\ed\ed\*(Aq can\*(Aqt match
+.Ve
+.SS "Extracting matches"
+.IX Subsection "Extracting matches"
+The grouping metacharacters \f(CW\*(C`()\*(C'\fR also allow the extraction of the
+parts of a string that matched. For each grouping, the part that
+matched inside goes into the special variables \f(CW$1\fR, \f(CW$2\fR, etc.
+They can be used just as ordinary variables:
+.PP
+.Vb 5
+\& # extract hours, minutes, seconds
+\& $time =~ /(\ed\ed):(\ed\ed):(\ed\ed)/; # match hh:mm:ss format
+\& $hours = $1;
+\& $minutes = $2;
+\& $seconds = $3;
+.Ve
+.PP
+In list context, a match \f(CW\*(C`/regex/\*(C'\fR with groupings will return the
+list of matched values \f(CW\*(C`($1,$2,...)\*(C'\fR. So we could rewrite it as
+.PP
+.Vb 1
+\& ($hours, $minutes, $second) = ($time =~ /(\ed\ed):(\ed\ed):(\ed\ed)/);
+.Ve
+.PP
+If the groupings in a regex are nested, \f(CW$1\fR gets the group with the
+leftmost opening parenthesis, \f(CW$2\fR the next opening parenthesis,
+etc. For example, here is a complex regex and the matching variables
+indicated below it:
+.PP
+.Vb 2
+\& /(ab(cd|ef)((gi)|j))/;
+\& 1 2 34
+.Ve
+.PP
+Associated with the matching variables \f(CW$1\fR, \f(CW$2\fR, ... are
+the \fBbackreferences\fR \f(CW\*(C`\eg1\*(C'\fR, \f(CW\*(C`\eg2\*(C'\fR, ... Backreferences are
+matching variables that can be used \fIinside\fR a regex:
+.PP
+.Vb 1
+\& /(\ew\ew\ew)\es\eg1/; # find sequences like \*(Aqthe the\*(Aq in string
+.Ve
+.PP
+\&\f(CW$1\fR, \f(CW$2\fR, ... should only be used outside of a regex, and \f(CW\*(C`\eg1\*(C'\fR,
+\&\f(CW\*(C`\eg2\*(C'\fR, ... only inside a regex.
+.SS "Matching repetitions"
+.IX Subsection "Matching repetitions"
+The \fBquantifier\fR metacharacters \f(CW\*(C`?\*(C'\fR, \f(CW\*(C`*\*(C'\fR, \f(CW\*(C`+\*(C'\fR, and \f(CW\*(C`{}\*(C'\fR allow us
+to determine the number of repeats of a portion of a regex we
+consider to be a match. Quantifiers are put immediately after the
+character, character class, or grouping that we want to specify. They
+have the following meanings:
+.IP \(bu 4
+\&\f(CW\*(C`a?\*(C'\fR = match 'a' 1 or 0 times
+.IP \(bu 4
+\&\f(CW\*(C`a*\*(C'\fR = match 'a' 0 or more times, i.e., any number of times
+.IP \(bu 4
+\&\f(CW\*(C`a+\*(C'\fR = match 'a' 1 or more times, i.e., at least once
+.IP \(bu 4
+\&\f(CW\*(C`a{n,m}\*(C'\fR = match at least \f(CW\*(C`n\*(C'\fR times, but not more than \f(CW\*(C`m\*(C'\fR
+times.
+.IP \(bu 4
+\&\f(CW\*(C`a{n,}\*(C'\fR = match at least \f(CW\*(C`n\*(C'\fR or more times
+.IP \(bu 4
+\&\f(CW\*(C`a{,n}\*(C'\fR = match \f(CW\*(C`n\*(C'\fR times or fewer
+.IP \(bu 4
+\&\f(CW\*(C`a{n}\*(C'\fR = match exactly \f(CW\*(C`n\*(C'\fR times
+.PP
+Here are some examples:
+.PP
+.Vb 6
+\& /[a\-z]+\es+\ed*/; # match a lowercase word, at least some space, and
+\& # any number of digits
+\& /(\ew+)\es+\eg1/; # match doubled words of arbitrary length
+\& $year =~ /^\ed{2,4}$/; # make sure year is at least 2 but not more
+\& # than 4 digits
+\& $year =~ /^\ed{ 4 }$|^\ed{2}$/; # better match; throw out 3 digit dates
+.Ve
+.PP
+These quantifiers will try to match as much of the string as possible,
+while still allowing the regex to match. So we have
+.PP
+.Vb 5
+\& $x = \*(Aqthe cat in the hat\*(Aq;
+\& $x =~ /^(.*)(at)(.*)$/; # matches,
+\& # $1 = \*(Aqthe cat in the h\*(Aq
+\& # $2 = \*(Aqat\*(Aq
+\& # $3 = \*(Aq\*(Aq (0 matches)
+.Ve
+.PP
+The first quantifier \f(CW\*(C`.*\*(C'\fR grabs as much of the string as possible
+while still having the regex match. The second quantifier \f(CW\*(C`.*\*(C'\fR has
+no string left to it, so it matches 0 times.
+.SS "More matching"
+.IX Subsection "More matching"
+There are a few more things you might want to know about matching
+operators.
+The global modifier \f(CW\*(C`/g\*(C'\fR allows the matching operator to match
+within a string as many times as possible. In scalar context,
+successive matches against a string will have \f(CW\*(C`/g\*(C'\fR jump from match
+to match, keeping track of position in the string as it goes along.
+You can get or set the position with the \f(CWpos()\fR function.
+For example,
+.PP
+.Vb 4
+\& $x = "cat dog house"; # 3 words
+\& while ($x =~ /(\ew+)/g) {
+\& print "Word is $1, ends at position ", pos $x, "\en";
+\& }
+.Ve
+.PP
+prints
+.PP
+.Vb 3
+\& Word is cat, ends at position 3
+\& Word is dog, ends at position 7
+\& Word is house, ends at position 13
+.Ve
+.PP
+A failed match or changing the target string resets the position. If
+you don't want the position reset after failure to match, add the
+\&\f(CW\*(C`/c\*(C'\fR, as in \f(CW\*(C`/regex/gc\*(C'\fR.
+.PP
+In list context, \f(CW\*(C`/g\*(C'\fR returns a list of matched groupings, or if
+there are no groupings, a list of matches to the whole regex. So
+.PP
+.Vb 4
+\& @words = ($x =~ /(\ew+)/g); # matches,
+\& # $word[0] = \*(Aqcat\*(Aq
+\& # $word[1] = \*(Aqdog\*(Aq
+\& # $word[2] = \*(Aqhouse\*(Aq
+.Ve
+.SS "Search and replace"
+.IX Subsection "Search and replace"
+Search and replace is performed using \f(CW\*(C`s/regex/replacement/modifiers\*(C'\fR.
+The \f(CW\*(C`replacement\*(C'\fR is a Perl double-quoted string that replaces in the
+string whatever is matched with the \f(CW\*(C`regex\*(C'\fR. The operator \f(CW\*(C`=~\*(C'\fR is
+also used here to associate a string with \f(CW\*(C`s///\*(C'\fR. If matching
+against \f(CW$_\fR, the \f(CW\*(C`$_\ =~\*(C'\fR can be dropped. If there is a match,
+\&\f(CW\*(C`s///\*(C'\fR returns the number of substitutions made; otherwise it returns
+false. Here are a few examples:
+.PP
+.Vb 5
+\& $x = "Time to feed the cat!";
+\& $x =~ s/cat/hacker/; # $x contains "Time to feed the hacker!"
+\& $y = "\*(Aqquoted words\*(Aq";
+\& $y =~ s/^\*(Aq(.*)\*(Aq$/$1/; # strip single quotes,
+\& # $y contains "quoted words"
+.Ve
+.PP
+With the \f(CW\*(C`s///\*(C'\fR operator, the matched variables \f(CW$1\fR, \f(CW$2\fR, etc.
+are immediately available for use in the replacement expression. With
+the global modifier, \f(CW\*(C`s///g\*(C'\fR will search and replace all occurrences
+of the regex in the string:
+.PP
+.Vb 4
+\& $x = "I batted 4 for 4";
+\& $x =~ s/4/four/; # $x contains "I batted four for 4"
+\& $x = "I batted 4 for 4";
+\& $x =~ s/4/four/g; # $x contains "I batted four for four"
+.Ve
+.PP
+The non-destructive modifier \f(CW\*(C`s///r\*(C'\fR causes the result of the substitution
+to be returned instead of modifying \f(CW$_\fR (or whatever variable the
+substitute was bound to with \f(CW\*(C`=~\*(C'\fR):
+.PP
+.Vb 3
+\& $x = "I like dogs.";
+\& $y = $x =~ s/dogs/cats/r;
+\& print "$x $y\en"; # prints "I like dogs. I like cats."
+\&
+\& $x = "Cats are great.";
+\& print $x =~ s/Cats/Dogs/r =~ s/Dogs/Frogs/r =~
+\& s/Frogs/Hedgehogs/r, "\en";
+\& # prints "Hedgehogs are great."
+\&
+\& @foo = map { s/[a\-z]/X/r } qw(a b c 1 2 3);
+\& # @foo is now qw(X X X 1 2 3)
+.Ve
+.PP
+The evaluation modifier \f(CW\*(C`s///e\*(C'\fR wraps an \f(CW\*(C`eval{...}\*(C'\fR around the
+replacement string and the evaluated result is substituted for the
+matched substring. Some examples:
+.PP
+.Vb 3
+\& # reverse all the words in a string
+\& $x = "the cat in the hat";
+\& $x =~ s/(\ew+)/reverse $1/ge; # $x contains "eht tac ni eht tah"
+\&
+\& # convert percentage to decimal
+\& $x = "A 39% hit rate";
+\& $x =~ s!(\ed+)%!$1/100!e; # $x contains "A 0.39 hit rate"
+.Ve
+.PP
+The last example shows that \f(CW\*(C`s///\*(C'\fR can use other delimiters, such as
+\&\f(CW\*(C`s!!!\*(C'\fR and \f(CW\*(C`s{}{}\*(C'\fR, and even \f(CW\*(C`s{}//\*(C'\fR. If single quotes are used
+\&\f(CW\*(C`s\*(Aq\*(Aq\*(Aq\*(C'\fR, then the regex and replacement are treated as single-quoted
+strings.
+.SS "The split operator"
+.IX Subsection "The split operator"
+\&\f(CW\*(C`split /regex/, string\*(C'\fR splits \f(CW\*(C`string\*(C'\fR into a list of substrings
+and returns that list. The regex determines the character sequence
+that \f(CW\*(C`string\*(C'\fR is split with respect to. For example, to split a
+string into words, use
+.PP
+.Vb 4
+\& $x = "Calvin and Hobbes";
+\& @word = split /\es+/, $x; # $word[0] = \*(AqCalvin\*(Aq
+\& # $word[1] = \*(Aqand\*(Aq
+\& # $word[2] = \*(AqHobbes\*(Aq
+.Ve
+.PP
+To extract a comma-delimited list of numbers, use
+.PP
+.Vb 4
+\& $x = "1.618,2.718, 3.142";
+\& @const = split /,\es*/, $x; # $const[0] = \*(Aq1.618\*(Aq
+\& # $const[1] = \*(Aq2.718\*(Aq
+\& # $const[2] = \*(Aq3.142\*(Aq
+.Ve
+.PP
+If the empty regex \f(CW\*(C`//\*(C'\fR is used, the string is split into individual
+characters. If the regex has groupings, then the list produced contains
+the matched substrings from the groupings as well:
+.PP
+.Vb 6
+\& $x = "/usr/bin";
+\& @parts = split m!(/)!, $x; # $parts[0] = \*(Aq\*(Aq
+\& # $parts[1] = \*(Aq/\*(Aq
+\& # $parts[2] = \*(Aqusr\*(Aq
+\& # $parts[3] = \*(Aq/\*(Aq
+\& # $parts[4] = \*(Aqbin\*(Aq
+.Ve
+.PP
+Since the first character of \f(CW$x\fR matched the regex, \f(CW\*(C`split\*(C'\fR prepended
+an empty initial element to the list.
+.ie n .SS """use re \*(Aqstrict\*(Aq"""
+.el .SS "\f(CWuse re \*(Aqstrict\*(Aq\fP"
+.IX Subsection "use re strict"
+New in v5.22, this applies stricter rules than otherwise when compiling
+regular expression patterns. It can find things that, while legal, may
+not be what you intended.
+.PP
+See 'strict' in re.
+.SH BUGS
+.IX Header "BUGS"
+None.
+.SH "SEE ALSO"
+.IX Header "SEE ALSO"
+This is just a quick start guide. For a more in-depth tutorial on
+regexes, see perlretut and for the reference page, see perlre.
+.SH "AUTHOR AND COPYRIGHT"
+.IX Header "AUTHOR AND COPYRIGHT"
+Copyright (c) 2000 Mark Kvale
+All rights reserved.
+.PP
+This document may be distributed under the same terms as Perl itself.
+.SS Acknowledgments
+.IX Subsection "Acknowledgments"
+The author would like to thank Mark-Jason Dominus, Tom Christiansen,
+Ilya Zakharevich, Brad Hughes, and Mike Giroux for all their helpful
+comments.