1 files changed, 3711 insertions, 0 deletions
diff --git a/upstream/archlinux/man1/perlre.1perl b/upstream/archlinux/man1/perlre.1perl
new file mode 100644
index 00000000..5380b09b
--- /dev/null
+++ b/upstream/archlinux/man1/perlre.1perl
@@ -0,0 +1,3711 @@
+.\" -*- mode: troff; coding: utf-8 -*-
+.\" Automatically generated by Pod::Man 5.01 (Pod::Simple 3.43)
+.\"
+.\" Standard preamble:
+.\" ========================================================================
+.de Sp \" Vertical space (when we can't use .PP)
+.if t .sp .5v
+.if n .sp
+..
+.de Vb \" Begin verbatim text
+.ft CW
+.nf
+.ne \\$1
+..
+.de Ve \" End verbatim text
+.ft R
+.fi
+..
+.\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>.
+.ie n \{\
+.    ds C` ""
+.    ds C' ""
+'br\}
+.el\{\
+.    ds C`
+.    ds C'
+'br\}
+.\"
+.\" Escape single quotes in literal strings from groff's Unicode transform.
+.ie \n(.g .ds Aq \(aq
+.el       .ds Aq '
+.\"
+.\" If the F register is >0, we'll generate index entries on stderr for
+.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
+.\" entries marked with X<> in POD.  Of course, you'll have to process the
+.\" output yourself in some meaningful fashion.
+.\"
+.\" Avoid warning from groff about undefined register 'F'.
+.de IX
+..
+.nr rF 0
+.if \n(.g .if rF .nr rF 1
+.if (\n(rF:(\n(.g==0)) \{\
+.    if \nF \{\
+.        de IX
+.        tm Index:\\$1\t\\n%\t"\\$2"
+..
+.        if !\nF==2 \{\
+.            nr % 0
+.            nr F 2
+.        \}
+.    \}
+.\}
+.rr rF
+.\" ========================================================================
+.\"
+.IX Title "PERLRE 1perl"
+.TH PERLRE 1perl 2024-02-11 "perl v5.38.2" "Perl Programmers Reference Guide"
+.\" For nroff, turn off justification.  Always turn off hyphenation; it makes
+.\" way too many mistakes in technical documents.
+.if n .ad l
+.nh
+.SH NAME
+perlre \- Perl regular expressions
+.IX Xref "regular expression regex regexp"
+.SH DESCRIPTION
+.IX Header "DESCRIPTION"
+This page describes the syntax of regular expressions in Perl.
+.PP
+If you haven't used regular expressions before, a tutorial introduction
+is available in perlretut.  If you know just a little about them,
+a quick-start introduction is available in perlrequick.
+.PP
+Except for "The Basics" section, this page assumes you are familiar
+with regular expression basics, like what is a "pattern", what does it
+look like, and how it is basically used.  For a reference on how they
+are used, plus various examples of the same, see discussions of \f(CW\*(C`m//\*(C'\fR,
+\&\f(CW\*(C`s///\*(C'\fR, \f(CW\*(C`qr//\*(C'\fR and \f(CW"??"\fR in "Regexp Quote-Like Operators" in perlop.
+.PP
+New in v5.22, \f(CW\*(C`use re \*(Aqstrict\*(Aq\*(C'\fR applies stricter
+rules than otherwise when compiling regular expression patterns.  It can
+find things that, while legal, may not be what you intended.
+.SS "The Basics"
+.IX Xref "regular expression, version 8 regex, version 8 regexp, version 8"
+.IX Subsection "The Basics"
+Regular expressions are strings with the very particular syntax and
+meaning described in this document and auxiliary documents referred to
+by this one.  The strings are called "patterns".  Patterns are used to
+determine if some other string, called the "target", has (or doesn't
+have) the characteristics specified by the pattern.  We call this
+"matching" the target string against the pattern.  Usually the match is
+done by having the target be the first operand, and the pattern be the
+second operand, of one of the two binary operators \f(CW\*(C`=~\*(C'\fR and \f(CW\*(C`!~\*(C'\fR,
+listed in "Binding Operators" in perlop; and the pattern will have been
+converted from an ordinary string by one of the operators in
+"Regexp Quote-Like Operators" in perlop, like so:
+.PP
+.Vb 1
+\& $foo =~ m/abc/
+.Ve
+.PP
+This evaluates to true if and only if the string in the variable \f(CW$foo\fR
+contains somewhere in it, the sequence of characters "a", "b", then "c".
+(The \f(CW\*(C`=~ m\*(C'\fR, or match operator, is described in
+"m/PATTERN/msixpodualngc" in perlop.)
+.PP
+Patterns that aren't already stored in some variable must be delimited,
+at both ends, by delimiter characters.  These are often, as in the
+example above, forward slashes, and the typical way a pattern is written
+in documentation is with those slashes.  In most cases, the delimiter
+is the same character, fore and aft, but there are a few cases where a
+character looks like it has a mirror-image mate, where the opening
+version is the beginning delimiter, and the closing one is the ending
+delimiter, like
+.PP
+.Vb 1
+\& $foo =~ m<abc>
+.Ve
+.PP
+Most times, the pattern is evaluated in double-quotish context, but it
+is possible to choose delimiters to force single-quotish, like
+.PP
+.Vb 1
+\& $foo =~ m\*(Aqabc\*(Aq
+.Ve
+.PP
+If the pattern contains its delimiter within it, that delimiter must be
+escaped.  Prefixing it with a backslash (\fIe.g.\fR, \f(CW"/foo\e/bar/"\fR)
+serves this purpose.
+.PP
+Any single character in a pattern matches that same character in the
+target string, unless the character is a \fImetacharacter\fR with a special
+meaning described in this document.  A sequence of non-metacharacters
+matches the same sequence in the target string, as we saw above with
+\&\f(CW\*(C`m/abc/\*(C'\fR.
+.PP
+Only a few characters (all of them being ASCII punctuation characters)
+are metacharacters.  The most commonly used one is a dot \f(CW"."\fR, which
+normally matches almost any character (including a dot itself).
+.PP
+You can cause characters that normally function as metacharacters to be
+interpreted literally by prefixing them with a \f(CW"\e"\fR, just like the
+pattern's delimiter must be escaped if it also occurs within the
+pattern.  Thus, \f(CW"\e."\fR matches just a literal dot, \f(CW"."\fR instead of
+its normal meaning.  This means that the backslash is also a
+metacharacter, so \f(CW"\e\e"\fR matches a single \f(CW"\e"\fR.  And a sequence that
+contains an escaped metacharacter matches the same sequence (but without
+the escape) in the target string.  So, the pattern \f(CW\*(C`/blur\e\efl/\*(C'\fR would
+match any target string that contains the sequence \f(CW"blur\efl"\fR.
+.PP
+The metacharacter \f(CW"|"\fR is used to match one thing or another.  Thus
+.PP
+.Vb 1
+\& $foo =~ m/this|that/
+.Ve
+.PP
+is TRUE if and only if \f(CW$foo\fR contains either the sequence \f(CW"this"\fR or
+the sequence \f(CW"that"\fR.  Like all metacharacters, prefixing the \f(CW"|"\fR
+with a backslash makes it match the plain punctuation character; in its
+case, the VERTICAL LINE.
+.PP
+.Vb 1
+\& $foo =~ m/this\e|that/
+.Ve
+.PP
+is TRUE if and only if \f(CW$foo\fR contains the sequence \f(CW"this|that"\fR.
+.PP
+You aren't limited to just a single \f(CW"|"\fR.
+.PP
+.Vb 1
+\& $foo =~ m/fee|fie|foe|fum/
+.Ve
+.PP
+is TRUE if and only if \f(CW$foo\fR contains any of those 4 sequences from
+the children's story "Jack and the Beanstalk".
+.PP
+As you can see, the \f(CW"|"\fR binds less tightly than a sequence of
+ordinary characters.  We can override this by using the grouping
+metacharacters, the parentheses \f(CW"("\fR and \f(CW")"\fR.
+.PP
+.Vb 1
+\& $foo =~ m/th(is|at) thing/
+.Ve
+.PP
+is TRUE if and only if \f(CW$foo\fR contains either the sequence \f(CW"this\ thing"\fR or the sequence \f(CW"that\ thing"\fR.  The portions of the string
+that match the portions of the pattern enclosed in parentheses are
+normally made available separately for use later in the pattern,
+substitution, or program.  This is called "capturing", and it can get
+complicated.  See "Capture groups".
+.PP
+The first alternative includes everything from the last pattern
+delimiter (\f(CW"("\fR, \f(CW"(?:"\fR (described later), \fIetc\fR. or the beginning
+of the pattern) up to the first \f(CW"|"\fR, and the last alternative
+contains everything from the last \f(CW"|"\fR to the next closing pattern
+delimiter.  That's why it's common practice to include alternatives in
+parentheses: to minimize confusion about where they start and end.
+.PP
+Alternatives are tried from left to right, so the first
+alternative found for which the entire expression matches, is the one that
+is chosen. This means that alternatives are not necessarily greedy. For
+example: when matching \f(CW\*(C`foo|foot\*(C'\fR against \f(CW"barefoot"\fR, only the \f(CW"foo"\fR
+part will match, as that is the first alternative tried, and it successfully
+matches the target string. (This might not seem important, but it is
+important when you are capturing matched text using parentheses.)
+.PP
+Besides taking away the special meaning of a metacharacter, a prefixed
+backslash changes some letter and digit characters away from matching
+just themselves to instead have special meaning.  These are called
+"escape sequences", and all such are described in perlrebackslash.  A
+backslash sequence (of a letter or digit) that doesn't currently have
+special meaning to Perl will raise a warning if warnings are enabled,
+as those are reserved for potential future use.
+.PP
+One such sequence is \f(CW\*(C`\eb\*(C'\fR, which matches a boundary of some sort.
+\&\f(CW\*(C`\eb{wb}\*(C'\fR and a few others give specialized types of boundaries.
+(They are all described in detail starting at
+"\eb{}, \eb, \eB{}, \eB" in perlrebackslash.)  Note that these don't match
+characters, but the zero-width spaces between characters.  They are an
+example of a zero-width assertion.  Consider again,
+.PP
+.Vb 1
+\& $foo =~ m/fee|fie|foe|fum/
+.Ve
+.PP
+It evaluates to TRUE if, besides those 4 words, any of the sequences
+"feed", "field", "Defoe", "fume", and many others are in \f(CW$foo\fR.  By
+judicious use of \f(CW\*(C`\eb\*(C'\fR (or better (because it is designed to handle
+natural language) \f(CW\*(C`\eb{wb}\*(C'\fR), we can make sure that only the Giant's
+words are matched:
+.PP
+.Vb 2
+\& $foo =~ m/\eb(fee|fie|foe|fum)\eb/
+\& $foo =~ m/\eb{wb}(fee|fie|foe|fum)\eb{wb}/
+.Ve
+.PP
+The final example shows that the characters \f(CW"{"\fR and \f(CW"}"\fR are
+metacharacters.
+.PP
+Another use for escape sequences is to specify characters that cannot
+(or which you prefer not to) be written literally.  These are described
+in detail in "Character Escapes" in perlrebackslash, but the next three
+paragraphs briefly describe some of them.
+.PP
+Various control characters can be written in C language style: \f(CW"\en"\fR
+matches a newline, \f(CW"\et"\fR a tab, \f(CW"\er"\fR a carriage return, \f(CW"\ef"\fR a
+form feed, \fIetc\fR.
+.PP
+More generally, \f(CW\*(C`\e\fR\f(CInnn\fR\f(CW\*(C'\fR, where \fInnn\fR is a string of three octal
+digits, matches the character whose native code point is \fInnn\fR.  You
+can easily run into trouble if you don't have exactly three digits.  So
+always use three, or since Perl 5.14, you can use \f(CW\*(C`\eo{...}\*(C'\fR to specify
+any number of octal digits.
+.PP
+Similarly, \f(CW\*(C`\ex\fR\f(CInn\fR\f(CW\*(C'\fR, where \fInn\fR are hexadecimal digits, matches the
+character whose native ordinal is \fInn\fR.  Again, not using exactly two
+digits is a recipe for disaster, but you can use \f(CW\*(C`\ex{...}\*(C'\fR to specify
+any number of hex digits.
+.PP
+Besides being a metacharacter, the \f(CW"."\fR is an example of a "character
+class", something that can match any single character of a given set of
+them.  In its case, the set is just about all possible characters.  Perl
+predefines several character classes besides the \f(CW"."\fR; there is a
+separate reference page about just these, perlrecharclass.
+.PP
+You can define your own custom character classes, by putting into your
+pattern in the appropriate place(s), a list of all the characters you
+want in the set.  You do this by enclosing the list within \f(CW\*(C`[]\*(C'\fR bracket
+characters.  These are called "bracketed character classes" when we are
+being precise, but often the word "bracketed" is dropped.  (Dropping it
+usually doesn't cause confusion.)  This means that the \f(CW"["\fR character
+is another metacharacter.  It doesn't match anything just by itself; it
+is used only to tell Perl that what follows it is a bracketed character
+class.  If you want to match a literal left square bracket, you must
+escape it, like \f(CW"\e["\fR.  The matching \f(CW"]"\fR is also a metacharacter;
+again it doesn't match anything by itself, but just marks the end of
+your custom class to Perl.  It is an example of a "sometimes
+metacharacter".  It isn't a metacharacter if there is no corresponding
+\&\f(CW"["\fR, and matches its literal self:
+.PP
+.Vb 1
+\& print "]" =~ /]/;  # prints 1
+.Ve
+.PP
+The list of characters within the character class gives the set of
+characters matched by the class.  \f(CW"[abc]"\fR matches a single "a" or "b"
+or "c".  But if the first character after the \f(CW"["\fR is \f(CW"^"\fR, the
+class instead matches any character not in the list.  Within a list, the
+\&\f(CW"\-"\fR character specifies a range of characters, so that \f(CW\*(C`a\-z\*(C'\fR
+represents all characters between "a" and "z", inclusive.  If you want
+either \f(CW"\-"\fR or \f(CW"]"\fR itself to be a member of a class, put it at the
+start of the list (possibly after a \f(CW"^"\fR), or escape it with a
+backslash.  \f(CW"\-"\fR is also taken literally when it is at the end of the
+list, just before the closing \f(CW"]"\fR.  (The following all specify the
+same class of three characters: \f(CW\*(C`[\-az]\*(C'\fR, \f(CW\*(C`[az\-]\*(C'\fR, and \f(CW\*(C`[a\e\-z]\*(C'\fR.  All
+are different from \f(CW\*(C`[a\-z]\*(C'\fR, which specifies a class containing
+twenty-six characters, even on EBCDIC-based character sets.)
+.PP
+There is lots more to bracketed character classes; full details are in
+"Bracketed Character Classes" in perlrecharclass.
+.PP
+\fIMetacharacters\fR
+.IX Xref "metacharacter \\ ^ . $ | ( () [ []"
+.IX Subsection "Metacharacters"
+.PP
+"The Basics" introduced some of the metacharacters.  This section
+gives them all.  Most of them have the same meaning as in the \fIegrep\fR
+command.
+.PP
+Only the \f(CW"\e"\fR is always a metacharacter.  The others are metacharacters
+just sometimes.  The following tables lists all of them, summarizes
+their use, and gives the contexts where they are metacharacters.
+Outside those contexts or if prefixed by a \f(CW"\e"\fR, they match their
+corresponding punctuation character.  In some cases, their meaning
+varies depending on various pattern modifiers that alter the default
+behaviors.  See "Modifiers".
+.PP
+.Vb 10
+\&            PURPOSE                                  WHERE
+\& \e   Escape the next character                    Always, except when
+\&                                                  escaped by another \e
+\& ^   Match the beginning of the string            Not in []
+\&       (or line, if /m is used)
+\& ^   Complement the [] class                      At the beginning of []
+\& .   Match any single character except newline    Not in []
+\&       (under /s, includes newline)
+\& $   Match the end of the string                  Not in [], but can
+\&       (or before newline at the end of the       mean interpolate a
+\&       string; or before any newline if /m is     scalar
+\&       used)
+\& |   Alternation                                  Not in []
+\& ()  Grouping                                     Not in []
+\& [   Start Bracketed Character class              Not in []
+\& ]   End Bracketed Character class                Only in [], and
+\&                                                    not first
+\& *   Matches the preceding element 0 or more      Not in []
+\&       times
+\& +   Matches the preceding element 1 or more      Not in []
+\&       times
+\& ?   Matches the preceding element 0 or 1         Not in []
+\&       times
+\& {   Starts a sequence that gives number(s)       Not in []
+\&       of times the preceding element can be
+\&       matched
+\& {   when following certain escape sequences
+\&       starts a modifier to the meaning of the
+\&       sequence
+\& }   End sequence started by {
+\& \-   Indicates a range                            Only in [] interior
+\& #   Beginning of comment, extends to line end    Only with /x modifier
+.Ve
+.PP
+Notice that most of the metacharacters lose their special meaning when
+they occur in a bracketed character class, except \f(CW"^"\fR has a different
+meaning when it is at the beginning of such a class.  And \f(CW"\-"\fR and \f(CW"]"\fR
+are metacharacters only at restricted positions within bracketed
+character classes; while \f(CW"}"\fR is a metacharacter only when closing a
+special construct started by \f(CW"{"\fR.
+.PP
+In double-quotish context, as is usually the case,  you need to be
+careful about \f(CW"$"\fR and the non-metacharacter \f(CW"@"\fR.  Those could
+interpolate variables, which may or may not be what you intended.
+.PP
+These rules were designed for compactness of expression, rather than
+legibility and maintainability.  The "/x and /xx" pattern
+modifiers allow you to insert white space to improve readability.  And
+use of \f(CW\*(C`re\ \*(Aqstrict\*(Aq\*(C'\fR adds extra checking to
+catch some typos that might silently compile into something unintended.
+.PP
+By default, the \f(CW"^"\fR character is guaranteed to match only the
+beginning of the string, the \f(CW"$"\fR character only the end (or before the
+newline at the end), and Perl does certain optimizations with the
+assumption that the string contains only one line.  Embedded newlines
+will not be matched by \f(CW"^"\fR or \f(CW"$"\fR.  You may, however, wish to treat a
+string as a multi-line buffer, such that the \f(CW"^"\fR will match after any
+newline within the string (except if the newline is the last character in
+the string), and \f(CW"$"\fR will match before any newline.  At the
+cost of a little more overhead, you can do this by using the
+\&\f(CW"/m"\fR modifier on the pattern match operator.  (Older programs
+did this by setting \f(CW$*\fR, but this option was removed in perl 5.10.)
+.IX Xref "^ $ m"
+.PP
+To simplify multi-line substitutions, the \f(CW"."\fR character never matches a
+newline unless you use the \f(CW\*(C`/s\*(C'\fR modifier, which in effect tells
+Perl to pretend the string is a single line\-\-even if it isn't.
+.IX Xref ". s"
+.SS Modifiers
+.IX Subsection "Modifiers"
+\fIOverview\fR
+.IX Subsection "Overview"
+.PP
+The default behavior for matching can be changed, using various
+modifiers.  Modifiers that relate to the interpretation of the pattern
+are listed just below.  Modifiers that alter the way a pattern is used
+by Perl are detailed in "Regexp Quote-Like Operators" in perlop and
+"Gory details of parsing quoted constructs" in perlop.  Modifiers can be added
+dynamically; see "Extended Patterns" below.
+.ie n .IP "\fR\fB""m""\fR\fB\fR" 4
+.el .IP \fR\f(CBm\fR\fB\fR 4
+.IX Xref " m regex, multiline regexp, multiline regular expression, multiline"
+.IX Item "m"
+Treat the string being matched against as multiple lines.  That is, change \f(CW"^"\fR and \f(CW"$"\fR from matching
+the start of the string's first line and the end of its last line to
+matching the start and end of each line within the string.
+.ie n .IP "\fR\fB""s""\fR\fB\fR" 4
+.el .IP \fR\f(CBs\fR\fB\fR 4
+.IX Xref " s regex, single-line regexp, single-line regular expression, single-line"
+.IX Item "s"
+Treat the string as single line.  That is, change \f(CW"."\fR to match any character
+whatsoever, even a newline, which normally it would not match.
+.Sp
+Used together, as \f(CW\*(C`/ms\*(C'\fR, they let the \f(CW"."\fR match any character whatsoever,
+while still allowing \f(CW"^"\fR and \f(CW"$"\fR to match, respectively, just after
+and just before newlines within the string.
+.ie n .IP "\fR\fB""i""\fR\fB\fR" 4
+.el .IP \fR\f(CBi\fR\fB\fR 4
+.IX Xref " i regex, case-insensitive regexp, case-insensitive regular expression, case-insensitive"
+.IX Item "i"
+Do case-insensitive pattern matching.  For example, "A" will match "a"
+under \f(CW\*(C`/i\*(C'\fR.
+.Sp
+If locale matching rules are in effect, the case map is taken from the
+current
+locale for code points less than 255, and from Unicode rules for larger
+code points.  However, matches that would cross the Unicode
+rules/non\-Unicode rules boundary (ords 255/256) will not succeed, unless
+the locale is a UTF\-8 one.  See perllocale.
+.Sp
+There are a number of Unicode characters that match a sequence of
+multiple characters under \f(CW\*(C`/i\*(C'\fR.  For example,
+\&\f(CW\*(C`LATIN SMALL LIGATURE FI\*(C'\fR should match the sequence \f(CW\*(C`fi\*(C'\fR.  Perl is not
+currently able to do this when the multiple characters are in the pattern and
+are split between groupings, or when one or more are quantified.  Thus
+.Sp
+.Vb 3
+\& "\eN{LATIN SMALL LIGATURE FI}" =~ /fi/i;          # Matches
+\& "\eN{LATIN SMALL LIGATURE FI}" =~ /[fi][fi]/i;    # Doesn\*(Aqt match!
+\& "\eN{LATIN SMALL LIGATURE FI}" =~ /fi*/i;         # Doesn\*(Aqt match!
+\&
+\& # The below doesn\*(Aqt match, and it isn\*(Aqt clear what $1 and $2 would
+\& # be even if it did!!
+\& "\eN{LATIN SMALL LIGATURE FI}" =~ /(f)(i)/i;      # Doesn\*(Aqt match!
+.Ve
+.Sp
+Perl doesn't match multiple characters in a bracketed
+character class unless the character that maps to them is explicitly
+mentioned, and it doesn't match them at all if the character class is
+inverted, which otherwise could be highly confusing.  See
+"Bracketed Character Classes" in perlrecharclass, and
+"Negation" in perlrecharclass.
+.ie n .IP "\fR\fB""x""\fR\fB\fR and \fB\fR\fB""xx""\fR\fB\fR" 4
+.el .IP "\fR\f(CBx\fR\fB\fR and \fB\fR\f(CBxx\fR\fB\fR" 4
+.IX Xref " x"
+.IX Item "x and xx"
+Extend your pattern's legibility by permitting whitespace and comments.
+Details in "/x and  /xx"
+.ie n .IP "\fR\fB""p""\fR\fB\fR" 4
+.el .IP \fR\f(CBp\fR\fB\fR 4
+.IX Xref " p regex, preserve regexp, preserve"
+.IX Item "p"
+Preserve the string matched such that \f(CW\*(C`${^PREMATCH}\*(C'\fR, \f(CW\*(C`${^MATCH}\*(C'\fR, and
+\&\f(CW\*(C`${^POSTMATCH}\*(C'\fR are available for use after matching.
+.Sp
+In Perl 5.20 and higher this is ignored. Due to a new copy-on-write
+mechanism, \f(CW\*(C`${^PREMATCH}\*(C'\fR, \f(CW\*(C`${^MATCH}\*(C'\fR, and \f(CW\*(C`${^POSTMATCH}\*(C'\fR will be available
+after the match regardless of the modifier.
+.ie n .IP "\fR\fB""a""\fR\fB\fR, \fB\fR\fB""d""\fR\fB\fR, \fB\fR\fB""l""\fR\fB\fR, and \fB\fR\fB""u""\fR\fB\fR" 4
+.el .IP "\fR\f(CBa\fR\fB\fR, \fB\fR\f(CBd\fR\fB\fR, \fB\fR\f(CBl\fR\fB\fR, and \fB\fR\f(CBu\fR\fB\fR" 4
+.IX Xref " a d l u"
+.IX Item "a, d, l, and u"
+These modifiers, all new in 5.14, affect which character-set rules
+(Unicode, \fIetc\fR.) are used, as described below in
+"Character set modifiers".
+.ie n .IP "\fR\fB""n""\fR\fB\fR" 4
+.el .IP \fR\f(CBn\fR\fB\fR 4
+.IX Xref " n regex, non-capture regexp, non-capture regular expression, non-capture"
+.IX Item "n"
+Prevent the grouping metacharacters \f(CW\*(C`()\*(C'\fR from capturing. This modifier,
+new in 5.22, will stop \f(CW$1\fR, \f(CW$2\fR, \fIetc\fR... from being filled in.
+.Sp
+.Vb 2
+\&  "hello" =~ /(hi|hello)/;   # $1 is "hello"
+\&  "hello" =~ /(hi|hello)/n;  # $1 is undef
+.Ve
+.Sp
+This is equivalent to putting \f(CW\*(C`?:\*(C'\fR at the beginning of every capturing group:
+.Sp
+.Vb 1
+\&  "hello" =~ /(?:hi|hello)/; # $1 is undef
+.Ve
+.Sp
+\&\f(CW\*(C`/n\*(C'\fR can be negated on a per-group basis. Alternatively, named captures
+may still be used.
+.Sp
+.Vb 3
+\&  "hello" =~ /(?\-n:(hi|hello))/n;   # $1 is "hello"
+\&  "hello" =~ /(?<greet>hi|hello)/n; # $1 is "hello", $+{greet} is
+\&                                    # "hello"
+.Ve
+.IP "Other Modifiers" 4
+.IX Item "Other Modifiers"
+There are a number of flags that can be found at the end of regular
+expression constructs that are \fInot\fR generic regular expression flags, but
+apply to the operation being performed, like matching or substitution (\f(CW\*(C`m//\*(C'\fR
+or \f(CW\*(C`s///\*(C'\fR respectively).
+.Sp
+Flags described further in
+"Using regular expressions in Perl" in perlretut are:
+.Sp
+.Vb 2
+\&  c  \- keep the current position during repeated matching
+\&  g  \- globally match the pattern repeatedly in the string
+.Ve
+.Sp
+Substitution-specific modifiers described in
+"s/PATTERN/REPLACEMENT/msixpodualngcer" in perlop are:
+.Sp
+.Vb 4
+\&  e  \- evaluate the right\-hand side as an expression
+\&  ee \- evaluate the right side as a string then eval the result
+\&  o  \- pretend to optimize your code, but actually introduce bugs
+\&  r  \- perform non\-destructive substitution and return the new value
+.Ve
+.PP
+Regular expression modifiers are usually written in documentation
+as \fIe.g.\fR, "the \f(CW\*(C`/x\*(C'\fR modifier", even though the delimiter
+in question might not really be a slash.  The modifiers \f(CW\*(C`/imnsxadlup\*(C'\fR
+may also be embedded within the regular expression itself using
+the \f(CW\*(C`(?...)\*(C'\fR construct, see "Extended Patterns" below.
+.PP
+\fIDetails on some modifiers\fR
+.IX Subsection "Details on some modifiers"
+.PP
+Some of the modifiers require more explanation than given in the
+"Overview" above.
+.PP
+\f(CW\*(C`/x\*(C'\fR and  \f(CW\*(C`/xx\*(C'\fR
+.IX Subsection "/x and /xx"
+.PP
+A single \f(CW\*(C`/x\*(C'\fR tells
+the regular expression parser to ignore most whitespace that is neither
+backslashed nor within a bracketed character class, nor within the characters
+of a multi-character metapattern like \f(CW\*(C`(?i: ... )\*(C'\fR.  You can use this to
+break up your regular expression into more readable parts.
+Also, the \f(CW"#"\fR character is treated as a metacharacter introducing a
+comment that runs up to the pattern's closing delimiter, or to the end
+of the current line if the pattern extends onto the next line.  Hence,
+this is very much like an ordinary Perl code comment.  (You can include
+the closing delimiter within the comment only if you precede it with a
+backslash, so be careful!)
+.PP
+Use of \f(CW\*(C`/x\*(C'\fR means that if you want real
+whitespace or \f(CW"#"\fR characters in the pattern (outside a bracketed character
+class, which is unaffected by \f(CW\*(C`/x\*(C'\fR), then you'll either have to
+escape them (using backslashes or \f(CW\*(C`\eQ...\eE\*(C'\fR) or encode them using octal,
+hex, or \f(CW\*(C`\eN{}\*(C'\fR or \f(CW\*(C`\ep{name=...}\*(C'\fR escapes.
+It is ineffective to try to continue a comment onto the next line by
+escaping the \f(CW\*(C`\en\*(C'\fR with a backslash or \f(CW\*(C`\eQ\*(C'\fR.
+.PP
+You can use "(?#text)" to create a comment that ends earlier than the
+end of the current line, but \f(CW\*(C`text\*(C'\fR also can't contain the closing
+delimiter unless escaped with a backslash.
+.PP
+A common pitfall is to forget that \f(CW"#"\fR characters (outside a
+bracketed character class) begin a comment under \f(CW\*(C`/x\*(C'\fR and are not
+matched literally.  Just keep that in mind when trying to puzzle out why
+a particular \f(CW\*(C`/x\*(C'\fR pattern isn't working as expected.
+Inside a bracketed character class, \f(CW"#"\fR retains its non-special,
+literal meaning.
+.PP
+Starting in Perl v5.26, if the modifier has a second \f(CW"x"\fR within it,
+the effect of a single \f(CW\*(C`/x\*(C'\fR is increased.  The only difference is that
+inside bracketed character classes, non-escaped (by a backslash) SPACE
+and TAB characters are not added to the class, and hence can be inserted
+to make the classes more readable:
+.PP
+.Vb 2
+\&    / [d\-e g\-i 3\-7]/xx
+\&    /[ ! @ " # $ % ^ & * () = ? <> \*(Aq ]/xx
+.Ve
+.PP
+may be easier to grasp than the squashed equivalents
+.PP
+.Vb 2
+\&    /[d\-eg\-i3\-7]/
+\&    /[!@"#$%^&*()=?<>\*(Aq]/
+.Ve
+.PP
+Note that this unfortunately doesn't mean that your bracketed classes
+can contain comments or extend over multiple lines.  A \f(CW\*(C`#\*(C'\fR inside a
+character class is still just a literal \f(CW\*(C`#\*(C'\fR, and doesn't introduce a
+comment.  And, unless the closing bracket is on the same line as the
+opening one, the newline character (and everything on the next line(s)
+until terminated by a \f(CW\*(C`]\*(C'\fR will be part of the class, just as if you'd
+written \f(CW\*(C`\en\*(C'\fR.
+.PP
+Taken together, these features go a long way towards
+making Perl's regular expressions more readable.  Here's an example:
+.PP
+.Vb 6
+\&    # Delete (most) C comments.
+\&    $program =~ s {
+\&        /\e*     # Match the opening delimiter.
+\&        .*?     # Match a minimal number of characters.
+\&        \e*/     # Match the closing delimiter.
+\&    } []gsx;
+.Ve
+.PP
+Note that anything inside
+a \f(CW\*(C`\eQ...\eE\*(C'\fR stays unaffected by \f(CW\*(C`/x\*(C'\fR.  And note that \f(CW\*(C`/x\*(C'\fR doesn't affect
+space interpretation within a single multi-character construct.  For
+example \f(CW\*(C`(?:...)\*(C'\fR can't have a space between the \f(CW"("\fR,
+\&\f(CW"?"\fR, and \f(CW":"\fR.  Within any delimiters for such a construct, allowed
+spaces are not affected by \f(CW\*(C`/x\*(C'\fR, and depend on the construct.  For
+example, all constructs using curly braces as delimiters, such as
+\&\f(CW\*(C`\ex{...}\*(C'\fR can have blanks within but adjacent to the braces, but not
+elsewhere, and no non-blank space characters.  An exception are Unicode
+properties which follow Unicode rules, for which see
+"Properties accessible through \ep{} and \eP{}" in perluniprops.
+.IX Xref " x"
+.PP
+The set of characters that are deemed whitespace are those that Unicode
+calls "Pattern White Space", namely:
+.PP
+.Vb 11
+\& U+0009 CHARACTER TABULATION
+\& U+000A LINE FEED
+\& U+000B LINE TABULATION
+\& U+000C FORM FEED
+\& U+000D CARRIAGE RETURN
+\& U+0020 SPACE
+\& U+0085 NEXT LINE
+\& U+200E LEFT\-TO\-RIGHT MARK
+\& U+200F RIGHT\-TO\-LEFT MARK
+\& U+2028 LINE SEPARATOR
+\& U+2029 PARAGRAPH SEPARATOR
+.Ve
+.PP
+Character set modifiers
+.IX Subsection "Character set modifiers"
+.PP
+\&\f(CW\*(C`/d\*(C'\fR, \f(CW\*(C`/u\*(C'\fR, \f(CW\*(C`/a\*(C'\fR, and \f(CW\*(C`/l\*(C'\fR, available starting in 5.14, are called
+the character set modifiers; they affect the character set rules
+used for the regular expression.
+.PP
+The \f(CW\*(C`/d\*(C'\fR, \f(CW\*(C`/u\*(C'\fR, and \f(CW\*(C`/l\*(C'\fR modifiers are not likely to be of much use
+to you, and so you need not worry about them very much.  They exist for
+Perl's internal use, so that complex regular expression data structures
+can be automatically serialized and later exactly reconstituted,
+including all their nuances.  But, since Perl can't keep a secret, and
+there may be rare instances where they are useful, they are documented
+here.
+.PP
+The \f(CW\*(C`/a\*(C'\fR modifier, on the other hand, may be useful.  Its purpose is to
+allow code that is to work mostly on ASCII data to not have to concern
+itself with Unicode.
+.PP
+Briefly, \f(CW\*(C`/l\*(C'\fR sets the character set to that of whatever \fBL\fRocale is in
+effect at the time of the execution of the pattern match.
+.PP
+\&\f(CW\*(C`/u\*(C'\fR sets the character set to \fBU\fRnicode.
+.PP
+\&\f(CW\*(C`/a\*(C'\fR also sets the character set to Unicode, BUT adds several
+restrictions for \fBA\fRSCII-safe matching.
+.PP
+\&\f(CW\*(C`/d\*(C'\fR is the old, problematic, pre\-5.14 \fBD\fRefault character set
+behavior.  Its only use is to force that old behavior.
+.PP
+At any given time, exactly one of these modifiers is in effect.  Their
+existence allows Perl to keep the originally compiled behavior of a
+regular expression, regardless of what rules are in effect when it is
+actually executed.  And if it is interpolated into a larger regex, the
+original's rules continue to apply to it, and don't affect the other
+parts.
+.PP
+The \f(CW\*(C`/l\*(C'\fR and \f(CW\*(C`/u\*(C'\fR modifiers are automatically selected for
+regular expressions compiled within the scope of various pragmas,
+and we recommend that in general, you use those pragmas instead of
+specifying these modifiers explicitly.  For one thing, the modifiers
+affect only pattern matching, and do not extend to even any replacement
+done, whereas using the pragmas gives consistent results for all
+appropriate operations within their scopes.  For example,
+.PP
+.Vb 1
+\& s/foo/\eUbar/il
+.Ve
+.PP
+will match "foo" using the locale's rules for case-insensitive matching,
+but the \f(CW\*(C`/l\*(C'\fR does not affect how the \f(CW\*(C`\eU\*(C'\fR operates.  Most likely you
+want both of them to use locale rules.  To do this, instead compile the
+regular expression within the scope of \f(CW\*(C`use locale\*(C'\fR.  This both
+implicitly adds the \f(CW\*(C`/l\*(C'\fR, and applies locale rules to the \f(CW\*(C`\eU\*(C'\fR.   The
+lesson is to \f(CW\*(C`use locale\*(C'\fR, and not \f(CW\*(C`/l\*(C'\fR explicitly.
+.PP
+Similarly, it would be better to use \f(CW\*(C`use feature \*(Aqunicode_strings\*(Aq\*(C'\fR
+instead of,
+.PP
+.Vb 1
+\& s/foo/\eLbar/iu
+.Ve
+.PP
+to get Unicode rules, as the \f(CW\*(C`\eL\*(C'\fR in the former (but not necessarily
+the latter) would also use Unicode rules.
+.PP
+More detail on each of the modifiers follows.  Most likely you don't
+need to know this detail for \f(CW\*(C`/l\*(C'\fR, \f(CW\*(C`/u\*(C'\fR, and \f(CW\*(C`/d\*(C'\fR, and can skip ahead
+to /a.
+.PP
+/l
+.IX Subsection "/l"
+.PP
+means to use the current locale's rules (see perllocale) when pattern
+matching.  For example, \f(CW\*(C`\ew\*(C'\fR will match the "word" characters of that
+locale, and \f(CW"/i"\fR case-insensitive matching will match according to
+the locale's case folding rules.  The locale used will be the one in
+effect at the time of execution of the pattern match.  This may not be
+the same as the compilation-time locale, and can differ from one match
+to another if there is an intervening call of the
+\&\fBsetlocale()\fR function.
+.PP
+Prior to v5.20, Perl did not support multi-byte locales.  Starting then,
+UTF\-8 locales are supported.  No other multi byte locales are ever
+likely to be supported.  However, in all locales, one can have code
+points above 255 and these will always be treated as Unicode no matter
+what locale is in effect.
+.PP
+Under Unicode rules, there are a few case-insensitive matches that cross
+the 255/256 boundary.  Except for UTF\-8 locales in Perls v5.20 and
+later, these are disallowed under \f(CW\*(C`/l\*(C'\fR.  For example, 0xFF (on ASCII
+platforms) does not caselessly match the character at 0x178, \f(CW\*(C`LATIN
+CAPITAL LETTER Y WITH DIAERESIS\*(C'\fR, because 0xFF may not be \f(CW\*(C`LATIN SMALL
+LETTER Y WITH DIAERESIS\*(C'\fR in the current locale, and Perl has no way of
+knowing if that character even exists in the locale, much less what code
+point it is.
+.PP
+In a UTF\-8 locale in v5.20 and later, the only visible difference
+between locale and non-locale in regular expressions should be tainting,
+if your perl supports taint checking (see perlsec).
+.PP
+This modifier may be specified to be the default by \f(CW\*(C`use locale\*(C'\fR, but
+see "Which character set modifier is in effect?".
+.IX Xref " l"
+.PP
+/u
+.IX Subsection "/u"
+.PP
+means to use Unicode rules when pattern matching.  On ASCII platforms,
+this means that the code points between 128 and 255 take on their
+Latin\-1 (ISO\-8859\-1) meanings (which are the same as Unicode's).
+(Otherwise Perl considers their meanings to be undefined.)  Thus,
+under this modifier, the ASCII platform effectively becomes a Unicode
+platform; and hence, for example, \f(CW\*(C`\ew\*(C'\fR will match any of the more than
+100_000 word characters in Unicode.
+.PP
+Unlike most locales, which are specific to a language and country pair,
+Unicode classifies all the characters that are letters \fIsomewhere\fR in
+the world as
+\&\f(CW\*(C`\ew\*(C'\fR.  For example, your locale might not think that \f(CW\*(C`LATIN SMALL
+LETTER ETH\*(C'\fR is a letter (unless you happen to speak Icelandic), but
+Unicode does.  Similarly, all the characters that are decimal digits
+somewhere in the world will match \f(CW\*(C`\ed\*(C'\fR; this is hundreds, not 10,
+possible matches.  And some of those digits look like some of the 10
+ASCII digits, but mean a different number, so a human could easily think
+a number is a different quantity than it really is.  For example,
+\&\f(CW\*(C`BENGALI DIGIT FOUR\*(C'\fR (U+09EA) looks very much like an
+\&\f(CW\*(C`ASCII DIGIT EIGHT\*(C'\fR (U+0038), and \f(CW\*(C`LEPCHA DIGIT SIX\*(C'\fR (U+1C46) looks
+very much like an \f(CW\*(C`ASCII DIGIT FIVE\*(C'\fR (U+0035).  And, \f(CW\*(C`\ed+\*(C'\fR, may match
+strings of digits that are a mixture from different writing systems,
+creating a security issue.  A fraudulent website, for example, could
+display the price of something using U+1C46, and it would appear to the
+user that something cost 500 units, but it really costs 600.  A browser
+that enforced script runs ("Script Runs") would prevent that
+fraudulent display.  "\fBnum()\fR" in Unicode::UCD can also be used to sort this
+out.  Or the \f(CW\*(C`/a\*(C'\fR modifier can be used to force \f(CW\*(C`\ed\*(C'\fR to match just the
+ASCII 0 through 9.
+.PP
+Also, under this modifier, case-insensitive matching works on the full
+set of Unicode
+characters.  The \f(CW\*(C`KELVIN SIGN\*(C'\fR, for example matches the letters "k" and
+"K"; and \f(CW\*(C`LATIN SMALL LIGATURE FF\*(C'\fR matches the sequence "ff", which,
+if you're not prepared, might make it look like a hexadecimal constant,
+presenting another potential security issue.  See
+<https://unicode.org/reports/tr36> for a detailed discussion of Unicode
+security issues.
+.PP
+This modifier may be specified to be the default by \f(CW\*(C`use feature
+\&\*(Aqunicode_strings\*(C'\fR, \f(CW\*(C`use locale \*(Aq:not_characters\*(Aq\*(C'\fR, or
+\&\f(CW\*(C`use v5.12\*(C'\fR (or higher),
+but see "Which character set modifier is in effect?".
+.IX Xref " u"
+.PP
+/d
+.IX Subsection "/d"
+.PP
+\&\fBIMPORTANT:\fR Because of the unpredictable behaviors this
+modifier causes, only use it to maintain weird backward compatibilities.
+Use the
+\&\f(CW\*(C`unicode_strings\*(C'\fR
+feature
+in new code to avoid inadvertently enabling this modifier by default.
+.PP
+What does this modifier do? It "Depends"!
+.PP
+This modifier means to use platform-native matching rules
+except when there is cause to use Unicode rules instead, as follows:
+.IP 1. 4
+the target string's UTF8 flag
+(see below) is set; or
+.IP 2. 4
+the pattern's UTF8 flag
+(see below) is set; or
+.IP 3. 4
+the pattern explicitly mentions a code point that is above 255 (say by
+\&\f(CW\*(C`\ex{100}\*(C'\fR); or
+.IP 4. 4
+the pattern uses a Unicode name (\f(CW\*(C`\eN{...}\*(C'\fR);  or
+.IP 5. 4
+the pattern uses a Unicode property (\f(CW\*(C`\ep{...}\*(C'\fR or \f(CW\*(C`\eP{...}\*(C'\fR); or
+.IP 6. 4
+the pattern uses a Unicode break (\f(CW\*(C`\eb{...}\*(C'\fR or \f(CW\*(C`\eB{...}\*(C'\fR); or
+.IP 7. 4
+the pattern uses \f(CW"(?[ ])"\fR
+.IP 8. 4
+the pattern uses \f(CW\*(C`(*script_run: ...)\*(C'\fR
+.PP
+Regarding the "UTF8 flag" references above: normally Perl applications
+shouldn't think about that flag. It's part of Perl's internals,
+so it can change whenever Perl wants. \f(CW\*(C`/d\*(C'\fR may thus cause unpredictable
+results. See "The "Unicode Bug"" in perlunicode. This bug
+has become rather infamous, leading to yet other (without swearing) names
+for this modifier like "Dicey" and "Dodgy".
+.PP
+Here are some examples of how that works on an ASCII platform:
+.PP
+.Vb 3
+\& $str =  "\exDF";        #
+\& utf8::downgrade($str); # $str is not UTF8\-flagged.
+\& $str =~ /^\ew/;         # No match, since no UTF8 flag.
+\&
+\& $str .= "\ex{0e0b}";    # Now $str is UTF8\-flagged.
+\& $str =~ /^\ew/;         # Match! $str is now UTF8\-flagged.
+\& chop $str;
+\& $str =~ /^\ew/;         # Still a match! $str retains its UTF8 flag.
+.Ve
+.PP
+Under Perl's default configuration this modifier is automatically
+selected by default when none of the others are, so yet another name
+for it (unfortunately) is "Default".
+.PP
+Whenever you can, use the
+\&\f(CW\*(C`unicode_strings\*(C'\fR
+to cause  to be the default instead.
+.IX Xref " u"
+.PP
+/a (and /aa)
+.IX Subsection "/a (and /aa)"
+.PP
+This modifier stands for ASCII-restrict (or ASCII-safe).  This modifier
+may be doubled-up to increase its effect.
+.PP
+When it appears singly, it causes the sequences \f(CW\*(C`\ed\*(C'\fR, \f(CW\*(C`\es\*(C'\fR, \f(CW\*(C`\ew\*(C'\fR, and
+the Posix character classes to match only in the ASCII range.  They thus
+revert to their pre\-5.6, pre-Unicode meanings.  Under \f(CW\*(C`/a\*(C'\fR,  \f(CW\*(C`\ed\*(C'\fR
+always means precisely the digits \f(CW"0"\fR to \f(CW"9"\fR; \f(CW\*(C`\es\*(C'\fR means the five
+characters \f(CW\*(C`[ \ef\en\er\et]\*(C'\fR, and starting in Perl v5.18, the vertical tab;
+\&\f(CW\*(C`\ew\*(C'\fR means the 63 characters
+\&\f(CW\*(C`[A\-Za\-z0\-9_]\*(C'\fR; and likewise, all the Posix classes such as
+\&\f(CW\*(C`[[:print:]]\*(C'\fR match only the appropriate ASCII-range characters.
+.PP
+This modifier is useful for people who only incidentally use Unicode,
+and who do not wish to be burdened with its complexities and security
+concerns.
+.PP
+With \f(CW\*(C`/a\*(C'\fR, one can write \f(CW\*(C`\ed\*(C'\fR with confidence that it will only match
+ASCII characters, and should the need arise to match beyond ASCII, you
+can instead use \f(CW\*(C`\ep{Digit}\*(C'\fR (or \f(CW\*(C`\ep{Word}\*(C'\fR for \f(CW\*(C`\ew\*(C'\fR).  There are
+similar \f(CW\*(C`\ep{...}\*(C'\fR constructs that can match beyond ASCII both white
+space (see "Whitespace" in perlrecharclass), and Posix classes (see
+"POSIX Character Classes" in perlrecharclass).  Thus, this modifier
+doesn't mean you can't use Unicode, it means that to get Unicode
+matching you must explicitly use a construct (\f(CW\*(C`\ep{}\*(C'\fR, \f(CW\*(C`\eP{}\*(C'\fR) that
+signals Unicode.
+.PP
+As you would expect, this modifier causes, for example, \f(CW\*(C`\eD\*(C'\fR to mean
+the same thing as \f(CW\*(C`[^0\-9]\*(C'\fR; in fact, all non-ASCII characters match
+\&\f(CW\*(C`\eD\*(C'\fR, \f(CW\*(C`\eS\*(C'\fR, and \f(CW\*(C`\eW\*(C'\fR.  \f(CW\*(C`\eb\*(C'\fR still means to match at the boundary
+between \f(CW\*(C`\ew\*(C'\fR and \f(CW\*(C`\eW\*(C'\fR, using the \f(CW\*(C`/a\*(C'\fR definitions of them (similarly
+for \f(CW\*(C`\eB\*(C'\fR).
+.PP
+Otherwise, \f(CW\*(C`/a\*(C'\fR behaves like the \f(CW\*(C`/u\*(C'\fR modifier, in that
+case-insensitive matching uses Unicode rules; for example, "k" will
+match the Unicode \f(CW\*(C`\eN{KELVIN SIGN}\*(C'\fR under \f(CW\*(C`/i\*(C'\fR matching, and code
+points in the Latin1 range, above ASCII will have Unicode rules when it
+comes to case-insensitive matching.
+.PP
+To forbid ASCII/non\-ASCII matches (like "k" with \f(CW\*(C`\eN{KELVIN SIGN}\*(C'\fR),
+specify the \f(CW"a"\fR twice, for example \f(CW\*(C`/aai\*(C'\fR or \f(CW\*(C`/aia\*(C'\fR.  (The first
+occurrence of \f(CW"a"\fR restricts the \f(CW\*(C`\ed\*(C'\fR, \fIetc\fR., and the second occurrence
+adds the \f(CW\*(C`/i\*(C'\fR restrictions.)  But, note that code points outside the
+ASCII range will use Unicode rules for \f(CW\*(C`/i\*(C'\fR matching, so the modifier
+doesn't really restrict things to just ASCII; it just forbids the
+intermixing of ASCII and non-ASCII.
+.PP
+To summarize, this modifier provides protection for applications that
+don't wish to be exposed to all of Unicode.  Specifying it twice
+gives added protection.
+.PP
+This modifier may be specified to be the default by \f(CW\*(C`use re \*(Aq/a\*(Aq\*(C'\fR
+or \f(CW\*(C`use re \*(Aq/aa\*(Aq\*(C'\fR.  If you do so, you may actually have occasion to use
+the \f(CW\*(C`/u\*(C'\fR modifier explicitly if there are a few regular expressions
+where you do want full Unicode rules (but even here, it's best if
+everything were under feature \f(CW"unicode_strings"\fR, along with the
+\&\f(CW\*(C`use re \*(Aq/aa\*(Aq\*(C'\fR).  Also see "Which character set modifier is in
+effect?".
+.IX Xref " a aa"
+.PP
+Which character set modifier is in effect?
+.IX Subsection "Which character set modifier is in effect?"
+.PP
+Which of these modifiers is in effect at any given point in a regular
+expression depends on a fairly complex set of interactions.  These have
+been designed so that in general you don't have to worry about it, but
+this section gives the gory details.  As
+explained below in "Extended Patterns" it is possible to explicitly
+specify modifiers that apply only to portions of a regular expression.
+The innermost always has priority over any outer ones, and one applying
+to the whole expression has priority over any of the default settings that are
+described in the remainder of this section.
+.PP
+The \f(CW\*(C`use re \*(Aq/foo\*(Aq\*(C'\fR pragma can be used to set
+default modifiers (including these) for regular expressions compiled
+within its scope.  This pragma has precedence over the other pragmas
+listed below that also change the defaults.
+.PP
+Otherwise, \f(CW\*(C`use locale\*(C'\fR sets the default modifier to \f(CW\*(C`/l\*(C'\fR;
+and \f(CW\*(C`use feature \*(Aqunicode_strings\*(C'\fR, or
+\&\f(CW\*(C`use v5.12\*(C'\fR (or higher) set the default to
+\&\f(CW\*(C`/u\*(C'\fR when not in the same scope as either \f(CW\*(C`use locale\*(C'\fR
+or \f(CW\*(C`use bytes\*(C'\fR.
+(\f(CW\*(C`use locale \*(Aq:not_characters\*(Aq\*(C'\fR also
+sets the default to \f(CW\*(C`/u\*(C'\fR, overriding any plain \f(CW\*(C`use locale\*(C'\fR.)
+Unlike the mechanisms mentioned above, these
+affect operations besides regular expressions pattern matching, and so
+give more consistent results with other operators, including using
+\&\f(CW\*(C`\eU\*(C'\fR, \f(CW\*(C`\el\*(C'\fR, \fIetc\fR. in substitution replacements.
+.PP
+If none of the above apply, for backwards compatibility reasons, the
+\&\f(CW\*(C`/d\*(C'\fR modifier is the one in effect by default.  As this can lead to
+unexpected results, it is best to specify which other rule set should be
+used.
+.PP
+Character set modifier behavior prior to Perl 5.14
+.IX Subsection "Character set modifier behavior prior to Perl 5.14"
+.PP
+Prior to 5.14, there were no explicit modifiers, but \f(CW\*(C`/l\*(C'\fR was implied
+for regexes compiled within the scope of \f(CW\*(C`use locale\*(C'\fR, and \f(CW\*(C`/d\*(C'\fR was
+implied otherwise.  However, interpolating a regex into a larger regex
+would ignore the original compilation in favor of whatever was in effect
+at the time of the second compilation.  There were a number of
+inconsistencies (bugs) with the \f(CW\*(C`/d\*(C'\fR modifier, where Unicode rules
+would be used when inappropriate, and vice versa.  \f(CW\*(C`\ep{}\*(C'\fR did not imply
+Unicode rules, and neither did all occurrences of \f(CW\*(C`\eN{}\*(C'\fR, until 5.12.
+.SS "Regular Expressions"
+.IX Subsection "Regular Expressions"
+\fIQuantifiers\fR
+.IX Subsection "Quantifiers"
+.PP
+Quantifiers are used when a particular portion of a pattern needs to
+match a certain number (or numbers) of times.  If there isn't a
+quantifier the number of times to match is exactly one.  The following
+standard quantifiers are recognized:
+.IX Xref "metacharacter quantifier * + ? {n} {n,} {n,m}"
+.PP
+.Vb 7
+\&    *           Match 0 or more times
+\&    +           Match 1 or more times
+\&    ?           Match 1 or 0 times
+\&    {n}         Match exactly n times
+\&    {n,}        Match at least n times
+\&    {,n}        Match at most n times
+\&    {n,m}       Match at least n but not more than m times
+.Ve
+.PP
+(If a non-escaped curly bracket occurs in a context other than one of
+the quantifiers listed above, where it does not form part of a
+backslashed sequence like \f(CW\*(C`\ex{...}\*(C'\fR, it is either a fatal syntax error,
+or treated as a regular character, generally with a deprecation warning
+raised.  To escape it, you can precede it with a backslash (\f(CW"\e{"\fR) or
+enclose it within square brackets  (\f(CW"[{]"\fR).
+This change will allow for future syntax extensions (like making the
+lower bound of a quantifier optional), and better error checking of
+quantifiers).
+.PP
+The \f(CW"*"\fR quantifier is equivalent to \f(CW\*(C`{0,}\*(C'\fR, the \f(CW"+"\fR
+quantifier to \f(CW\*(C`{1,}\*(C'\fR, and the \f(CW"?"\fR quantifier to \f(CW\*(C`{0,1}\*(C'\fR.  \fIn\fR and \fIm\fR are limited
+to non-negative integral values less than a preset limit defined when perl is built.
+This is usually 65534 on the most common platforms.  The actual limit can
+be seen in the error message generated by code such as this:
+.PP
+.Vb 1
+\&    $_ **= $_ , / {$_} / for 2 .. 42;
+.Ve
+.PP
+By default, a quantified subpattern is "greedy", that is, it will match as
+many times as possible (given a particular starting location) while still
+allowing the rest of the pattern to match.  If you want it to match the
+minimum number of times possible, follow the quantifier with a \f(CW"?"\fR.  Note
+that the meanings don't change, just the "greediness":
+.IX Xref "metacharacter greedy greediness ? *? +? ?? {n}? {n,}? {,n}? {n,m}?"
+.PP
+.Vb 7
+\&    *?        Match 0 or more times, not greedily
+\&    +?        Match 1 or more times, not greedily
+\&    ??        Match 0 or 1 time, not greedily
+\&    {n}?      Match exactly n times, not greedily (redundant)
+\&    {n,}?     Match at least n times, not greedily
+\&    {,n}?     Match at most n times, not greedily
+\&    {n,m}?    Match at least n but not more than m times, not greedily
+.Ve
+.PP
+Normally when a quantified subpattern does not allow the rest of the
+overall pattern to match, Perl will backtrack. However, this behaviour is
+sometimes undesirable. Thus Perl provides the "possessive" quantifier form
+as well.
+.PP
+.Vb 7
+\& *+     Match 0 or more times and give nothing back
+\& ++     Match 1 or more times and give nothing back
+\& ?+     Match 0 or 1 time and give nothing back
+\& {n}+   Match exactly n times and give nothing back (redundant)
+\& {n,}+  Match at least n times and give nothing back
+\& {,n}+  Match at most n times and give nothing back
+\& {n,m}+ Match at least n but not more than m times and give nothing back
+.Ve
+.PP
+For instance,
+.PP
+.Vb 1
+\&   \*(Aqaaaa\*(Aq =~ /a++a/
+.Ve
+.PP
+will never match, as the \f(CW\*(C`a++\*(C'\fR will gobble up all the \f(CW"a"\fR's in the
+string and won't leave any for the remaining part of the pattern. This
+feature can be extremely useful to give perl hints about where it
+shouldn't backtrack. For instance, the typical "match a double-quoted
+string" problem can be most efficiently performed when written as:
+.PP
+.Vb 1
+\&   /"(?:[^"\e\e]++|\e\e.)*+"/
+.Ve
+.PP
+as we know that if the final quote does not match, backtracking will not
+help. See the independent subexpression
+\&\f(CW"(?>\fR\f(CIpattern\fR\f(CW)"\fR for more details;
+possessive quantifiers are just syntactic sugar for that construct. For
+instance the above example could also be written as follows:
+.PP
+.Vb 1
+\&   /"(?>(?:(?>[^"\e\e]+)|\e\e.)*)"/
+.Ve
+.PP
+Note that the possessive quantifier modifier can not be combined
+with the non-greedy modifier. This is because it would make no sense.
+Consider the follow equivalency table:
+.PP
+.Vb 5
+\&    Illegal         Legal
+\&    \-\-\-\-\-\-\-\-\-\-\-\-    \-\-\-\-\-\-
+\&    X??+            X{0}
+\&    X+?+            X{1}
+\&    X{min,max}?+    X{min}
+.Ve
+.PP
+\fIEscape sequences\fR
+.IX Subsection "Escape sequences"
+.PP
+Because patterns are processed as double-quoted strings, the following
+also work:
+.PP
+.Vb 10
+\& \et          tab                   (HT, TAB)
+\& \en          newline               (LF, NL)
+\& \er          return                (CR)
+\& \ef          form feed             (FF)
+\& \ea          alarm (bell)          (BEL)
+\& \ee          escape (think troff)  (ESC)
+\& \ecK         control char          (example: VT)
+\& \ex{}, \ex00  character whose ordinal is the given hexadecimal number
+\& \eN{name}    named Unicode character or character sequence
+\& \eN{U+263D}  Unicode character     (example: FIRST QUARTER MOON)
+\& \eo{}, \e000  character whose ordinal is the given octal number
+\& \el          lowercase next char (think vi)
+\& \eu          uppercase next char (think vi)
+\& \eL          lowercase until \eE (think vi)
+\& \eU          uppercase until \eE (think vi)
+\& \eQ          quote (disable) pattern metacharacters until \eE
+\& \eE          end either case modification or quoted section, think vi
+.Ve
+.PP
+Details are in "Quote and Quote-like Operators" in perlop.
+.PP
+\fICharacter Classes and other Special Escapes\fR
+.IX Subsection "Character Classes and other Special Escapes"
+.PP
+In addition, Perl defines the following:
+.IX Xref "\\g \\k \\K backreference"
+.PP
+.Vb 10
+\& Sequence   Note    Description
+\&  [...]     [1]  Match a character according to the rules of the
+\&                   bracketed character class defined by the "...".
+\&                   Example: [a\-z] matches "a" or "b" or "c" ... or "z"
+\&  [[:...:]] [2]  Match a character according to the rules of the POSIX
+\&                   character class "..." within the outer bracketed
+\&                   character class.  Example: [[:upper:]] matches any
+\&                   uppercase character.
+\&  (?[...])  [8]  Extended bracketed character class
+\&  \ew        [3]  Match a "word" character (alphanumeric plus "_", plus
+\&                   other connector punctuation chars plus Unicode
+\&                   marks)
+\&  \eW        [3]  Match a non\-"word" character
+\&  \es        [3]  Match a whitespace character
+\&  \eS        [3]  Match a non\-whitespace character
+\&  \ed        [3]  Match a decimal digit character
+\&  \eD        [3]  Match a non\-digit character
+\&  \epP       [3]  Match P, named property.  Use \ep{Prop} for longer names
+\&  \ePP       [3]  Match non\-P
+\&  \eX        [4]  Match Unicode "eXtended grapheme cluster"
+\&  \e1        [5]  Backreference to a specific capture group or buffer.
+\&                   \*(Aq1\*(Aq may actually be any positive integer.
+\&  \eg1       [5]  Backreference to a specific or previous group,
+\&  \eg{\-1}    [5]  The number may be negative indicating a relative
+\&                   previous group and may optionally be wrapped in
+\&                   curly brackets for safer parsing.
+\&  \eg{name}  [5]  Named backreference
+\&  \ek<name>  [5]  Named backreference
+\&  \ek\*(Aqname\*(Aq  [5]  Named backreference
+\&  \ek{name}  [5]  Named backreference
+\&  \eK        [6]  Keep the stuff left of the \eK, don\*(Aqt include it in $&
+\&  \eN        [7]  Any character but \en.  Not affected by /s modifier
+\&  \ev        [3]  Vertical whitespace
+\&  \eV        [3]  Not vertical whitespace
+\&  \eh        [3]  Horizontal whitespace
+\&  \eH        [3]  Not horizontal whitespace
+\&  \eR        [4]  Linebreak
+.Ve
+.IP [1] 4
+.IX Item "[1]"
+See "Bracketed Character Classes" in perlrecharclass for details.
+.IP [2] 4
+.IX Item "[2]"
+See "POSIX Character Classes" in perlrecharclass for details.
+.IP [3] 4
+.IX Item "[3]"
+See "Unicode Character Properties" in perlunicode for details
+.IP [4] 4
+.IX Item "[4]"
+See "Misc" in perlrebackslash for details.
+.IP [5] 4
+.IX Item "[5]"
+See "Capture groups" below for details.
+.IP [6] 4
+.IX Item "[6]"
+See "Extended Patterns" below for details.
+.IP [7] 4
+.IX Item "[7]"
+Note that \f(CW\*(C`\eN\*(C'\fR has two meanings.  When of the form \f(CW\*(C`\eN{\fR\f(CINAME\fR\f(CW}\*(C'\fR, it
+matches the character or character sequence whose name is \fINAME\fR; and
+similarly
+when of the form \f(CW\*(C`\eN{U+\fR\f(CIhex\fR\f(CW}\*(C'\fR, it matches the character whose Unicode
+code point is \fIhex\fR.  Otherwise it matches any character but \f(CW\*(C`\en\*(C'\fR.
+.IP [8] 4
+.IX Item "[8]"
+See "Extended Bracketed Character Classes" in perlrecharclass for details.
+.PP
+\fIAssertions\fR
+.IX Subsection "Assertions"
+.PP
+Besides \f(CW"^"\fR and \f(CW"$"\fR, Perl defines the following
+zero-width assertions:
+.IX Xref "zero-width assertion assertion regex, zero-width assertion regexp, zero-width assertion regular expression, zero-width assertion \\b \\B \\A \\Z \\z \\G"
+.PP
+.Vb 9
+\& \eb{}   Match at Unicode boundary of specified type
+\& \eB{}   Match where corresponding \eb{} doesn\*(Aqt match
+\& \eb     Match a \ew\eW or \eW\ew boundary
+\& \eB     Match except at a \ew\eW or \eW\ew boundary
+\& \eA     Match only at beginning of string
+\& \eZ     Match only at end of string, or before newline at the end
+\& \ez     Match only at end of string
+\& \eG     Match only at pos() (e.g. at the end\-of\-match position
+\&        of prior m//g)
+.Ve
+.PP
+A Unicode boundary (\f(CW\*(C`\eb{}\*(C'\fR), available starting in v5.22, is a spot
+between two characters, or before the first character in the string, or
+after the final character in the string where certain criteria defined
+by Unicode are met.  See "\eb{}, \eb, \eB{}, \eB" in perlrebackslash for
+details.
+.PP
+A word boundary (\f(CW\*(C`\eb\*(C'\fR) is a spot between two characters
+that has a \f(CW\*(C`\ew\*(C'\fR on one side of it and a \f(CW\*(C`\eW\*(C'\fR on the other side
+of it (in either order), counting the imaginary characters off the
+beginning and end of the string as matching a \f(CW\*(C`\eW\*(C'\fR.  (Within
+character classes \f(CW\*(C`\eb\*(C'\fR represents backspace rather than a word
+boundary, just as it normally does in any double-quoted string.)
+The \f(CW\*(C`\eA\*(C'\fR and \f(CW\*(C`\eZ\*(C'\fR are just like \f(CW"^"\fR and \f(CW"$"\fR, except that they
+won't match multiple times when the \f(CW\*(C`/m\*(C'\fR modifier is used, while
+\&\f(CW"^"\fR and \f(CW"$"\fR will match at every internal line boundary.  To match
+the actual end of the string and not ignore an optional trailing
+newline, use \f(CW\*(C`\ez\*(C'\fR.
+.IX Xref "\\b \\A \\Z \\z m"
+.PP
+The \f(CW\*(C`\eG\*(C'\fR assertion can be used to chain global matches (using
+\&\f(CW\*(C`m//g\*(C'\fR), as described in "Regexp Quote-Like Operators" in perlop.
+It is also useful when writing \f(CW\*(C`lex\*(C'\fR\-like scanners, when you have
+several patterns that you want to match against consequent substrings
+of your string; see the previous reference.  The actual location
+where \f(CW\*(C`\eG\*(C'\fR will match can also be influenced by using \f(CWpos()\fR as
+an lvalue: see "pos" in perlfunc. Note that the rule for zero-length
+matches (see "Repeated Patterns Matching a Zero-length Substring")
+is modified somewhat, in that contents to the left of \f(CW\*(C`\eG\*(C'\fR are
+not counted when determining the length of the match. Thus the following
+will not match forever:
+.IX Xref "\\G"
+.PP
+.Vb 5
+\&     my $string = \*(AqABC\*(Aq;
+\&     pos($string) = 1;
+\&     while ($string =~ /(.\eG)/g) {
+\&         print $1;
+\&     }
+.Ve
+.PP
+It will print 'A' and then terminate, as it considers the match to
+be zero-width, and thus will not match at the same position twice in a
+row.
+.PP
+It is worth noting that \f(CW\*(C`\eG\*(C'\fR improperly used can result in an infinite
+loop. Take care when using patterns that include \f(CW\*(C`\eG\*(C'\fR in an alternation.
+.PP
+Note also that \f(CW\*(C`s///\*(C'\fR will refuse to overwrite part of a substitution
+that has already been replaced; so for example this will stop after the
+first iteration, rather than iterating its way backwards through the
+string:
+.PP
+.Vb 4
+\&    $_ = "123456789";
+\&    pos = 6;
+\&    s/.(?=.\eG)/X/g;
+\&    print;      # prints 1234X6789, not XXXXX6789
+.Ve
+.PP
+\fICapture groups\fR
+.IX Subsection "Capture groups"
+.PP
+The grouping construct \f(CW\*(C`( ... )\*(C'\fR creates capture groups (also referred to as
+capture buffers). To refer to the current contents of a group later on, within
+the same pattern, use \f(CW\*(C`\eg1\*(C'\fR (or \f(CW\*(C`\eg{1}\*(C'\fR) for the first, \f(CW\*(C`\eg2\*(C'\fR (or \f(CW\*(C`\eg{2}\*(C'\fR)
+for the second, and so on.
+This is called a \fIbackreference\fR.
+ 
+ 
+ 
+ 
+    
+ 
+ 
+  
+There is no limit to the number of captured substrings that you may use.
+Groups are numbered with the leftmost open parenthesis being number 1, \fIetc\fR.  If
+a group did not match, the associated backreference won't match either. (This
+can happen if the group is optional, or in a different branch of an
+alternation.)
+You can omit the \f(CW"g"\fR, and write \f(CW"\e1"\fR, \fIetc\fR, but there are some issues with
+this form, described below.
+.IX Xref "regex, capture buffer regexp, capture buffer regex, capture group regexp, capture group regular expression, capture buffer backreference regular expression, capture group backreference \\g{1} \\g{-1} \\g{name} relative backreference named backreference named capture buffer regular expression, named capture buffer named capture group regular expression, named capture group %+ $+{name} \\k<name>"
+.PP
+You can also refer to capture groups relatively, by using a negative number, so
+that \f(CW\*(C`\eg\-1\*(C'\fR and \f(CW\*(C`\eg{\-1}\*(C'\fR both refer to the immediately preceding capture
+group, and \f(CW\*(C`\eg\-2\*(C'\fR and \f(CW\*(C`\eg{\-2}\*(C'\fR both refer to the group before it.  For
+example:
+.PP
+.Vb 8
+\&        /
+\&         (Y)            # group 1
+\&         (              # group 2
+\&            (X)         # group 3
+\&            \eg{\-1}      # backref to group 3
+\&            \eg{\-3}      # backref to group 1
+\&         )
+\&        /x
+.Ve
+.PP
+would match the same as \f(CW\*(C`/(Y) ( (X) \eg3 \eg1 )/x\*(C'\fR.  This allows you to
+interpolate regexes into larger regexes and not have to worry about the
+capture groups being renumbered.
+.PP
+You can dispense with numbers altogether and create named capture groups.
+The notation is \f(CW\*(C`(?<\fR\f(CIname\fR\f(CW>...)\*(C'\fR to declare and \f(CW\*(C`\eg{\fR\f(CIname\fR\f(CW}\*(C'\fR to
+reference.  (To be compatible with .Net regular expressions, \f(CW\*(C`\eg{\fR\f(CIname\fR\f(CW}\*(C'\fR may
+also be written as \f(CW\*(C`\ek{\fR\f(CIname\fR\f(CW}\*(C'\fR, \f(CW\*(C`\ek<\fR\f(CIname\fR\f(CW>\*(C'\fR or \f(CW\*(C`\ek\*(Aq\fR\f(CIname\fR\f(CW\*(Aq\*(C'\fR.)
+\&\fIname\fR must not begin with a number, nor contain hyphens.
+When different groups within the same pattern have the same name, any reference
+to that name assumes the leftmost defined group.  Named groups count in
+absolute and relative numbering, and so can also be referred to by those
+numbers.
+(It's possible to do things with named capture groups that would otherwise
+require \f(CW\*(C`(??{})\*(C'\fR.)
+.PP
+Capture group contents are dynamically scoped and available to you outside the
+pattern until the end of the enclosing block or until the next successful
+match in the same scope, whichever comes first.
+See "Compound Statements" in perlsyn and
+"Scoping Rules of Regex Variables" in perlvar for more details.
+.PP
+You can access the contents of a capture group by absolute number (using
+\&\f(CW"$1"\fR instead of \f(CW"\eg1"\fR, \fIetc\fR); or by name via the \f(CW\*(C`%+\*(C'\fR hash,
+using \f(CW"$+{\fR\f(CIname\fR\f(CW}"\fR.
+.PP
+Braces are required in referring to named capture groups, but are optional for
+absolute or relative numbered ones.  Braces are safer when creating a regex by
+concatenating smaller strings.  For example if you have \f(CW\*(C`qr/$a$b/\*(C'\fR, and \f(CW$a\fR
+contained \f(CW"\eg1"\fR, and \f(CW$b\fR contained \f(CW"37"\fR, you would get \f(CW\*(C`/\eg137/\*(C'\fR which
+is probably not what you intended.
+.PP
+If you use braces, you may also optionally add any number of blank
+(space or tab) characters within but adjacent to the braces, like
+\&\f(CW\*(C`\eg{\ \-1\ }\*(C'\fR, or \f(CW\*(C`\ek{\ \fR\f(CIname\fR\f(CW\ }\*(C'\fR.
+.PP
+The \f(CW\*(C`\eg\*(C'\fR and \f(CW\*(C`\ek\*(C'\fR notations were introduced in Perl 5.10.0.  Prior to that
+there were no named nor relative numbered capture groups.  Absolute numbered
+groups were referred to using \f(CW\*(C`\e1\*(C'\fR,
+\&\f(CW\*(C`\e2\*(C'\fR, \fIetc\fR., and this notation is still
+accepted (and likely always will be).  But it leads to some ambiguities if
+there are more than 9 capture groups, as \f(CW\*(C`\e10\*(C'\fR could mean either the tenth
+capture group, or the character whose ordinal in octal is 010 (a backspace in
+ASCII).  Perl resolves this ambiguity by interpreting \f(CW\*(C`\e10\*(C'\fR as a backreference
+only if at least 10 left parentheses have opened before it.  Likewise \f(CW\*(C`\e11\*(C'\fR is
+a backreference only if at least 11 left parentheses have opened before it.
+And so on.  \f(CW\*(C`\e1\*(C'\fR through \f(CW\*(C`\e9\*(C'\fR are always interpreted as backreferences.
+There are several examples below that illustrate these perils.  You can avoid
+the ambiguity by always using \f(CW\*(C`\eg{}\*(C'\fR or \f(CW\*(C`\eg\*(C'\fR if you mean capturing groups;
+and for octal constants always using \f(CW\*(C`\eo{}\*(C'\fR, or for \f(CW\*(C`\e077\*(C'\fR and below, using 3
+digits padded with leading zeros, since a leading zero implies an octal
+constant.
+.PP
+The \f(CW\*(C`\e\fR\f(CIdigit\fR\f(CW\*(C'\fR notation also works in certain circumstances outside
+the pattern.  See "Warning on \e1 Instead of \f(CW$1\fR" below for details.
+.PP
+Examples:
+.PP
+.Vb 1
+\&    s/^([^ ]*) *([^ ]*)/$2 $1/;     # swap first two words
+\&
+\&    /(.)\eg1/                        # find first doubled char
+\&         and print "\*(Aq$1\*(Aq is the first doubled character\en";
+\&
+\&    /(?<char>.)\ek<char>/            # ... a different way
+\&         and print "\*(Aq$+{char}\*(Aq is the first doubled character\en";
+\&
+\&    /(?\*(Aqchar\*(Aq.)\eg1/                 # ... mix and match
+\&         and print "\*(Aq$1\*(Aq is the first doubled character\en";
+\&
+\&    if (/Time: (..):(..):(..)/) {   # parse out values
+\&        $hours = $1;
+\&        $minutes = $2;
+\&        $seconds = $3;
+\&    }
+\&
+\&    /(.)(.)(.)(.)(.)(.)(.)(.)(.)\eg10/   # \eg10 is a backreference
+\&    /(.)(.)(.)(.)(.)(.)(.)(.)(.)\e10/    # \e10 is octal
+\&    /((.)(.)(.)(.)(.)(.)(.)(.)(.))\e10/  # \e10 is a backreference
+\&    /((.)(.)(.)(.)(.)(.)(.)(.)(.))\e010/ # \e010 is octal
+\&
+\&    $a = \*(Aq(.)\e1\*(Aq;        # Creates problems when concatenated.
+\&    $b = \*(Aq(.)\eg{1}\*(Aq;     # Avoids the problems.
+\&    "aa" =~ /${a}/;      # True
+\&    "aa" =~ /${b}/;      # True
+\&    "aa0" =~ /${a}0/;    # False!
+\&    "aa0" =~ /${b}0/;    # True
+\&    "aa\ex08" =~ /${a}0/;  # True!
+\&    "aa\ex08" =~ /${b}0/;  # False
+.Ve
+.PP
+Several special variables also refer back to portions of the previous
+match.  \f(CW$+\fR returns whatever the last bracket match matched.
+\&\f(CW$&\fR returns the entire matched string.  (At one point \f(CW$0\fR did
+also, but now it returns the name of the program.)  \f(CW\*(C`$\`\*(C'\fR returns
+everything before the matched string.  \f(CW\*(C`$\*(Aq\*(C'\fR returns everything
+after the matched string. And \f(CW$^N\fR contains whatever was matched by
+the most-recently closed group (submatch). \f(CW$^N\fR can be used in
+extended patterns (see below), for example to assign a submatch to a
+variable.
+.IX Xref "$+ $^N $& $` $'"
+.PP
+These special variables, like the \f(CW\*(C`%+\*(C'\fR hash and the numbered match variables
+(\f(CW$1\fR, \f(CW$2\fR, \f(CW$3\fR, \fIetc\fR.) are dynamically scoped
+until the end of the enclosing block or until the next successful
+match, whichever comes first.  (See "Compound Statements" in perlsyn.)
+.IX Xref "$+ $^N $& $` $' $1 $2 $3 $4 $5 $6 $7 $8 $9 @{^CAPTURE}"
+.PP
+The \f(CW\*(C`@{^CAPTURE}\*(C'\fR array may be used to access ALL of the capture buffers
+as an array without needing to know how many there are. For instance
+.PP
+.Vb 1
+\&    $string=~/$pattern/ and @captured = @{^CAPTURE};
+.Ve
+.PP
+will place a copy of each capture variable, \f(CW$1\fR, \f(CW$2\fR etc, into the
+\&\f(CW@captured\fR array.
+.PP
+Be aware that when interpolating a subscript of the \f(CW\*(C`@{^CAPTURE}\*(C'\fR
+array you must use demarcated curly brace notation:
+.PP
+.Vb 1
+\&    print "@{^CAPTURE[0]}";
+.Ve
+.PP
+See "Demarcated variable names using braces" in perldata for more on
+this notation.
+.PP
+\&\fBNOTE\fR: Failed matches in Perl do not reset the match variables,
+which makes it easier to write code that tests for a series of more
+specific cases and remembers the best match.
+.PP
+\&\fBWARNING\fR: If your code is to run on Perl 5.16 or earlier,
+beware that once Perl sees that you need one of \f(CW$&\fR, \f(CW\*(C`$\`\*(C'\fR, or
+\&\f(CW\*(C`$\*(Aq\*(C'\fR anywhere in the program, it has to provide them for every
+pattern match.  This may substantially slow your program.
+.PP
+Perl uses the same mechanism to produce \f(CW$1\fR, \f(CW$2\fR, \fIetc\fR, so you also
+pay a price for each pattern that contains capturing parentheses.
+(To avoid this cost while retaining the grouping behaviour, use the
+extended regular expression \f(CW\*(C`(?: ... )\*(C'\fR instead.)  But if you never
+use \f(CW$&\fR, \f(CW\*(C`$\`\*(C'\fR or \f(CW\*(C`$\*(Aq\*(C'\fR, then patterns \fIwithout\fR capturing
+parentheses will not be penalized.  So avoid \f(CW$&\fR, \f(CW\*(C`$\*(Aq\*(C'\fR, and \f(CW\*(C`$\`\*(C'\fR
+if you can, but if you can't (and some algorithms really appreciate
+them), once you've used them once, use them at will, because you've
+already paid the price.
+.IX Xref "$& $` $'"
+.PP
+Perl 5.16 introduced a slightly more efficient mechanism that notes
+separately whether each of \f(CW\*(C`$\`\*(C'\fR, \f(CW$&\fR, and \f(CW\*(C`$\*(Aq\*(C'\fR have been seen, and
+thus may only need to copy part of the string.  Perl 5.20 introduced a
+much more efficient copy-on-write mechanism which eliminates any slowdown.
+.PP
+As another workaround for this problem, Perl 5.10.0 introduced \f(CW\*(C`${^PREMATCH}\*(C'\fR,
+\&\f(CW\*(C`${^MATCH}\*(C'\fR and \f(CW\*(C`${^POSTMATCH}\*(C'\fR, which are equivalent to \f(CW\*(C`$\`\*(C'\fR, \f(CW$&\fR
+and \f(CW\*(C`$\*(Aq\*(C'\fR, \fBexcept\fR that they are only guaranteed to be defined after a
+successful match that was executed with the \f(CW\*(C`/p\*(C'\fR (preserve) modifier.
+The use of these variables incurs no global performance penalty, unlike
+their punctuation character equivalents, however at the trade-off that you
+have to tell perl when you want to use them.  As of Perl 5.20, these three
+variables are equivalent to \f(CW\*(C`$\`\*(C'\fR, \f(CW$&\fR and \f(CW\*(C`$\*(Aq\*(C'\fR, and \f(CW\*(C`/p\*(C'\fR is ignored.
+.IX Xref " p p modifier"
+.SS "Quoting metacharacters"
+.IX Subsection "Quoting metacharacters"
+Backslashed metacharacters in Perl are alphanumeric, such as \f(CW\*(C`\eb\*(C'\fR,
+\&\f(CW\*(C`\ew\*(C'\fR, \f(CW\*(C`\en\*(C'\fR.  Unlike some other regular expression languages, there
+are no backslashed symbols that aren't alphanumeric.  So anything
+that looks like \f(CW\*(C`\e\e\*(C'\fR, \f(CW\*(C`\e(\*(C'\fR, \f(CW\*(C`\e)\*(C'\fR, \f(CW\*(C`\e[\*(C'\fR, \f(CW\*(C`\e]\*(C'\fR, \f(CW\*(C`\e{\*(C'\fR, or \f(CW\*(C`\e}\*(C'\fR is
+always
+interpreted as a literal character, not a metacharacter.  This was
+once used in a common idiom to disable or quote the special meanings
+of regular expression metacharacters in a string that you want to
+use for a pattern. Simply quote all non\-"word" characters:
+.PP
+.Vb 1
+\&    $pattern =~ s/(\eW)/\e\e$1/g;
+.Ve
+.PP
+(If \f(CW\*(C`use locale\*(C'\fR is set, then this depends on the current locale.)
+Today it is more common to use the \f(CWquotemeta()\fR
+function or the \f(CW\*(C`\eQ\*(C'\fR metaquoting escape sequence to disable all
+metacharacters' special meanings like this:
+.PP
+.Vb 1
+\&    /$unquoted\eQ$quoted\eE$unquoted/
+.Ve
+.PP
+Beware that if you put literal backslashes (those not inside
+interpolated variables) between \f(CW\*(C`\eQ\*(C'\fR and \f(CW\*(C`\eE\*(C'\fR, double-quotish
+backslash interpolation may lead to confusing results.  If you
+\&\fIneed\fR to use literal backslashes within \f(CW\*(C`\eQ...\eE\*(C'\fR,
+consult "Gory details of parsing quoted constructs" in perlop.
+.PP
+\&\f(CWquotemeta()\fR and \f(CW\*(C`\eQ\*(C'\fR are fully described in "quotemeta" in perlfunc.
+.SS "Extended Patterns"
+.IX Subsection "Extended Patterns"
+Perl also defines a consistent extension syntax for features not
+found in standard tools like \fBawk\fR and
+\&\fBlex\fR.  The syntax for most of these is a
+pair of parentheses with a question mark as the first thing within
+the parentheses.  The character after the question mark indicates
+the extension.
+.PP
+A question mark was chosen for this and for the minimal-matching
+construct because 1) question marks are rare in older regular
+expressions, and 2) whenever you see one, you should stop and
+"question" exactly what is going on.  That's psychology....
+.ie n .IP """(?#\fItext\fR)""" 4
+.el .IP \f(CW(?#\fR\f(CItext\fR\f(CW)\fR 4
+.IX Xref "(?#)"
+.IX Item "(?#text)"
+A comment.  The \fItext\fR is ignored.
+Note that Perl closes
+the comment as soon as it sees a \f(CW")"\fR, so there is no way to put a literal
+\&\f(CW")"\fR in the comment.  The pattern's closing delimiter must be escaped by
+a backslash if it appears in the comment.
+.Sp
+See "/x" for another way to have comments in patterns.
+.Sp
+Note that a comment can go just about anywhere, except in the middle of
+an escape sequence.   Examples:
+.Sp
+.Vb 1
+\& qr/foo(?#comment)bar/\*(Aq  # Matches \*(Aqfoobar\*(Aq
+\&
+\& # The pattern below matches \*(Aqabcd\*(Aq, \*(Aqabccd\*(Aq, or \*(Aqabcccd\*(Aq
+\& qr/abc(?#comment between literal and its quantifier){1,3}d/
+\&
+\& # The pattern below generates a syntax error, because the \*(Aq\ep\*(Aq must
+\& # be followed immediately by a \*(Aq{\*(Aq.
+\& qr/\ep(?#comment between \ep and its property name){Any}/
+\&
+\& # The pattern below generates a syntax error, because the initial
+\& # \*(Aq\e(\*(Aq is a literal opening parenthesis, and so there is nothing
+\& # for the  closing \*(Aq)\*(Aq to match
+\& qr/\e(?#the backslash means this isn\*(Aqt a comment)p{Any}/
+\&
+\& # Comments can be used to fold long patterns into multiple lines
+\& qr/First part of a long regex(?#
+\&   )remaining part/
+.Ve
+.ie n .IP """(?adlupimnsx\-imnsx)""" 4
+.el .IP \f(CW(?adlupimnsx\-imnsx)\fR 4
+.IX Item "(?adlupimnsx-imnsx)"
+.PD 0
+.ie n .IP """(?^alupimnsx)""" 4
+.el .IP \f(CW(?^alupimnsx)\fR 4
+.IX Xref "(?) (?^)"
+.IX Item "(?^alupimnsx)"
+.PD
+Zero or more embedded pattern-match modifiers, to be turned on (or
+turned off if preceded by \f(CW"\-"\fR) for the remainder of the pattern or
+the remainder of the enclosing pattern group (if any).
+.Sp
+This is particularly useful for dynamically-generated patterns,
+such as those read in from a
+configuration file, taken from an argument, or specified in a table
+somewhere.  Consider the case where some patterns want to be
+case-sensitive and some do not:  The case-insensitive ones merely need to
+include \f(CW\*(C`(?i)\*(C'\fR at the front of the pattern.  For example:
+.Sp
+.Vb 2
+\&    $pattern = "foobar";
+\&    if ( /$pattern/i ) { }
+\&
+\&    # more flexible:
+\&
+\&    $pattern = "(?i)foobar";
+\&    if ( /$pattern/ ) { }
+.Ve
+.Sp
+These modifiers are restored at the end of the enclosing group. For example,
+.Sp
+.Vb 1
+\&    ( (?i) blah ) \es+ \eg1
+.Ve
+.Sp
+will match \f(CW\*(C`blah\*(C'\fR in any case, some spaces, and an exact (\fIincluding the case\fR!)
+repetition of the previous word, assuming the \f(CW\*(C`/x\*(C'\fR modifier, and no \f(CW\*(C`/i\*(C'\fR
+modifier outside this group.
+.Sp
+These modifiers do not carry over into named subpatterns called in the
+enclosing group. In other words, a pattern such as \f(CW\*(C`((?i)(?&\fR\f(CINAME\fR\f(CW))\*(C'\fR does not
+change the case-sensitivity of the \fINAME\fR pattern.
+.Sp
+A modifier is overridden by later occurrences of this construct in the
+same scope containing the same modifier, so that
+.Sp
+.Vb 1
+\&    /((?im)foo(?\-m)bar)/
+.Ve
+.Sp
+matches all of \f(CW\*(C`foobar\*(C'\fR case insensitively, but uses \f(CW\*(C`/m\*(C'\fR rules for
+only the \f(CW\*(C`foo\*(C'\fR portion.  The \f(CW"a"\fR flag overrides \f(CW\*(C`aa\*(C'\fR as well;
+likewise \f(CW\*(C`aa\*(C'\fR overrides \f(CW"a"\fR.  The same goes for \f(CW"x"\fR and \f(CW\*(C`xx\*(C'\fR.
+Hence, in
+.Sp
+.Vb 1
+\&    /(?\-x)foo/xx
+.Ve
+.Sp
+both \f(CW\*(C`/x\*(C'\fR and \f(CW\*(C`/xx\*(C'\fR are turned off during matching \f(CW\*(C`foo\*(C'\fR.  And in
+.Sp
+.Vb 1
+\&    /(?x)foo/x
+.Ve
+.Sp
+\&\f(CW\*(C`/x\*(C'\fR but NOT \f(CW\*(C`/xx\*(C'\fR is turned on for matching \f(CW\*(C`foo\*(C'\fR.  (One might
+mistakenly think that since the inner \f(CW\*(C`(?x)\*(C'\fR is already in the scope of
+\&\f(CW\*(C`/x\*(C'\fR, that the result would effectively be the sum of them, yielding
+\&\f(CW\*(C`/xx\*(C'\fR.  It doesn't work that way.)  Similarly, doing something like
+\&\f(CW\*(C`(?xx\-x)foo\*(C'\fR turns off all \f(CW"x"\fR behavior for matching \f(CW\*(C`foo\*(C'\fR, it is not
+that you subtract 1 \f(CW"x"\fR from 2 to get 1 \f(CW"x"\fR remaining.
+.Sp
+Any of these modifiers can be set to apply globally to all regular
+expressions compiled within the scope of a \f(CW\*(C`use re\*(C'\fR.  See
+"'/flags' mode" in re.
+.Sp
+Starting in Perl 5.14, a \f(CW"^"\fR (caret or circumflex accent) immediately
+after the \f(CW"?"\fR is a shorthand equivalent to \f(CW\*(C`d\-imnsx\*(C'\fR.  Flags (except
+\&\f(CW"d"\fR) may follow the caret to override it.
+But a minus sign is not legal with it.
+.Sp
+Note that the \f(CW"a"\fR, \f(CW"d"\fR, \f(CW"l"\fR, \f(CW"p"\fR, and \f(CW"u"\fR modifiers are special in
+that they can only be enabled, not disabled, and the \f(CW"a"\fR, \f(CW"d"\fR, \f(CW"l"\fR, and
+\&\f(CW"u"\fR modifiers are mutually exclusive: specifying one de-specifies the
+others, and a maximum of one (or two \f(CW"a"\fR's) may appear in the
+construct.  Thus, for
+example, \f(CW\*(C`(?\-p)\*(C'\fR will warn when compiled under \f(CW\*(C`use warnings\*(C'\fR;
+\&\f(CW\*(C`(?\-d:...)\*(C'\fR and \f(CW\*(C`(?dl:...)\*(C'\fR are fatal errors.
+.Sp
+Note also that the \f(CW"p"\fR modifier is special in that its presence
+anywhere in a pattern has a global effect.
+.Sp
+Having zero modifiers makes this a no-op (so why did you specify it,
+unless it's generated code), and starting in v5.30, warns under \f(CW\*(C`use
+re \*(Aqstrict\*(Aq\*(C'\fR.
+.ie n .IP """(?:\fIpattern\fR)""" 4
+.el .IP \f(CW(?:\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Xref "(?:)"
+.IX Item "(?:pattern)"
+.PD 0
+.ie n .IP """(?adluimnsx\-imnsx:\fIpattern\fR)""" 4
+.el .IP \f(CW(?adluimnsx\-imnsx:\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Item "(?adluimnsx-imnsx:pattern)"
+.ie n .IP """(?^aluimnsx:\fIpattern\fR)""" 4
+.el .IP \f(CW(?^aluimnsx:\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Xref "(?^:)"
+.IX Item "(?^aluimnsx:pattern)"
+.PD
+This is for clustering, not capturing; it groups subexpressions like
+\&\f(CW"()"\fR, but doesn't make backreferences as \f(CW"()"\fR does.  So
+.Sp
+.Vb 1
+\&    @fields = split(/\eb(?:a|b|c)\eb/)
+.Ve
+.Sp
+matches the same field delimiters as
+.Sp
+.Vb 1
+\&    @fields = split(/\eb(a|b|c)\eb/)
+.Ve
+.Sp
+but doesn't spit out the delimiters themselves as extra fields (even though
+that's the behaviour of "split" in perlfunc when its pattern contains capturing
+groups).  It's also cheaper not to capture
+characters if you don't need to.
+.Sp
+Any letters between \f(CW"?"\fR and \f(CW":"\fR act as flags modifiers as with
+\&\f(CW\*(C`(?adluimnsx\-imnsx)\*(C'\fR.  For example,
+.Sp
+.Vb 1
+\&    /(?s\-i:more.*than).*million/i
+.Ve
+.Sp
+is equivalent to the more verbose
+.Sp
+.Vb 1
+\&    /(?:(?s\-i)more.*than).*million/i
+.Ve
+.Sp
+Note that any \f(CW\*(C`()\*(C'\fR constructs enclosed within this one will still
+capture unless the \f(CW\*(C`/n\*(C'\fR modifier is in effect.
+.Sp
+Like the "(?adlupimnsx\-imnsx)" construct, \f(CW\*(C`aa\*(C'\fR and \f(CW"a"\fR override each
+other, as do \f(CW\*(C`xx\*(C'\fR and \f(CW"x"\fR.  They are not additive.  So, doing
+something like \f(CW\*(C`(?xx\-x:foo)\*(C'\fR turns off all \f(CW"x"\fR behavior for matching
+\&\f(CW\*(C`foo\*(C'\fR.
+.Sp
+Starting in Perl 5.14, a \f(CW"^"\fR (caret or circumflex accent) immediately
+after the \f(CW"?"\fR is a shorthand equivalent to \f(CW\*(C`d\-imnsx\*(C'\fR.  Any positive
+flags (except \f(CW"d"\fR) may follow the caret, so
+.Sp
+.Vb 1
+\&    (?^x:foo)
+.Ve
+.Sp
+is equivalent to
+.Sp
+.Vb 1
+\&    (?x\-imns:foo)
+.Ve
+.Sp
+The caret tells Perl that this cluster doesn't inherit the flags of any
+surrounding pattern, but uses the system defaults (\f(CW\*(C`d\-imnsx\*(C'\fR),
+modified by any flags specified.
+.Sp
+The caret allows for simpler stringification of compiled regular
+expressions.  These look like
+.Sp
+.Vb 1
+\&    (?^:pattern)
+.Ve
+.Sp
+with any non-default flags appearing between the caret and the colon.
+A test that looks at such stringification thus doesn't need to have the
+system default flags hard-coded in it, just the caret.  If new flags are
+added to Perl, the meaning of the caret's expansion will change to include
+the default for those flags, so the test will still work, unchanged.
+.Sp
+Specifying a negative flag after the caret is an error, as the flag is
+redundant.
+.Sp
+Mnemonic for \f(CW\*(C`(?^...)\*(C'\fR:  A fresh beginning since the usual use of a caret is
+to match at the beginning.
+.ie n .IP """(?|\fIpattern\fR)""" 4
+.el .IP \f(CW(?|\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Xref "(?|) Branch reset"
+.IX Item "(?|pattern)"
+This is the "branch reset" pattern, which has the special property
+that the capture groups are numbered from the same starting point
+in each alternation branch. It is available starting from perl 5.10.0.
+.Sp
+Capture groups are numbered from left to right, but inside this
+construct the numbering is restarted for each branch.
+.Sp
+The numbering within each branch will be as normal, and any groups
+following this construct will be numbered as though the construct
+contained only one branch, that being the one with the most capture
+groups in it.
+.Sp
+This construct is useful when you want to capture one of a
+number of alternative matches.
+.Sp
+Consider the following pattern.  The numbers underneath show in
+which group the captured content will be stored.
+.Sp
+.Vb 3
+\&    # before  \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-branch\-reset\-\-\-\-\-\-\-\-\-\-\- after
+\&    / ( a )  (?| x ( y ) z | (p (q) r) | (t) u (v) ) ( z ) /x
+\&    # 1            2         2  3        2     3     4
+.Ve
+.Sp
+Be careful when using the branch reset pattern in combination with
+named captures. Named captures are implemented as being aliases to
+numbered groups holding the captures, and that interferes with the
+implementation of the branch reset pattern. If you are using named
+captures in a branch reset pattern, it's best to use the same names,
+in the same order, in each of the alternations:
+.Sp
+.Vb 2
+\&   /(?|  (?<a> x ) (?<b> y )
+\&      |  (?<a> z ) (?<b> w )) /x
+.Ve
+.Sp
+Not doing so may lead to surprises:
+.Sp
+.Vb 3
+\&  "12" =~ /(?| (?<a> \ed+ ) | (?<b> \eD+))/x;
+\&  say $+{a};    # Prints \*(Aq12\*(Aq
+\&  say $+{b};    # *Also* prints \*(Aq12\*(Aq.
+.Ve
+.Sp
+The problem here is that both the group named \f(CW\*(C`a\*(C'\fR and the group
+named \f(CW\*(C`b\*(C'\fR are aliases for the group belonging to \f(CW$1\fR.
+.IP "Lookaround Assertions" 4
+.IX Xref "look-around assertion lookaround assertion look-around lookaround"
+.IX Item "Lookaround Assertions"
+Lookaround assertions are zero-width patterns which match a specific
+pattern without including it in \f(CW$&\fR. Positive assertions match when
+their subpattern matches, negative assertions match when their subpattern
+fails. Lookbehind matches text up to the current match position,
+lookahead matches text following the current match position.
+.RS 4
+.ie n .IP """(?=\fIpattern\fR)""" 4
+.el .IP \f(CW(?=\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Item "(?=pattern)"
+.PD 0
+.ie n .IP """(*pla:\fIpattern\fR)""" 4
+.el .IP \f(CW(*pla:\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Item "(*pla:pattern)"
+.ie n .IP """(*positive_lookahead:\fIpattern\fR)""" 4
+.el .IP \f(CW(*positive_lookahead:\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Xref "(?=) (*pla (*positive_lookahead look-ahead, positive lookahead, positive"
+.IX Item "(*positive_lookahead:pattern)"
+.PD
+A zero-width positive lookahead assertion.  For example, \f(CW\*(C`/\ew+(?=\et)/\*(C'\fR
+matches a word followed by a tab, without including the tab in \f(CW$&\fR.
+.ie n .IP """(?!\fIpattern\fR)""" 4
+.el .IP \f(CW(?!\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Item "(?!pattern)"
+.PD 0
+.ie n .IP """(*nla:\fIpattern\fR)""" 4
+.el .IP \f(CW(*nla:\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Item "(*nla:pattern)"
+.ie n .IP """(*negative_lookahead:\fIpattern\fR)""" 4
+.el .IP \f(CW(*negative_lookahead:\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Xref "(?!) (*nla (*negative_lookahead look-ahead, negative lookahead, negative"
+.IX Item "(*negative_lookahead:pattern)"
+.PD
+A zero-width negative lookahead assertion.  For example \f(CW\*(C`/foo(?!bar)/\*(C'\fR
+matches any occurrence of "foo" that isn't followed by "bar".  Note
+however that lookahead and lookbehind are NOT the same thing.  You cannot
+use this for lookbehind.
+.Sp
+If you are looking for a "bar" that isn't preceded by a "foo", \f(CW\*(C`/(?!foo)bar/\*(C'\fR
+will not do what you want.  That's because the \f(CW\*(C`(?!foo)\*(C'\fR is just saying that
+the next thing cannot be "foo"\-\-and it's not, it's a "bar", so "foobar" will
+match.  Use lookbehind instead (see below).
+.ie n .IP """(?<=\fIpattern\fR)""" 4
+.el .IP \f(CW(?<=\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Item "(?<=pattern)"
+.PD 0
+.ie n .IP """\eK""" 4
+.el .IP \f(CW\eK\fR 4
+.IX Item "K"
+.ie n .IP """(*plb:\fIpattern\fR)""" 4
+.el .IP \f(CW(*plb:\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Item "(*plb:pattern)"
+.ie n .IP """(*positive_lookbehind:\fIpattern\fR)""" 4
+.el .IP \f(CW(*positive_lookbehind:\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Xref "(?<=) (*plb (*positive_lookbehind look-behind, positive lookbehind, positive \\K"
+.IX Item "(*positive_lookbehind:pattern)"
+.PD
+A zero-width positive lookbehind assertion.  For example, \f(CW\*(C`/(?<=\et)\ew+/\*(C'\fR
+matches a word that follows a tab, without including the tab in \f(CW$&\fR.
+.Sp
+Prior to Perl 5.30, it worked only for fixed-width lookbehind, but
+starting in that release, it can handle variable lengths from 1 to 255
+characters as an experimental feature.  The feature is enabled
+automatically if you use a variable length positive lookbehind assertion.
+.Sp
+In Perl 5.35.10 the scope of the experimental nature of this construct
+has been reduced, and experimental warnings will only be produced when
+the construct contains capturing parenthesis. The warnings will be
+raised at pattern compilation time, unless turned off, in the
+\&\f(CW\*(C`experimental::vlb\*(C'\fR category.  This is to warn you that the exact
+contents of capturing buffers in a variable length positive lookbehind
+is not well defined and is subject to change in a future release of perl.
+.Sp
+Currently if you use capture buffers inside of a positive variable length
+lookbehind the result will be the longest and thus leftmost match possible.
+This means that
+.Sp
+.Vb 4
+\&    "aax" =~ /(?=x)(?<=(a|aa))/
+\&    "aax" =~ /(?=x)(?<=(aa|a))/
+\&    "aax" =~ /(?=x)(?<=(a{1,2}?)/
+\&    "aax" =~ /(?=x)(?<=(a{1,2})/
+.Ve
+.Sp
+will all result in \f(CW$1\fR containing \f(CW"aa"\fR. It is possible in a future
+release of perl we will change this behavior.
+.Sp
+There is a special form of this construct, called \f(CW\*(C`\eK\*(C'\fR
+(available since Perl 5.10.0), which causes the
+regex engine to "keep" everything it had matched prior to the \f(CW\*(C`\eK\*(C'\fR and
+not include it in \f(CW$&\fR. This effectively provides non-experimental
+variable-length lookbehind of any length.
+.Sp
+And, there is a technique that can be used to handle variable length
+lookbehinds on earlier releases, and longer than 255 characters.  It is
+described in
+<http://www.drregex.com/2019/02/variable\-length\-lookbehinds\-actually.html>.
+.Sp
+Note that under \f(CW\*(C`/i\*(C'\fR, a few single characters match two or three other
+characters.  This makes them variable length, and the 255 length applies
+to the maximum number of characters in the match.  For
+example \f(CW\*(C`qr/\eN{LATIN SMALL LETTER SHARP S}/i\*(C'\fR matches the sequence
+\&\f(CW"ss"\fR.  Your lookbehind assertion could contain 127 Sharp S
+characters under \f(CW\*(C`/i\*(C'\fR, but adding a 128th would generate a compilation
+error, as that could match 256 \f(CW"s"\fR characters in a row.
+.Sp
+The use of \f(CW\*(C`\eK\*(C'\fR inside of another lookaround assertion
+is allowed, but the behaviour is currently not well defined.
+.Sp
+For various reasons \f(CW\*(C`\eK\*(C'\fR may be significantly more efficient than the
+equivalent \f(CW\*(C`(?<=...)\*(C'\fR construct, and it is especially useful in
+situations where you want to efficiently remove something following
+something else in a string. For instance
+.Sp
+.Vb 1
+\&  s/(foo)bar/$1/g;
+.Ve
+.Sp
+can be rewritten as the much more efficient
+.Sp
+.Vb 1
+\&  s/foo\eKbar//g;
+.Ve
+.Sp
+Use of the non-greedy modifier \f(CW"?"\fR may not give you the expected
+results if it is within a capturing group within the construct.
+.ie n .IP """(?<!\fIpattern\fR)""" 4
+.el .IP \f(CW(?<!\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Item "(?<!pattern)"
+.PD 0
+.ie n .IP """(*nlb:\fIpattern\fR)""" 4
+.el .IP \f(CW(*nlb:\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Item "(*nlb:pattern)"
+.ie n .IP """(*negative_lookbehind:\fIpattern\fR)""" 4
+.el .IP \f(CW(*negative_lookbehind:\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Xref "(?<!) (*nlb (*negative_lookbehind look-behind, negative lookbehind, negative"
+.IX Item "(*negative_lookbehind:pattern)"
+.PD
+A zero-width negative lookbehind assertion.  For example \f(CW\*(C`/(?<!bar)foo/\*(C'\fR
+matches any occurrence of "foo" that does not follow "bar".
+.Sp
+Prior to Perl 5.30, it worked only for fixed-width lookbehind, but
+starting in that release, it can handle variable lengths from 1 to 255
+characters as an experimental feature.  The feature is enabled
+automatically if you use a variable length negative lookbehind assertion.
+.Sp
+In Perl 5.35.10 the scope of the experimental nature of this construct
+has been reduced, and experimental warnings will only be produced when
+the construct contains capturing parentheses. The warnings will be
+raised at pattern compilation time, unless turned off, in the
+\&\f(CW\*(C`experimental::vlb\*(C'\fR category.  This is to warn you that the exact
+contents of capturing buffers in a variable length negative lookbehind
+is not well defined and is subject to change in a future release of perl.
+.Sp
+Currently if you use capture buffers inside of a negative variable length
+lookbehind the result may not be what you expect, for instance:
+.Sp
+.Vb 1
+\&    say "axfoo"=~/(?=foo)(?<!(a|ax)(?{ say $1 }))/ ? "y" : "n";
+.Ve
+.Sp
+will output the following:
+.Sp
+.Vb 2
+\&    a
+\&    no
+.Ve
+.Sp
+which does not make sense as this should print out "ax" as the "a" does
+not line up at the correct place. Another example would be:
+.Sp
+.Vb 1
+\&    say "yes: \*(Aq$1\-$2\*(Aq" if "aayfoo"=~/(?=foo)(?<!(a|aa)(a|aa)x)/;
+.Ve
+.Sp
+will output the following:
+.Sp
+.Vb 1
+\&    yes: \*(Aqaa\-a\*(Aq
+.Ve
+.Sp
+It is possible in a future release of perl we will change this behavior
+so both of these examples produced more reasonable output.
+.Sp
+Note that we are confident that the construct will match and reject
+patterns appropriately, the undefined behavior strictly relates to the
+value of the capture buffer during or after matching.
+.Sp
+There is a technique that can be used to handle variable length
+lookbehind on earlier releases, and longer than 255 characters.  It is
+described in
+<http://www.drregex.com/2019/02/variable\-length\-lookbehinds\-actually.html>.
+.Sp
+Note that under \f(CW\*(C`/i\*(C'\fR, a few single characters match two or three other
+characters.  This makes them variable length, and the 255 length applies
+to the maximum number of characters in the match.  For
+example \f(CW\*(C`qr/\eN{LATIN SMALL LETTER SHARP S}/i\*(C'\fR matches the sequence
+\&\f(CW"ss"\fR.  Your lookbehind assertion could contain 127 Sharp S
+characters under \f(CW\*(C`/i\*(C'\fR, but adding a 128th would generate a compilation
+error, as that could match 256 \f(CW"s"\fR characters in a row.
+.Sp
+Use of the non-greedy modifier \f(CW"?"\fR may not give you the expected
+results if it is within a capturing group within the construct.
+.RE
+.RS 4
+.RE
+.ie n .IP """(?<\fINAME\fR>\fIpattern\fR)""" 4
+.el .IP \f(CW(?<\fR\f(CINAME\fR\f(CW>\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Item "(?<NAME>pattern)"
+.PD 0
+.ie n .IP """(?\*(Aq\fINAME\fR\*(Aq\fIpattern\fR)""" 4
+.el .IP \f(CW(?\*(Aq\fR\f(CINAME\fR\f(CW\*(Aq\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Xref "(?<NAME>) (?'NAME') named capture capture"
+.IX Item "(?NAMEpattern)"
+.PD
+A named capture group. Identical in every respect to normal capturing
+parentheses \f(CW\*(C`()\*(C'\fR but for the additional fact that the group
+can be referred to by name in various regular expression
+constructs (like \f(CW\*(C`\eg{\fR\f(CINAME\fR\f(CW}\*(C'\fR) and can be accessed by name
+after a successful match via \f(CW\*(C`%+\*(C'\fR or \f(CW\*(C`%\-\*(C'\fR. See perlvar
+for more details on the \f(CW\*(C`%+\*(C'\fR and \f(CW\*(C`%\-\*(C'\fR hashes.
+.Sp
+If multiple distinct capture groups have the same name, then
+\&\f(CW$+{\fR\f(CINAME\fR\f(CW}\fR will refer to the leftmost defined group in the match.
+.Sp
+The forms \f(CW\*(C`(?\*(Aq\fR\f(CINAME\fR\f(CW\*(Aq\fR\f(CIpattern\fR\f(CW)\*(C'\fR and \f(CW\*(C`(?<\fR\f(CINAME\fR\f(CW>\fR\f(CIpattern\fR\f(CW)\*(C'\fR
+are equivalent.
+.Sp
+\&\fBNOTE:\fR While the notation of this construct is the same as the similar
+function in .NET regexes, the behavior is not. In Perl the groups are
+numbered sequentially regardless of being named or not. Thus in the
+pattern
+.Sp
+.Vb 1
+\&  /(x)(?<foo>y)(z)/
+.Ve
+.Sp
+\&\f(CW$+{foo}\fR will be the same as \f(CW$2\fR, and \f(CW$3\fR will contain 'z' instead of
+the opposite which is what a .NET regex hacker might expect.
+.Sp
+Currently \fINAME\fR is restricted to simple identifiers only.
+In other words, it must match \f(CW\*(C`/^[_A\-Za\-z][_A\-Za\-z0\-9]*\ez/\*(C'\fR or
+its Unicode extension (see utf8),
+though it isn't extended by the locale (see perllocale).
+.Sp
+\&\fBNOTE:\fR In order to make things easier for programmers with experience
+with the Python or PCRE regex engines, the pattern \f(CW\*(C`(?P<\fR\f(CINAME\fR\f(CW>\fR\f(CIpattern\fR\f(CW)\*(C'\fR
+may be used instead of \f(CW\*(C`(?<\fR\f(CINAME\fR\f(CW>\fR\f(CIpattern\fR\f(CW)\*(C'\fR; however this form does not
+support the use of single quotes as a delimiter for the name.
+.ie n .IP """\ek<\fINAME\fR>""" 4
+.el .IP \f(CW\ek<\fR\f(CINAME\fR\f(CW>\fR 4
+.IX Item "k<NAME>"
+.PD 0
+.ie n .IP """\ek\*(Aq\fINAME\fR\*(Aq""" 4
+.el .IP \f(CW\ek\*(Aq\fR\f(CINAME\fR\f(CW\*(Aq\fR 4
+.IX Item "kNAME"
+.ie n .IP """\ek{\fINAME\fR}""" 4
+.el .IP \f(CW\ek{\fR\f(CINAME\fR\f(CW}\fR 4
+.IX Item "k{NAME}"
+.PD
+Named backreference. Similar to numeric backreferences, except that
+the group is designated by name and not number. If multiple groups
+have the same name then it refers to the leftmost defined group in
+the current match.
+.Sp
+It is an error to refer to a name not defined by a \f(CW\*(C`(?<\fR\f(CINAME\fR\f(CW>)\*(C'\fR
+earlier in the pattern.
+.Sp
+All three forms are equivalent, although with \f(CW\*(C`\ek{ \fR\f(CINAME\fR\f(CW }\*(C'\fR,
+you may optionally have blanks within but adjacent to the braces, as
+shown.
+.Sp
+\&\fBNOTE:\fR In order to make things easier for programmers with experience
+with the Python or PCRE regex engines, the pattern \f(CW\*(C`(?P=\fR\f(CINAME\fR\f(CW)\*(C'\fR
+may be used instead of \f(CW\*(C`\ek<\fR\f(CINAME\fR\f(CW>\*(C'\fR.
+.ie n .IP """(?{ \fIcode\fR })""" 4
+.el .IP "\f(CW(?{ \fR\f(CIcode\fR\f(CW })\fR" 4
+.IX Xref "(?{}) regex, code in regexp, code in regular expression, code in"
+.IX Item "(?{ code })"
+\&\fBWARNING\fR: Using this feature safely requires that you understand its
+limitations.  Code executed that has side effects may not perform identically
+from version to version due to the effect of future optimisations in the regex
+engine.  For more information on this, see "Embedded Code Execution
+Frequency".
+.Sp
+This zero-width assertion executes any embedded Perl code.  It always
+succeeds, and its return value is set as \f(CW$^R\fR.
+.Sp
+In literal patterns, the code is parsed at the same time as the
+surrounding code. While within the pattern, control is passed temporarily
+back to the perl parser, until the logically-balancing closing brace is
+encountered. This is similar to the way that an array index expression in
+a literal string is handled, for example
+.Sp
+.Vb 1
+\&    "abc$array[ 1 + f(\*(Aq[\*(Aq) + g()]def"
+.Ve
+.Sp
+In particular, braces do not need to be balanced:
+.Sp
+.Vb 1
+\&    s/abc(?{ f(\*(Aq{\*(Aq); })/def/
+.Ve
+.Sp
+Even in a pattern that is interpolated and compiled at run-time, literal
+code blocks will be compiled once, at perl compile time; the following
+prints "ABCD":
+.Sp
+.Vb 5
+\&    print "D";
+\&    my $qr = qr/(?{ BEGIN { print "A" } })/;
+\&    my $foo = "foo";
+\&    /$foo$qr(?{ BEGIN { print "B" } })/;
+\&    BEGIN { print "C" }
+.Ve
+.Sp
+In patterns where the text of the code is derived from run-time
+information rather than appearing literally in a source code /pattern/,
+the code is compiled at the same time that the pattern is compiled, and
+for reasons of security, \f(CW\*(C`use re \*(Aqeval\*(Aq\*(C'\fR must be in scope. This is to
+stop user-supplied patterns containing code snippets from being
+executable.
+.Sp
+In situations where you need to enable this with \f(CW\*(C`use re \*(Aqeval\*(Aq\*(C'\fR, you should
+also have taint checking enabled, if your perl supports it.
+Better yet, use the carefully constrained evaluation within a Safe compartment.
+See perlsec for details about both these mechanisms.
+.Sp
+From the viewpoint of parsing, lexical variable scope and closures,
+.Sp
+.Vb 1
+\&    /AAA(?{ BBB })CCC/
+.Ve
+.Sp
+behaves approximately like
+.Sp
+.Vb 1
+\&    /AAA/ && do { BBB } && /CCC/
+.Ve
+.Sp
+Similarly,
+.Sp
+.Vb 1
+\&    qr/AAA(?{ BBB })CCC/
+.Ve
+.Sp
+behaves approximately like
+.Sp
+.Vb 1
+\&    sub { /AAA/ && do { BBB } && /CCC/ }
+.Ve
+.Sp
+In particular:
+.Sp
+.Vb 3
+\&    { my $i = 1; $r = qr/(?{ print $i })/ }
+\&    my $i = 2;
+\&    /$r/; # prints "1"
+.Ve
+.Sp
+Inside a \f(CW\*(C`(?{...})\*(C'\fR block, \f(CW$_\fR refers to the string the regular
+expression is matching against. You can also use \f(CWpos()\fR to know what is
+the current position of matching within this string.
+.Sp
+The code block introduces a new scope from the perspective of lexical
+variable declarations, but \fBnot\fR from the perspective of \f(CW\*(C`local\*(C'\fR and
+similar localizing behaviours. So later code blocks within the same
+pattern will still see the values which were localized in earlier blocks.
+These accumulated localizations are undone either at the end of a
+successful match, or if the assertion is backtracked (compare
+"Backtracking"). For example,
+.Sp
+.Vb 10
+\&  $_ = \*(Aqa\*(Aq x 8;
+\&  m<
+\&     (?{ $cnt = 0 })               # Initialize $cnt.
+\&     (
+\&       a
+\&       (?{
+\&           local $cnt = $cnt + 1;  # Update $cnt,
+\&                                   # backtracking\-safe.
+\&       })
+\&     )*
+\&     aaaa
+\&     (?{ $res = $cnt })            # On success copy to
+\&                                   # non\-localized location.
+\&   >x;
+.Ve
+.Sp
+will initially increment \f(CW$cnt\fR up to 8; then during backtracking, its
+value will be unwound back to 4, which is the value assigned to \f(CW$res\fR.
+At the end of the regex execution, \f(CW$cnt\fR will be wound back to its initial
+value of 0.
+.Sp
+This assertion may be used as the condition in a
+.Sp
+.Vb 1
+\&    (?(condition)yes\-pattern|no\-pattern)
+.Ve
+.Sp
+switch.  If \fInot\fR used in this way, the result of evaluation of \fIcode\fR
+is put into the special variable \f(CW$^R\fR.  This happens immediately, so
+\&\f(CW$^R\fR can be used from other \f(CW\*(C`(?{ \fR\f(CIcode\fR\f(CW })\*(C'\fR assertions inside the same
+regular expression.
+.Sp
+The assignment to \f(CW$^R\fR above is properly localized, so the old
+value of \f(CW$^R\fR is restored if the assertion is backtracked; compare
+"Backtracking".
+.Sp
+Note that the special variable \f(CW$^N\fR  is particularly useful with code
+blocks to capture the results of submatches in variables without having to
+keep track of the number of nested parentheses. For example:
+.Sp
+.Vb 3
+\&  $_ = "The brown fox jumps over the lazy dog";
+\&  /the (\eS+)(?{ $color = $^N }) (\eS+)(?{ $animal = $^N })/i;
+\&  print "color = $color, animal = $animal\en";
+.Ve
+.Sp
+The use of this construct disables some optimisations globally in the
+pattern, and the pattern may execute much slower as a consequence.
+Use a \f(CW\*(C`*\*(C'\fR instead of the \f(CW\*(C`?\*(C'\fR block to create an optimistic form of
+this construct. \f(CW\*(C`(*{ ... })\*(C'\fR should not disable any optimisations.
+.ie n .IP """(*{ \fIcode\fR })""" 4
+.el .IP "\f(CW(*{ \fR\f(CIcode\fR\f(CW })\fR" 4
+.IX Xref "(*{}) regex, optimistic code"
+.IX Item "(*{ code })"
+This is *exactly* the same as \f(CW\*(C`(?{ \fR\f(CIcode\fR\f(CW })\*(C'\fR with the exception
+that it does not disable \fBany\fR optimisations at all in the regex engine.
+How often it is executed may vary from perl release to perl release.
+In a failing match it may not even be executed at all.
+.ie n .IP """(??{ \fIcode\fR })""" 4
+.el .IP "\f(CW(??{ \fR\f(CIcode\fR\f(CW })\fR" 4
+.IX Xref "(??{}) regex, postponed regexp, postponed regular expression, postponed"
+.IX Item "(??{ code })"
+\&\fBWARNING\fR: Using this feature safely requires that you understand its
+limitations.  Code executed that has side effects may not perform
+identically from version to version due to the effect of future
+optimisations in the regex engine.  For more information on this, see
+"Embedded Code Execution Frequency".
+.Sp
+This is a "postponed" regular subexpression.  It behaves in \fIexactly\fR the
+same way as a \f(CW\*(C`(?{ \fR\f(CIcode\fR\f(CW })\*(C'\fR code block as described above, except that
+its return value, rather than being assigned to \f(CW$^R\fR, is treated as a
+pattern, compiled if it's a string (or used as-is if its a qr// object),
+then matched as if it were inserted instead of this construct.
+.Sp
+During the matching of this sub-pattern, it has its own set of
+captures which are valid during the sub-match, but are discarded once
+control returns to the main pattern. For example, the following matches,
+with the inner pattern capturing "B" and matching "BB", while the outer
+pattern captures "A";
+.Sp
+.Vb 3
+\&    my $inner = \*(Aq(.)\e1\*(Aq;
+\&    "ABBA" =~ /^(.)(??{ $inner })\e1/;
+\&    print $1; # prints "A";
+.Ve
+.Sp
+Note that this means that  there is no way for the inner pattern to refer
+to a capture group defined outside.  (The code block itself can use \f(CW$1\fR,
+\&\fIetc\fR., to refer to the enclosing pattern's capture groups.)  Thus, although
+.Sp
+.Vb 1
+\&    (\*(Aqa\*(Aq x 100)=~/(??{\*(Aq(.)\*(Aq x 100})/
+.Ve
+.Sp
+\&\fIwill\fR match, it will \fInot\fR set \f(CW$1\fR on exit.
+.Sp
+The following pattern matches a parenthesized group:
+.Sp
+.Vb 9
+\& $re = qr{
+\&            \e(
+\&            (?:
+\&               (?> [^()]+ )  # Non\-parens without backtracking
+\&             |
+\&               (??{ $re })   # Group with matching parens
+\&            )*
+\&            \e)
+\&         }x;
+.Ve
+.Sp
+See also
+\&\f(CW\*(C`(?\fR\f(CIPARNO\fR\f(CW)\*(C'\fR
+for a different, more efficient way to accomplish
+the same task.
+.Sp
+Executing a postponed regular expression too many times without
+consuming any input string will also result in a fatal error.  The depth
+at which that happens is compiled into perl, so it can be changed with a
+custom build.
+.Sp
+The use of this construct disables some optimisations globally in the pattern,
+and the pattern may execute much slower as a consequence.
+.ie n .IP """(?\fIPARNO\fR)"" ""(?\-\fIPARNO\fR)"" ""(?+\fIPARNO\fR)"" ""(?R)"" ""(?0)""" 4
+.el .IP "\f(CW(?\fR\f(CIPARNO\fR\f(CW)\fR \f(CW(?\-\fR\f(CIPARNO\fR\f(CW)\fR \f(CW(?+\fR\f(CIPARNO\fR\f(CW)\fR \f(CW(?R)\fR \f(CW(?0)\fR" 4
+.IX Xref "(?PARNO) (?1) (?R) (?0) (?-1) (?+1) (?-PARNO) (?+PARNO) regex, recursive regexp, recursive regular expression, recursive regex, relative recursion GOSUB GOSTART"
+.IX Item "(?PARNO) (?-PARNO) (?+PARNO) (?R) (?0)"
+Recursive subpattern. Treat the contents of a given capture buffer in the
+current pattern as an independent subpattern and attempt to match it at
+the current position in the string. Information about capture state from
+the caller for things like backreferences is available to the subpattern,
+but capture buffers set by the subpattern are not visible to the caller.
+.Sp
+Similar to \f(CW\*(C`(??{ \fR\f(CIcode\fR\f(CW })\*(C'\fR except that it does not involve executing any
+code or potentially compiling a returned pattern string; instead it treats
+the part of the current pattern contained within a specified capture group
+as an independent pattern that must match at the current position. Also
+different is the treatment of capture buffers, unlike \f(CW\*(C`(??{ \fR\f(CIcode\fR\f(CW })\*(C'\fR
+recursive patterns have access to their caller's match state, so one can
+use backreferences safely.
+.Sp
+\&\fIPARNO\fR is a sequence of digits (not starting with 0) whose value reflects
+the paren-number of the capture group to recurse to. \f(CW\*(C`(?R)\*(C'\fR recurses to
+the beginning of the whole pattern. \f(CW\*(C`(?0)\*(C'\fR is an alternate syntax for
+\&\f(CW\*(C`(?R)\*(C'\fR. If \fIPARNO\fR is preceded by a plus or minus sign then it is assumed
+to be relative, with negative numbers indicating preceding capture groups
+and positive ones following. Thus \f(CW\*(C`(?\-1)\*(C'\fR refers to the most recently
+declared group, and \f(CW\*(C`(?+1)\*(C'\fR indicates the next group to be declared.
+Note that the counting for relative recursion differs from that of
+relative backreferences, in that with recursion unclosed groups \fBare\fR
+included.
+.Sp
+The following pattern matches a function \f(CWfoo()\fR which may contain
+balanced parentheses as the argument.
+.Sp
+.Vb 10
+\&  $re = qr{ (                   # paren group 1 (full function)
+\&              foo
+\&              (                 # paren group 2 (parens)
+\&                \e(
+\&                  (             # paren group 3 (contents of parens)
+\&                  (?:
+\&                   (?> [^()]+ ) # Non\-parens without backtracking
+\&                  |
+\&                   (?2)         # Recurse to start of paren group 2
+\&                  )*
+\&                  )
+\&                \e)
+\&              )
+\&            )
+\&          }x;
+.Ve
+.Sp
+If the pattern was used as follows
+.Sp
+.Vb 4
+\&    \*(Aqfoo(bar(baz)+baz(bop))\*(Aq=~/$re/
+\&        and print "\e$1 = $1\en",
+\&                  "\e$2 = $2\en",
+\&                  "\e$3 = $3\en";
+.Ve
+.Sp
+the output produced should be the following:
+.Sp
+.Vb 3
+\&    $1 = foo(bar(baz)+baz(bop))
+\&    $2 = (bar(baz)+baz(bop))
+\&    $3 = bar(baz)+baz(bop)
+.Ve
+.Sp
+If there is no corresponding capture group defined, then it is a
+fatal error.  Recursing deeply without consuming any input string will
+also result in a fatal error.  The depth at which that happens is
+compiled into perl, so it can be changed with a custom build.
+.Sp
+The following shows how using negative indexing can make it
+easier to embed recursive patterns inside of a \f(CW\*(C`qr//\*(C'\fR construct
+for later use:
+.Sp
+.Vb 4
+\&    my $parens = qr/(\e((?:[^()]++|(?\-1))*+\e))/;
+\&    if (/foo $parens \es+ \e+ \es+ bar $parens/x) {
+\&       # do something here...
+\&    }
+.Ve
+.Sp
+\&\fBNote\fR that this pattern does not behave the same way as the equivalent
+PCRE or Python construct of the same form. In Perl you can backtrack into
+a recursed group, in PCRE and Python the recursed into group is treated
+as atomic. Also, modifiers are resolved at compile time, so constructs
+like \f(CW\*(C`(?i:(?1))\*(C'\fR or \f(CW\*(C`(?:(?i)(?1))\*(C'\fR do not affect how the sub-pattern will
+be processed.
+.ie n .IP """(?&\fINAME\fR)""" 4
+.el .IP \f(CW(?&\fR\f(CINAME\fR\f(CW)\fR 4
+.IX Xref "(?&NAME)"
+.IX Item "(?&NAME)"
+Recurse to a named subpattern. Identical to \f(CW\*(C`(?\fR\f(CIPARNO\fR\f(CW)\*(C'\fR except that the
+parenthesis to recurse to is determined by name. If multiple parentheses have
+the same name, then it recurses to the leftmost.
+.Sp
+It is an error to refer to a name that is not declared somewhere in the
+pattern.
+.Sp
+\&\fBNOTE:\fR In order to make things easier for programmers with experience
+with the Python or PCRE regex engines the pattern \f(CW\*(C`(?P>\fR\f(CINAME\fR\f(CW)\*(C'\fR
+may be used instead of \f(CW\*(C`(?&\fR\f(CINAME\fR\f(CW)\*(C'\fR.
+.ie n .IP """(?(\fIcondition\fR)\fIyes\-pattern\fR|\fIno\-pattern\fR)""" 4
+.el .IP \f(CW(?(\fR\f(CIcondition\fR\f(CW)\fR\f(CIyes\-pattern\fR\f(CW|\fR\f(CIno\-pattern\fR\f(CW)\fR 4
+.IX Xref "(?()"
+.IX Item "(?(condition)yes-pattern|no-pattern)"
+.PD 0
+.ie n .IP """(?(\fIcondition\fR)\fIyes\-pattern\fR)""" 4
+.el .IP \f(CW(?(\fR\f(CIcondition\fR\f(CW)\fR\f(CIyes\-pattern\fR\f(CW)\fR 4
+.IX Item "(?(condition)yes-pattern)"
+.PD
+Conditional expression. Matches \fIyes-pattern\fR if \fIcondition\fR yields
+a true value, matches \fIno-pattern\fR otherwise. A missing pattern always
+matches.
+.Sp
+\&\f(CW\*(C`(\fR\f(CIcondition\fR\f(CW)\*(C'\fR should be one of:
+.RS 4
+.IP "an integer in parentheses" 4
+.IX Item "an integer in parentheses"
+(which is valid if the corresponding pair of parentheses
+matched);
+.IP "a lookahead/lookbehind/evaluate zero-width assertion;" 4
+.IX Item "a lookahead/lookbehind/evaluate zero-width assertion;"
+.PD 0
+.IP "a name in angle brackets or single quotes" 4
+.IX Item "a name in angle brackets or single quotes"
+.PD
+(which is valid if a group with the given name matched);
+.ie n .IP "the special symbol ""(R)""" 4
+.el .IP "the special symbol \f(CW(R)\fR" 4
+.IX Item "the special symbol (R)"
+(true when evaluated inside of recursion or eval).  Additionally the
+\&\f(CW"R"\fR may be
+followed by a number, (which will be true when evaluated when recursing
+inside of the appropriate group), or by \f(CW\*(C`&\fR\f(CINAME\fR\f(CW\*(C'\fR, in which case it will
+be true only when evaluated during recursion in the named group.
+.RE
+.RS 4
+.Sp
+Here's a summary of the possible predicates:
+.ie n .IP """(1)"" ""(2)"" ..." 4
+.el .IP "\f(CW(1)\fR \f(CW(2)\fR ..." 4
+.IX Item "(1) (2) ..."
+Checks if the numbered capturing group has matched something.
+Full syntax: \f(CW\*(C`(?(1)then|else)\*(C'\fR
+.ie n .IP """(<\fINAME\fR>)"" ""(\*(Aq\fINAME\fR\*(Aq)""" 4
+.el .IP "\f(CW(<\fR\f(CINAME\fR\f(CW>)\fR \f(CW(\*(Aq\fR\f(CINAME\fR\f(CW\*(Aq)\fR" 4
+.IX Item "(<NAME>) (NAME)"
+Checks if a group with the given name has matched something.
+Full syntax: \f(CW\*(C`(?(<name>)then|else)\*(C'\fR
+.ie n .IP """(?=...)"" ""(?!...)"" ""(?<=...)"" ""(?<!...)""" 4
+.el .IP "\f(CW(?=...)\fR \f(CW(?!...)\fR \f(CW(?<=...)\fR \f(CW(?<!...)\fR" 4
+.IX Item "(?=...) (?!...) (?<=...) (?<!...)"
+Checks whether the pattern matches (or does not match, for the \f(CW"!"\fR
+variants).
+Full syntax: \f(CW\*(C`(?(?=\fR\f(CIlookahead\fR\f(CW)\fR\f(CIthen\fR\f(CW|\fR\f(CIelse\fR\f(CW)\*(C'\fR
+.ie n .IP """(?{ \fICODE\fR })""" 4
+.el .IP "\f(CW(?{ \fR\f(CICODE\fR\f(CW })\fR" 4
+.IX Item "(?{ CODE })"
+Treats the return value of the code block as the condition.
+Full syntax: \f(CW\*(C`(?(?{ \fR\f(CICODE\fR\f(CW })\fR\f(CIthen\fR\f(CW|\fR\f(CIelse\fR\f(CW)\*(C'\fR
+.Sp
+Note use of this construct may globally affect the performance
+of the pattern. Consider using \f(CW\*(C`(*{ \fR\f(CICODE\fR\f(CW })\*(C'\fR
+.ie n .IP """(*{ \fICODE\fR })""" 4
+.el .IP "\f(CW(*{ \fR\f(CICODE\fR\f(CW })\fR" 4
+.IX Item "(*{ CODE })"
+Treats the return value of the code block as the condition.
+Full syntax: \f(CW\*(C`(?(*{ \fR\f(CICODE\fR\f(CW })\fR\f(CIthen\fR\f(CW|\fR\f(CIelse\fR\f(CW)\*(C'\fR
+.ie n .IP """(R)""" 4
+.el .IP \f(CW(R)\fR 4
+.IX Item "(R)"
+Checks if the expression has been evaluated inside of recursion.
+Full syntax: \f(CW\*(C`(?(R)\fR\f(CIthen\fR\f(CW|\fR\f(CIelse\fR\f(CW)\*(C'\fR
+.ie n .IP """(R1)"" ""(R2)"" ..." 4
+.el .IP "\f(CW(R1)\fR \f(CW(R2)\fR ..." 4
+.IX Item "(R1) (R2) ..."
+Checks if the expression has been evaluated while executing directly
+inside of the n\-th capture group. This check is the regex equivalent of
+.Sp
+.Vb 1
+\&  if ((caller(0))[3] eq \*(Aqsubname\*(Aq) { ... }
+.Ve
+.Sp
+In other words, it does not check the full recursion stack.
+.Sp
+Full syntax: \f(CW\*(C`(?(R1)\fR\f(CIthen\fR\f(CW|\fR\f(CIelse\fR\f(CW)\*(C'\fR
+.ie n .IP """(R&\fINAME\fR)""" 4
+.el .IP \f(CW(R&\fR\f(CINAME\fR\f(CW)\fR 4
+.IX Item "(R&NAME)"
+Similar to \f(CW\*(C`(R1)\*(C'\fR, this predicate checks to see if we're executing
+directly inside of the leftmost group with a given name (this is the same
+logic used by \f(CW\*(C`(?&\fR\f(CINAME\fR\f(CW)\*(C'\fR to disambiguate). It does not check the full
+stack, but only the name of the innermost active recursion.
+Full syntax: \f(CW\*(C`(?(R&\fR\f(CIname\fR\f(CW)\fR\f(CIthen\fR\f(CW|\fR\f(CIelse\fR\f(CW)\*(C'\fR
+.ie n .IP """(DEFINE)""" 4
+.el .IP \f(CW(DEFINE)\fR 4
+.IX Item "(DEFINE)"
+In this case, the yes-pattern is never directly executed, and no
+no-pattern is allowed. Similar in spirit to \f(CW\*(C`(?{0})\*(C'\fR but more efficient.
+See below for details.
+Full syntax: \f(CW\*(C`(?(DEFINE)\fR\f(CIdefinitions\fR\f(CW...)\*(C'\fR
+.RE
+.RS 4
+.Sp
+For example:
+.Sp
+.Vb 4
+\&    m{ ( \e( )?
+\&       [^()]+
+\&       (?(1) \e) )
+\&     }x
+.Ve
+.Sp
+matches a chunk of non-parentheses, possibly included in parentheses
+themselves.
+.Sp
+A special form is the \f(CW\*(C`(DEFINE)\*(C'\fR predicate, which never executes its
+yes-pattern directly, and does not allow a no-pattern. This allows one to
+define subpatterns which will be executed only by the recursion mechanism.
+This way, you can define a set of regular expression rules that can be
+bundled into any pattern you choose.
+.Sp
+It is recommended that for this usage you put the DEFINE block at the
+end of the pattern, and that you name any subpatterns defined within it.
+.Sp
+Also, it's worth noting that patterns defined this way probably will
+not be as efficient, as the optimizer is not very clever about
+handling them.
+.Sp
+An example of how this might be used is as follows:
+.Sp
+.Vb 5
+\&  /(?<NAME>(?&NAME_PAT))(?<ADDR>(?&ADDRESS_PAT))
+\&   (?(DEFINE)
+\&     (?<NAME_PAT>....)
+\&     (?<ADDRESS_PAT>....)
+\&   )/x
+.Ve
+.Sp
+Note that capture groups matched inside of recursion are not accessible
+after the recursion returns, so the extra layer of capturing groups is
+necessary. Thus \f(CW$+{NAME_PAT}\fR would not be defined even though
+\&\f(CW$+{NAME}\fR would be.
+.Sp
+Finally, keep in mind that subpatterns created inside a DEFINE block
+count towards the absolute and relative number of captures, so this:
+.Sp
+.Vb 5
+\&    my @captures = "a" =~ /(.)                  # First capture
+\&                           (?(DEFINE)
+\&                               (?<EXAMPLE> 1 )  # Second capture
+\&                           )/x;
+\&    say scalar @captures;
+.Ve
+.Sp
+Will output 2, not 1. This is particularly important if you intend to
+compile the definitions with the \f(CW\*(C`qr//\*(C'\fR operator, and later
+interpolate them in another pattern.
+.RE
+.ie n .IP """(?>\fIpattern\fR)""" 4
+.el .IP \f(CW(?>\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Item "(?>pattern)"
+.PD 0
+.ie n .IP """(*atomic:\fIpattern\fR)""" 4
+.el .IP \f(CW(*atomic:\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Xref "(?>pattern) (*atomic backtrack backtracking atomic possessive"
+.IX Item "(*atomic:pattern)"
+.PD
+An "independent" subexpression, one which matches the substring
+that a standalone \fIpattern\fR would match if anchored at the given
+position, and it matches \fInothing other than this substring\fR.  This
+construct is useful for optimizations of what would otherwise be
+"eternal" matches, because it will not backtrack (see "Backtracking").
+It may also be useful in places where the "grab all you can, and do not
+give anything back" semantic is desirable.
+.Sp
+For example: \f(CW\*(C`^(?>a*)ab\*(C'\fR will never match, since \f(CW\*(C`(?>a*)\*(C'\fR
+(anchored at the beginning of string, as above) will match \fIall\fR
+characters \f(CW"a"\fR at the beginning of string, leaving no \f(CW"a"\fR for
+\&\f(CW\*(C`ab\*(C'\fR to match.  In contrast, \f(CW\*(C`a*ab\*(C'\fR will match the same as \f(CW\*(C`a+b\*(C'\fR,
+since the match of the subgroup \f(CW\*(C`a*\*(C'\fR is influenced by the following
+group \f(CW\*(C`ab\*(C'\fR (see "Backtracking").  In particular, \f(CW\*(C`a*\*(C'\fR inside
+\&\f(CW\*(C`a*ab\*(C'\fR will match fewer characters than a standalone \f(CW\*(C`a*\*(C'\fR, since
+this makes the tail match.
+.Sp
+\&\f(CW\*(C`(?>\fR\f(CIpattern\fR\f(CW)\*(C'\fR does not disable backtracking altogether once it has
+matched. It is still possible to backtrack past the construct, but not
+into it. So \f(CW\*(C`((?>a*)|(?>b*))ar\*(C'\fR will still match "bar".
+.Sp
+An effect similar to \f(CW\*(C`(?>\fR\f(CIpattern\fR\f(CW)\*(C'\fR may be achieved by writing
+\&\f(CW\*(C`(?=(\fR\f(CIpattern\fR\f(CW))\eg{\-1}\*(C'\fR.  This matches the same substring as a standalone
+\&\f(CW\*(C`a+\*(C'\fR, and the following \f(CW\*(C`\eg{\-1}\*(C'\fR eats the matched string; it therefore
+makes a zero-length assertion into an analogue of \f(CW\*(C`(?>...)\*(C'\fR.
+(The difference between these two constructs is that the second one
+uses a capturing group, thus shifting ordinals of backreferences
+in the rest of a regular expression.)
+.Sp
+Consider this pattern:
+.Sp
+.Vb 8
+\&    m{ \e(
+\&          (
+\&            [^()]+           # x+
+\&          |
+\&            \e( [^()]* \e)
+\&          )+
+\&       \e)
+\&     }x
+.Ve
+.Sp
+That will efficiently match a nonempty group with matching parentheses
+two levels deep or less.  However, if there is no such group, it
+will take virtually forever on a long string.  That's because there
+are so many different ways to split a long string into several
+substrings.  This is what \f(CW\*(C`(.+)+\*(C'\fR is doing, and \f(CW\*(C`(.+)+\*(C'\fR is similar
+to a subpattern of the above pattern.  Consider how the pattern
+above detects no-match on \f(CW\*(C`((()aaaaaaaaaaaaaaaaaa\*(C'\fR in several
+seconds, but that each extra letter doubles this time.  This
+exponential performance will make it appear that your program has
+hung.  However, a tiny change to this pattern
+.Sp
+.Vb 8
+\&    m{ \e(
+\&          (
+\&            (?> [^()]+ )        # change x+ above to (?> x+ )
+\&          |
+\&            \e( [^()]* \e)
+\&          )+
+\&       \e)
+\&     }x
+.Ve
+.Sp
+which uses \f(CW\*(C`(?>...)\*(C'\fR matches exactly when the one above does (verifying
+this yourself would be a productive exercise), but finishes in a fourth
+the time when used on a similar string with 1000000 \f(CW"a"\fRs.  Be aware,
+however, that, when this construct is followed by a
+quantifier, it currently triggers a warning message under
+the \f(CW\*(C`use warnings\*(C'\fR pragma or \fB\-w\fR switch saying it
+\&\f(CW"matches null string many times in regex"\fR.
+.Sp
+On simple groups, such as the pattern \f(CW\*(C`(?> [^()]+ )\*(C'\fR, a comparable
+effect may be achieved by negative lookahead, as in \f(CW\*(C`[^()]+ (?! [^()] )\*(C'\fR.
+This was only 4 times slower on a string with 1000000 \f(CW"a"\fRs.
+.Sp
+The "grab all you can, and do not give anything back" semantic is desirable
+in many situations where on the first sight a simple \f(CW\*(C`()*\*(C'\fR looks like
+the correct solution.  Suppose we parse text with comments being delimited
+by \f(CW"#"\fR followed by some optional (horizontal) whitespace.  Contrary to
+its appearance, \f(CW\*(C`#[ \et]*\*(C'\fR \fIis not\fR the correct subexpression to match
+the comment delimiter, because it may "give up" some whitespace if
+the remainder of the pattern can be made to match that way.  The correct
+answer is either one of these:
+.Sp
+.Vb 2
+\&    (?>#[ \et]*)
+\&    #[ \et]*(?![ \et])
+.Ve
+.Sp
+For example, to grab non-empty comments into \f(CW$1\fR, one should use either
+one of these:
+.Sp
+.Vb 2
+\&    / (?> \e# [ \et]* ) (        .+ ) /x;
+\&    /     \e# [ \et]*   ( [^ \et] .* ) /x;
+.Ve
+.Sp
+Which one you pick depends on which of these expressions better reflects
+the above specification of comments.
+.Sp
+In some literature this construct is called "atomic matching" or
+"possessive matching".
+.Sp
+Possessive quantifiers are equivalent to putting the item they are applied
+to inside of one of these constructs. The following equivalences apply:
+.Sp
+.Vb 6
+\&    Quantifier Form     Bracketing Form
+\&    \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-     \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
+\&    PAT*+               (?>PAT*)
+\&    PAT++               (?>PAT+)
+\&    PAT?+               (?>PAT?)
+\&    PAT{min,max}+       (?>PAT{min,max})
+.Ve
+.Sp
+Nested \f(CW\*(C`(?>...)\*(C'\fR constructs are not no-ops, even if at first glance
+they might seem to be.  This is because the nested \f(CW\*(C`(?>...)\*(C'\fR can
+restrict internal backtracking that otherwise might occur.  For example,
+.Sp
+.Vb 1
+\& "abc" =~ /(?>a[bc]*c)/
+.Ve
+.Sp
+matches, but
+.Sp
+.Vb 1
+\& "abc" =~ /(?>a(?>[bc]*)c)/
+.Ve
+.Sp
+does not.
+.ie n .IP """(?[ ])""" 4
+.el .IP "\f(CW(?[ ])\fR" 4
+.IX Item "(?[ ])"
+See "Extended Bracketed Character Classes" in perlrecharclass.
+.SS Backtracking
+.IX Xref "backtrack backtracking"
+.IX Subsection "Backtracking"
+NOTE: This section presents an abstract approximation of regular
+expression behavior.  For a more rigorous (and complicated) view of
+the rules involved in selecting a match among possible alternatives,
+see "Combining RE Pieces".
+.PP
+A fundamental feature of regular expression matching involves the
+notion called \fIbacktracking\fR, which is currently used (when needed)
+by all regular non-possessive expression quantifiers, namely \f(CW"*"\fR,
+\&\f(CW\*(C`*?\*(C'\fR, \f(CW"+"\fR, \f(CW\*(C`+?\*(C'\fR, \f(CW\*(C`{n,m}\*(C'\fR, and \f(CW\*(C`{n,m}?\*(C'\fR.  Backtracking is often
+optimized internally, but the general principle outlined here is valid.
+.PP
+For a regular expression to match, the \fIentire\fR regular expression must
+match, not just part of it.  So if the beginning of a pattern containing a
+quantifier succeeds in a way that causes later parts in the pattern to
+fail, the matching engine backs up and recalculates the beginning
+part\-\-that's why it's called backtracking.
+.PP
+Here is an example of backtracking:  Let's say you want to find the
+word following "foo" in the string "Food is on the foo table.":
+.PP
+.Vb 4
+\&    $_ = "Food is on the foo table.";
+\&    if ( /\eb(foo)\es+(\ew+)/i ) {
+\&        print "$2 follows $1.\en";
+\&    }
+.Ve
+.PP
+When the match runs, the first part of the regular expression (\f(CW\*(C`\eb(foo)\*(C'\fR)
+finds a possible match right at the beginning of the string, and loads up
+\&\f(CW$1\fR with "Foo".  However, as soon as the matching engine sees that there's
+no whitespace following the "Foo" that it had saved in \f(CW$1\fR, it realizes its
+mistake and starts over again one character after where it had the
+tentative match.  This time it goes all the way until the next occurrence
+of "foo". The complete regular expression matches this time, and you get
+the expected output of "table follows foo."
+.PP
+Sometimes minimal matching can help a lot.  Imagine you'd like to match
+everything between "foo" and "bar".  Initially, you write something
+like this:
+.PP
+.Vb 4
+\&    $_ =  "The food is under the bar in the barn.";
+\&    if ( /foo(.*)bar/ ) {
+\&        print "got <$1>\en";
+\&    }
+.Ve
+.PP
+Which perhaps unexpectedly yields:
+.PP
+.Vb 1
+\&  got <d is under the bar in the >
+.Ve
+.PP
+That's because \f(CW\*(C`.*\*(C'\fR was greedy, so you get everything between the
+\&\fIfirst\fR "foo" and the \fIlast\fR "bar".  Here it's more effective
+to use minimal matching to make sure you get the text between a "foo"
+and the first "bar" thereafter.
+.PP
+.Vb 2
+\&    if ( /foo(.*?)bar/ ) { print "got <$1>\en" }
+\&  got <d is under the >
+.Ve
+.PP
+Here's another example. Let's say you'd like to match a number at the end
+of a string, and you also want to keep the preceding part of the match.
+So you write this:
+.PP
+.Vb 4
+\&    $_ = "I have 2 numbers: 53147";
+\&    if ( /(.*)(\ed*)/ ) {                                # Wrong!
+\&        print "Beginning is <$1>, number is <$2>.\en";
+\&    }
+.Ve
+.PP
+That won't work at all, because \f(CW\*(C`.*\*(C'\fR was greedy and gobbled up the
+whole string. As \f(CW\*(C`\ed*\*(C'\fR can match on an empty string the complete
+regular expression matched successfully.
+.PP
+.Vb 1
+\&    Beginning is <I have 2 numbers: 53147>, number is <>.
+.Ve
+.PP
+Here are some variants, most of which don't work:
+.PP
+.Vb 11
+\&    $_ = "I have 2 numbers: 53147";
+\&    @pats = qw{
+\&        (.*)(\ed*)
+\&        (.*)(\ed+)
+\&        (.*?)(\ed*)
+\&        (.*?)(\ed+)
+\&        (.*)(\ed+)$
+\&        (.*?)(\ed+)$
+\&        (.*)\eb(\ed+)$
+\&        (.*\eD)(\ed+)$
+\&    };
+\&
+\&    for $pat (@pats) {
+\&        printf "%\-12s ", $pat;
+\&        if ( /$pat/ ) {
+\&            print "<$1> <$2>\en";
+\&        } else {
+\&            print "FAIL\en";
+\&        }
+\&    }
+.Ve
+.PP
+That will print out:
+.PP
+.Vb 8
+\&    (.*)(\ed*)    <I have 2 numbers: 53147> <>
+\&    (.*)(\ed+)    <I have 2 numbers: 5314> <7>
+\&    (.*?)(\ed*)   <> <>
+\&    (.*?)(\ed+)   <I have > <2>
+\&    (.*)(\ed+)$   <I have 2 numbers: 5314> <7>
+\&    (.*?)(\ed+)$  <I have 2 numbers: > <53147>
+\&    (.*)\eb(\ed+)$ <I have 2 numbers: > <53147>
+\&    (.*\eD)(\ed+)$ <I have 2 numbers: > <53147>
+.Ve
+.PP
+As you see, this can be a bit tricky.  It's important to realize that a
+regular expression is merely a set of assertions that gives a definition
+of success.  There may be 0, 1, or several different ways that the
+definition might succeed against a particular string.  And if there are
+multiple ways it might succeed, you need to understand backtracking to
+know which variety of success you will achieve.
+.PP
+When using lookahead assertions and negations, this can all get even
+trickier.  Imagine you'd like to find a sequence of non-digits not
+followed by "123".  You might try to write that as
+.PP
+.Vb 4
+\&    $_ = "ABC123";
+\&    if ( /^\eD*(?!123)/ ) {                # Wrong!
+\&        print "Yup, no 123 in $_\en";
+\&    }
+.Ve
+.PP
+But that isn't going to match; at least, not the way you're hoping.  It
+claims that there is no 123 in the string.  Here's a clearer picture of
+why that pattern matches, contrary to popular expectations:
+.PP
+.Vb 2
+\&    $x = \*(AqABC123\*(Aq;
+\&    $y = \*(AqABC445\*(Aq;
+\&
+\&    print "1: got $1\en" if $x =~ /^(ABC)(?!123)/;
+\&    print "2: got $1\en" if $y =~ /^(ABC)(?!123)/;
+\&
+\&    print "3: got $1\en" if $x =~ /^(\eD*)(?!123)/;
+\&    print "4: got $1\en" if $y =~ /^(\eD*)(?!123)/;
+.Ve
+.PP
+This prints
+.PP
+.Vb 3
+\&    2: got ABC
+\&    3: got AB
+\&    4: got ABC
+.Ve
+.PP
+You might have expected test 3 to fail because it seems to a more
+general purpose version of test 1.  The important difference between
+them is that test 3 contains a quantifier (\f(CW\*(C`\eD*\*(C'\fR) and so can use
+backtracking, whereas test 1 will not.  What's happening is
+that you've asked "Is it true that at the start of \f(CW$x\fR, following 0 or more
+non-digits, you have something that's not 123?"  If the pattern matcher had
+let \f(CW\*(C`\eD*\*(C'\fR expand to "ABC", this would have caused the whole pattern to
+fail.
+.PP
+The search engine will initially match \f(CW\*(C`\eD*\*(C'\fR with "ABC".  Then it will
+try to match \f(CW\*(C`(?!123)\*(C'\fR with "123", which fails.  But because
+a quantifier (\f(CW\*(C`\eD*\*(C'\fR) has been used in the regular expression, the
+search engine can backtrack and retry the match differently
+in the hope of matching the complete regular expression.
+.PP
+The pattern really, \fIreally\fR wants to succeed, so it uses the
+standard pattern back-off-and-retry and lets \f(CW\*(C`\eD*\*(C'\fR expand to just "AB" this
+time.  Now there's indeed something following "AB" that is not
+"123".  It's "C123", which suffices.
+.PP
+We can deal with this by using both an assertion and a negation.
+We'll say that the first part in \f(CW$1\fR must be followed both by a digit
+and by something that's not "123".  Remember that the lookaheads
+are zero-width expressions\-\-they only look, but don't consume any
+of the string in their match.  So rewriting this way produces what
+you'd expect; that is, case 5 will fail, but case 6 succeeds:
+.PP
+.Vb 2
+\&    print "5: got $1\en" if $x =~ /^(\eD*)(?=\ed)(?!123)/;
+\&    print "6: got $1\en" if $y =~ /^(\eD*)(?=\ed)(?!123)/;
+\&
+\&    6: got ABC
+.Ve
+.PP
+In other words, the two zero-width assertions next to each other work as though
+they're ANDed together, just as you'd use any built-in assertions:  \f(CW\*(C`/^$/\*(C'\fR
+matches only if you're at the beginning of the line AND the end of the
+line simultaneously.  The deeper underlying truth is that juxtaposition in
+regular expressions always means AND, except when you write an explicit OR
+using the vertical bar.  \f(CW\*(C`/ab/\*(C'\fR means match "a" AND (then) match "b",
+although the attempted matches are made at different positions because "a"
+is not a zero-width assertion, but a one-width assertion.
+.PP
+\&\fBWARNING\fR: Particularly complicated regular expressions can take
+exponential time to solve because of the immense number of possible
+ways they can use backtracking to try for a match.  For example, without
+internal optimizations done by the regular expression engine, this will
+take a painfully long time to run:
+.PP
+.Vb 1
+\&    \*(Aqaaaaaaaaaaaa\*(Aq =~ /((a{0,5}){0,5})*[c]/
+.Ve
+.PP
+And if you used \f(CW"*"\fR's in the internal groups instead of limiting them
+to 0 through 5 matches, then it would take forever\-\-or until you ran
+out of stack space.  Moreover, these internal optimizations are not
+always applicable.  For example, if you put \f(CW\*(C`{0,5}\*(C'\fR instead of \f(CW"*"\fR
+on the external group, no current optimization is applicable, and the
+match takes a long time to finish.
+.PP
+A powerful tool for optimizing such beasts is what is known as an
+"independent group",
+which does not backtrack (see \f(CW"(?>pattern)"\fR).  Note also that
+zero-length lookahead/lookbehind assertions will not backtrack to make
+the tail match, since they are in "logical" context: only
+whether they match is considered relevant.  For an example
+where side-effects of lookahead \fImight\fR have influenced the
+following match, see \f(CW"(?>pattern)"\fR.
+.SS "Script Runs"
+.IX Xref "(*script_run:...) (sr:...) (*atomic_script_run:...) (asr:...)"
+.IX Subsection "Script Runs"
+A script run is basically a sequence of characters, all from the same
+Unicode script (see "Scripts" in perlunicode), such as Latin or Greek.  In
+most places a single word would never be written in multiple scripts,
+unless it is a spoofing attack.  An infamous example, is
+.PP
+.Vb 1
+\& paypal.com
+.Ve
+.PP
+Those letters could all be Latin (as in the example just above), or they
+could be all Cyrillic (except for the dot), or they could be a mixture
+of the two.  In the case of an internet address the \f(CW\*(C`.com\*(C'\fR would be in
+Latin, And any Cyrillic ones would cause it to be a mixture, not a
+script run.  Someone clicking on such a link would not be directed to
+the real Paypal website, but an attacker would craft a look-alike one to
+attempt to gather sensitive information from the person.
+.PP
+Starting in Perl 5.28, it is now easy to detect strings that aren't
+script runs.  Simply enclose just about any pattern like either of
+these:
+.PP
+.Vb 2
+\& (*script_run:pattern)
+\& (*sr:pattern)
+.Ve
+.PP
+What happens is that after \fIpattern\fR succeeds in matching, it is
+subjected to the additional criterion that every character in it must be
+from the same script (see exceptions below).  If this isn't true,
+backtracking occurs until something all in the same script is found that
+matches, or all possibilities are exhausted.  This can cause a lot of
+backtracking, but generally, only malicious input will result in this,
+though the slow down could cause a denial of service attack.  If your
+needs permit, it is best to make the pattern atomic to cut down on the
+amount of backtracking.  This is so likely to be what you want, that
+instead of writing this:
+.PP
+.Vb 1
+\& (*script_run:(?>pattern))
+.Ve
+.PP
+you can write either of these:
+.PP
+.Vb 2
+\& (*atomic_script_run:pattern)
+\& (*asr:pattern)
+.Ve
+.PP
+(See \f(CW"(?>\fR\f(CIpattern\fR\f(CW)"\fR.)
+.PP
+In Taiwan, Japan, and Korea, it is common for text to have a mixture of
+characters from their native scripts and base Chinese.  Perl follows
+Unicode's UTS 39 (<https://unicode.org/reports/tr39/>) Unicode Security
+Mechanisms in allowing such mixtures.  For example, the Japanese scripts
+Katakana and Hiragana are commonly mixed together in practice, along
+with some Chinese characters, and hence are treated as being in a single
+script run by Perl.
+.PP
+The rules used for matching decimal digits are slightly stricter.  Many
+scripts have their own sets of digits equivalent to the Western \f(CW0\fR
+through \f(CW9\fR ones.  A few, such as Arabic, have more than one set.  For
+a string to be considered a script run, all digits in it must come from
+the same set of ten, as determined by the first digit encountered.
+As an example,
+.PP
+.Vb 1
+\& qr/(*script_run: \ed+ \eb )/x
+.Ve
+.PP
+guarantees that the digits matched will all be from the same set of 10.
+You won't get a look-alike digit from a different script that has a
+different value than what it appears to be.
+.PP
+Unicode has three pseudo scripts that are handled specially.
+.PP
+"Unknown" is applied to code points whose meaning has yet to be
+determined.  Perl currently will match as a script run, any single
+character string consisting of one of these code points.  But any string
+longer than one code point containing one of these will not be
+considered a script run.
+.PP
+"Inherited" is applied to characters that modify another, such as an
+accent of some type.  These are considered to be in the script of the
+master character, and so never cause a script run to not match.
+.PP
+The other one is "Common".  This consists of mostly punctuation, emoji,
+characters used in mathematics and music, the ASCII digits \f(CW0\fR
+through \f(CW9\fR, and full-width forms of these digits.  These characters
+can appear intermixed in text in many of the world's scripts.  These
+also don't cause a script run to not match.  But like other scripts, all
+digits in a run must come from the same set of 10.
+.PP
+This construct is non-capturing.  You can add parentheses to \fIpattern\fR
+to capture, if desired.  You will have to do this if you plan to use
+"(*ACCEPT) (*ACCEPT:arg)" and not have it bypass the script run
+checking.
+.PP
+The \f(CW\*(C`Script_Extensions\*(C'\fR property as modified by UTS 39
+(<https://unicode.org/reports/tr39/>) is used as the basis for this
+feature.
+.PP
+To summarize,
+.IP \(bu 4
+All length 0 or length 1 sequences are script runs.
+.IP \(bu 4
+A longer sequence is a script run if and only if \fBall\fR of the following
+conditions are met:
+.Sp
+
+.RS 4
+.IP 1. 4
+No code point in the sequence has the \f(CW\*(C`Script_Extension\*(C'\fR property of
+\&\f(CW\*(C`Unknown\*(C'\fR.
+.Sp
+This currently means that all code points in the sequence have been
+assigned by Unicode to be characters that aren't private use nor
+surrogate code points.
+.IP 2. 4
+All characters in the sequence come from the Common script and/or the
+Inherited script and/or a single other script.
+.Sp
+The script of a character is determined by the \f(CW\*(C`Script_Extensions\*(C'\fR
+property as modified by UTS 39 (<https://unicode.org/reports/tr39/>), as
+described above.
+.IP 3. 4
+All decimal digits in the sequence come from the same block of 10
+consecutive digits.
+.RE
+.RS 4
+.RE
+.SS "Special Backtracking Control Verbs"
+.IX Subsection "Special Backtracking Control Verbs"
+These special patterns are generally of the form \f(CW\*(C`(*\fR\f(CIVERB\fR\f(CW:\fR\f(CIarg\fR\f(CW)\*(C'\fR. Unless
+otherwise stated the \fIarg\fR argument is optional; in some cases, it is
+mandatory.
+.PP
+Any pattern containing a special backtracking verb that allows an argument
+has the special behaviour that when executed it sets the current package's
+\&\f(CW$REGERROR\fR and \f(CW$REGMARK\fR variables. When doing so the following
+rules apply:
+.PP
+On failure, the \f(CW$REGERROR\fR variable will be set to the \fIarg\fR value of the
+verb pattern, if the verb was involved in the failure of the match. If the
+\&\fIarg\fR part of the pattern was omitted, then \f(CW$REGERROR\fR will be set to the
+name of the last \f(CW\*(C`(*MARK:\fR\f(CINAME\fR\f(CW)\*(C'\fR pattern executed, or to TRUE if there was
+none. Also, the \f(CW$REGMARK\fR variable will be set to FALSE.
+.PP
+On a successful match, the \f(CW$REGERROR\fR variable will be set to FALSE, and
+the \f(CW$REGMARK\fR variable will be set to the name of the last
+\&\f(CW\*(C`(*MARK:\fR\f(CINAME\fR\f(CW)\*(C'\fR pattern executed.  See the explanation for the
+\&\f(CW\*(C`(*MARK:\fR\f(CINAME\fR\f(CW)\*(C'\fR verb below for more details.
+.PP
+\&\fBNOTE:\fR \f(CW$REGERROR\fR and \f(CW$REGMARK\fR are not magic variables like \f(CW$1\fR
+and most other regex-related variables. They are not local to a scope, nor
+readonly, but instead are volatile package variables similar to \f(CW$AUTOLOAD\fR.
+They are set in the package containing the code that \fIexecuted\fR the regex
+(rather than the one that compiled it, where those differ).  If necessary, you
+can use \f(CW\*(C`local\*(C'\fR to localize changes to these variables to a specific scope
+before executing a regex.
+.PP
+If a pattern does not contain a special backtracking verb that allows an
+argument, then \f(CW$REGERROR\fR and \f(CW$REGMARK\fR are not touched at all.
+.IP Verbs 3
+.IX Item "Verbs"
+.RS 3
+.PD 0
+.ie n .IP """(*PRUNE)"" ""(*PRUNE:\fINAME\fR)""" 4
+.el .IP "\f(CW(*PRUNE)\fR \f(CW(*PRUNE:\fR\f(CINAME\fR\f(CW)\fR" 4
+.IX Xref "(*PRUNE) (*PRUNE:NAME)"
+.IX Item "(*PRUNE) (*PRUNE:NAME)"
+.PD
+This zero-width pattern prunes the backtracking tree at the current point
+when backtracked into on failure. Consider the pattern \f(CW\*(C`/\fR\f(CIA\fR\f(CW (*PRUNE) \fR\f(CIB\fR\f(CW/\*(C'\fR,
+where \fIA\fR and \fIB\fR are complex patterns. Until the \f(CW\*(C`(*PRUNE)\*(C'\fR verb is reached,
+\&\fIA\fR may backtrack as necessary to match. Once it is reached, matching
+continues in \fIB\fR, which may also backtrack as necessary; however, should B
+not match, then no further backtracking will take place, and the pattern
+will fail outright at the current starting position.
+.Sp
+The following example counts all the possible matching strings in a
+pattern (without actually matching any of them).
+.Sp
+.Vb 2
+\&    \*(Aqaaab\*(Aq =~ /a+b?(?{print "$&\en"; $count++})(*FAIL)/;
+\&    print "Count=$count\en";
+.Ve
+.Sp
+which produces:
+.Sp
+.Vb 10
+\&    aaab
+\&    aaa
+\&    aa
+\&    a
+\&    aab
+\&    aa
+\&    a
+\&    ab
+\&    a
+\&    Count=9
+.Ve
+.Sp
+If we add a \f(CW\*(C`(*PRUNE)\*(C'\fR before the count like the following
+.Sp
+.Vb 2
+\&    \*(Aqaaab\*(Aq =~ /a+b?(*PRUNE)(?{print "$&\en"; $count++})(*FAIL)/;
+\&    print "Count=$count\en";
+.Ve
+.Sp
+we prevent backtracking and find the count of the longest matching string
+at each matching starting point like so:
+.Sp
+.Vb 4
+\&    aaab
+\&    aab
+\&    ab
+\&    Count=3
+.Ve
+.Sp
+Any number of \f(CW\*(C`(*PRUNE)\*(C'\fR assertions may be used in a pattern.
+.Sp
+See also \f(CW"(?>\fR\f(CIpattern\fR\f(CW)"\fR and possessive quantifiers for
+other ways to
+control backtracking. In some cases, the use of \f(CW\*(C`(*PRUNE)\*(C'\fR can be
+replaced with a \f(CW\*(C`(?>pattern)\*(C'\fR with no functional difference; however,
+\&\f(CW\*(C`(*PRUNE)\*(C'\fR can be used to handle cases that cannot be expressed using a
+\&\f(CW\*(C`(?>pattern)\*(C'\fR alone.
+.ie n .IP """(*SKIP)"" ""(*SKIP:\fINAME\fR)""" 4
+.el .IP "\f(CW(*SKIP)\fR \f(CW(*SKIP:\fR\f(CINAME\fR\f(CW)\fR" 4
+.IX Xref "(*SKIP)"
+.IX Item "(*SKIP) (*SKIP:NAME)"
+This zero-width pattern is similar to \f(CW\*(C`(*PRUNE)\*(C'\fR, except that on
+failure it also signifies that whatever text that was matched leading up
+to the \f(CW\*(C`(*SKIP)\*(C'\fR pattern being executed cannot be part of \fIany\fR match
+of this pattern. This effectively means that the regex engine "skips" forward
+to this position on failure and tries to match again, (assuming that
+there is sufficient room to match).
+.Sp
+The name of the \f(CW\*(C`(*SKIP:\fR\f(CINAME\fR\f(CW)\*(C'\fR pattern has special significance. If a
+\&\f(CW\*(C`(*MARK:\fR\f(CINAME\fR\f(CW)\*(C'\fR was encountered while matching, then it is that position
+which is used as the "skip point". If no \f(CW\*(C`(*MARK)\*(C'\fR of that name was
+encountered, then the \f(CW\*(C`(*SKIP)\*(C'\fR operator has no effect. When used
+without a name the "skip point" is where the match point was when
+executing the \f(CW\*(C`(*SKIP)\*(C'\fR pattern.
+.Sp
+Compare the following to the examples in \f(CW\*(C`(*PRUNE)\*(C'\fR; note the string
+is twice as long:
+.Sp
+.Vb 2
+\& \*(Aqaaabaaab\*(Aq =~ /a+b?(*SKIP)(?{print "$&\en"; $count++})(*FAIL)/;
+\& print "Count=$count\en";
+.Ve
+.Sp
+outputs
+.Sp
+.Vb 3
+\&    aaab
+\&    aaab
+\&    Count=2
+.Ve
+.Sp
+Once the 'aaab' at the start of the string has matched, and the \f(CW\*(C`(*SKIP)\*(C'\fR
+executed, the next starting point will be where the cursor was when the
+\&\f(CW\*(C`(*SKIP)\*(C'\fR was executed.
+.ie n .IP """(*MARK:\fINAME\fR)"" ""(*:\fINAME\fR)""" 4
+.el .IP "\f(CW(*MARK:\fR\f(CINAME\fR\f(CW)\fR \f(CW(*:\fR\f(CINAME\fR\f(CW)\fR" 4
+.IX Xref "(*MARK) (*MARK:NAME) (*:NAME)"
+.IX Item "(*MARK:NAME) (*:NAME)"
+This zero-width pattern can be used to mark the point reached in a string
+when a certain part of the pattern has been successfully matched. This
+mark may be given a name. A later \f(CW\*(C`(*SKIP)\*(C'\fR pattern will then skip
+forward to that point if backtracked into on failure. Any number of
+\&\f(CW\*(C`(*MARK)\*(C'\fR patterns are allowed, and the \fINAME\fR portion may be duplicated.
+.Sp
+In addition to interacting with the \f(CW\*(C`(*SKIP)\*(C'\fR pattern, \f(CW\*(C`(*MARK:\fR\f(CINAME\fR\f(CW)\*(C'\fR
+can be used to "label" a pattern branch, so that after matching, the
+program can determine which branches of the pattern were involved in the
+match.
+.Sp
+When a match is successful, the \f(CW$REGMARK\fR variable will be set to the
+name of the most recently executed \f(CW\*(C`(*MARK:\fR\f(CINAME\fR\f(CW)\*(C'\fR that was involved
+in the match.
+.Sp
+This can be used to determine which branch of a pattern was matched
+without using a separate capture group for each branch, which in turn
+can result in a performance improvement, as perl cannot optimize
+\&\f(CW\*(C`/(?:(x)|(y)|(z))/\*(C'\fR as efficiently as something like
+\&\f(CW\*(C`/(?:x(*MARK:x)|y(*MARK:y)|z(*MARK:z))/\*(C'\fR.
+.Sp
+When a match has failed, and unless another verb has been involved in
+failing the match and has provided its own name to use, the \f(CW$REGERROR\fR
+variable will be set to the name of the most recently executed
+\&\f(CW\*(C`(*MARK:\fR\f(CINAME\fR\f(CW)\*(C'\fR.
+.Sp
+See "(*SKIP)" for more details.
+.Sp
+As a shortcut \f(CW\*(C`(*MARK:\fR\f(CINAME\fR\f(CW)\*(C'\fR can be written \f(CW\*(C`(*:\fR\f(CINAME\fR\f(CW)\*(C'\fR.
+.ie n .IP """(*THEN)"" ""(*THEN:\fINAME\fR)""" 4
+.el .IP "\f(CW(*THEN)\fR \f(CW(*THEN:\fR\f(CINAME\fR\f(CW)\fR" 4
+.IX Item "(*THEN) (*THEN:NAME)"
+This is similar to the "cut group" operator \f(CW\*(C`::\*(C'\fR from Raku.  Like
+\&\f(CW\*(C`(*PRUNE)\*(C'\fR, this verb always matches, and when backtracked into on
+failure, it causes the regex engine to try the next alternation in the
+innermost enclosing group (capturing or otherwise) that has alternations.
+The two branches of a \f(CW\*(C`(?(\fR\f(CIcondition\fR\f(CW)\fR\f(CIyes\-pattern\fR\f(CW|\fR\f(CIno\-pattern\fR\f(CW)\*(C'\fR do not
+count as an alternation, as far as \f(CW\*(C`(*THEN)\*(C'\fR is concerned.
+.Sp
+Its name comes from the observation that this operation combined with the
+alternation operator (\f(CW"|"\fR) can be used to create what is essentially a
+pattern-based if/then/else block:
+.Sp
+.Vb 1
+\&  ( COND (*THEN) FOO | COND2 (*THEN) BAR | COND3 (*THEN) BAZ )
+.Ve
+.Sp
+Note that if this operator is used and NOT inside of an alternation then
+it acts exactly like the \f(CW\*(C`(*PRUNE)\*(C'\fR operator.
+.Sp
+.Vb 1
+\&  / A (*PRUNE) B /
+.Ve
+.Sp
+is the same as
+.Sp
+.Vb 1
+\&  / A (*THEN) B /
+.Ve
+.Sp
+but
+.Sp
+.Vb 1
+\&  / ( A (*THEN) B | C ) /
+.Ve
+.Sp
+is not the same as
+.Sp
+.Vb 1
+\&  / ( A (*PRUNE) B | C ) /
+.Ve
+.Sp
+as after matching the \fIA\fR but failing on the \fIB\fR the \f(CW\*(C`(*THEN)\*(C'\fR verb will
+backtrack and try \fIC\fR; but the \f(CW\*(C`(*PRUNE)\*(C'\fR verb will simply fail.
+.ie n .IP """(*COMMIT)"" ""(*COMMIT:\fIarg\fR)""" 4
+.el .IP "\f(CW(*COMMIT)\fR \f(CW(*COMMIT:\fR\f(CIarg\fR\f(CW)\fR" 4
+.IX Xref "(*COMMIT)"
+.IX Item "(*COMMIT) (*COMMIT:arg)"
+This is the Raku "commit pattern" \f(CW\*(C`<commit>\*(C'\fR or \f(CW\*(C`:::\*(C'\fR. It's a
+zero-width pattern similar to \f(CW\*(C`(*SKIP)\*(C'\fR, except that when backtracked
+into on failure it causes the match to fail outright. No further attempts
+to find a valid match by advancing the start pointer will occur again.
+For example,
+.Sp
+.Vb 2
+\& \*(Aqaaabaaab\*(Aq =~ /a+b?(*COMMIT)(?{print "$&\en"; $count++})(*FAIL)/;
+\& print "Count=$count\en";
+.Ve
+.Sp
+outputs
+.Sp
+.Vb 2
+\&    aaab
+\&    Count=1
+.Ve
+.Sp
+In other words, once the \f(CW\*(C`(*COMMIT)\*(C'\fR has been entered, and if the pattern
+does not match, the regex engine will not try any further matching on the
+rest of the string.
+.ie n .IP """(*FAIL)"" ""(*F)"" ""(*FAIL:\fIarg\fR)""" 4
+.el .IP "\f(CW(*FAIL)\fR \f(CW(*F)\fR \f(CW(*FAIL:\fR\f(CIarg\fR\f(CW)\fR" 4
+.IX Xref "(*FAIL) (*F)"
+.IX Item "(*FAIL) (*F) (*FAIL:arg)"
+This pattern matches nothing and always fails. It can be used to force the
+engine to backtrack. It is equivalent to \f(CW\*(C`(?!)\*(C'\fR, but easier to read. In
+fact, \f(CW\*(C`(?!)\*(C'\fR gets optimised into \f(CW\*(C`(*FAIL)\*(C'\fR internally. You can provide
+an argument so that if the match fails because of this \f(CW\*(C`FAIL\*(C'\fR directive
+the argument can be obtained from \f(CW$REGERROR\fR.
+.Sp
+It is probably useful only when combined with \f(CW\*(C`(?{})\*(C'\fR or \f(CW\*(C`(??{})\*(C'\fR.
+.ie n .IP """(*ACCEPT)"" ""(*ACCEPT:\fIarg\fR)""" 4
+.el .IP "\f(CW(*ACCEPT)\fR \f(CW(*ACCEPT:\fR\f(CIarg\fR\f(CW)\fR" 4
+.IX Xref "(*ACCEPT)"
+.IX Item "(*ACCEPT) (*ACCEPT:arg)"
+This pattern matches nothing and causes the end of successful matching at
+the point at which the \f(CW\*(C`(*ACCEPT)\*(C'\fR pattern was encountered, regardless of
+whether there is actually more to match in the string. When inside of a
+nested pattern, such as recursion, or in a subpattern dynamically generated
+via \f(CW\*(C`(??{})\*(C'\fR, only the innermost pattern is ended immediately.
+.Sp
+If the \f(CW\*(C`(*ACCEPT)\*(C'\fR is inside of capturing groups then the groups are
+marked as ended at the point at which the \f(CW\*(C`(*ACCEPT)\*(C'\fR was encountered.
+For instance:
+.Sp
+.Vb 1
+\&  \*(AqAB\*(Aq =~ /(A (A|B(*ACCEPT)|C) D)(E)/x;
+.Ve
+.Sp
+will match, and \f(CW$1\fR will be \f(CW\*(C`AB\*(C'\fR and \f(CW$2\fR will be \f(CW"B"\fR, \f(CW$3\fR will not
+be set. If another branch in the inner parentheses was matched, such as in the
+string 'ACDE', then the \f(CW"D"\fR and \f(CW"E"\fR would have to be matched as well.
+.Sp
+You can provide an argument, which will be available in the var
+\&\f(CW$REGMARK\fR after the match completes.
+.RE
+.RS 3
+.RE
+.ie n .SS "Warning on ""\e1"" Instead of $1"
+.el .SS "Warning on \f(CW\e1\fP Instead of \f(CW$1\fP"
+.IX Subsection "Warning on 1 Instead of $1"
+Some people get too used to writing things like:
+.PP
+.Vb 1
+\&    $pattern =~ s/(\eW)/\e\e\e1/g;
+.Ve
+.PP
+This is grandfathered (for \e1 to \e9) for the RHS of a substitute to avoid
+shocking the
+\&\fBsed\fR addicts, but it's a dirty habit to get into.  That's because in
+PerlThink, the righthand side of an \f(CW\*(C`s///\*(C'\fR is a double-quoted string.  \f(CW\*(C`\e1\*(C'\fR in
+the usual double-quoted string means a control-A.  The customary Unix
+meaning of \f(CW\*(C`\e1\*(C'\fR is kludged in for \f(CW\*(C`s///\*(C'\fR.  However, if you get into the habit
+of doing that, you get yourself into trouble if you then add an \f(CW\*(C`/e\*(C'\fR
+modifier.
+.PP
+.Vb 1
+\&    s/(\ed+)/ \e1 + 1 /eg;            # causes warning under \-w
+.Ve
+.PP
+Or if you try to do
+.PP
+.Vb 1
+\&    s/(\ed+)/\e1000/;
+.Ve
+.PP
+You can't disambiguate that by saying \f(CW\*(C`\e{1}000\*(C'\fR, whereas you can fix it with
+\&\f(CW\*(C`${1}000\*(C'\fR.  The operation of interpolation should not be confused
+with the operation of matching a backreference.  Certainly they mean two
+different things on the \fIleft\fR side of the \f(CW\*(C`s///\*(C'\fR.
+.SS "Repeated Patterns Matching a Zero-length Substring"
+.IX Subsection "Repeated Patterns Matching a Zero-length Substring"
+\&\fBWARNING\fR: Difficult material (and prose) ahead.  This section needs a rewrite.
+.PP
+Regular expressions provide a terse and powerful programming language.  As
+with most other power tools, power comes together with the ability
+to wreak havoc.
+.PP
+A common abuse of this power stems from the ability to make infinite
+loops using regular expressions, with something as innocuous as:
+.PP
+.Vb 1
+\&    \*(Aqfoo\*(Aq =~ m{ ( o? )* }x;
+.Ve
+.PP
+The \f(CW\*(C`o?\*(C'\fR matches at the beginning of "\f(CW\*(C`foo\*(C'\fR", and since the position
+in the string is not moved by the match, \f(CW\*(C`o?\*(C'\fR would match again and again
+because of the \f(CW"*"\fR quantifier.  Another common way to create a similar cycle
+is with the looping modifier \f(CW\*(C`/g\*(C'\fR:
+.PP
+.Vb 1
+\&    @matches = ( \*(Aqfoo\*(Aq =~ m{ o? }xg );
+.Ve
+.PP
+or
+.PP
+.Vb 1
+\&    print "match: <$&>\en" while \*(Aqfoo\*(Aq =~ m{ o? }xg;
+.Ve
+.PP
+or the loop implied by \f(CWsplit()\fR.
+.PP
+However, long experience has shown that many programming tasks may
+be significantly simplified by using repeated subexpressions that
+may match zero-length substrings.  Here's a simple example being:
+.PP
+.Vb 2
+\&    @chars = split //, $string;           # // is not magic in split
+\&    ($whitewashed = $string) =~ s/()/ /g; # parens avoid magic s// /
+.Ve
+.PP
+Thus Perl allows such constructs, by \fIforcefully breaking
+the infinite loop\fR.  The rules for this are different for lower-level
+loops given by the greedy quantifiers \f(CW\*(C`*+{}\*(C'\fR, and for higher-level
+ones like the \f(CW\*(C`/g\*(C'\fR modifier or \f(CWsplit()\fR operator.
+.PP
+The lower-level loops are \fIinterrupted\fR (that is, the loop is
+broken) when Perl detects that a repeated expression matched a
+zero-length substring.   Thus
+.PP
+.Vb 1
+\&   m{ (?: NON_ZERO_LENGTH | ZERO_LENGTH )* }x;
+.Ve
+.PP
+is made equivalent to
+.PP
+.Vb 1
+\&   m{ (?: NON_ZERO_LENGTH )* (?: ZERO_LENGTH )? }x;
+.Ve
+.PP
+For example, this program
+.PP
+.Vb 12
+\&   #!perl \-l
+\&   "aaaaab" =~ /
+\&     (?:
+\&        a                 # non\-zero
+\&        |                 # or
+\&       (?{print "hello"}) # print hello whenever this
+\&                          #    branch is tried
+\&       (?=(b))            # zero\-width assertion
+\&     )*  # any number of times
+\&    /x;
+\&   print $&;
+\&   print $1;
+.Ve
+.PP
+prints
+.PP
+.Vb 3
+\&   hello
+\&   aaaaa
+\&   b
+.Ve
+.PP
+Notice that "hello" is only printed once, as when Perl sees that the sixth
+iteration of the outermost \f(CW\*(C`(?:)*\*(C'\fR matches a zero-length string, it stops
+the \f(CW"*"\fR.
+.PP
+The higher-level loops preserve an additional state between iterations:
+whether the last match was zero-length.  To break the loop, the following
+match after a zero-length match is prohibited to have a length of zero.
+This prohibition interacts with backtracking (see "Backtracking"),
+and so the \fIsecond best\fR match is chosen if the \fIbest\fR match is of
+zero length.
+.PP
+For example:
+.PP
+.Vb 2
+\&    $_ = \*(Aqbar\*(Aq;
+\&    s/\ew??/<$&>/g;
+.Ve
+.PP
+results in \f(CW\*(C`<><b><><a><><r><>\*(C'\fR.  At each position of the string the best
+match given by non-greedy \f(CW\*(C`??\*(C'\fR is the zero-length match, and the \fIsecond
+best\fR match is what is matched by \f(CW\*(C`\ew\*(C'\fR.  Thus zero-length matches
+alternate with one-character-long matches.
+.PP
+Similarly, for repeated \f(CW\*(C`m/()/g\*(C'\fR the second-best match is the match at the
+position one notch further in the string.
+.PP
+The additional state of being \fImatched with zero-length\fR is associated with
+the matched string, and is reset by each assignment to \f(CWpos()\fR.
+Zero-length matches at the end of the previous match are ignored
+during \f(CW\*(C`split\*(C'\fR.
+.SS "Combining RE Pieces"
+.IX Subsection "Combining RE Pieces"
+Each of the elementary pieces of regular expressions which were described
+before (such as \f(CW\*(C`ab\*(C'\fR or \f(CW\*(C`\eZ\*(C'\fR) could match at most one substring
+at the given position of the input string.  However, in a typical regular
+expression these elementary pieces are combined into more complicated
+patterns using combining operators \f(CW\*(C`ST\*(C'\fR, \f(CW\*(C`S|T\*(C'\fR, \f(CW\*(C`S*\*(C'\fR \fIetc\fR.
+(in these examples \f(CW"S"\fR and \f(CW"T"\fR are regular subexpressions).
+.PP
+Such combinations can include alternatives, leading to a problem of choice:
+if we match a regular expression \f(CW\*(C`a|ab\*(C'\fR against \f(CW"abc"\fR, will it match
+substring \f(CW"a"\fR or \f(CW"ab"\fR?  One way to describe which substring is
+actually matched is the concept of backtracking (see "Backtracking").
+However, this description is too low-level and makes you think
+in terms of a particular implementation.
+.PP
+Another description starts with notions of "better"/"worse".  All the
+substrings which may be matched by the given regular expression can be
+sorted from the "best" match to the "worst" match, and it is the "best"
+match which is chosen.  This substitutes the question of "what is chosen?"
+by the question of "which matches are better, and which are worse?".
+.PP
+Again, for elementary pieces there is no such question, since at most
+one match at a given position is possible.  This section describes the
+notion of better/worse for combining operators.  In the description
+below \f(CW"S"\fR and \f(CW"T"\fR are regular subexpressions.
+.ie n .IP """ST""" 4
+.el .IP \f(CWST\fR 4
+.IX Item "ST"
+Consider two possible matches, \f(CW\*(C`AB\*(C'\fR and \f(CW\*(C`A\*(AqB\*(Aq\*(C'\fR, \f(CW"A"\fR and \f(CW\*(C`A\*(Aq\*(C'\fR are
+substrings which can be matched by \f(CW"S"\fR, \f(CW"B"\fR and \f(CW\*(C`B\*(Aq\*(C'\fR are substrings
+which can be matched by \f(CW"T"\fR.
+.Sp
+If \f(CW"A"\fR is a better match for \f(CW"S"\fR than \f(CW\*(C`A\*(Aq\*(C'\fR, \f(CW\*(C`AB\*(C'\fR is a better
+match than \f(CW\*(C`A\*(AqB\*(Aq\*(C'\fR.
+.Sp
+If \f(CW"A"\fR and \f(CW\*(C`A\*(Aq\*(C'\fR coincide: \f(CW\*(C`AB\*(C'\fR is a better match than \f(CW\*(C`AB\*(Aq\*(C'\fR if
+\&\f(CW"B"\fR is a better match for \f(CW"T"\fR than \f(CW\*(C`B\*(Aq\*(C'\fR.
+.ie n .IP """S|T""" 4
+.el .IP \f(CWS|T\fR 4
+.IX Item "S|T"
+When \f(CW"S"\fR can match, it is a better match than when only \f(CW"T"\fR can match.
+.Sp
+Ordering of two matches for \f(CW"S"\fR is the same as for \f(CW"S"\fR.  Similar for
+two matches for \f(CW"T"\fR.
+.ie n .IP """S{REPEAT_COUNT}""" 4
+.el .IP \f(CWS{REPEAT_COUNT}\fR 4
+.IX Item "S{REPEAT_COUNT}"
+Matches as \f(CW\*(C`SSS...S\*(C'\fR (repeated as many times as necessary).
+.ie n .IP """S{min,max}""" 4
+.el .IP \f(CWS{min,max}\fR 4
+.IX Item "S{min,max}"
+Matches as \f(CW\*(C`S{max}|S{max\-1}|...|S{min+1}|S{min}\*(C'\fR.
+.ie n .IP """S{min,max}?""" 4
+.el .IP \f(CWS{min,max}?\fR 4
+.IX Item "S{min,max}?"
+Matches as \f(CW\*(C`S{min}|S{min+1}|...|S{max\-1}|S{max}\*(C'\fR.
+.ie n .IP """S?"", ""S*"", ""S+""" 4
+.el .IP "\f(CWS?\fR, \f(CWS*\fR, \f(CWS+\fR" 4
+.IX Item "S?, S*, S+"
+Same as \f(CW\*(C`S{0,1}\*(C'\fR, \f(CW\*(C`S{0,BIG_NUMBER}\*(C'\fR, \f(CW\*(C`S{1,BIG_NUMBER}\*(C'\fR respectively.
+.ie n .IP """S??"", ""S*?"", ""S+?""" 4
+.el .IP "\f(CWS??\fR, \f(CWS*?\fR, \f(CWS+?\fR" 4
+.IX Item "S??, S*?, S+?"
+Same as \f(CW\*(C`S{0,1}?\*(C'\fR, \f(CW\*(C`S{0,BIG_NUMBER}?\*(C'\fR, \f(CW\*(C`S{1,BIG_NUMBER}?\*(C'\fR respectively.
+.ie n .IP """(?>S)""" 4
+.el .IP \f(CW(?>S)\fR 4
+.IX Item "(?>S)"
+Matches the best match for \f(CW"S"\fR and only that.
+.ie n .IP """(?=S)"", ""(?<=S)""" 4
+.el .IP "\f(CW(?=S)\fR, \f(CW(?<=S)\fR" 4
+.IX Item "(?=S), (?<=S)"
+Only the best match for \f(CW"S"\fR is considered.  (This is important only if
+\&\f(CW"S"\fR has capturing parentheses, and backreferences are used somewhere
+else in the whole regular expression.)
+.ie n .IP """(?!S)"", ""(?<!S)""" 4
+.el .IP "\f(CW(?!S)\fR, \f(CW(?<!S)\fR" 4
+.IX Item "(?!S), (?<!S)"
+For this grouping operator there is no need to describe the ordering, since
+only whether or not \f(CW"S"\fR can match is important.
+.ie n .IP """(??{ \fIEXPR\fR })"", ""(?\fIPARNO\fR)""" 4
+.el .IP "\f(CW(??{ \fR\f(CIEXPR\fR\f(CW })\fR, \f(CW(?\fR\f(CIPARNO\fR\f(CW)\fR" 4
+.IX Item "(??{ EXPR }), (?PARNO)"
+The ordering is the same as for the regular expression which is
+the result of \fIEXPR\fR, or the pattern contained by capture group \fIPARNO\fR.
+.ie n .IP """(?(\fIcondition\fR)\fIyes\-pattern\fR|\fIno\-pattern\fR)""" 4
+.el .IP \f(CW(?(\fR\f(CIcondition\fR\f(CW)\fR\f(CIyes\-pattern\fR\f(CW|\fR\f(CIno\-pattern\fR\f(CW)\fR 4
+.IX Item "(?(condition)yes-pattern|no-pattern)"
+Recall that which of \fIyes-pattern\fR or \fIno-pattern\fR actually matches is
+already determined.  The ordering of the matches is the same as for the
+chosen subexpression.
+.PP
+The above recipes describe the ordering of matches \fIat a given position\fR.
+One more rule is needed to understand how a match is determined for the
+whole regular expression: a match at an earlier position is always better
+than a match at a later position.
+.SS "Creating Custom RE Engines"
+.IX Subsection "Creating Custom RE Engines"
+As of Perl 5.10.0, one can create custom regular expression engines.  This
+is not for the faint of heart, as they have to plug in at the C level.  See
+perlreapi for more details.
+.PP
+As an alternative, overloaded constants (see overload) provide a simple
+way to extend the functionality of the RE engine, by substituting one
+pattern for another.
+.PP
+Suppose that we want to enable a new RE escape-sequence \f(CW\*(C`\eY|\*(C'\fR which
+matches at a boundary between whitespace characters and non-whitespace
+characters.  Note that \f(CW\*(C`(?=\eS)(?<!\eS)|(?!\eS)(?<=\eS)\*(C'\fR matches exactly
+at these positions, so we want to have each \f(CW\*(C`\eY|\*(C'\fR in the place of the
+more complicated version.  We can create a module \f(CW\*(C`customre\*(C'\fR to do
+this:
+.PP
+.Vb 2
+\&    package customre;
+\&    use overload;
+\&
+\&    sub import {
+\&      shift;
+\&      die "No argument to customre::import allowed" if @_;
+\&      overload::constant \*(Aqqr\*(Aq => \e&convert;
+\&    }
+\&
+\&    sub invalid { die "/$_[0]/: invalid escape \*(Aq\e\e$_[1]\*(Aq"}
+\&
+\&    # We must also take care of not escaping the legitimate \e\eY|
+\&    # sequence, hence the presence of \*(Aq\e\e\*(Aq in the conversion rules.
+\&    my %rules = ( \*(Aq\e\e\*(Aq => \*(Aq\e\e\e\e\*(Aq,
+\&                  \*(AqY|\*(Aq => qr/(?=\eS)(?<!\eS)|(?!\eS)(?<=\eS)/ );
+\&    sub convert {
+\&      my $re = shift;
+\&      $re =~ s{
+\&                \e\e ( \e\e | Y . )
+\&              }
+\&              { $rules{$1} or invalid($re,$1) }sgex;
+\&      return $re;
+\&    }
+.Ve
+.PP
+Now \f(CW\*(C`use customre\*(C'\fR enables the new escape in constant regular
+expressions, \fIi.e.\fR, those without any runtime variable interpolations.
+As documented in overload, this conversion will work only over
+literal parts of regular expressions.  For \f(CW\*(C`\eY|$re\eY|\*(C'\fR the variable
+part of this regular expression needs to be converted explicitly
+(but only if the special meaning of \f(CW\*(C`\eY|\*(C'\fR should be enabled inside \f(CW$re\fR):
+.PP
+.Vb 5
+\&    use customre;
+\&    $re = <>;
+\&    chomp $re;
+\&    $re = customre::convert $re;
+\&    /\eY|$re\eY|/;
+.Ve
+.SS "Embedded Code Execution Frequency"
+.IX Subsection "Embedded Code Execution Frequency"
+The exact rules for how often \f(CW\*(C`(?{})\*(C'\fR and \f(CW\*(C`(??{})\*(C'\fR are executed in a pattern
+are unspecified, and this is even more true of \f(CW\*(C`(*{})\*(C'\fR.
+In the case of a successful match you can assume that they DWIM and
+will be executed in left to right order the appropriate number of times in the
+accepting path of the pattern as would any other meta-pattern. How non\-
+accepting pathways and match failures affect the number of times a pattern is
+executed is specifically unspecified and may vary depending on what
+optimizations can be applied to the pattern and is likely to change from
+version to version.
+.PP
+For instance in
+.PP
+.Vb 1
+\&  "aaabcdeeeee"=~/a(?{print "a"})b(?{print "b"})cde/;
+.Ve
+.PP
+the exact number of times "a" or "b" are printed out is unspecified for
+failure, but you may assume they will be printed at least once during
+a successful match, additionally you may assume that if "b" is printed,
+it will be preceded by at least one "a".
+.PP
+In the case of branching constructs like the following:
+.PP
+.Vb 1
+\&  /a(b|(?{ print "a" }))c(?{ print "c" })/;
+.Ve
+.PP
+you can assume that the input "ac" will output "ac", and that "abc"
+will output only "c".
+.PP
+When embedded code is quantified, successful matches will call the
+code once for each matched iteration of the quantifier.  For
+example:
+.PP
+.Vb 1
+\&  "good" =~ /g(?:o(?{print "o"}))*d/;
+.Ve
+.PP
+will output "o" twice.
+.PP
+For historical and consistency reasons the use of normal code blocks
+anywhere in a pattern will disable certain optimisations. As of 5.37.7
+you can use an "optimistic" codeblock, \f(CW\*(C`(*{ ... })\*(C'\fR as a replacement
+for \f(CW\*(C`(?{ ... })\*(C'\fR, if you do *not* wish to disable these optimisations.
+This may result in the code block being called less often than it might
+have been had they not been optimistic.
+.SS "PCRE/Python Support"
+.IX Subsection "PCRE/Python Support"
+As of Perl 5.10.0, Perl supports several Python/PCRE\-specific extensions
+to the regex syntax. While Perl programmers are encouraged to use the
+Perl-specific syntax, the following are also accepted:
+.ie n .IP """(?P<\fINAME\fR>\fIpattern\fR)""" 4
+.el .IP \f(CW(?P<\fR\f(CINAME\fR\f(CW>\fR\f(CIpattern\fR\f(CW)\fR 4
+.IX Item "(?P<NAME>pattern)"
+Define a named capture group. Equivalent to \f(CW\*(C`(?<\fR\f(CINAME\fR\f(CW>\fR\f(CIpattern\fR\f(CW)\*(C'\fR.
+.ie n .IP """(?P=\fINAME\fR)""" 4
+.el .IP \f(CW(?P=\fR\f(CINAME\fR\f(CW)\fR 4
+.IX Item "(?P=NAME)"
+Backreference to a named capture group. Equivalent to \f(CW\*(C`\eg{\fR\f(CINAME\fR\f(CW}\*(C'\fR.
+.ie n .IP """(?P>\fINAME\fR)""" 4
+.el .IP \f(CW(?P>\fR\f(CINAME\fR\f(CW)\fR 4
+.IX Item "(?P>NAME)"
+Subroutine call to a named capture group. Equivalent to \f(CW\*(C`(?&\fR\f(CINAME\fR\f(CW)\*(C'\fR.
+.SH BUGS
+.IX Header "BUGS"
+There are a number of issues with regard to case-insensitive matching
+in Unicode rules.  See \f(CW"i"\fR under "Modifiers" above.
+.PP
+This document varies from difficult to understand to completely
+and utterly opaque.  The wandering prose riddled with jargon is
+hard to fathom in several places.
+.PP
+This document needs a rewrite that separates the tutorial content
+from the reference content.
+.SH "SEE ALSO"
+.IX Header "SEE ALSO"
+The syntax of patterns used in Perl pattern matching evolved from those
+supplied in the Bell Labs Research Unix 8th Edition (Version 8) regex
+routines.  (The code is actually derived (distantly) from Henry
+Spencer's freely redistributable reimplementation of those V8 routines.)
+.PP
+perlrequick.
+.PP
+perlretut.
+.PP
+"Regexp Quote-Like Operators" in perlop.
+.PP
+"Gory details of parsing quoted constructs" in perlop.
+.PP
+perlfaq6.
+.PP
+"pos" in perlfunc.
+.PP
+perllocale.
+.PP
+perlebcdic.
+.PP
+\&\fIMastering Regular Expressions\fR by Jeffrey Friedl, published
+by O'Reilly and Associates.