1 files changed, 1248 insertions, 0 deletions
diff --git a/doc/src/sgml/html/functions-matching.html b/doc/src/sgml/html/functions-matching.html
new file mode 100644
index 0000000..2d0c524
--- /dev/null
+++ b/doc/src/sgml/html/functions-matching.html
@@ -0,0 +1,1248 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>9.7. Pattern Matching</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets V1.79.1" /><link rel="prev" href="functions-bitstring.html" title="9.6. Bit String Functions and Operators" /><link rel="next" href="functions-formatting.html" title="9.8. Data Type Formatting Functions" /></head><body id="docContent" class="container-fluid col-10"><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">9.7. Pattern Matching</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="functions-bitstring.html" title="9.6. Bit String Functions and Operators">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="functions.html" title="Chapter 9. Functions and Operators">Up</a></td><th width="60%" align="center">Chapter 9. Functions and Operators</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 13.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="functions-formatting.html" title="9.8. Data Type Formatting Functions">Next</a></td></tr></table><hr></hr></div><div class="sect1" id="FUNCTIONS-MATCHING"><div class="titlepage"><div><div><h2 class="title" style="clear: both">9.7. Pattern Matching</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="functions-matching.html#FUNCTIONS-LIKE">9.7.1. <code class="function">LIKE</code></a></span></dt><dt><span class="sect2"><a href="functions-matching.html#FUNCTIONS-SIMILARTO-REGEXP">9.7.2. <code class="function">SIMILAR TO</code> Regular Expressions</a></span></dt><dt><span class="sect2"><a href="functions-matching.html#FUNCTIONS-POSIX-REGEXP">9.7.3. <acronym class="acronym">POSIX</acronym> Regular Expressions</a></span></dt></dl></div><a id="id-1.5.8.13.2" class="indexterm"></a><p>
+    There are three separate approaches to pattern matching provided
+    by <span class="productname">PostgreSQL</span>: the traditional
+    <acronym class="acronym">SQL</acronym> <code class="function">LIKE</code> operator, the
+    more recent <code class="function">SIMILAR TO</code> operator (added in
+    SQL:1999), and <acronym class="acronym">POSIX</acronym>-style regular
+    expressions.  Aside from the basic <span class="quote">“<span class="quote">does this string match
+    this pattern?</span>”</span> operators, functions are available to extract
+    or replace matching substrings and to split a string at matching
+    locations.
+   </p><div class="tip"><h3 class="title">Tip</h3><p>
+     If you have pattern matching needs that go beyond this,
+     consider writing a user-defined function in Perl or Tcl.
+    </p></div><div class="caution"><h3 class="title">Caution</h3><p>
+     While most regular-expression searches can be executed very quickly,
+     regular expressions can be contrived that take arbitrary amounts of
+     time and memory to process.  Be wary of accepting regular-expression
+     search patterns from hostile sources.  If you must do so, it is
+     advisable to impose a statement timeout.
+    </p><p>
+     Searches using <code class="function">SIMILAR TO</code> patterns have the same
+     security hazards, since <code class="function">SIMILAR TO</code> provides many
+     of the same capabilities as <acronym class="acronym">POSIX</acronym>-style regular
+     expressions.
+    </p><p>
+     <code class="function">LIKE</code> searches, being much simpler than the other
+     two options, are safer to use with possibly-hostile pattern sources.
+    </p></div><p>
+    The pattern matching operators of all three kinds do not support
+    nondeterministic collations.  If required, apply a different collation to
+    the expression to work around this limitation.
+   </p><div class="sect2" id="FUNCTIONS-LIKE"><div class="titlepage"><div><div><h3 class="title">9.7.1. <code class="function">LIKE</code></h3></div></div></div><a id="id-1.5.8.13.7.2" class="indexterm"></a><pre class="synopsis">
+<em class="replaceable"><code>string</code></em> LIKE <em class="replaceable"><code>pattern</code></em> [<span class="optional">ESCAPE <em class="replaceable"><code>escape-character</code></em></span>]
+<em class="replaceable"><code>string</code></em> NOT LIKE <em class="replaceable"><code>pattern</code></em> [<span class="optional">ESCAPE <em class="replaceable"><code>escape-character</code></em></span>]
+</pre><p>
+     The <code class="function">LIKE</code> expression returns true if the
+     <em class="replaceable"><code>string</code></em> matches the supplied
+     <em class="replaceable"><code>pattern</code></em>.  (As
+     expected, the <code class="function">NOT LIKE</code> expression returns
+     false if <code class="function">LIKE</code> returns true, and vice versa.
+     An equivalent expression is
+     <code class="literal">NOT (<em class="replaceable"><code>string</code></em> LIKE
+      <em class="replaceable"><code>pattern</code></em>)</code>.)
+    </p><p>
+     If <em class="replaceable"><code>pattern</code></em> does not contain percent
+     signs or underscores, then the pattern only represents the string
+     itself; in that case <code class="function">LIKE</code> acts like the
+     equals operator.  An underscore (<code class="literal">_</code>) in
+     <em class="replaceable"><code>pattern</code></em> stands for (matches) any single
+     character; a percent sign (<code class="literal">%</code>) matches any sequence
+     of zero or more characters.
+    </p><p>
+    Some examples:
+</p><pre class="programlisting">
+'abc' LIKE 'abc'    <em class="lineannotation"><span class="lineannotation">true</span></em>
+'abc' LIKE 'a%'     <em class="lineannotation"><span class="lineannotation">true</span></em>
+'abc' LIKE '_b_'    <em class="lineannotation"><span class="lineannotation">true</span></em>
+'abc' LIKE 'c'      <em class="lineannotation"><span class="lineannotation">false</span></em>
+</pre><p>
+   </p><p>
+    <code class="function">LIKE</code> pattern matching always covers the entire
+    string.  Therefore, if it's desired to match a sequence anywhere within
+    a string, the pattern must start and end with a percent sign.
+   </p><p>
+    To match a literal underscore or percent sign without matching
+    other characters, the respective character in
+    <em class="replaceable"><code>pattern</code></em> must be
+    preceded by the escape character.  The default escape
+    character is the backslash but a different one can be selected by
+    using the <code class="literal">ESCAPE</code> clause.  To match the escape
+    character itself, write two escape characters.
+   </p><div class="note"><h3 class="title">Note</h3><p>
+     If you have <a class="xref" href="runtime-config-compatible.html#GUC-STANDARD-CONFORMING-STRINGS">standard_conforming_strings</a> turned off,
+     any backslashes you write in literal string constants will need to be
+     doubled.  See <a class="xref" href="sql-syntax-lexical.html#SQL-SYNTAX-STRINGS" title="4.1.2.1. String Constants">Section 4.1.2.1</a> for more information.
+    </p></div><p>
+    It's also possible to select no escape character by writing
+    <code class="literal">ESCAPE ''</code>.  This effectively disables the
+    escape mechanism, which makes it impossible to turn off the
+    special meaning of underscore and percent signs in the pattern.
+   </p><p>
+    According to the SQL standard, omitting <code class="literal">ESCAPE</code>
+    means there is no escape character (rather than defaulting to a
+    backslash), and a zero-length <code class="literal">ESCAPE</code> value is
+    disallowed.  <span class="productname">PostgreSQL</span>'s behavior in
+    this regard is therefore slightly nonstandard.
+   </p><p>
+    The key word <code class="token">ILIKE</code> can be used instead of
+    <code class="token">LIKE</code> to make the match case-insensitive according
+    to the active locale.  This is not in the <acronym class="acronym">SQL</acronym> standard but is a
+    <span class="productname">PostgreSQL</span> extension.
+   </p><p>
+    The operator <code class="literal">~~</code> is equivalent to
+    <code class="function">LIKE</code>, and <code class="literal">~~*</code> corresponds to
+    <code class="function">ILIKE</code>.  There are also
+    <code class="literal">!~~</code> and <code class="literal">!~~*</code> operators that
+    represent <code class="function">NOT LIKE</code> and <code class="function">NOT
+    ILIKE</code>, respectively.  All of these operators are
+    <span class="productname">PostgreSQL</span>-specific.  You may see these
+    operator names in <code class="command">EXPLAIN</code> output and similar
+    places, since the parser actually translates <code class="function">LIKE</code>
+    et al. to these operators.
+   </p><p>
+    The phrases <code class="function">LIKE</code>, <code class="function">ILIKE</code>,
+    <code class="function">NOT LIKE</code>, and <code class="function">NOT ILIKE</code> are
+    generally treated as operators
+    in <span class="productname">PostgreSQL</span> syntax; for example they can
+    be used in <em class="replaceable"><code>expression</code></em>
+    <em class="replaceable"><code>operator</code></em> ANY
+    (<em class="replaceable"><code>subquery</code></em>) constructs, although
+    an <code class="literal">ESCAPE</code> clause cannot be included there.  In some
+    obscure cases it may be necessary to use the underlying operator names
+    instead.
+   </p><p>
+    Also see the prefix operator <code class="literal">^@</code> and corresponding
+    <code class="function">starts_with</code> function, which are useful in cases
+    where simply matching the beginning of a string is needed.
+   </p></div><div class="sect2" id="FUNCTIONS-SIMILARTO-REGEXP"><div class="titlepage"><div><div><h3 class="title">9.7.2. <code class="function">SIMILAR TO</code> Regular Expressions</h3></div></div></div><a id="id-1.5.8.13.8.2" class="indexterm"></a><a id="id-1.5.8.13.8.3" class="indexterm"></a><a id="id-1.5.8.13.8.4" class="indexterm"></a><pre class="synopsis">
+<em class="replaceable"><code>string</code></em> SIMILAR TO <em class="replaceable"><code>pattern</code></em> [<span class="optional">ESCAPE <em class="replaceable"><code>escape-character</code></em></span>]
+<em class="replaceable"><code>string</code></em> NOT SIMILAR TO <em class="replaceable"><code>pattern</code></em> [<span class="optional">ESCAPE <em class="replaceable"><code>escape-character</code></em></span>]
+</pre><p>
+    The <code class="function">SIMILAR TO</code> operator returns true or
+    false depending on whether its pattern matches the given string.
+    It is similar to <code class="function">LIKE</code>, except that it
+    interprets the pattern using the SQL standard's definition of a
+    regular expression.  SQL regular expressions are a curious cross
+    between <code class="function">LIKE</code> notation and common (POSIX) regular
+    expression notation.
+   </p><p>
+    Like <code class="function">LIKE</code>, the <code class="function">SIMILAR TO</code>
+    operator succeeds only if its pattern matches the entire string;
+    this is unlike common regular expression behavior where the pattern
+    can match any part of the string.
+    Also like
+    <code class="function">LIKE</code>, <code class="function">SIMILAR TO</code> uses
+    <code class="literal">_</code> and <code class="literal">%</code> as wildcard characters denoting
+    any single character and any string, respectively (these are
+    comparable to <code class="literal">.</code> and <code class="literal">.*</code> in POSIX regular
+    expressions).
+   </p><p>
+    In addition to these facilities borrowed from <code class="function">LIKE</code>,
+    <code class="function">SIMILAR TO</code> supports these pattern-matching
+    metacharacters borrowed from POSIX regular expressions:
+
+   </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
+      <code class="literal">|</code> denotes alternation (either of two alternatives).
+     </p></li><li class="listitem"><p>
+      <code class="literal">*</code> denotes repetition of the previous item zero
+      or more times.
+     </p></li><li class="listitem"><p>
+      <code class="literal">+</code> denotes repetition of the previous item one
+      or more times.
+     </p></li><li class="listitem"><p>
+      <code class="literal">?</code> denotes repetition of the previous item zero
+      or one time.
+     </p></li><li class="listitem"><p>
+      <code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">}</code> denotes repetition
+      of the previous item exactly <em class="replaceable"><code>m</code></em> times.
+     </p></li><li class="listitem"><p>
+      <code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">,}</code> denotes repetition
+      of the previous item <em class="replaceable"><code>m</code></em> or more times.
+     </p></li><li class="listitem"><p>
+      <code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">,</code><em class="replaceable"><code>n</code></em><code class="literal">}</code>
+      denotes repetition of the previous item at least <em class="replaceable"><code>m</code></em> and
+      not more than <em class="replaceable"><code>n</code></em> times.
+     </p></li><li class="listitem"><p>
+      Parentheses <code class="literal">()</code> can be used to group items into
+      a single logical item.
+     </p></li><li class="listitem"><p>
+      A bracket expression <code class="literal">[...]</code> specifies a character
+      class, just as in POSIX regular expressions.
+     </p></li></ul></div><p>
+
+    Notice that the period (<code class="literal">.</code>) is not a metacharacter
+    for <code class="function">SIMILAR TO</code>.
+   </p><p>
+    As with <code class="function">LIKE</code>, a backslash disables the special
+    meaning of any of these metacharacters.  A different escape character
+    can be specified with <code class="literal">ESCAPE</code>, or the escape
+    capability can be disabled by writing <code class="literal">ESCAPE ''</code>.
+   </p><p>
+    According to the SQL standard, omitting <code class="literal">ESCAPE</code>
+    means there is no escape character (rather than defaulting to a
+    backslash), and a zero-length <code class="literal">ESCAPE</code> value is
+    disallowed.  <span class="productname">PostgreSQL</span>'s behavior in
+    this regard is therefore slightly nonstandard.
+   </p><p>
+    Another nonstandard extension is that following the escape character
+    with a letter or digit provides access to the escape sequences
+    defined for POSIX regular expressions; see
+    <a class="xref" href="functions-matching.html#POSIX-CHARACTER-ENTRY-ESCAPES-TABLE" title="Table 9.20. Regular Expression Character-Entry Escapes">Table 9.20</a>,
+    <a class="xref" href="functions-matching.html#POSIX-CLASS-SHORTHAND-ESCAPES-TABLE" title="Table 9.21. Regular Expression Class-Shorthand Escapes">Table 9.21</a>, and
+    <a class="xref" href="functions-matching.html#POSIX-CONSTRAINT-ESCAPES-TABLE" title="Table 9.22. Regular Expression Constraint Escapes">Table 9.22</a> below.
+   </p><p>
+    Some examples:
+</p><pre class="programlisting">
+'abc' SIMILAR TO 'abc'          <em class="lineannotation"><span class="lineannotation">true</span></em>
+'abc' SIMILAR TO 'a'            <em class="lineannotation"><span class="lineannotation">false</span></em>
+'abc' SIMILAR TO '%(b|d)%'      <em class="lineannotation"><span class="lineannotation">true</span></em>
+'abc' SIMILAR TO '(b|c)%'       <em class="lineannotation"><span class="lineannotation">false</span></em>
+'-abc-' SIMILAR TO '%\mabc\M%'  <em class="lineannotation"><span class="lineannotation">true</span></em>
+'xabcy' SIMILAR TO '%\mabc\M%'  <em class="lineannotation"><span class="lineannotation">false</span></em>
+</pre><p>
+   </p><p>
+    The <code class="function">substring</code> function with three parameters
+    provides extraction of a substring that matches an SQL
+    regular expression pattern.  The function can be written according
+    to SQL99 syntax:
+</p><pre class="synopsis">
+substring(<em class="replaceable"><code>string</code></em> from <em class="replaceable"><code>pattern</code></em> for <em class="replaceable"><code>escape-character</code></em>)
+</pre><p>
+    or as a plain three-argument function:
+</p><pre class="synopsis">
+substring(<em class="replaceable"><code>string</code></em>, <em class="replaceable"><code>pattern</code></em>, <em class="replaceable"><code>escape-character</code></em>)
+</pre><p>
+    As with <code class="literal">SIMILAR TO</code>, the
+    specified pattern must match the entire data string, or else the
+    function fails and returns null.  To indicate the part of the
+    pattern for which the matching data sub-string is of interest,
+    the pattern should contain
+    two occurrences of the escape character followed by a double quote
+    (<code class="literal">"</code>). 
+    The text matching the portion of the pattern
+    between these separators is returned when the match is successful.
+   </p><p>
+    The escape-double-quote separators actually
+    divide <code class="function">substring</code>'s pattern into three independent
+    regular expressions; for example, a vertical bar (<code class="literal">|</code>)
+    in any of the three sections affects only that section.  Also, the first
+    and third of these regular expressions are defined to match the smallest
+    possible amount of text, not the largest, when there is any ambiguity
+    about how much of the data string matches which pattern.  (In POSIX
+    parlance, the first and third regular expressions are forced to be
+    non-greedy.)
+   </p><p>
+    As an extension to the SQL standard, <span class="productname">PostgreSQL</span>
+    allows there to be just one escape-double-quote separator, in which case
+    the third regular expression is taken as empty; or no separators, in which
+    case the first and third regular expressions are taken as empty.
+   </p><p>
+    Some examples, with <code class="literal">#"</code> delimiting the return string:
+</p><pre class="programlisting">
+substring('foobar' from '%#"o_b#"%' for '#')   <em class="lineannotation"><span class="lineannotation">oob</span></em>
+substring('foobar' from '#"o_b#"%' for '#')    <em class="lineannotation"><span class="lineannotation">NULL</span></em>
+</pre><p>
+   </p></div><div class="sect2" id="FUNCTIONS-POSIX-REGEXP"><div class="titlepage"><div><div><h3 class="title">9.7.3. <acronym class="acronym">POSIX</acronym> Regular Expressions</h3></div></div></div><a id="id-1.5.8.13.9.2" class="indexterm"></a><a id="id-1.5.8.13.9.3" class="indexterm"></a><a id="id-1.5.8.13.9.4" class="indexterm"></a><a id="id-1.5.8.13.9.5" class="indexterm"></a><a id="id-1.5.8.13.9.6" class="indexterm"></a><a id="id-1.5.8.13.9.7" class="indexterm"></a><a id="id-1.5.8.13.9.8" class="indexterm"></a><p>
+    <a class="xref" href="functions-matching.html#FUNCTIONS-POSIX-TABLE" title="Table 9.16. Regular Expression Match Operators">Table 9.16</a> lists the available
+    operators for pattern matching using POSIX regular expressions.
+   </p><div class="table" id="FUNCTIONS-POSIX-TABLE"><p class="title"><strong>Table 9.16. Regular Expression Match Operators</strong></p><div class="table-contents"><table class="table" summary="Regular Expression Match Operators" border="1"><colgroup><col /></colgroup><thead><tr><th class="func_table_entry"><p class="func_signature">
+        Operator
+       </p>
+       <p>
+        Description
+       </p>
+       <p>
+        Example(s)
+       </p></th></tr></thead><tbody><tr><td class="func_table_entry"><p class="func_signature">
+        <code class="type">text</code> <code class="literal">~</code> <code class="type">text</code>
+        → <code class="returnvalue">boolean</code>
+       </p>
+       <p>
+        String matches regular expression, case sensitively
+       </p>
+       <p>
+        <code class="literal">'thomas' ~ 't.*ma'</code>
+        → <code class="returnvalue">t</code>
+       </p></td></tr><tr><td class="func_table_entry"><p class="func_signature">
+        <code class="type">text</code> <code class="literal">~*</code> <code class="type">text</code>
+        → <code class="returnvalue">boolean</code>
+       </p>
+       <p>
+        String matches regular expression, case insensitively
+       </p>
+       <p>
+        <code class="literal">'thomas' ~* 'T.*ma'</code>
+        → <code class="returnvalue">t</code>
+       </p></td></tr><tr><td class="func_table_entry"><p class="func_signature">
+        <code class="type">text</code> <code class="literal">!~</code> <code class="type">text</code>
+        → <code class="returnvalue">boolean</code>
+       </p>
+       <p>
+        String does not match regular expression, case sensitively
+       </p>
+       <p>
+        <code class="literal">'thomas' !~ 't.*max'</code>
+        → <code class="returnvalue">t</code>
+       </p></td></tr><tr><td class="func_table_entry"><p class="func_signature">
+        <code class="type">text</code> <code class="literal">!~*</code> <code class="type">text</code>
+        → <code class="returnvalue">boolean</code>
+       </p>
+       <p>
+        String does not match regular expression, case insensitively
+       </p>
+       <p>
+        <code class="literal">'thomas' !~* 'T.*ma'</code>
+        → <code class="returnvalue">f</code>
+       </p></td></tr></tbody></table></div></div><br class="table-break" /><p>
+     <acronym class="acronym">POSIX</acronym> regular expressions provide a more
+     powerful means for pattern matching than the <code class="function">LIKE</code> and
+     <code class="function">SIMILAR TO</code> operators.
+     Many Unix tools such as <code class="command">egrep</code>,
+     <code class="command">sed</code>, or <code class="command">awk</code> use a pattern
+     matching language that is similar to the one described here.
+    </p><p>
+     A regular expression is a character sequence that is an
+     abbreviated definition of a set of strings (a <em class="firstterm">regular
+     set</em>).  A string is said to match a regular expression
+     if it is a member of the regular set described by the regular
+     expression.  As with <code class="function">LIKE</code>, pattern characters
+     match string characters exactly unless they are special characters
+     in the regular expression language — but regular expressions use
+     different special characters than <code class="function">LIKE</code> does.
+     Unlike <code class="function">LIKE</code> patterns, a
+     regular expression is allowed to match anywhere within a string, unless
+     the regular expression is explicitly anchored to the beginning or
+     end of the string.
+    </p><p>
+     Some examples:
+</p><pre class="programlisting">
+'abcd' ~ 'bc'     <em class="lineannotation"><span class="lineannotation">true</span></em>
+'abcd' ~ 'a.c'    <em class="lineannotation"><span class="lineannotation">true — dot matches any character</span></em>
+'abcd' ~ 'a.*d'   <em class="lineannotation"><span class="lineannotation">true — <code class="literal">*</code> repeats the preceding pattern item</span></em>
+'abcd' ~ '(b|x)'  <em class="lineannotation"><span class="lineannotation">true — <code class="literal">|</code> means OR, parentheses group</span></em>
+'abcd' ~ '^a'     <em class="lineannotation"><span class="lineannotation">true — <code class="literal">^</code> anchors to start of string</span></em>
+'abcd' ~ '^(b|c)' <em class="lineannotation"><span class="lineannotation">false — would match except for anchoring</span></em>
+</pre><p>
+    </p><p>
+     The <acronym class="acronym">POSIX</acronym> pattern language is described in much
+     greater detail below.
+    </p><p>
+     The <code class="function">substring</code> function with two parameters,
+     <code class="function">substring(<em class="replaceable"><code>string</code></em> from
+     <em class="replaceable"><code>pattern</code></em>)</code>, provides extraction of a
+     substring
+     that matches a POSIX regular expression pattern.  It returns null if
+     there is no match, otherwise the first portion of the text that matched the
+     pattern.  But if the pattern contains any parentheses, the portion
+     of the text that matched the first parenthesized subexpression (the
+     one whose left parenthesis comes first) is
+     returned.  You can put parentheses around the whole expression
+     if you want to use parentheses within it without triggering this
+     exception.  If you need parentheses in the pattern before the
+     subexpression you want to extract, see the non-capturing parentheses
+     described below.
+    </p><p>
+    Some examples:
+</p><pre class="programlisting">
+substring('foobar' from 'o.b')     <em class="lineannotation"><span class="lineannotation">oob</span></em>
+substring('foobar' from 'o(.)b')   <em class="lineannotation"><span class="lineannotation">o</span></em>
+</pre><p>
+   </p><p>
+     The <code class="function">regexp_replace</code> function provides substitution of
+     new text for substrings that match POSIX regular expression patterns.
+     It has the syntax
+     <code class="function">regexp_replace</code>(<em class="replaceable"><code>source</code></em>,
+     <em class="replaceable"><code>pattern</code></em>, <em class="replaceable"><code>replacement</code></em>
+     [<span class="optional">, <em class="replaceable"><code>flags</code></em> </span>]).
+     The <em class="replaceable"><code>source</code></em> string is returned unchanged if
+     there is no match to the <em class="replaceable"><code>pattern</code></em>.  If there is a
+     match, the <em class="replaceable"><code>source</code></em> string is returned with the
+     <em class="replaceable"><code>replacement</code></em> string substituted for the matching
+     substring.  The <em class="replaceable"><code>replacement</code></em> string can contain
+     <code class="literal">\</code><em class="replaceable"><code>n</code></em>, where <em class="replaceable"><code>n</code></em> is 1
+     through 9, to indicate that the source substring matching the
+     <em class="replaceable"><code>n</code></em>'th parenthesized subexpression of the pattern should be
+     inserted, and it can contain <code class="literal">\&amp;</code> to indicate that the
+     substring matching the entire pattern should be inserted.  Write
+     <code class="literal">\\</code> if you need to put a literal backslash in the replacement
+     text.
+     The <em class="replaceable"><code>flags</code></em> parameter is an optional text
+     string containing zero or more single-letter flags that change the
+     function's behavior.  Flag <code class="literal">i</code> specifies case-insensitive
+     matching, while flag <code class="literal">g</code> specifies replacement of each matching
+     substring rather than only the first one.  Supported flags (though
+     not <code class="literal">g</code>) are
+     described in <a class="xref" href="functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE" title="Table 9.24. ARE Embedded-Option Letters">Table 9.24</a>.
+    </p><p>
+    Some examples:
+</p><pre class="programlisting">
+regexp_replace('foobarbaz', 'b..', 'X')
+                                   <em class="lineannotation"><span class="lineannotation">fooXbaz</span></em>
+regexp_replace('foobarbaz', 'b..', 'X', 'g')
+                                   <em class="lineannotation"><span class="lineannotation">fooXX</span></em>
+regexp_replace('foobarbaz', 'b(..)', 'X\1Y', 'g')
+                                   <em class="lineannotation"><span class="lineannotation">fooXarYXazY</span></em>
+</pre><p>
+   </p><p>
+     The <code class="function">regexp_match</code> function returns a text array of
+     captured substring(s) resulting from the first match of a POSIX
+     regular expression pattern to a string.  It has the syntax
+     <code class="function">regexp_match</code>(<em class="replaceable"><code>string</code></em>,
+     <em class="replaceable"><code>pattern</code></em> [<span class="optional">, <em class="replaceable"><code>flags</code></em> </span>]).
+     If there is no match, the result is <code class="literal">NULL</code>.
+     If a match is found, and the <em class="replaceable"><code>pattern</code></em> contains no
+     parenthesized subexpressions, then the result is a single-element text
+     array containing the substring matching the whole pattern.
+     If a match is found, and the <em class="replaceable"><code>pattern</code></em> contains
+     parenthesized subexpressions, then the result is a text array
+     whose <em class="replaceable"><code>n</code></em>'th element is the substring matching
+     the <em class="replaceable"><code>n</code></em>'th parenthesized subexpression of
+     the <em class="replaceable"><code>pattern</code></em> (not counting <span class="quote">“<span class="quote">non-capturing</span>”</span>
+     parentheses; see below for details).
+     The <em class="replaceable"><code>flags</code></em> parameter is an optional text string
+     containing zero or more single-letter flags that change the function's
+     behavior.  Supported flags are described
+     in <a class="xref" href="functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE" title="Table 9.24. ARE Embedded-Option Letters">Table 9.24</a>.
+    </p><p>
+    Some examples:
+</p><pre class="programlisting">
+SELECT regexp_match('foobarbequebaz', 'bar.*que');
+ regexp_match
+--------------
+ {barbeque}
+(1 row)
+
+SELECT regexp_match('foobarbequebaz', '(bar)(beque)');
+ regexp_match
+--------------
+ {bar,beque}
+(1 row)
+</pre><p>
+    In the common case where you just want the whole matching substring
+    or <code class="literal">NULL</code> for no match, write something like
+</p><pre class="programlisting">
+SELECT (regexp_match('foobarbequebaz', 'bar.*que'))[1];
+ regexp_match
+--------------
+ barbeque
+(1 row)
+</pre><p>
+   </p><p>
+     The <code class="function">regexp_matches</code> function returns a set of text arrays
+     of captured substring(s) resulting from matching a POSIX regular
+     expression pattern to a string.  It has the same syntax as
+     <code class="function">regexp_match</code>.
+     This function returns no rows if there is no match, one row if there is
+     a match and the <code class="literal">g</code> flag is not given, or <em class="replaceable"><code>N</code></em>
+     rows if there are <em class="replaceable"><code>N</code></em> matches and the <code class="literal">g</code> flag
+     is given.  Each returned row is a text array containing the whole
+     matched substring or the substrings matching parenthesized
+     subexpressions of the <em class="replaceable"><code>pattern</code></em>, just as described above
+     for <code class="function">regexp_match</code>.
+     <code class="function">regexp_matches</code> accepts all the flags shown
+     in <a class="xref" href="functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE" title="Table 9.24. ARE Embedded-Option Letters">Table 9.24</a>, plus
+     the <code class="literal">g</code> flag which commands it to return all matches, not
+     just the first one.
+    </p><p>
+    Some examples:
+</p><pre class="programlisting">
+SELECT regexp_matches('foo', 'not there');
+ regexp_matches
+----------------
+(0 rows)
+
+SELECT regexp_matches('foobarbequebazilbarfbonk', '(b[^b]+)(b[^b]+)', 'g');
+ regexp_matches
+----------------
+ {bar,beque}
+ {bazil,barf}
+(2 rows)
+</pre><p>
+   </p><div class="tip"><h3 class="title">Tip</h3><p>
+     In most cases <code class="function">regexp_matches()</code> should be used with
+     the <code class="literal">g</code> flag, since if you only want the first match, it's
+     easier and more efficient to use <code class="function">regexp_match()</code>.
+     However, <code class="function">regexp_match()</code> only exists
+     in <span class="productname">PostgreSQL</span> version 10 and up.  When working in older
+     versions, a common trick is to place a <code class="function">regexp_matches()</code>
+     call in a sub-select, for example:
+</p><pre class="programlisting">
+SELECT col1, (SELECT regexp_matches(col2, '(bar)(beque)')) FROM tab;
+</pre><p>
+     This produces a text array if there's a match, or <code class="literal">NULL</code> if
+     not, the same as <code class="function">regexp_match()</code> would do.  Without the
+     sub-select, this query would produce no output at all for table rows
+     without a match, which is typically not the desired behavior.
+    </p></div><p>
+     The <code class="function">regexp_split_to_table</code> function splits a string using a POSIX
+     regular expression pattern as a delimiter.  It has the syntax
+     <code class="function">regexp_split_to_table</code>(<em class="replaceable"><code>string</code></em>, <em class="replaceable"><code>pattern</code></em>
+     [<span class="optional">, <em class="replaceable"><code>flags</code></em> </span>]).
+     If there is no match to the <em class="replaceable"><code>pattern</code></em>, the function returns the
+     <em class="replaceable"><code>string</code></em>.  If there is at least one match, for each match it returns
+     the text from the end of the last match (or the beginning of the string)
+     to the beginning of the match.  When there are no more matches, it
+     returns the text from the end of the last match to the end of the string.
+     The <em class="replaceable"><code>flags</code></em> parameter is an optional text string containing
+     zero or more single-letter flags that change the function's behavior.
+     <code class="function">regexp_split_to_table</code> supports the flags described in
+     <a class="xref" href="functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE" title="Table 9.24. ARE Embedded-Option Letters">Table 9.24</a>.
+    </p><p>
+     The <code class="function">regexp_split_to_array</code> function behaves the same as
+     <code class="function">regexp_split_to_table</code>, except that <code class="function">regexp_split_to_array</code>
+     returns its result as an array of <code class="type">text</code>.  It has the syntax
+     <code class="function">regexp_split_to_array</code>(<em class="replaceable"><code>string</code></em>, <em class="replaceable"><code>pattern</code></em>
+     [<span class="optional">, <em class="replaceable"><code>flags</code></em> </span>]).
+     The parameters are the same as for <code class="function">regexp_split_to_table</code>.
+    </p><p>
+    Some examples:
+</p><pre class="programlisting">
+
+SELECT foo FROM regexp_split_to_table('the quick brown fox jumps over the lazy dog', '\s+') AS foo;
+  foo   
+-------
+ the    
+ quick  
+ brown  
+ fox    
+ jumps 
+ over   
+ the    
+ lazy   
+ dog    
+(9 rows)
+
+SELECT regexp_split_to_array('the quick brown fox jumps over the lazy dog', '\s+');
+              regexp_split_to_array             
+-----------------------------------------------
+ {the,quick,brown,fox,jumps,over,the,lazy,dog}
+(1 row)
+
+SELECT foo FROM regexp_split_to_table('the quick brown fox', '\s*') AS foo;
+ foo 
+-----
+ t         
+ h         
+ e         
+ q         
+ u         
+ i         
+ c         
+ k         
+ b         
+ r         
+ o         
+ w         
+ n         
+ f         
+ o         
+ x         
+(16 rows)
+</pre><p>
+   </p><p>
+    As the last example demonstrates, the regexp split functions ignore
+    zero-length matches that occur at the start or end of the string
+    or immediately after a previous match.  This is contrary to the strict
+    definition of regexp matching that is implemented by
+    <code class="function">regexp_match</code> and
+    <code class="function">regexp_matches</code>, but is usually the most convenient behavior
+    in practice.  Other software systems such as Perl use similar definitions.
+   </p><div class="sect3" id="POSIX-SYNTAX-DETAILS"><div class="titlepage"><div><div><h4 class="title">9.7.3.1. Regular Expression Details</h4></div></div></div><p>
+    <span class="productname">PostgreSQL</span>'s regular expressions are implemented
+    using a software package written by Henry Spencer.  Much of
+    the description of regular expressions below is copied verbatim from his
+    manual.
+   </p><p>
+    Regular expressions (<acronym class="acronym">RE</acronym>s), as defined in
+    <acronym class="acronym">POSIX</acronym> 1003.2, come in two forms:
+    <em class="firstterm">extended</em> <acronym class="acronym">RE</acronym>s or <acronym class="acronym">ERE</acronym>s
+    (roughly those of <code class="command">egrep</code>), and
+    <em class="firstterm">basic</em> <acronym class="acronym">RE</acronym>s or <acronym class="acronym">BRE</acronym>s
+    (roughly those of <code class="command">ed</code>).
+    <span class="productname">PostgreSQL</span> supports both forms, and
+    also implements some extensions
+    that are not in the POSIX standard, but have become widely used
+    due to their availability in programming languages such as Perl and Tcl.
+    <acronym class="acronym">RE</acronym>s using these non-POSIX extensions are called
+    <em class="firstterm">advanced</em> <acronym class="acronym">RE</acronym>s or <acronym class="acronym">ARE</acronym>s
+    in this documentation.  AREs are almost an exact superset of EREs,
+    but BREs have several notational incompatibilities (as well as being
+    much more limited).
+    We first describe the ARE and ERE forms, noting features that apply
+    only to AREs, and then describe how BREs differ.
+   </p><div class="note"><h3 class="title">Note</h3><p>
+     <span class="productname">PostgreSQL</span> always initially presumes that a regular
+     expression follows the ARE rules.  However, the more limited ERE or
+     BRE rules can be chosen by prepending an <em class="firstterm">embedded option</em>
+     to the RE pattern, as described in <a class="xref" href="functions-matching.html#POSIX-METASYNTAX" title="9.7.3.4. Regular Expression Metasyntax">Section 9.7.3.4</a>.
+     This can be useful for compatibility with applications that expect
+     exactly the <acronym class="acronym">POSIX</acronym> 1003.2 rules.
+    </p></div><p>
+    A regular expression is defined as one or more
+    <em class="firstterm">branches</em>, separated by
+    <code class="literal">|</code>.  It matches anything that matches one of the
+    branches.
+   </p><p>
+    A branch is zero or more <em class="firstterm">quantified atoms</em> or
+    <em class="firstterm">constraints</em>, concatenated.
+    It matches a match for the first, followed by a match for the second, etc;
+    an empty branch matches the empty string.
+   </p><p>
+    A quantified atom is an <em class="firstterm">atom</em> possibly followed
+    by a single <em class="firstterm">quantifier</em>.
+    Without a quantifier, it matches a match for the atom.
+    With a quantifier, it can match some number of matches of the atom.
+    An <em class="firstterm">atom</em> can be any of the possibilities
+    shown in <a class="xref" href="functions-matching.html#POSIX-ATOMS-TABLE" title="Table 9.17. Regular Expression Atoms">Table 9.17</a>.
+    The possible quantifiers and their meanings are shown in
+    <a class="xref" href="functions-matching.html#POSIX-QUANTIFIERS-TABLE" title="Table 9.18. Regular Expression Quantifiers">Table 9.18</a>.
+   </p><p>
+    A <em class="firstterm">constraint</em> matches an empty string, but matches only when
+    specific conditions are met.  A constraint can be used where an atom
+    could be used, except it cannot be followed by a quantifier.
+    The simple constraints are shown in
+    <a class="xref" href="functions-matching.html#POSIX-CONSTRAINTS-TABLE" title="Table 9.19. Regular Expression Constraints">Table 9.19</a>;
+    some more constraints are described later.
+   </p><div class="table" id="POSIX-ATOMS-TABLE"><p class="title"><strong>Table 9.17. Regular Expression Atoms</strong></p><div class="table-contents"><table class="table" summary="Regular Expression Atoms" border="1"><colgroup><col /><col /></colgroup><thead><tr><th>Atom</th><th>Description</th></tr></thead><tbody><tr><td> <code class="literal">(</code><em class="replaceable"><code>re</code></em><code class="literal">)</code> </td><td> (where <em class="replaceable"><code>re</code></em> is any regular expression)
+       matches a match for
+       <em class="replaceable"><code>re</code></em>, with the match noted for possible reporting </td></tr><tr><td> <code class="literal">(?:</code><em class="replaceable"><code>re</code></em><code class="literal">)</code> </td><td> as above, but the match is not noted for reporting
+       (a <span class="quote">“<span class="quote">non-capturing</span>”</span> set of parentheses)
+       (AREs only) </td></tr><tr><td> <code class="literal">.</code> </td><td> matches any single character </td></tr><tr><td> <code class="literal">[</code><em class="replaceable"><code>chars</code></em><code class="literal">]</code> </td><td> a <em class="firstterm">bracket expression</em>,
+       matching any one of the <em class="replaceable"><code>chars</code></em> (see
+       <a class="xref" href="functions-matching.html#POSIX-BRACKET-EXPRESSIONS" title="9.7.3.2. Bracket Expressions">Section 9.7.3.2</a> for more detail) </td></tr><tr><td> <code class="literal">\</code><em class="replaceable"><code>k</code></em> </td><td> (where <em class="replaceable"><code>k</code></em> is a non-alphanumeric character)
+       matches that character taken as an ordinary character,
+       e.g., <code class="literal">\\</code> matches a backslash character </td></tr><tr><td> <code class="literal">\</code><em class="replaceable"><code>c</code></em> </td><td> where <em class="replaceable"><code>c</code></em> is alphanumeric
+       (possibly followed by other characters)
+       is an <em class="firstterm">escape</em>, see <a class="xref" href="functions-matching.html#POSIX-ESCAPE-SEQUENCES" title="9.7.3.3. Regular Expression Escapes">Section 9.7.3.3</a>
+       (AREs only; in EREs and BREs, this matches <em class="replaceable"><code>c</code></em>) </td></tr><tr><td> <code class="literal">{</code> </td><td> when followed by a character other than a digit,
+       matches the left-brace character <code class="literal">{</code>;
+       when followed by a digit, it is the beginning of a
+       <em class="replaceable"><code>bound</code></em> (see below) </td></tr><tr><td> <em class="replaceable"><code>x</code></em> </td><td> where <em class="replaceable"><code>x</code></em> is a single character with no other
+       significance, matches that character </td></tr></tbody></table></div></div><br class="table-break" /><p>
+    An RE cannot end with a backslash (<code class="literal">\</code>).
+   </p><div class="note"><h3 class="title">Note</h3><p>
+     If you have <a class="xref" href="runtime-config-compatible.html#GUC-STANDARD-CONFORMING-STRINGS">standard_conforming_strings</a> turned off,
+     any backslashes you write in literal string constants will need to be
+     doubled.  See <a class="xref" href="sql-syntax-lexical.html#SQL-SYNTAX-STRINGS" title="4.1.2.1. String Constants">Section 4.1.2.1</a> for more information.
+    </p></div><div class="table" id="POSIX-QUANTIFIERS-TABLE"><p class="title"><strong>Table 9.18. Regular Expression Quantifiers</strong></p><div class="table-contents"><table class="table" summary="Regular Expression Quantifiers" border="1"><colgroup><col /><col /></colgroup><thead><tr><th>Quantifier</th><th>Matches</th></tr></thead><tbody><tr><td> <code class="literal">*</code> </td><td> a sequence of 0 or more matches of the atom </td></tr><tr><td> <code class="literal">+</code> </td><td> a sequence of 1 or more matches of the atom </td></tr><tr><td> <code class="literal">?</code> </td><td> a sequence of 0 or 1 matches of the atom </td></tr><tr><td> <code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">}</code> </td><td> a sequence of exactly <em class="replaceable"><code>m</code></em> matches of the atom </td></tr><tr><td> <code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">,}</code> </td><td> a sequence of <em class="replaceable"><code>m</code></em> or more matches of the atom </td></tr><tr><td>
+       <code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">,</code><em class="replaceable"><code>n</code></em><code class="literal">}</code> </td><td> a sequence of <em class="replaceable"><code>m</code></em> through <em class="replaceable"><code>n</code></em>
+       (inclusive) matches of the atom; <em class="replaceable"><code>m</code></em> cannot exceed
+       <em class="replaceable"><code>n</code></em> </td></tr><tr><td> <code class="literal">*?</code> </td><td> non-greedy version of <code class="literal">*</code> </td></tr><tr><td> <code class="literal">+?</code> </td><td> non-greedy version of <code class="literal">+</code> </td></tr><tr><td> <code class="literal">??</code> </td><td> non-greedy version of <code class="literal">?</code> </td></tr><tr><td> <code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">}?</code> </td><td> non-greedy version of <code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">}</code> </td></tr><tr><td> <code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">,}?</code> </td><td> non-greedy version of <code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">,}</code> </td></tr><tr><td>
+       <code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">,</code><em class="replaceable"><code>n</code></em><code class="literal">}?</code> </td><td> non-greedy version of <code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">,</code><em class="replaceable"><code>n</code></em><code class="literal">}</code> </td></tr></tbody></table></div></div><br class="table-break" /><p>
+    The forms using <code class="literal">{</code><em class="replaceable"><code>...</code></em><code class="literal">}</code>
+    are known as <em class="firstterm">bounds</em>.
+    The numbers <em class="replaceable"><code>m</code></em> and <em class="replaceable"><code>n</code></em> within a bound are
+    unsigned decimal integers with permissible values from 0 to 255 inclusive.
+   </p><p>
+     <em class="firstterm">Non-greedy</em> quantifiers (available in AREs only) match the
+     same possibilities as their corresponding normal (<em class="firstterm">greedy</em>)
+     counterparts, but prefer the smallest number rather than the largest
+     number of matches.
+     See <a class="xref" href="functions-matching.html#POSIX-MATCHING-RULES" title="9.7.3.5. Regular Expression Matching Rules">Section 9.7.3.5</a> for more detail.
+   </p><div class="note"><h3 class="title">Note</h3><p>
+     A quantifier cannot immediately follow another quantifier, e.g.,
+     <code class="literal">**</code> is invalid.
+     A quantifier cannot
+     begin an expression or subexpression or follow
+     <code class="literal">^</code> or <code class="literal">|</code>.
+    </p></div><div class="table" id="POSIX-CONSTRAINTS-TABLE"><p class="title"><strong>Table 9.19. Regular Expression Constraints</strong></p><div class="table-contents"><table class="table" summary="Regular Expression Constraints" border="1"><colgroup><col /><col /></colgroup><thead><tr><th>Constraint</th><th>Description</th></tr></thead><tbody><tr><td> <code class="literal">^</code> </td><td> matches at the beginning of the string </td></tr><tr><td> <code class="literal">$</code> </td><td> matches at the end of the string </td></tr><tr><td> <code class="literal">(?=</code><em class="replaceable"><code>re</code></em><code class="literal">)</code> </td><td> <em class="firstterm">positive lookahead</em> matches at any point
+       where a substring matching <em class="replaceable"><code>re</code></em> begins
+       (AREs only) </td></tr><tr><td> <code class="literal">(?!</code><em class="replaceable"><code>re</code></em><code class="literal">)</code> </td><td> <em class="firstterm">negative lookahead</em> matches at any point
+       where no substring matching <em class="replaceable"><code>re</code></em> begins
+       (AREs only) </td></tr><tr><td> <code class="literal">(?&lt;=</code><em class="replaceable"><code>re</code></em><code class="literal">)</code> </td><td> <em class="firstterm">positive lookbehind</em> matches at any point
+       where a substring matching <em class="replaceable"><code>re</code></em> ends
+       (AREs only) </td></tr><tr><td> <code class="literal">(?&lt;!</code><em class="replaceable"><code>re</code></em><code class="literal">)</code> </td><td> <em class="firstterm">negative lookbehind</em> matches at any point
+       where no substring matching <em class="replaceable"><code>re</code></em> ends
+       (AREs only) </td></tr></tbody></table></div></div><br class="table-break" /><p>
+    Lookahead and lookbehind constraints cannot contain <em class="firstterm">back
+    references</em> (see <a class="xref" href="functions-matching.html#POSIX-ESCAPE-SEQUENCES" title="9.7.3.3. Regular Expression Escapes">Section 9.7.3.3</a>),
+    and all parentheses within them are considered non-capturing.
+   </p></div><div class="sect3" id="POSIX-BRACKET-EXPRESSIONS"><div class="titlepage"><div><div><h4 class="title">9.7.3.2. Bracket Expressions</h4></div></div></div><p>
+    A <em class="firstterm">bracket expression</em> is a list of
+    characters enclosed in <code class="literal">[]</code>.  It normally matches
+    any single character from the list (but see below).  If the list
+    begins with <code class="literal">^</code>, it matches any single character
+    <span class="emphasis"><em>not</em></span> from the rest of the list.
+    If two characters
+    in the list are separated by <code class="literal">-</code>, this is
+    shorthand for the full range of characters between those two
+    (inclusive) in the collating sequence,
+    e.g., <code class="literal">[0-9]</code> in <acronym class="acronym">ASCII</acronym> matches
+    any decimal digit.  It is illegal for two ranges to share an
+    endpoint, e.g.,  <code class="literal">a-c-e</code>.  Ranges are very
+    collating-sequence-dependent, so portable programs should avoid
+    relying on them.
+   </p><p>
+    To include a literal <code class="literal">]</code> in the list, make it the
+    first character (after <code class="literal">^</code>, if that is used).  To
+    include a literal <code class="literal">-</code>, make it the first or last
+    character, or the second endpoint of a range.  To use a literal
+    <code class="literal">-</code> as the first endpoint of a range, enclose it
+    in <code class="literal">[.</code> and <code class="literal">.]</code> to make it a
+    collating element (see below).  With the exception of these characters,
+    some combinations using <code class="literal">[</code>
+    (see next paragraphs), and escapes (AREs only), all other special
+    characters lose their special significance within a bracket expression.
+    In particular, <code class="literal">\</code> is not special when following
+    ERE or BRE rules, though it is special (as introducing an escape)
+    in AREs.
+   </p><p>
+    Within a bracket expression, a collating element (a character, a
+    multiple-character sequence that collates as if it were a single
+    character, or a collating-sequence name for either) enclosed in
+    <code class="literal">[.</code> and <code class="literal">.]</code> stands for the
+    sequence of characters of that collating element.  The sequence is
+    treated as a single element of the bracket expression's list.  This
+    allows a bracket
+    expression containing a multiple-character collating element to
+    match more than one character, e.g., if the collating sequence
+    includes a <code class="literal">ch</code> collating element, then the RE
+    <code class="literal">[[.ch.]]*c</code> matches the first five characters of
+    <code class="literal">chchcc</code>.
+   </p><div class="note"><h3 class="title">Note</h3><p>
+     <span class="productname">PostgreSQL</span> currently does not support multi-character collating
+     elements. This information describes possible future behavior.
+    </p></div><p>
+    Within a bracket expression, a collating element enclosed in
+    <code class="literal">[=</code> and <code class="literal">=]</code> is an <em class="firstterm">equivalence
+    class</em>, standing for the sequences of characters of all collating
+    elements equivalent to that one, including itself.  (If there are
+    no other equivalent collating elements, the treatment is as if the
+    enclosing delimiters were <code class="literal">[.</code> and
+    <code class="literal">.]</code>.)  For example, if <code class="literal">o</code> and
+    <code class="literal">^</code> are the members of an equivalence class, then
+    <code class="literal">[[=o=]]</code>, <code class="literal">[[=^=]]</code>, and
+    <code class="literal">[o^]</code> are all synonymous.  An equivalence class
+    cannot be an endpoint of a range.
+   </p><p>
+    Within a bracket expression, the name of a character class
+    enclosed in <code class="literal">[:</code> and <code class="literal">:]</code> stands
+    for the list of all characters belonging to that class.  A character
+    class cannot be used as an endpoint of a range.
+    The <acronym class="acronym">POSIX</acronym> standard defines these character class
+    names:
+    <code class="literal">alnum</code> (letters and numeric digits),
+    <code class="literal">alpha</code> (letters),
+    <code class="literal">blank</code> (space and tab),
+    <code class="literal">cntrl</code> (control characters),
+    <code class="literal">digit</code> (numeric digits),
+    <code class="literal">graph</code> (printable characters except space),
+    <code class="literal">lower</code> (lower-case letters),
+    <code class="literal">print</code> (printable characters including space),
+    <code class="literal">punct</code> (punctuation),
+    <code class="literal">space</code> (any white space),
+    <code class="literal">upper</code> (upper-case letters),
+    and <code class="literal">xdigit</code> (hexadecimal digits).
+    The behavior of these standard character classes is generally
+    consistent across platforms for characters in the 7-bit ASCII set.
+    Whether a given non-ASCII character is considered to belong to one
+    of these classes depends on the <em class="firstterm">collation</em>
+    that is used for the regular-expression function or operator
+    (see <a class="xref" href="collation.html" title="23.2. Collation Support">Section 23.2</a>), or by default on the
+    database's <code class="envar">LC_CTYPE</code> locale setting (see
+    <a class="xref" href="locale.html" title="23.1. Locale Support">Section 23.1</a>).  The classification of non-ASCII
+    characters can vary across platforms even in similarly-named
+    locales.  (But the <code class="literal">C</code> locale never considers any
+    non-ASCII characters to belong to any of these classes.)
+    In addition to these standard character
+    classes, <span class="productname">PostgreSQL</span> defines
+    the <code class="literal">ascii</code> character class, which contains exactly
+    the 7-bit ASCII set.
+   </p><p>
+    There are two special cases of bracket expressions:  the bracket
+    expressions <code class="literal">[[:&lt;:]]</code> and
+    <code class="literal">[[:&gt;:]]</code> are constraints,
+    matching empty strings at the beginning
+    and end of a word respectively.  A word is defined as a sequence
+    of word characters that is neither preceded nor followed by word
+    characters.  A word character is an <code class="literal">alnum</code> character (as
+    defined by the <acronym class="acronym">POSIX</acronym> character class described above)
+    or an underscore.  This is an extension, compatible with but not
+    specified by <acronym class="acronym">POSIX</acronym> 1003.2, and should be used with
+    caution in software intended to be portable to other systems.
+    The constraint escapes described below are usually preferable; they
+    are no more standard, but are easier to type.
+   </p></div><div class="sect3" id="POSIX-ESCAPE-SEQUENCES"><div class="titlepage"><div><div><h4 class="title">9.7.3.3. Regular Expression Escapes</h4></div></div></div><p>
+    <em class="firstterm">Escapes</em> are special sequences beginning with <code class="literal">\</code>
+    followed by an alphanumeric character. Escapes come in several varieties:
+    character entry, class shorthands, constraint escapes, and back references.
+    A <code class="literal">\</code> followed by an alphanumeric character but not constituting
+    a valid escape is illegal in AREs.
+    In EREs, there are no escapes: outside a bracket expression,
+    a <code class="literal">\</code> followed by an alphanumeric character merely stands for
+    that character as an ordinary character, and inside a bracket expression,
+    <code class="literal">\</code> is an ordinary character.
+    (The latter is the one actual incompatibility between EREs and AREs.)
+   </p><p>
+    <em class="firstterm">Character-entry escapes</em> exist to make it easier to specify
+    non-printing and other inconvenient characters in REs.  They are
+    shown in <a class="xref" href="functions-matching.html#POSIX-CHARACTER-ENTRY-ESCAPES-TABLE" title="Table 9.20. Regular Expression Character-Entry Escapes">Table 9.20</a>.
+   </p><p>
+    <em class="firstterm">Class-shorthand escapes</em> provide shorthands for certain
+    commonly-used character classes.  They are
+    shown in <a class="xref" href="functions-matching.html#POSIX-CLASS-SHORTHAND-ESCAPES-TABLE" title="Table 9.21. Regular Expression Class-Shorthand Escapes">Table 9.21</a>.
+   </p><p>
+    A <em class="firstterm">constraint escape</em> is a constraint,
+    matching the empty string if specific conditions are met,
+    written as an escape.  They are
+    shown in <a class="xref" href="functions-matching.html#POSIX-CONSTRAINT-ESCAPES-TABLE" title="Table 9.22. Regular Expression Constraint Escapes">Table 9.22</a>.
+   </p><p>
+    A <em class="firstterm">back reference</em> (<code class="literal">\</code><em class="replaceable"><code>n</code></em>) matches the
+    same string matched by the previous parenthesized subexpression specified
+    by the number <em class="replaceable"><code>n</code></em>
+    (see <a class="xref" href="functions-matching.html#POSIX-CONSTRAINT-BACKREF-TABLE" title="Table 9.23. Regular Expression Back References">Table 9.23</a>).  For example,
+    <code class="literal">([bc])\1</code> matches <code class="literal">bb</code> or <code class="literal">cc</code>
+    but not <code class="literal">bc</code> or <code class="literal">cb</code>.
+    The subexpression must entirely precede the back reference in the RE.
+    Subexpressions are numbered in the order of their leading parentheses.
+    Non-capturing parentheses do not define subexpressions.
+   </p><div class="table" id="POSIX-CHARACTER-ENTRY-ESCAPES-TABLE"><p class="title"><strong>Table 9.20. Regular Expression Character-Entry Escapes</strong></p><div class="table-contents"><table class="table" summary="Regular Expression Character-Entry Escapes" border="1"><colgroup><col /><col /></colgroup><thead><tr><th>Escape</th><th>Description</th></tr></thead><tbody><tr><td> <code class="literal">\a</code> </td><td> alert (bell) character, as in C </td></tr><tr><td> <code class="literal">\b</code> </td><td> backspace, as in C </td></tr><tr><td> <code class="literal">\B</code> </td><td> synonym for backslash (<code class="literal">\</code>) to help reduce the need for backslash
+       doubling </td></tr><tr><td> <code class="literal">\c</code><em class="replaceable"><code>X</code></em> </td><td> (where <em class="replaceable"><code>X</code></em> is any character) the character whose
+       low-order 5 bits are the same as those of
+       <em class="replaceable"><code>X</code></em>, and whose other bits are all zero </td></tr><tr><td> <code class="literal">\e</code> </td><td> the character whose collating-sequence name
+       is <code class="literal">ESC</code>,
+       or failing that, the character with octal value <code class="literal">033</code> </td></tr><tr><td> <code class="literal">\f</code> </td><td> form feed, as in C </td></tr><tr><td> <code class="literal">\n</code> </td><td> newline, as in C </td></tr><tr><td> <code class="literal">\r</code> </td><td> carriage return, as in C </td></tr><tr><td> <code class="literal">\t</code> </td><td> horizontal tab, as in C </td></tr><tr><td> <code class="literal">\u</code><em class="replaceable"><code>wxyz</code></em> </td><td> (where <em class="replaceable"><code>wxyz</code></em> is exactly four hexadecimal digits)
+       the character whose hexadecimal value is
+       <code class="literal">0x</code><em class="replaceable"><code>wxyz</code></em>
+       </td></tr><tr><td> <code class="literal">\U</code><em class="replaceable"><code>stuvwxyz</code></em> </td><td> (where <em class="replaceable"><code>stuvwxyz</code></em> is exactly eight hexadecimal
+       digits)
+       the character whose hexadecimal value is
+       <code class="literal">0x</code><em class="replaceable"><code>stuvwxyz</code></em>
+       </td></tr><tr><td> <code class="literal">\v</code> </td><td> vertical tab, as in C </td></tr><tr><td> <code class="literal">\x</code><em class="replaceable"><code>hhh</code></em> </td><td> (where <em class="replaceable"><code>hhh</code></em> is any sequence of hexadecimal
+       digits)
+       the character whose hexadecimal value is
+       <code class="literal">0x</code><em class="replaceable"><code>hhh</code></em>
+       (a single character no matter how many hexadecimal digits are used)
+       </td></tr><tr><td> <code class="literal">\0</code> </td><td> the character whose value is <code class="literal">0</code> (the null byte)</td></tr><tr><td> <code class="literal">\</code><em class="replaceable"><code>xy</code></em> </td><td> (where <em class="replaceable"><code>xy</code></em> is exactly two octal digits,
+       and is not a <em class="firstterm">back reference</em>)
+       the character whose octal value is
+       <code class="literal">0</code><em class="replaceable"><code>xy</code></em> </td></tr><tr><td> <code class="literal">\</code><em class="replaceable"><code>xyz</code></em> </td><td> (where <em class="replaceable"><code>xyz</code></em> is exactly three octal digits,
+       and is not a <em class="firstterm">back reference</em>)
+       the character whose octal value is
+       <code class="literal">0</code><em class="replaceable"><code>xyz</code></em> </td></tr></tbody></table></div></div><br class="table-break" /><p>
+    Hexadecimal digits are <code class="literal">0</code>-<code class="literal">9</code>,
+    <code class="literal">a</code>-<code class="literal">f</code>, and <code class="literal">A</code>-<code class="literal">F</code>.
+    Octal digits are <code class="literal">0</code>-<code class="literal">7</code>.
+   </p><p>
+    Numeric character-entry escapes specifying values outside the ASCII range
+    (0–127) have meanings dependent on the database encoding.  When the
+    encoding is UTF-8, escape values are equivalent to Unicode code points,
+    for example <code class="literal">\u1234</code> means the character <code class="literal">U+1234</code>.
+    For other multibyte encodings, character-entry escapes usually just
+    specify the concatenation of the byte values for the character.  If the
+    escape value does not correspond to any legal character in the database
+    encoding, no error will be raised, but it will never match any data.
+   </p><p>
+    The character-entry escapes are always taken as ordinary characters.
+    For example, <code class="literal">\135</code> is <code class="literal">]</code> in ASCII, but
+    <code class="literal">\135</code> does not terminate a bracket expression.
+   </p><div class="table" id="POSIX-CLASS-SHORTHAND-ESCAPES-TABLE"><p class="title"><strong>Table 9.21. Regular Expression Class-Shorthand Escapes</strong></p><div class="table-contents"><table class="table" summary="Regular Expression Class-Shorthand Escapes" border="1"><colgroup><col /><col /></colgroup><thead><tr><th>Escape</th><th>Description</th></tr></thead><tbody><tr><td> <code class="literal">\d</code> </td><td> <code class="literal">[[:digit:]]</code> </td></tr><tr><td> <code class="literal">\s</code> </td><td> <code class="literal">[[:space:]]</code> </td></tr><tr><td> <code class="literal">\w</code> </td><td> <code class="literal">[[:alnum:]_]</code>
+       (note underscore is included) </td></tr><tr><td> <code class="literal">\D</code> </td><td> <code class="literal">[^[:digit:]]</code> </td></tr><tr><td> <code class="literal">\S</code> </td><td> <code class="literal">[^[:space:]]</code> </td></tr><tr><td> <code class="literal">\W</code> </td><td> <code class="literal">[^[:alnum:]_]</code>
+       (note underscore is included) </td></tr></tbody></table></div></div><br class="table-break" /><p>
+    Within bracket expressions, <code class="literal">\d</code>, <code class="literal">\s</code>,
+    and <code class="literal">\w</code> lose their outer brackets,
+    and <code class="literal">\D</code>, <code class="literal">\S</code>, and <code class="literal">\W</code> are illegal.
+    (So, for example, <code class="literal">[a-c\d]</code> is equivalent to
+    <code class="literal">[a-c[:digit:]]</code>.
+    Also, <code class="literal">[a-c\D]</code>, which is equivalent to
+    <code class="literal">[a-c^[:digit:]]</code>, is illegal.)
+   </p><div class="table" id="POSIX-CONSTRAINT-ESCAPES-TABLE"><p class="title"><strong>Table 9.22. Regular Expression Constraint Escapes</strong></p><div class="table-contents"><table class="table" summary="Regular Expression Constraint Escapes" border="1"><colgroup><col /><col /></colgroup><thead><tr><th>Escape</th><th>Description</th></tr></thead><tbody><tr><td> <code class="literal">\A</code> </td><td> matches only at the beginning of the string
+       (see <a class="xref" href="functions-matching.html#POSIX-MATCHING-RULES" title="9.7.3.5. Regular Expression Matching Rules">Section 9.7.3.5</a> for how this differs from
+       <code class="literal">^</code>) </td></tr><tr><td> <code class="literal">\m</code> </td><td> matches only at the beginning of a word </td></tr><tr><td> <code class="literal">\M</code> </td><td> matches only at the end of a word </td></tr><tr><td> <code class="literal">\y</code> </td><td> matches only at the beginning or end of a word </td></tr><tr><td> <code class="literal">\Y</code> </td><td> matches only at a point that is not the beginning or end of a
+       word </td></tr><tr><td> <code class="literal">\Z</code> </td><td> matches only at the end of the string
+       (see <a class="xref" href="functions-matching.html#POSIX-MATCHING-RULES" title="9.7.3.5. Regular Expression Matching Rules">Section 9.7.3.5</a> for how this differs from
+       <code class="literal">$</code>) </td></tr></tbody></table></div></div><br class="table-break" /><p>
+    A word is defined as in the specification of
+    <code class="literal">[[:&lt;:]]</code> and <code class="literal">[[:&gt;:]]</code> above.
+    Constraint escapes are illegal within bracket expressions.
+   </p><div class="table" id="POSIX-CONSTRAINT-BACKREF-TABLE"><p class="title"><strong>Table 9.23. Regular Expression Back References</strong></p><div class="table-contents"><table class="table" summary="Regular Expression Back References" border="1"><colgroup><col /><col /></colgroup><thead><tr><th>Escape</th><th>Description</th></tr></thead><tbody><tr><td> <code class="literal">\</code><em class="replaceable"><code>m</code></em> </td><td> (where <em class="replaceable"><code>m</code></em> is a nonzero digit)
+       a back reference to the <em class="replaceable"><code>m</code></em>'th subexpression </td></tr><tr><td> <code class="literal">\</code><em class="replaceable"><code>mnn</code></em> </td><td> (where <em class="replaceable"><code>m</code></em> is a nonzero digit, and
+       <em class="replaceable"><code>nn</code></em> is some more digits, and the decimal value
+       <em class="replaceable"><code>mnn</code></em> is not greater than the number of closing capturing
+       parentheses seen so far)
+       a back reference to the <em class="replaceable"><code>mnn</code></em>'th subexpression </td></tr></tbody></table></div></div><br class="table-break" /><div class="note"><h3 class="title">Note</h3><p>
+     There is an inherent ambiguity between octal character-entry
+     escapes and back references, which is resolved by the following heuristics,
+     as hinted at above.
+     A leading zero always indicates an octal escape.
+     A single non-zero digit, not followed by another digit,
+     is always taken as a back reference.
+     A multi-digit sequence not starting with a zero is taken as a back
+     reference if it comes after a suitable subexpression
+     (i.e., the number is in the legal range for a back reference),
+     and otherwise is taken as octal.
+    </p></div></div><div class="sect3" id="POSIX-METASYNTAX"><div class="titlepage"><div><div><h4 class="title">9.7.3.4. Regular Expression Metasyntax</h4></div></div></div><p>
+    In addition to the main syntax described above, there are some special
+    forms and miscellaneous syntactic facilities available.
+   </p><p>
+    An RE can begin with one of two special <em class="firstterm">director</em> prefixes.
+    If an RE begins with <code class="literal">***:</code>,
+    the rest of the RE is taken as an ARE.  (This normally has no effect in
+    <span class="productname">PostgreSQL</span>, since REs are assumed to be AREs;
+    but it does have an effect if ERE or BRE mode had been specified by
+    the <em class="replaceable"><code>flags</code></em> parameter to a regex function.)
+    If an RE begins with <code class="literal">***=</code>,
+    the rest of the RE is taken to be a literal string,
+    with all characters considered ordinary characters.
+   </p><p>
+    An ARE can begin with <em class="firstterm">embedded options</em>:
+    a sequence <code class="literal">(?</code><em class="replaceable"><code>xyz</code></em><code class="literal">)</code>
+    (where <em class="replaceable"><code>xyz</code></em> is one or more alphabetic characters)
+    specifies options affecting the rest of the RE.
+    These options override any previously determined options —
+    in particular, they can override the case-sensitivity behavior implied by
+    a regex operator, or the <em class="replaceable"><code>flags</code></em> parameter to a regex
+    function.
+    The available option letters are
+    shown in <a class="xref" href="functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE" title="Table 9.24. ARE Embedded-Option Letters">Table 9.24</a>.
+    Note that these same option letters are used in the <em class="replaceable"><code>flags</code></em>
+    parameters of regex functions.
+   </p><div class="table" id="POSIX-EMBEDDED-OPTIONS-TABLE"><p class="title"><strong>Table 9.24. ARE Embedded-Option Letters</strong></p><div class="table-contents"><table class="table" summary="ARE Embedded-Option Letters" border="1"><colgroup><col /><col /></colgroup><thead><tr><th>Option</th><th>Description</th></tr></thead><tbody><tr><td> <code class="literal">b</code> </td><td> rest of RE is a BRE </td></tr><tr><td> <code class="literal">c</code> </td><td> case-sensitive matching (overrides operator type) </td></tr><tr><td> <code class="literal">e</code> </td><td> rest of RE is an ERE </td></tr><tr><td> <code class="literal">i</code> </td><td> case-insensitive matching (see
+       <a class="xref" href="functions-matching.html#POSIX-MATCHING-RULES" title="9.7.3.5. Regular Expression Matching Rules">Section 9.7.3.5</a>) (overrides operator type) </td></tr><tr><td> <code class="literal">m</code> </td><td> historical synonym for <code class="literal">n</code> </td></tr><tr><td> <code class="literal">n</code> </td><td> newline-sensitive matching (see
+       <a class="xref" href="functions-matching.html#POSIX-MATCHING-RULES" title="9.7.3.5. Regular Expression Matching Rules">Section 9.7.3.5</a>) </td></tr><tr><td> <code class="literal">p</code> </td><td> partial newline-sensitive matching (see
+       <a class="xref" href="functions-matching.html#POSIX-MATCHING-RULES" title="9.7.3.5. Regular Expression Matching Rules">Section 9.7.3.5</a>) </td></tr><tr><td> <code class="literal">q</code> </td><td> rest of RE is a literal (<span class="quote">“<span class="quote">quoted</span>”</span>) string, all ordinary
+       characters </td></tr><tr><td> <code class="literal">s</code> </td><td> non-newline-sensitive matching (default) </td></tr><tr><td> <code class="literal">t</code> </td><td> tight syntax (default; see below) </td></tr><tr><td> <code class="literal">w</code> </td><td> inverse partial newline-sensitive (<span class="quote">“<span class="quote">weird</span>”</span>) matching
+       (see <a class="xref" href="functions-matching.html#POSIX-MATCHING-RULES" title="9.7.3.5. Regular Expression Matching Rules">Section 9.7.3.5</a>) </td></tr><tr><td> <code class="literal">x</code> </td><td> expanded syntax (see below) </td></tr></tbody></table></div></div><br class="table-break" /><p>
+    Embedded options take effect at the <code class="literal">)</code> terminating the sequence.
+    They can appear only at the start of an ARE (after the
+    <code class="literal">***:</code> director if any).
+   </p><p>
+    In addition to the usual (<em class="firstterm">tight</em>) RE syntax, in which all
+    characters are significant, there is an <em class="firstterm">expanded</em> syntax,
+    available by specifying the embedded <code class="literal">x</code> option.
+    In the expanded syntax,
+    white-space characters in the RE are ignored, as are
+    all characters between a <code class="literal">#</code>
+    and the following newline (or the end of the RE).  This
+    permits paragraphing and commenting a complex RE.
+    There are three exceptions to that basic rule:
+
+    </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
+       a white-space character or <code class="literal">#</code> preceded by <code class="literal">\</code> is
+       retained
+      </p></li><li class="listitem"><p>
+       white space or <code class="literal">#</code> within a bracket expression is retained
+      </p></li><li class="listitem"><p>
+       white space and comments cannot appear within multi-character symbols,
+       such as <code class="literal">(?:</code>
+      </p></li></ul></div><p>
+
+    For this purpose, white-space characters are blank, tab, newline, and
+    any character that belongs to the <em class="replaceable"><code>space</code></em> character class.
+   </p><p>
+    Finally, in an ARE, outside bracket expressions, the sequence
+    <code class="literal">(?#</code><em class="replaceable"><code>ttt</code></em><code class="literal">)</code>
+    (where <em class="replaceable"><code>ttt</code></em> is any text not containing a <code class="literal">)</code>)
+    is a comment, completely ignored.
+    Again, this is not allowed between the characters of
+    multi-character symbols, like <code class="literal">(?:</code>.
+    Such comments are more a historical artifact than a useful facility,
+    and their use is deprecated; use the expanded syntax instead.
+   </p><p>
+    <span class="emphasis"><em>None</em></span> of these metasyntax extensions is available if
+    an initial <code class="literal">***=</code> director
+    has specified that the user's input be treated as a literal string
+    rather than as an RE.
+   </p></div><div class="sect3" id="POSIX-MATCHING-RULES"><div class="titlepage"><div><div><h4 class="title">9.7.3.5. Regular Expression Matching Rules</h4></div></div></div><p>
+    In the event that an RE could match more than one substring of a given
+    string, the RE matches the one starting earliest in the string.
+    If the RE could match more than one substring starting at that point,
+    either the longest possible match or the shortest possible match will
+    be taken, depending on whether the RE is <em class="firstterm">greedy</em> or
+    <em class="firstterm">non-greedy</em>.
+   </p><p>
+    Whether an RE is greedy or not is determined by the following rules:
+    </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
+       Most atoms, and all constraints, have no greediness attribute (because
+       they cannot match variable amounts of text anyway).
+      </p></li><li class="listitem"><p>
+       Adding parentheses around an RE does not change its greediness.
+      </p></li><li class="listitem"><p>
+       A quantified atom with a fixed-repetition quantifier
+       (<code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">}</code>
+       or
+       <code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">}?</code>)
+       has the same greediness (possibly none) as the atom itself.
+      </p></li><li class="listitem"><p>
+       A quantified atom with other normal quantifiers (including
+       <code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">,</code><em class="replaceable"><code>n</code></em><code class="literal">}</code>
+       with <em class="replaceable"><code>m</code></em> equal to <em class="replaceable"><code>n</code></em>)
+       is greedy (prefers longest match).
+      </p></li><li class="listitem"><p>
+       A quantified atom with a non-greedy quantifier (including
+       <code class="literal">{</code><em class="replaceable"><code>m</code></em><code class="literal">,</code><em class="replaceable"><code>n</code></em><code class="literal">}?</code>
+       with <em class="replaceable"><code>m</code></em> equal to <em class="replaceable"><code>n</code></em>)
+       is non-greedy (prefers shortest match).
+      </p></li><li class="listitem"><p>
+       A branch — that is, an RE that has no top-level
+       <code class="literal">|</code> operator — has the same greediness as the first
+       quantified atom in it that has a greediness attribute.
+      </p></li><li class="listitem"><p>
+       An RE consisting of two or more branches connected by the
+       <code class="literal">|</code> operator is always greedy.
+      </p></li></ul></div><p>
+   </p><p>
+    The above rules associate greediness attributes not only with individual
+    quantified atoms, but with branches and entire REs that contain quantified
+    atoms.  What that means is that the matching is done in such a way that
+    the branch, or whole RE, matches the longest or shortest possible
+    substring <span class="emphasis"><em>as a whole</em></span>.  Once the length of the entire match
+    is determined, the part of it that matches any particular subexpression
+    is determined on the basis of the greediness attribute of that
+    subexpression, with subexpressions starting earlier in the RE taking
+    priority over ones starting later.
+   </p><p>
+    An example of what this means:
+</p><pre class="screen">
+SELECT SUBSTRING('XY1234Z', 'Y*([0-9]{1,3})');
+<em class="lineannotation"><span class="lineannotation">Result: </span></em><code class="computeroutput">123</code>
+SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})');
+<em class="lineannotation"><span class="lineannotation">Result: </span></em><code class="computeroutput">1</code>
+</pre><p>
+    In the first case, the RE as a whole is greedy because <code class="literal">Y*</code>
+    is greedy.  It can match beginning at the <code class="literal">Y</code>, and it matches
+    the longest possible string starting there, i.e., <code class="literal">Y123</code>.
+    The output is the parenthesized part of that, or <code class="literal">123</code>.
+    In the second case, the RE as a whole is non-greedy because <code class="literal">Y*?</code>
+    is non-greedy.  It can match beginning at the <code class="literal">Y</code>, and it matches
+    the shortest possible string starting there, i.e., <code class="literal">Y1</code>.
+    The subexpression <code class="literal">[0-9]{1,3}</code> is greedy but it cannot change
+    the decision as to the overall match length; so it is forced to match
+    just <code class="literal">1</code>.
+   </p><p>
+    In short, when an RE contains both greedy and non-greedy subexpressions,
+    the total match length is either as long as possible or as short as
+    possible, according to the attribute assigned to the whole RE.  The
+    attributes assigned to the subexpressions only affect how much of that
+    match they are allowed to <span class="quote">“<span class="quote">eat</span>”</span> relative to each other.
+   </p><p>
+    The quantifiers <code class="literal">{1,1}</code> and <code class="literal">{1,1}?</code>
+    can be used to force greediness or non-greediness, respectively,
+    on a subexpression or a whole RE.
+    This is useful when you need the whole RE to have a greediness attribute
+    different from what's deduced from its elements.  As an example,
+    suppose that we are trying to separate a string containing some digits
+    into the digits and the parts before and after them.  We might try to
+    do that like this:
+</p><pre class="screen">
+SELECT regexp_match('abc01234xyz', '(.*)(\d+)(.*)');
+<em class="lineannotation"><span class="lineannotation">Result: </span></em><code class="computeroutput">{abc0123,4,xyz}</code>
+</pre><p>
+    That didn't work: the first <code class="literal">.*</code> is greedy so
+    it <span class="quote">“<span class="quote">eats</span>”</span> as much as it can, leaving the <code class="literal">\d+</code> to
+    match at the last possible place, the last digit.  We might try to fix
+    that by making it non-greedy:
+</p><pre class="screen">
+SELECT regexp_match('abc01234xyz', '(.*?)(\d+)(.*)');
+<em class="lineannotation"><span class="lineannotation">Result: </span></em><code class="computeroutput">{abc,0,""}</code>
+</pre><p>
+    That didn't work either, because now the RE as a whole is non-greedy
+    and so it ends the overall match as soon as possible.  We can get what
+    we want by forcing the RE as a whole to be greedy:
+</p><pre class="screen">
+SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
+<em class="lineannotation"><span class="lineannotation">Result: </span></em><code class="computeroutput">{abc,01234,xyz}</code>
+</pre><p>
+    Controlling the RE's overall greediness separately from its components'
+    greediness allows great flexibility in handling variable-length patterns.
+   </p><p>
+    When deciding what is a longer or shorter match,
+    match lengths are measured in characters, not collating elements.
+    An empty string is considered longer than no match at all.
+    For example:
+    <code class="literal">bb*</code>
+    matches the three middle characters of <code class="literal">abbbc</code>;
+    <code class="literal">(week|wee)(night|knights)</code>
+    matches all ten characters of <code class="literal">weeknights</code>;
+    when <code class="literal">(.*).*</code>
+    is matched against <code class="literal">abc</code> the parenthesized subexpression
+    matches all three characters; and when
+    <code class="literal">(a*)*</code> is matched against <code class="literal">bc</code>
+    both the whole RE and the parenthesized
+    subexpression match an empty string.
+   </p><p>
+    If case-independent matching is specified,
+    the effect is much as if all case distinctions had vanished from the
+    alphabet.
+    When an alphabetic that exists in multiple cases appears as an
+    ordinary character outside a bracket expression, it is effectively
+    transformed into a bracket expression containing both cases,
+    e.g., <code class="literal">x</code> becomes <code class="literal">[xX]</code>.
+    When it appears inside a bracket expression, all case counterparts
+    of it are added to the bracket expression, e.g.,
+    <code class="literal">[x]</code> becomes <code class="literal">[xX]</code>
+    and <code class="literal">[^x]</code> becomes <code class="literal">[^xX]</code>.
+   </p><p>
+    If newline-sensitive matching is specified, <code class="literal">.</code>
+    and bracket expressions using <code class="literal">^</code>
+    will never match the newline character
+    (so that matches will never cross newlines unless the RE
+    explicitly arranges it)
+    and <code class="literal">^</code> and <code class="literal">$</code>
+    will match the empty string after and before a newline
+    respectively, in addition to matching at beginning and end of string
+    respectively.
+    But the ARE escapes <code class="literal">\A</code> and <code class="literal">\Z</code>
+    continue to match beginning or end of string <span class="emphasis"><em>only</em></span>.
+   </p><p>
+    If partial newline-sensitive matching is specified,
+    this affects <code class="literal">.</code> and bracket expressions
+    as with newline-sensitive matching, but not <code class="literal">^</code>
+    and <code class="literal">$</code>.
+   </p><p>
+    If inverse partial newline-sensitive matching is specified,
+    this affects <code class="literal">^</code> and <code class="literal">$</code>
+    as with newline-sensitive matching, but not <code class="literal">.</code>
+    and bracket expressions.
+    This isn't very useful but is provided for symmetry.
+   </p></div><div class="sect3" id="POSIX-LIMITS-COMPATIBILITY"><div class="titlepage"><div><div><h4 class="title">9.7.3.6. Limits and Compatibility</h4></div></div></div><p>
+    No particular limit is imposed on the length of REs in this
+    implementation.  However,
+    programs intended to be highly portable should not employ REs longer
+    than 256 bytes,
+    as a POSIX-compliant implementation can refuse to accept such REs.
+   </p><p>
+    The only feature of AREs that is actually incompatible with
+    POSIX EREs is that <code class="literal">\</code> does not lose its special
+    significance inside bracket expressions.
+    All other ARE features use syntax which is illegal or has
+    undefined or unspecified effects in POSIX EREs;
+    the <code class="literal">***</code> syntax of directors likewise is outside the POSIX
+    syntax for both BREs and EREs.
+   </p><p>
+    Many of the ARE extensions are borrowed from Perl, but some have
+    been changed to clean them up, and a few Perl extensions are not present.
+    Incompatibilities of note include <code class="literal">\b</code>, <code class="literal">\B</code>,
+    the lack of special treatment for a trailing newline,
+    the addition of complemented bracket expressions to the things
+    affected by newline-sensitive matching,
+    the restrictions on parentheses and back references in lookahead/lookbehind
+    constraints, and the longest/shortest-match (rather than first-match)
+    matching semantics.
+   </p><p>
+    Two significant incompatibilities exist between AREs and the ERE syntax
+    recognized by pre-7.4 releases of <span class="productname">PostgreSQL</span>:
+
+    </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
+       In AREs, <code class="literal">\</code> followed by an alphanumeric character is either
+       an escape or an error, while in previous releases, it was just another
+       way of writing the alphanumeric.
+       This should not be much of a problem because there was no reason to
+       write such a sequence in earlier releases.
+      </p></li><li class="listitem"><p>
+       In AREs, <code class="literal">\</code> remains a special character within
+       <code class="literal">[]</code>, so a literal <code class="literal">\</code> within a bracket
+       expression must be written <code class="literal">\\</code>.
+      </p></li></ul></div><p>
+   </p></div><div class="sect3" id="POSIX-BASIC-REGEXES"><div class="titlepage"><div><div><h4 class="title">9.7.3.7. Basic Regular Expressions</h4></div></div></div><p>
+    BREs differ from EREs in several respects.
+    In BREs, <code class="literal">|</code>, <code class="literal">+</code>, and <code class="literal">?</code>
+    are ordinary characters and there is no equivalent
+    for their functionality.
+    The delimiters for bounds are
+    <code class="literal">\{</code> and <code class="literal">\}</code>,
+    with <code class="literal">{</code> and <code class="literal">}</code>
+    by themselves ordinary characters.
+    The parentheses for nested subexpressions are
+    <code class="literal">\(</code> and <code class="literal">\)</code>,
+    with <code class="literal">(</code> and <code class="literal">)</code> by themselves ordinary characters.
+    <code class="literal">^</code> is an ordinary character except at the beginning of the
+    RE or the beginning of a parenthesized subexpression,
+    <code class="literal">$</code> is an ordinary character except at the end of the
+    RE or the end of a parenthesized subexpression,
+    and <code class="literal">*</code> is an ordinary character if it appears at the beginning
+    of the RE or the beginning of a parenthesized subexpression
+    (after a possible leading <code class="literal">^</code>).
+    Finally, single-digit back references are available, and
+    <code class="literal">\&lt;</code> and <code class="literal">\&gt;</code>
+    are synonyms for
+    <code class="literal">[[:&lt;:]]</code> and <code class="literal">[[:&gt;:]]</code>
+    respectively; no other escapes are available in BREs.
+   </p></div><div class="sect3" id="POSIX-VS-XQUERY"><div class="titlepage"><div><div><h4 class="title">9.7.3.8. Differences from XQuery (<code class="literal">LIKE_REGEX</code>)</h4></div></div></div><a id="id-1.5.8.13.9.35.2" class="indexterm"></a><a id="id-1.5.8.13.9.35.3" class="indexterm"></a><p>
+     Since SQL:2008, the SQL standard includes
+     a <code class="literal">LIKE_REGEX</code> operator that performs pattern
+     matching according to the XQuery regular expression
+     standard.  <span class="productname">PostgreSQL</span> does not yet
+     implement this operator, but you can get very similar behavior using
+     the <code class="function">regexp_match()</code> function, since XQuery
+     regular expressions are quite close to the ARE syntax described above.
+    </p><p>
+     Notable differences between the existing POSIX-based
+     regular-expression feature and XQuery regular expressions include:
+
+     </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
+        XQuery character class subtraction is not supported.  An example of
+        this feature is using the following to match only English
+        consonants: <code class="literal">[a-z-[aeiou]]</code>.
+       </p></li><li class="listitem"><p>
+        XQuery character class shorthands <code class="literal">\c</code>,
+        <code class="literal">\C</code>, <code class="literal">\i</code>,
+        and <code class="literal">\I</code> are not supported.
+       </p></li><li class="listitem"><p>
+        XQuery character class elements
+        using <code class="literal">\p{UnicodeProperty}</code> or the
+        inverse <code class="literal">\P{UnicodeProperty}</code> are not supported.
+       </p></li><li class="listitem"><p>
+        POSIX interprets character classes such as <code class="literal">\w</code>
+        (see <a class="xref" href="functions-matching.html#POSIX-CLASS-SHORTHAND-ESCAPES-TABLE" title="Table 9.21. Regular Expression Class-Shorthand Escapes">Table 9.21</a>)
+        according to the prevailing locale (which you can control by
+        attaching a <code class="literal">COLLATE</code> clause to the operator or
+        function).  XQuery specifies these classes by reference to Unicode
+        character properties, so equivalent behavior is obtained only with
+        a locale that follows the Unicode rules.
+       </p></li><li class="listitem"><p>
+        The SQL standard (not XQuery itself) attempts to cater for more
+        variants of <span class="quote">“<span class="quote">newline</span>”</span> than POSIX does.  The
+        newline-sensitive matching options described above consider only
+        ASCII NL (<code class="literal">\n</code>) to be a newline, but SQL would have
+        us treat CR (<code class="literal">\r</code>), CRLF (<code class="literal">\r\n</code>)
+        (a Windows-style newline), and some Unicode-only characters like
+        LINE SEPARATOR (U+2028) as newlines as well.
+        Notably, <code class="literal">.</code> and <code class="literal">\s</code> should
+        count <code class="literal">\r\n</code> as one character not two according to
+        SQL.
+       </p></li><li class="listitem"><p>
+        Of the character-entry escapes described in
+        <a class="xref" href="functions-matching.html#POSIX-CHARACTER-ENTRY-ESCAPES-TABLE" title="Table 9.20. Regular Expression Character-Entry Escapes">Table 9.20</a>,
+        XQuery supports only <code class="literal">\n</code>, <code class="literal">\r</code>,
+        and <code class="literal">\t</code>.
+       </p></li><li class="listitem"><p>
+        XQuery does not support
+        the <code class="literal">[:<em class="replaceable"><code>name</code></em>:]</code> syntax
+        for character classes within bracket expressions.
+       </p></li><li class="listitem"><p>
+        XQuery does not have lookahead or lookbehind constraints,
+        nor any of the constraint escapes described in
+        <a class="xref" href="functions-matching.html#POSIX-CONSTRAINT-ESCAPES-TABLE" title="Table 9.22. Regular Expression Constraint Escapes">Table 9.22</a>.
+       </p></li><li class="listitem"><p>
+        The metasyntax forms described in <a class="xref" href="functions-matching.html#POSIX-METASYNTAX" title="9.7.3.4. Regular Expression Metasyntax">Section 9.7.3.4</a>
+        do not exist in XQuery.
+       </p></li><li class="listitem"><p>
+        The regular expression flag letters defined by XQuery are
+        related to but not the same as the option letters for POSIX
+        (<a class="xref" href="functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE" title="Table 9.24. ARE Embedded-Option Letters">Table 9.24</a>).  While the
+        <code class="literal">i</code> and <code class="literal">q</code> options behave the
+        same, others do not:
+        </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: circle; "><li class="listitem"><p>
+           XQuery's <code class="literal">s</code> (allow dot to match newline)
+           and <code class="literal">m</code> (allow <code class="literal">^</code>
+           and <code class="literal">$</code> to match at newlines) flags provide
+           access to the same behaviors as
+           POSIX's <code class="literal">n</code>, <code class="literal">p</code>
+           and <code class="literal">w</code> flags, but they
+           do <span class="emphasis"><em>not</em></span> match the behavior of
+           POSIX's <code class="literal">s</code> and <code class="literal">m</code> flags.
+           Note in particular that dot-matches-newline is the default
+           behavior in POSIX but not XQuery.
+          </p></li><li class="listitem"><p>
+           XQuery's <code class="literal">x</code> (ignore whitespace in pattern) flag
+           is noticeably different from POSIX's expanded-mode flag.
+           POSIX's <code class="literal">x</code> flag also
+           allows <code class="literal">#</code> to begin a comment in the pattern,
+           and POSIX will not ignore a whitespace character after a
+           backslash.
+          </p></li></ul></div><p>
+       </p></li></ul></div><p>
+    </p></div></div></div><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navfooter"><hr></hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="functions-bitstring.html" title="9.6. Bit String Functions and Operators">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="functions.html" title="Chapter 9. Functions and Operators">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="functions-formatting.html" title="9.8. Data Type Formatting Functions">Next</a></td></tr><tr><td width="40%" align="left" valign="top">9.6. Bit String Functions and Operators </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 13.4 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 9.8. Data Type Formatting Functions</td></tr></table></div></body></html>
+\ No newline at end of file