1 files changed, 196 insertions, 0 deletions
diff --git a/doc/src/sgml/html/datatype-textsearch.html b/doc/src/sgml/html/datatype-textsearch.html
new file mode 100644
index 0000000..2f4e4a9
--- /dev/null
+++ b/doc/src/sgml/html/datatype-textsearch.html
@@ -0,0 +1,196 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>8.11. Text Search Types</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="datatype-bit.html" title="8.10. Bit String Types" /><link rel="next" href="datatype-uuid.html" title="8.12. UUID Type" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">8.11. Text Search Types</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="datatype-bit.html" title="8.10. Bit String Types">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="datatype.html" title="Chapter 8. Data Types">Up</a></td><th width="60%" align="center">Chapter 8. Data Types</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 15.5 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="datatype-uuid.html" title="8.12. UUID Type">Next</a></td></tr></table><hr /></div><div class="sect1" id="DATATYPE-TEXTSEARCH"><div class="titlepage"><div><div><h2 class="title" style="clear: both">8.11. Text Search Types</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="datatype-textsearch.html#DATATYPE-TSVECTOR">8.11.1. <code class="type">tsvector</code></a></span></dt><dt><span class="sect2"><a href="datatype-textsearch.html#DATATYPE-TSQUERY">8.11.2. <code class="type">tsquery</code></a></span></dt></dl></div><a id="id-1.5.7.19.2" class="indexterm"></a><a id="id-1.5.7.19.3" class="indexterm"></a><p>
+    <span class="productname">PostgreSQL</span> provides two data types that
+    are designed to support full text search, which is the activity of
+    searching through a collection of natural-language <em class="firstterm">documents</em>
+    to locate those that best match a <em class="firstterm">query</em>.
+    The <code class="type">tsvector</code> type represents a document in a form optimized
+    for text search; the <code class="type">tsquery</code> type similarly represents
+    a text query.
+    <a class="xref" href="textsearch.html" title="Chapter 12. Full Text Search">Chapter 12</a> provides a detailed explanation of this
+    facility, and <a class="xref" href="functions-textsearch.html" title="9.13. Text Search Functions and Operators">Section 9.13</a> summarizes the
+    related functions and operators.
+   </p><div class="sect2" id="DATATYPE-TSVECTOR"><div class="titlepage"><div><div><h3 class="title">8.11.1. <code class="type">tsvector</code></h3></div></div></div><a id="id-1.5.7.19.5.2" class="indexterm"></a><p>
+     A <code class="type">tsvector</code> value is a sorted list of distinct
+     <em class="firstterm">lexemes</em>, which are words that have been
+     <em class="firstterm">normalized</em> to merge different variants of the same word
+     (see <a class="xref" href="textsearch.html" title="Chapter 12. Full Text Search">Chapter 12</a> for details).  Sorting and
+     duplicate-elimination are done automatically during input, as shown in
+     this example:
+
+</p><pre class="programlisting">
+SELECT 'a fat cat sat on a mat and ate a fat rat'::tsvector;
+                      tsvector
+----------------------------------------------------
+ 'a' 'and' 'ate' 'cat' 'fat' 'mat' 'on' 'rat' 'sat'
+</pre><p>
+
+     To represent
+     lexemes containing whitespace or punctuation, surround them with quotes:
+
+</p><pre class="programlisting">
+SELECT $$the lexeme '    ' contains spaces$$::tsvector;
+                 tsvector
+-------------------------------------------
+ '    ' 'contains' 'lexeme' 'spaces' 'the'
+</pre><p>
+
+     (We use dollar-quoted string literals in this example and the next one
+     to avoid the confusion of having to double quote marks within the
+     literals.)  Embedded quotes and backslashes must be doubled:
+
+</p><pre class="programlisting">
+SELECT $$the lexeme 'Joe''s' contains a quote$$::tsvector;
+                    tsvector
+------------------------------------------------
+ 'Joe''s' 'a' 'contains' 'lexeme' 'quote' 'the'
+</pre><p>
+
+     Optionally, integer <em class="firstterm">positions</em>
+     can be attached to lexemes:
+
+</p><pre class="programlisting">
+SELECT 'a:1 fat:2 cat:3 sat:4 on:5 a:6 mat:7 and:8 ate:9 a:10 fat:11 rat:12'::tsvector;
+                                  tsvector
+-------------------------------------------------------------------------------
+ 'a':1,6,10 'and':8 'ate':9 'cat':3 'fat':2,11 'mat':7 'on':5 'rat':12 'sat':4
+</pre><p>
+
+     A position normally indicates the source word's location in the
+     document.  Positional information can be used for
+     <em class="firstterm">proximity ranking</em>.  Position values can
+     range from 1 to 16383; larger numbers are silently set to 16383.
+     Duplicate positions for the same lexeme are discarded.
+    </p><p>
+     Lexemes that have positions can further be labeled with a
+     <em class="firstterm">weight</em>, which can be <code class="literal">A</code>,
+     <code class="literal">B</code>, <code class="literal">C</code>, or <code class="literal">D</code>.
+     <code class="literal">D</code> is the default and hence is not shown on output:
+
+</p><pre class="programlisting">
+SELECT 'a:1A fat:2B,4C cat:5D'::tsvector;
+          tsvector
+----------------------------
+ 'a':1A 'cat':5 'fat':2B,4C
+</pre><p>
+
+     Weights are typically used to reflect document structure, for example
+     by marking title words differently from body words.  Text search
+     ranking functions can assign different priorities to the different
+     weight markers.
+    </p><p>
+     It is important to understand that the
+     <code class="type">tsvector</code> type itself does not perform any word
+     normalization; it assumes the words it is given are normalized
+     appropriately for the application.  For example,
+
+</p><pre class="programlisting">
+SELECT 'The Fat Rats'::tsvector;
+      tsvector
+--------------------
+ 'Fat' 'Rats' 'The'
+</pre><p>
+
+     For most English-text-searching applications the above words would
+     be considered non-normalized, but <code class="type">tsvector</code> doesn't care.
+     Raw document text should usually be passed through
+     <code class="function">to_tsvector</code> to normalize the words appropriately
+     for searching:
+
+</p><pre class="programlisting">
+SELECT to_tsvector('english', 'The Fat Rats');
+   to_tsvector
+-----------------
+ 'fat':2 'rat':3
+</pre><p>
+
+     Again, see <a class="xref" href="textsearch.html" title="Chapter 12. Full Text Search">Chapter 12</a> for more detail.
+    </p></div><div class="sect2" id="DATATYPE-TSQUERY"><div class="titlepage"><div><div><h3 class="title">8.11.2. <code class="type">tsquery</code></h3></div></div></div><a id="id-1.5.7.19.6.2" class="indexterm"></a><p>
+     A <code class="type">tsquery</code> value stores lexemes that are to be
+     searched for, and can combine them using the Boolean operators
+     <code class="literal">&amp;</code> (AND), <code class="literal">|</code> (OR), and
+     <code class="literal">!</code> (NOT), as well as the phrase search operator
+     <code class="literal">&lt;-&gt;</code> (FOLLOWED BY).  There is also a variant
+     <code class="literal">&lt;<em class="replaceable"><code>N</code></em>&gt;</code> of the FOLLOWED BY
+     operator, where <em class="replaceable"><code>N</code></em> is an integer constant that
+     specifies the distance between the two lexemes being searched
+     for.  <code class="literal">&lt;-&gt;</code> is equivalent to <code class="literal">&lt;1&gt;</code>.
+    </p><p>
+     Parentheses can be used to enforce grouping of these operators.
+     In the absence of parentheses, <code class="literal">!</code> (NOT) binds most tightly,
+     <code class="literal">&lt;-&gt;</code> (FOLLOWED BY) next most tightly, then
+     <code class="literal">&amp;</code> (AND), with <code class="literal">|</code> (OR) binding
+     the least tightly.
+    </p><p>
+     Here are some examples:
+
+</p><pre class="programlisting">
+SELECT 'fat &amp; rat'::tsquery;
+    tsquery
+---------------
+ 'fat' &amp; 'rat'
+
+SELECT 'fat &amp; (rat | cat)'::tsquery;
+          tsquery
+---------------------------
+ 'fat' &amp; ( 'rat' | 'cat' )
+
+SELECT 'fat &amp; rat &amp; ! cat'::tsquery;
+        tsquery
+------------------------
+ 'fat' &amp; 'rat' &amp; !'cat'
+</pre><p>
+    </p><p>
+     Optionally, lexemes in a <code class="type">tsquery</code> can be labeled with
+     one or more weight letters, which restricts them to match only
+     <code class="type">tsvector</code> lexemes with one of those weights:
+
+</p><pre class="programlisting">
+SELECT 'fat:ab &amp; cat'::tsquery;
+    tsquery
+------------------
+ 'fat':AB &amp; 'cat'
+</pre><p>
+    </p><p>
+     Also, lexemes in a <code class="type">tsquery</code> can be labeled with <code class="literal">*</code>
+     to specify prefix matching:
+</p><pre class="programlisting">
+SELECT 'super:*'::tsquery;
+  tsquery
+-----------
+ 'super':*
+</pre><p>
+     This query will match any word in a <code class="type">tsvector</code> that begins
+     with <span class="quote">“<span class="quote">super</span>”</span>.
+    </p><p>
+     Quoting rules for lexemes are the same as described previously for
+     lexemes in <code class="type">tsvector</code>; and, as with <code class="type">tsvector</code>,
+     any required normalization of words must be done before converting
+     to the <code class="type">tsquery</code> type.  The <code class="function">to_tsquery</code>
+     function is convenient for performing such normalization:
+
+</p><pre class="programlisting">
+SELECT to_tsquery('Fat:ab &amp; Cats');
+    to_tsquery
+------------------
+ 'fat':AB &amp; 'cat'
+</pre><p>
+
+     Note that <code class="function">to_tsquery</code> will process prefixes in the same way
+     as other words, which means this comparison returns true:
+
+</p><pre class="programlisting">
+SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' );
+ ?column?
+----------
+ t
+</pre><p>
+     because <code class="literal">postgres</code> gets stemmed to <code class="literal">postgr</code>:
+</p><pre class="programlisting">
+SELECT to_tsvector( 'postgraduate' ), to_tsquery( 'postgres:*' );
+  to_tsvector  | to_tsquery
+---------------+------------
+ 'postgradu':1 | 'postgr':*
+</pre><p>
+     which will match the stemmed form of <code class="literal">postgraduate</code>.
+    </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="datatype-bit.html" title="8.10. Bit String Types">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="datatype.html" title="Chapter 8. Data Types">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="datatype-uuid.html" title="8.12. UUID Type">Next</a></td></tr><tr><td width="40%" align="left" valign="top">8.10. Bit String Types </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 15.5 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 8.12. <acronym class="acronym">UUID</acronym> Type</td></tr></table></div></body></html>
+\ No newline at end of file