summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/html/textsearch-tables.html
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src/sgml/html/textsearch-tables.html')
-rw-r--r--doc/src/sgml/html/textsearch-tables.html139
1 files changed, 139 insertions, 0 deletions
diff --git a/doc/src/sgml/html/textsearch-tables.html b/doc/src/sgml/html/textsearch-tables.html
new file mode 100644
index 0000000..6ce2688
--- /dev/null
+++ b/doc/src/sgml/html/textsearch-tables.html
@@ -0,0 +1,139 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>12.2. Tables and Indexes</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets V1.79.1" /><link rel="prev" href="textsearch-intro.html" title="12.1. Introduction" /><link rel="next" href="textsearch-controls.html" title="12.3. Controlling Text Search" /></head><body id="docContent" class="container-fluid col-10"><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">12.2. Tables and Indexes</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="textsearch-intro.html" title="12.1. Introduction">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="textsearch.html" title="Chapter 12. Full Text Search">Up</a></td><th width="60%" align="center">Chapter 12. Full Text Search</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 13.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="textsearch-controls.html" title="12.3. Controlling Text Search">Next</a></td></tr></table><hr></hr></div><div class="sect1" id="TEXTSEARCH-TABLES"><div class="titlepage"><div><div><h2 class="title" style="clear: both">12.2. Tables and Indexes</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="textsearch-tables.html#TEXTSEARCH-TABLES-SEARCH">12.2.1. Searching a Table</a></span></dt><dt><span class="sect2"><a href="textsearch-tables.html#TEXTSEARCH-TABLES-INDEX">12.2.2. Creating Indexes</a></span></dt></dl></div><p>
+ The examples in the previous section illustrated full text matching using
+ simple constant strings. This section shows how to search table data,
+ optionally using indexes.
+ </p><div class="sect2" id="TEXTSEARCH-TABLES-SEARCH"><div class="titlepage"><div><div><h3 class="title">12.2.1. Searching a Table</h3></div></div></div><p>
+ It is possible to do a full text search without an index. A simple query
+ to print the <code class="structname">title</code> of each row that contains the word
+ <code class="literal">friend</code> in its <code class="structfield">body</code> field is:
+
+</p><pre class="programlisting">
+SELECT title
+FROM pgweb
+WHERE to_tsvector('english', body) @@ to_tsquery('english', 'friend');
+</pre><p>
+
+ This will also find related words such as <code class="literal">friends</code>
+ and <code class="literal">friendly</code>, since all these are reduced to the same
+ normalized lexeme.
+ </p><p>
+ The query above specifies that the <code class="literal">english</code> configuration
+ is to be used to parse and normalize the strings. Alternatively we
+ could omit the configuration parameters:
+
+</p><pre class="programlisting">
+SELECT title
+FROM pgweb
+WHERE to_tsvector(body) @@ to_tsquery('friend');
+</pre><p>
+
+ This query will use the configuration set by <a class="xref" href="runtime-config-client.html#GUC-DEFAULT-TEXT-SEARCH-CONFIG">default_text_search_config</a>.
+ </p><p>
+ A more complex example is to
+ select the ten most recent documents that contain <code class="literal">create</code> and
+ <code class="literal">table</code> in the <code class="structname">title</code> or <code class="structname">body</code>:
+
+</p><pre class="programlisting">
+SELECT title
+FROM pgweb
+WHERE to_tsvector(title || ' ' || body) @@ to_tsquery('create &amp; table')
+ORDER BY last_mod_date DESC
+LIMIT 10;
+</pre><p>
+
+ For clarity we omitted the <code class="function">coalesce</code> function calls
+ which would be needed to find rows that contain <code class="literal">NULL</code>
+ in one of the two fields.
+ </p><p>
+ Although these queries will work without an index, most applications
+ will find this approach too slow, except perhaps for occasional ad-hoc
+ searches. Practical use of text searching usually requires creating
+ an index.
+ </p></div><div class="sect2" id="TEXTSEARCH-TABLES-INDEX"><div class="titlepage"><div><div><h3 class="title">12.2.2. Creating Indexes</h3></div></div></div><p>
+ We can create a <acronym class="acronym">GIN</acronym> index (<a class="xref" href="textsearch-indexes.html" title="12.9. GIN and GiST Index Types">Section 12.9</a>) to speed up text searches:
+
+</p><pre class="programlisting">
+CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector('english', body));
+</pre><p>
+
+ Notice that the 2-argument version of <code class="function">to_tsvector</code> is
+ used. Only text search functions that specify a configuration name can
+ be used in expression indexes (<a class="xref" href="indexes-expressional.html" title="11.7. Indexes on Expressions">Section 11.7</a>).
+ This is because the index contents must be unaffected by <a class="xref" href="runtime-config-client.html#GUC-DEFAULT-TEXT-SEARCH-CONFIG">default_text_search_config</a>. If they were affected, the
+ index contents might be inconsistent because different entries could
+ contain <code class="type">tsvector</code>s that were created with different text search
+ configurations, and there would be no way to guess which was which. It
+ would be impossible to dump and restore such an index correctly.
+ </p><p>
+ Because the two-argument version of <code class="function">to_tsvector</code> was
+ used in the index above, only a query reference that uses the 2-argument
+ version of <code class="function">to_tsvector</code> with the same configuration
+ name will use that index. That is, <code class="literal">WHERE
+ to_tsvector('english', body) @@ 'a &amp; b'</code> can use the index,
+ but <code class="literal">WHERE to_tsvector(body) @@ 'a &amp; b'</code> cannot.
+ This ensures that an index will be used only with the same configuration
+ used to create the index entries.
+ </p><p>
+ It is possible to set up more complex expression indexes wherein the
+ configuration name is specified by another column, e.g.:
+
+</p><pre class="programlisting">
+CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector(config_name, body));
+</pre><p>
+
+ where <code class="literal">config_name</code> is a column in the <code class="literal">pgweb</code>
+ table. This allows mixed configurations in the same index while
+ recording which configuration was used for each index entry. This
+ would be useful, for example, if the document collection contained
+ documents in different languages. Again,
+ queries that are meant to use the index must be phrased to match, e.g.,
+ <code class="literal">WHERE to_tsvector(config_name, body) @@ 'a &amp; b'</code>.
+ </p><p>
+ Indexes can even concatenate columns:
+
+</p><pre class="programlisting">
+CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector('english', title || ' ' || body));
+</pre><p>
+ </p><p>
+ Another approach is to create a separate <code class="type">tsvector</code> column
+ to hold the output of <code class="function">to_tsvector</code>. To keep this
+ column automatically up to date with its source data, use a stored
+ generated column. This example is a
+ concatenation of <code class="literal">title</code> and <code class="literal">body</code>,
+ using <code class="function">coalesce</code> to ensure that one field will still be
+ indexed when the other is <code class="literal">NULL</code>:
+
+</p><pre class="programlisting">
+ALTER TABLE pgweb
+ ADD COLUMN textsearchable_index_col tsvector
+ GENERATED ALWAYS AS (to_tsvector('english', coalesce(title, '') || ' ' || coalesce(body, ''))) STORED;
+</pre><p>
+
+ Then we create a <acronym class="acronym">GIN</acronym> index to speed up the search:
+
+</p><pre class="programlisting">
+CREATE INDEX textsearch_idx ON pgweb USING GIN (textsearchable_index_col);
+</pre><p>
+
+ Now we are ready to perform a fast full text search:
+
+</p><pre class="programlisting">
+SELECT title
+FROM pgweb
+WHERE textsearchable_index_col @@ to_tsquery('create &amp; table')
+ORDER BY last_mod_date DESC
+LIMIT 10;
+</pre><p>
+ </p><p>
+ One advantage of the separate-column approach over an expression index
+ is that it is not necessary to explicitly specify the text search
+ configuration in queries in order to make use of the index. As shown
+ in the example above, the query can depend on
+ <code class="varname">default_text_search_config</code>. Another advantage is that
+ searches will be faster, since it will not be necessary to redo the
+ <code class="function">to_tsvector</code> calls to verify index matches. (This is more
+ important when using a GiST index than a GIN index; see <a class="xref" href="textsearch-indexes.html" title="12.9. GIN and GiST Index Types">Section 12.9</a>.) The expression-index approach is
+ simpler to set up, however, and it requires less disk space since the
+ <code class="type">tsvector</code> representation is not stored explicitly.
+ </p></div></div><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navfooter"><hr></hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="textsearch-intro.html" title="12.1. Introduction">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="textsearch.html" title="Chapter 12. Full Text Search">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="textsearch-controls.html" title="12.3. Controlling Text Search">Next</a></td></tr><tr><td width="40%" align="left" valign="top">12.1. Introduction </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 13.4 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 12.3. Controlling Text Search</td></tr></table></div></body></html> \ No newline at end of file