summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/dict-xsyn.sgml
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-16 19:46:48 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-16 19:46:48 +0000
commit311bcfc6b3acdd6fd152798c7f287ddf74fa2a98 (patch)
tree0ec307299b1dada3701e42f4ca6eda57d708261e /doc/src/sgml/dict-xsyn.sgml
parentInitial commit. (diff)
downloadpostgresql-15-311bcfc6b3acdd6fd152798c7f287ddf74fa2a98.tar.xz
postgresql-15-311bcfc6b3acdd6fd152798c7f287ddf74fa2a98.zip
Adding upstream version 15.4.upstream/15.4upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/src/sgml/dict-xsyn.sgml')
-rw-r--r--doc/src/sgml/dict-xsyn.sgml149
1 files changed, 149 insertions, 0 deletions
diff --git a/doc/src/sgml/dict-xsyn.sgml b/doc/src/sgml/dict-xsyn.sgml
new file mode 100644
index 0000000..256aff7
--- /dev/null
+++ b/doc/src/sgml/dict-xsyn.sgml
@@ -0,0 +1,149 @@
+<!-- doc/src/sgml/dict-xsyn.sgml -->
+
+<sect1 id="dict-xsyn" xreflabel="dict_xsyn">
+ <title>dict_xsyn</title>
+
+ <indexterm zone="dict-xsyn">
+ <primary>dict_xsyn</primary>
+ </indexterm>
+
+ <para>
+ <filename>dict_xsyn</filename> (Extended Synonym Dictionary) is an example of an
+ add-on dictionary template for full-text search. This dictionary type
+ replaces words with groups of their synonyms, and so makes it possible to
+ search for a word using any of its synonyms.
+ </para>
+
+ <sect2>
+ <title>Configuration</title>
+
+ <para>
+ A <literal>dict_xsyn</literal> dictionary accepts the following options:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <literal>matchorig</literal> controls whether the original word is accepted by
+ the dictionary. Default is <literal>true</literal>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>matchsynonyms</literal> controls whether the synonyms are
+ accepted by the dictionary. Default is <literal>false</literal>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>keeporig</literal> controls whether the original word is included in
+ the dictionary's output. Default is <literal>true</literal>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>keepsynonyms</literal> controls whether the synonyms are included in
+ the dictionary's output. Default is <literal>true</literal>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>rules</literal> is the base name of the file containing the list of
+ synonyms. This file must be stored in
+ <filename>$SHAREDIR/tsearch_data/</filename> (where <literal>$SHAREDIR</literal> means
+ the <productname>PostgreSQL</productname> installation's shared-data directory).
+ Its name must end in <literal>.rules</literal> (which is not to be included in
+ the <literal>rules</literal> parameter).
+ </para>
+ </listitem>
+ </itemizedlist>
+ <para>
+ The rules file has the following format:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Each line represents a group of synonyms for a single word, which is
+ given first on the line. Synonyms are separated by whitespace, thus:
+<programlisting>
+word syn1 syn2 syn3
+</programlisting>
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The sharp (<literal>#</literal>) sign is a comment delimiter. It may appear at
+ any position in a line. The rest of the line will be skipped.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ Look at <filename>xsyn_sample.rules</filename>, which is installed in
+ <filename>$SHAREDIR/tsearch_data/</filename>, for an example.
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Usage</title>
+
+ <para>
+ Installing the <literal>dict_xsyn</literal> extension creates a text search
+ template <literal>xsyn_template</literal> and a dictionary <literal>xsyn</literal>
+ based on it, with default parameters. You can alter the
+ parameters, for example
+
+<programlisting>
+mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=false);
+ALTER TEXT SEARCH DICTIONARY
+</programlisting>
+
+ or create new dictionaries based on the template.
+ </para>
+
+ <para>
+ To test the dictionary, you can try
+
+<programlisting>
+mydb=# SELECT ts_lexize('xsyn', 'word');
+ ts_lexize
+-----------------------
+ {syn1,syn2,syn3}
+
+mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=true);
+ALTER TEXT SEARCH DICTIONARY
+
+mydb=# SELECT ts_lexize('xsyn', 'word');
+ ts_lexize
+-----------------------
+ {word,syn1,syn2,syn3}
+
+mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=false, MATCHSYNONYMS=true);
+ALTER TEXT SEARCH DICTIONARY
+
+mydb=# SELECT ts_lexize('xsyn', 'syn1');
+ ts_lexize
+-----------------------
+ {syn1,syn2,syn3}
+
+mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=true, MATCHORIG=false, KEEPSYNONYMS=false);
+ALTER TEXT SEARCH DICTIONARY
+
+mydb=# SELECT ts_lexize('xsyn', 'syn1');
+ ts_lexize
+-----------------------
+ {word}
+</programlisting>
+
+ Real-world usage will involve including it in a text search
+ configuration as described in <xref linkend="textsearch"/>.
+ That might look like this:
+
+<programlisting>
+ALTER TEXT SEARCH CONFIGURATION english
+ ALTER MAPPING FOR word, asciiword WITH xsyn, english_stem;
+</programlisting>
+
+ </para>
+ </sect2>
+
+</sect1>