summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/dict-xsyn.sgml
blob: 256aff7c58c76c4c812be575417434196cf54aeb (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
<!-- doc/src/sgml/dict-xsyn.sgml -->

<sect1 id="dict-xsyn" xreflabel="dict_xsyn">
 <title>dict_xsyn</title>

 <indexterm zone="dict-xsyn">
  <primary>dict_xsyn</primary>
 </indexterm>

 <para>
  <filename>dict_xsyn</filename> (Extended Synonym Dictionary) is an example of an
  add-on dictionary template for full-text search.  This dictionary type
  replaces words with groups of their synonyms, and so makes it possible to
  search for a word using any of its synonyms.
 </para>

 <sect2>
  <title>Configuration</title>

  <para>
   A <literal>dict_xsyn</literal> dictionary accepts the following options:
  </para>
  <itemizedlist>
   <listitem>
    <para>
     <literal>matchorig</literal> controls whether the original word is accepted by
     the dictionary. Default is <literal>true</literal>.
    </para>
   </listitem>
   <listitem>
    <para>
     <literal>matchsynonyms</literal> controls whether the synonyms are
     accepted by the dictionary. Default is <literal>false</literal>.
    </para>
   </listitem>
   <listitem>
    <para>
     <literal>keeporig</literal> controls whether the original word is included in
     the dictionary's output. Default is <literal>true</literal>.
    </para>
   </listitem>
   <listitem>
    <para>
     <literal>keepsynonyms</literal> controls whether the synonyms are included in
     the dictionary's output. Default is <literal>true</literal>.
    </para>
   </listitem>
   <listitem>
    <para>
     <literal>rules</literal> is the base name of the file containing the list of
     synonyms.  This file must be stored in
     <filename>$SHAREDIR/tsearch_data/</filename> (where <literal>$SHAREDIR</literal> means
     the <productname>PostgreSQL</productname> installation's shared-data directory).
     Its name must end in <literal>.rules</literal> (which is not to be included in
     the <literal>rules</literal> parameter).
    </para>
   </listitem>
  </itemizedlist>
  <para>
   The rules file has the following format:
  </para>
  <itemizedlist>
   <listitem>
    <para>
     Each line represents a group of synonyms for a single word, which is
     given first on the line. Synonyms are separated by whitespace, thus:
<programlisting>
word syn1 syn2 syn3
</programlisting>
    </para>
   </listitem>
   <listitem>
    <para>
     The sharp (<literal>#</literal>) sign is a comment delimiter. It may appear at
     any position in a line.  The rest of the line will be skipped.
    </para>
   </listitem>
  </itemizedlist>

  <para>
   Look at <filename>xsyn_sample.rules</filename>, which is installed in
   <filename>$SHAREDIR/tsearch_data/</filename>, for an example.
  </para>
 </sect2>

 <sect2>
  <title>Usage</title>

  <para>
   Installing the <literal>dict_xsyn</literal> extension creates a text search
   template <literal>xsyn_template</literal> and a dictionary <literal>xsyn</literal>
   based on it, with default parameters.  You can alter the
   parameters, for example

<programlisting>
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=false);
ALTER TEXT SEARCH DICTIONARY
</programlisting>

   or create new dictionaries based on the template.
  </para>

  <para>
   To test the dictionary, you can try

<programlisting>
mydb=# SELECT ts_lexize('xsyn', 'word');
      ts_lexize
-----------------------
 {syn1,syn2,syn3}

mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=true);
ALTER TEXT SEARCH DICTIONARY

mydb=# SELECT ts_lexize('xsyn', 'word');
      ts_lexize
-----------------------
 {word,syn1,syn2,syn3}

mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=false, MATCHSYNONYMS=true);
ALTER TEXT SEARCH DICTIONARY

mydb=# SELECT ts_lexize('xsyn', 'syn1');
      ts_lexize
-----------------------
 {syn1,syn2,syn3}

mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=true, MATCHORIG=false, KEEPSYNONYMS=false);
ALTER TEXT SEARCH DICTIONARY

mydb=# SELECT ts_lexize('xsyn', 'syn1');
      ts_lexize
-----------------------
 {word}
</programlisting>

   Real-world usage will involve including it in a text search
   configuration as described in <xref linkend="textsearch"/>.
   That might look like this:

<programlisting>
ALTER TEXT SEARCH CONFIGURATION english
    ALTER MAPPING FOR word, asciiword WITH xsyn, english_stem;
</programlisting>

  </para>
 </sect2>

</sect1>