diff options
Diffstat (limited to 'doc/src/sgml/dict-int.sgml')
-rw-r--r-- | doc/src/sgml/dict-int.sgml | 101 |
1 files changed, 101 insertions, 0 deletions
diff --git a/doc/src/sgml/dict-int.sgml b/doc/src/sgml/dict-int.sgml new file mode 100644 index 0000000..8dd07b9 --- /dev/null +++ b/doc/src/sgml/dict-int.sgml @@ -0,0 +1,101 @@ +<!-- doc/src/sgml/dict-int.sgml --> + +<sect1 id="dict-int" xreflabel="dict_int"> + <title>dict_int — + example full-text search dictionary for integers</title> + + <indexterm zone="dict-int"> + <primary>dict_int</primary> + </indexterm> + + <para> + <filename>dict_int</filename> is an example of an add-on dictionary template + for full-text search. The motivation for this example dictionary is to + control the indexing of integers (signed and unsigned), allowing such + numbers to be indexed while preventing excessive growth in the number of + unique words, which greatly affects the performance of searching. + </para> + + <para> + This module is considered <quote>trusted</quote>, that is, it can be + installed by non-superusers who have <literal>CREATE</literal> privilege + on the current database. + </para> + + <sect2 id="dict-int-config"> + <title>Configuration</title> + + <para> + The dictionary accepts three options: + </para> + + <itemizedlist> + <listitem> + <para> + The <literal>maxlen</literal> parameter specifies the maximum number of + digits allowed in an integer word. The default value is 6. + </para> + </listitem> + <listitem> + <para> + The <literal>rejectlong</literal> parameter specifies whether an overlength + integer should be truncated or ignored. If <literal>rejectlong</literal> is + <literal>false</literal> (the default), the dictionary returns the first + <literal>maxlen</literal> digits of the integer. If <literal>rejectlong</literal> is + <literal>true</literal>, the dictionary treats an overlength integer as a stop + word, so that it will not be indexed. Note that this also means that + such an integer cannot be searched for. + </para> + </listitem> + <listitem> + <para> + The <literal>absval</literal> parameter specifies whether leading + <quote><literal>+</literal></quote> or <quote><literal>-</literal></quote> + signs should be removed from integer words. The default + is <literal>false</literal>. When <literal>true</literal>, the sign is + removed before <literal>maxlen</literal> is applied. + </para> + </listitem> + </itemizedlist> + </sect2> + + <sect2 id="dict-int-usage"> + <title>Usage</title> + + <para> + Installing the <literal>dict_int</literal> extension creates a text search + template <literal>intdict_template</literal> and a dictionary <literal>intdict</literal> + based on it, with the default parameters. You can alter the + parameters, for example + +<programlisting> +mydb# ALTER TEXT SEARCH DICTIONARY intdict (MAXLEN = 4, REJECTLONG = true); +ALTER TEXT SEARCH DICTIONARY +</programlisting> + + or create new dictionaries based on the template. + </para> + + <para> + To test the dictionary, you can try + +<programlisting> +mydb# select ts_lexize('intdict', '12345678'); + ts_lexize +----------- + {123456} +</programlisting> + + but real-world usage will involve including it in a text search + configuration as described in <xref linkend="textsearch"/>. + That might look like this: + +<programlisting> +ALTER TEXT SEARCH CONFIGURATION english + ALTER MAPPING FOR int, uint WITH intdict; +</programlisting> + + </para> + </sect2> + +</sect1> |