summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/html/gin-tips.html
blob: 8d57cec5202a57b7962a3c65b802238d6fa61126 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>70.5. GIN Tips and Tricks</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="gin-implementation.html" title="70.4. Implementation" /><link rel="next" href="gin-limit.html" title="70.6. Limitations" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">70.5. GIN Tips and Tricks</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="gin-implementation.html" title="70.4. Implementation">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="gin.html" title="Chapter 70. GIN Indexes">Up</a></td><th width="60%" align="center">Chapter 70. GIN Indexes</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 15.7 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="gin-limit.html" title="70.6. Limitations">Next</a></td></tr></table><hr /></div><div class="sect1" id="GIN-TIPS"><div class="titlepage"><div><div><h2 class="title" style="clear: both">70.5. GIN Tips and Tricks</h2></div></div></div><div class="variablelist"><dl class="variablelist"><dt><span class="term">Create vs. insert</span></dt><dd><p>
     Insertion into a <acronym class="acronym">GIN</acronym> index can be slow
     due to the likelihood of many keys being inserted for each item.
     So, for bulk insertions into a table it is advisable to drop the GIN
     index and recreate it after finishing bulk insertion.
    </p><p>
     When <code class="literal">fastupdate</code> is enabled for <acronym class="acronym">GIN</acronym>
     (see <a class="xref" href="gin-implementation.html#GIN-FAST-UPDATE" title="70.4.1. GIN Fast Update Technique">Section 70.4.1</a> for details), the penalty is
     less than when it is not.  But for very large updates it may still be
     best to drop and recreate the index.
    </p></dd><dt><span class="term"><a class="xref" href="runtime-config-resource.html#GUC-MAINTENANCE-WORK-MEM">maintenance_work_mem</a></span></dt><dd><p>
     Build time for a <acronym class="acronym">GIN</acronym> index is very sensitive to
     the <code class="varname">maintenance_work_mem</code> setting; it doesn't pay to
     skimp on work memory during index creation.
    </p></dd><dt><span class="term"><a class="xref" href="runtime-config-client.html#GUC-GIN-PENDING-LIST-LIMIT">gin_pending_list_limit</a></span></dt><dd><p>
     During a series of insertions into an existing <acronym class="acronym">GIN</acronym>
     index that has <code class="literal">fastupdate</code> enabled, the system will clean up
     the pending-entry list whenever the list grows larger than
     <code class="varname">gin_pending_list_limit</code>. To avoid fluctuations in observed
     response time, it's desirable to have pending-list cleanup occur in the
     background (i.e., via autovacuum).  Foreground cleanup operations
     can be avoided by increasing <code class="varname">gin_pending_list_limit</code>
     or making autovacuum more aggressive.
     However, enlarging the threshold of the cleanup operation means that
     if a foreground cleanup does occur, it will take even longer.
    </p><p>
     <code class="varname">gin_pending_list_limit</code> can be overridden for individual
     GIN indexes by changing storage parameters, which allows each
     GIN index to have its own cleanup threshold.
     For example, it's possible to increase the threshold only for the GIN
     index which can be updated heavily, and decrease it otherwise.
    </p></dd><dt><span class="term"><a class="xref" href="runtime-config-client.html#GUC-GIN-FUZZY-SEARCH-LIMIT">gin_fuzzy_search_limit</a></span></dt><dd><p>
     The primary goal of developing <acronym class="acronym">GIN</acronym> indexes was
     to create support for highly scalable full-text search in
     <span class="productname">PostgreSQL</span>, and there are often situations when
     a full-text search returns a very large set of results.  Moreover, this
     often happens when the query contains very frequent words, so that the
     large result set is not even useful.  Since reading many
     tuples from the disk and sorting them could take a lot of time, this is
     unacceptable for production.  (Note that the index search itself is very
     fast.)
    </p><p>
     To facilitate controlled execution of such queries,
     <acronym class="acronym">GIN</acronym> has a configurable soft upper limit on the
     number of rows returned: the
     <code class="varname">gin_fuzzy_search_limit</code> configuration parameter.
     It is set to 0 (meaning no limit) by default.
     If a non-zero limit is set, then the returned set is a subset of
     the whole result set, chosen at random.
    </p><p>
     <span class="quote"><span class="quote">Soft</span></span> means that the actual number of returned results
     could differ somewhat from the specified limit, depending on the query
     and the quality of the system's random number generator.
    </p><p>
     From experience, values in the thousands (e.g., 5000 — 20000)
     work well.
    </p></dd></dl></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="gin-implementation.html" title="70.4. Implementation">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="gin.html" title="Chapter 70. GIN Indexes">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="gin-limit.html" title="70.6. Limitations">Next</a></td></tr><tr><td width="40%" align="left" valign="top">70.4. Implementation </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 15.7 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 70.6. Limitations</td></tr></table></div></body></html>