summaryrefslogtreecommitdiffstats
path: root/doc/wiki/Plugins.FTS.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/wiki/Plugins.FTS.txt')
-rw-r--r--doc/wiki/Plugins.FTS.txt102
1 files changed, 102 insertions, 0 deletions
diff --git a/doc/wiki/Plugins.FTS.txt b/doc/wiki/Plugins.FTS.txt
new file mode 100644
index 0000000..e960b4c
--- /dev/null
+++ b/doc/wiki/Plugins.FTS.txt
@@ -0,0 +1,102 @@
+Full text search indexing
+=========================
+
+The following FTS indexers (in preferred order) are supported:
+
+ * <Solr> [Plugins.FTS.Solr.txt] communicates with Lucene's Solr server
+ [http://lucene.apache.org/solr/].
+ * <Lucene> [Plugins.FTS.Lucene.txt] uses Lucene's C++ library. (Requires
+ v2.1+)
+ * <fts-dovecot> [Plugins.FTS.Dovecot.txt] is Dovecot Pro's new search index,
+ and is not available without commercial agreement. (Requires v2.2+)
+ * <Squat> [Plugins.FTS.Squat.txt] is Dovecot's own search index. (Obsolete in
+ v2.1+)
+ * fts-xapian [https://github.com/grosjo/fts-xapian] is Xapian
+ [https://xapian.org] based plugin maintained by '<jom AT NOSPAM grosjo DOT
+ net>'. (Requires v2.3+)
+
+Indexing
+--------
+
+By default the FTS indexes are updated *only* while searching, so neither the
+<LDA.txt> nor an IMAP APPEND command updates the indexes immediately. This
+means that if user has received a lot of mail since the last indexing (==
+search operation), it may take a while to index all the mails before replying
+to the search command. Dovecot sends periodic "* OK Indexed n% of the mailbox"
+updates which can be caught by webmail implementations to implement a progress
+bar.
+
+In v2.2.9+ the indexing can be done automatically with 'fts_autoindex=yes'
+setting (see below).
+
+The indexing can be done manually (e.g. cronjob) or by a LDA script by running:
+
+ * v2.1: 'doveadm index -u user@domain -q INBOX'
+ * v2.0: 'printf "a select INBOX\nb search text xyzzy\nc logout\n" |
+ /usr/local/libexec/dovecot/imap -u user@domain'
+
+Of course the INBOX needs to be replaced with whatever mailbox needs to be
+indexed.
+
+Indexing Attachments (v2.1+)
+----------------------------
+
+Attachments can be indexed either via a script that translates the attachment
+to UTF-8 plaintext or Apache Tika server.
+
+ * 'fts_decoder = <service>': Decode attachments to plaintext using this
+ service and index the resulting plaintext. See the 'decode2text.sh' script
+ included in Dovecot for how to use this. (v2.1+)
+ * 'fts_tika = http://tikahost:9998/tika/': This URL needs to be running Apache
+ Tika server (e.g. started with 'java -jar
+ tika-server/target/tika-server-1.5.jar') (v2.2.13+)
+
+Rescan (v2.1+)
+--------------
+
+Since v2.1 Dovecot keeps track of indexed messages in the dovecot.index files.
+If this becomes out of sync with the actual FTS indexes (either too many or too
+few mails), you'll need to do a rescan:
+
+---%<-------------------------------------------------------------------------
+doveadm fts rescan -u user@domain
+---%<-------------------------------------------------------------------------
+
+Other Settings
+--------------
+
+All the FTS settings go inside 'plugin {} ' section of 90-plugin.conf.
+
+ * 'fts_autoindex=yes': Index new messages immediately after they've been
+ saved/copied. (v2.2.9+)
+ * 'fts_autoindex_exclude=pattern1', 'fts_autoindex_exclude2=pattern2', ...:
+ Exclude given mailboxes, one pattern per setting. Supports "*" and "?"
+ wildcards. If a name starts with '\', it's treated as a case-insensitive
+ special-use flag. (v2.2.25+)
+ * Example:
+
+ ---%<-------------------------------------------------------------------
+ plugin {
+ fts_autoindex_exclude = \Junk
+ fts_autoindex_exclude2 = \Trash
+ fts_autoindex_exclude3 = DUMPSTER
+ }
+ ---%<-------------------------------------------------------------------
+
+ * 'fts_autoindex_max_recent_msgs=n': Skip autoindexing the mailbox if it has
+ more than n \Recent messages (implying that the mailbox is never actually
+ being accessed). (v2.2.9+)
+ * 'fts_enforced':
+ * no (default): All body searches will index all missing mails in FTS.
+ Header searches will use FTS if the mails are indexed, otherwise fallback
+ to parsing the headers (usually from dovecot.index.cache). If FTS search
+ fails, fallback to reading and parsing all mails.
+ * yes: All header and body searches will index all missing mails in FTS. If
+ FTS search fails, error is returned to client.
+ * 'fts_index_timeout': When SEARCH notices that index isn't up to date, it
+ tells indexer to index the mails and waits until it is finished. This
+ setting adds a maximum timeout to this wait. If the timeout is reached, the
+ SEARCH fails with:'NO [INUSE] Timeout while waiting for indexing to finish'
+ (v2.1+)
+
+(This file was created from the wiki on 2019-06-19 12:42)