blob: 0ddc392d52b07d2a8ebbdb2db7cd096dfe43a8a4 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
|
Lucene Full Text Search Indexing
================================
*NOTE*: Although the fts-lucene plugin works, it's using CLucene library, which
is very old and has some bugs. It's a much better idea to use <fts-solr>
[Plugins.FTS.Solr.txt] instead, which has much more features and is more
stable.
Requires Dovecot v2.1+ to work properly. The CLucene version must be v2.3 (not
v0.9).Dovecot builds only a single Lucene index for all mailboxes. The Lucene
indexes are stored in 'lucene-indexes/' directory under the mail root index
directory (e.g.'~/Maildir/lucene-indexes/').
Compilation
-----------
If you compile Dovecot yourself, you must add the following switches to your
configure command for the plugin to be built:
---%<-------------------------------------------------------------------------
--with-lucene --with-stemmer
---%<-------------------------------------------------------------------------
The second switch is only required if you have compiled libstemmer yourself or
if it's included in the CLucene you are using.
Configuration
-------------
Into 10-mail.conf (note add existing plugins to string)
---%<-------------------------------------------------------------------------
mail_plugins = $mail_plugins fts fts_lucene
---%<-------------------------------------------------------------------------
Into 90-plugins.conf
---%<-------------------------------------------------------------------------
plugin {
fts = lucene
# Lucene-specific settings, good ones are:
fts_lucene = whitespace_chars=@.
}
---%<-------------------------------------------------------------------------
The fts-lucene settings include:
* whitespace_chars=<chars>: List of characters that are translated to
whitespace. You may want to use "@." so that e.g. in
"'first.last@example.org'" it won't be treated as a single word, but rather
you can search separately for "first", "last" and "example".
* default_language=<lang>: Default stemming language to use for mails. The
default is english. Requires that Dovecot is built with libstemmer, which
also limits the languages that are supported.
* textcat_conf=<path> textcat_dir=<path>: If specified, enable guessing the
stemming language for emails and search keywords. This is a little bit
problematic in practice, since indexing and searching languages may differ
and may not find even exact words because they stem differently.
* no_snowball: Support normalization of indexed words even without stemming
and libstemmer (Snowball). (v2.2.3+)
* mime_parts: Index each MIME part separately and include the MIME part number
in the "part" field. In future versions this will allowing showing which
attachment matched the search result. (v2.2.13+)
Libraries
---------
* CLucene [http://sourceforge.net/projects/clucene/files/]: Get v2.3.3.4 (not
v0.9)
* libstemmer [http://snowball.tartarus.org/download.php]: Builds libstemmer.o,
which you can rename to libstemmer.a
* textcat [http://textcat.sourceforge.net/]
(This file was created from the wiki on 2019-06-19 12:42)
|