diff options
Diffstat (limited to 'doc/guide/admin/tuning.sdf')
-rw-r--r-- | doc/guide/admin/tuning.sdf | 206 |
1 files changed, 206 insertions, 0 deletions
diff --git a/doc/guide/admin/tuning.sdf b/doc/guide/admin/tuning.sdf new file mode 100644 index 0000000..f00984d --- /dev/null +++ b/doc/guide/admin/tuning.sdf @@ -0,0 +1,206 @@ +# $OpenLDAP$ +# Copyright 1999-2022 The OpenLDAP Foundation, All Rights Reserved. +# COPYING RESTRICTIONS APPLY, see COPYRIGHT. + +H1: Tuning + +This is perhaps one of the most important chapters in the guide, because if +you have not tuned {{slapd}}(8) correctly or grasped how to design your +directory and environment, you can expect very poor performance. + +Reading, understanding and experimenting using the instructions and information +in the following sections, will enable you to fully understand how to tailor +your directory server to your specific requirements. + +It should be noted that the following information has been collected over time +from our community based FAQ. So obviously the benefit of this real world experience +and advice should be of great value to the reader. + + +H2: Performance Factors + +Various factors can play a part in how your directory performs on your chosen +hardware and environment. We will attempt to discuss these here. + + +H3: Memory + +Scale your cache to use available memory and increase system memory if you can. + + +H3: Disks + +Use fast filesystems, and conduct your own testing to see which filesystem +types perform best with your workload. (On our own Linux testing, EXT2 and JFS +tend to provide better write performance than everything else, including +newer filesystems like EXT4, BTRFS, etc.) + +Use fast subsystems. Put each database on separate disks. + +H3: Network Topology + +http://www.openldap.org/faq/data/cache/363.html + +Drawing here. + + +H3: Directory Layout Design + +Reference to other sections and good/bad drawing here. + + +H3: Expected Usage + +Discussion. + + +H2: Indexes + +H3: Understanding how a search works + +If you're searching on a filter that has been indexed, then the search reads +the index and pulls exactly the entries that are referenced by the index. +If the filter term has not been indexed, then the search must read every single + entry in the target scope and test to see if each entry matches the filter. +Obviously indexing can save a lot of work when it's used correctly. + +In back-mdb, indexes can only track a certain number of entries per key (by +default that number is 2^16 = 65536). If more entries' values hash to this +key, some/all of them will have to be represented by a range of candidates, +making the index less useful over time as deletions cannot usually be tracked +accurately. + +H3: What to index + +As a general rule, to make any use of indexes, you must set up an equality +index on objectClass: + +> index objectClass eq + +Then you should create indices to match the actual filter terms used in +search queries. + +> index cn,sn,givenname,mail eq + +Each attribute index can be tuned further by selecting the set of index types to generate. For example, substring and approximate search for organizations (o) may make little sense (and isn't like done very often). And searching for {{userPassword}} likely makes no sense what so ever. + +General rule: don't go overboard with indexes. Unused indexes must be maintained and hence can only slow things down. + +See {{slapd.conf}}(5) and {{slapdindex}}(8) for more information + + +H3: Presence indexing + +If your client application uses presence filters and if the +target attribute exists on the majority of entries in your target scope, then +all of those entries are going to be read anyway, because they are valid +members of the result set. In a subtree where 100% of the +entries are going to contain the same attributes, the presence index does +absolutely NOTHING to benefit the search, because 100% of the entries match +that presence filter. As an example, setting a presence index on objectClass +provides no benefit since it is present on every entry. + +So the resource cost of generating the index is a +complete waste of CPU time, disk, and memory. Don't do it unless you know +that it will be used, and that the attribute in question occurs very +infrequently in the target data. + +Almost no applications use presence filters in their search queries. Presence +indexing is pointless when the target attribute exists on the majority of +entries in the database. In most LDAP deployments, presence indexing should +not be done, it's just wasted overhead. + +See the {{Logging}} section below on what to watch out for if you have a frequently searched +for attribute that is unindexed. + +H3: Equality indexing + +Similarly to presence indexes, equality indexes are most useful if the +values searched for are uncommon. Most OpenLDAP indexes work by hashing +the normalised value and using the hash as the key. Hashing behaviour +depends on the matching rule syntax, some matching rules also implement +indexers that help speed up inequality (lower than, ...) queries. + +Check the documentation and other parts of this guide if some indexes are +mandatory - e.g. to enable replication, it is expected you index certain +operational attributes, likewise if you rely on filters in ACL processing. + +Approximate indexes are usually identical to equality indexes unless +a matching rule explicitly implements it. As of OpenLDAP 2.5, only +directoryStringApproxMatch and IA5StringApproxMatch matchers +and indexers are implemented, currently using soundex or metaphone, with +metaphone being the default. + +H3: Substring indexing + +Substring indexes work on splitting the value into short chunks and then +indexing those in a similar way to how equality index does. The storage +space needed to store all of this data is analogous to the amount of data +being indexed, which makes the indexes extremely heavy-handed in most +scenarios. + + +H2: Logging + +H3: What log level to use + +The default of {{loglevel stats}} (256) is really the best bet. There's a corollary to +this when problems *do* arise, don't try to trace them using syslog. +Use the debug flag instead, and capture slapd's stderr output. syslog is too +slow for debug tracing, and it's inherently lossy - it will throw away messages when it +can't keep up. See {{slapd.conf}}(5) or {{slapd-config}}(5) for more information on +how to configure the loglevel. + +Contrary to popular belief, {{loglevel 0}} is not ideal for production as you +won't be able to track when problems first arise. + +H3: What to watch out for + +The most common message you'll see that you should pay attention to is: + +> "<= mdb_equality_candidates: (foo) index_param failed (18)" + +That means that some application tried to use an equality filter ({{foo=<somevalue>}}) +and attribute {{foo}} does not have an equality index. If you see a lot of these +messages, you should add the index. If you see one every month or so, it may +be acceptable to ignore it. + +The default syslog level is stats (256) which logs the basic parameters of each +request; it usually produces 1-3 lines of output. On Solaris and systems that +only provide synchronous syslog, you may want to turn it off completely, but +usually you want to leave it enabled so that you'll be able to see index +messages whenever they arise. On Linux you can configure syslogd to run +asynchronously, in which case the performance hit for moderate syslog traffic +pretty much disappears. + +H3: Improving throughput + +You can improve logging performance on some systems by configuring syslog not +to sync the file system with every write ({{man syslogd/syslog.conf}}). In Linux, +you can prepend the log file name with a "-" in {{syslog.conf}}. For example, +if you are using the default LOCAL4 logging you could try: + +> # LDAP logs +> LOCAL4.* -/var/log/ldap + +For syslog-ng, add or modify the following line in {{syslog-ng.conf}}: + +> options { sync(n); }; + +where n is the number of lines which will be buffered before a write. + + +H2: {{slapd}}(8) Threads + +{{slapd}}(8) can process requests via a configurable number of threads, which +in turn affects the in/out rate of connections. + +This value should generally be a function of the number of "real" cores on +the system, for example on a server with 2 CPUs with one core each, set this +to 8, or 4 threads per real core. This is a "read" maximized value. The more +threads that are configured per core, the slower {{slapd}}(8) responds for +"read" operations. On the flip side, it appears to handle write operations +faster in a heavy write/low read scenario. + +The upper bound for good read performance appears to be 16 threads (which +also happens to be the default setting). |