summaryrefslogtreecommitdiffstats
path: root/debian/README.DB_CONFIG
diff options
context:
space:
mode:
Diffstat (limited to 'debian/README.DB_CONFIG')
-rw-r--r--debian/README.DB_CONFIG187
1 files changed, 187 insertions, 0 deletions
diff --git a/debian/README.DB_CONFIG b/debian/README.DB_CONFIG
new file mode 100644
index 0000000..f8ee5f1
--- /dev/null
+++ b/debian/README.DB_CONFIG
@@ -0,0 +1,187 @@
+For good performance using the BDB backend, a good DB_CONFIG file in the
+database directory (usually /var/lib/ldap) is crucial. The following two
+articles should help you to determine a good configuration for your
+requirements. A standard DB_CONFIG is installed but it may not be adequate
+for your system.
+
+The current version of OpenLDAP supports putting DB_CONFIG parameters into
+slapd.conf instead by prefixing those options with dbconfig. See the
+slapd-bdb(5) man page for more information. If there is no DB_CONFIG file
+when slapd starts and there are dbconfig lines in slapd.conf, slapd will
+write out a DB_CONFIG file with those settings before initializing the
+database.
+
+With the current version of OpenLDAP, any changes to DB_CONFIG will take
+effect automatically after restarting slapd. Running db_recover is no
+longer required.
+
+ -- Torsten Landschoff <torsten@debian.org> Sun, 29 May 2005 18:08:10 +0200
+ Russ Allbery <rra@debian.org> Fri, 01 Jun 2007 23:57:33 -0700
+
+How do I configure the BDB backend?
+-----------------------------------
+(Taken from http://www.openldap.org/faq/data/cache/893.html, author unknown)
+
+The BDB backend ("back-bdb") uses a lot of special features of Sleepycat's
+Berkeley DB library, and there are a lot of details that must be set correctly
+to get the best results from it. Even though the LDBM backend ("back-ldbm") can
+use the BerkeleyDB library, the BDB and LDBM backends have some very important
+differences, as already noted in (Xref) What are the different backends? What
+are their differences?.
+
+Because back-bdb is transaction-based and uses write-ahead logging to ensure
+data consistency, it has much heavier I/O demands than back-ldbm. Also, the
+transaction log files accumulate as data is written to the directory, and these
+log files must be cleaned out periodically. Otherwise the log files will
+consume enormous amounts of disk space. The cleanup procedures are described in
+(Xref) How to maintain Berkeley DB (logs etc.) ?.
+
+The information needed to fully understand things and to properly configure
+back-bdb is divided among the slapd-bdb(5) manual page and the SleepyCat
+BerkeleyDB documentation (http://www.sleepycat.com/docs/).
+
+You should read the entire slapd-bdb(5) manpage before proceeding. The only
+mandatory keyword is "directory" for setting the location of the database
+files. The other keywords control tradeoffs between data reliability,
+performance, and memory use. To ensure that committed transactions actually get
+flushed to disk, you should use the "checkpoint" keyword, otherwise your data
+is vulnerable to loss due to system failures. See the SleepyCat documentation
+for more information about checkpoints. (In fact, you should read all of
+chapter 9 "Berkeley DB Transactional Data Store Applications" in the SleepyCat
+reference manual. At least, read sections 1-3 and 13-24.)
+
+The "dbnosync" keyword is provided for compatibility with back-ldbm; the
+preferred method of setting this is to use the BDB DB_CONFIG file option
+set_flags DB_TXN_NOSYNC. The "lockdetect" keyword is also deprecated, you
+should instead use the BDB DB_CONFIG file set_lk_detect keyword. (It's safe to
+leave this at the default setting.)
+
+A number of important items must be configured in the BDB DB_CONFIG file and
+not in slapd.conf. You should, at least, read about these items:
+
+set_cachesize
+ The BDB library maintains its own cache separate from the back-bdb entry
+ cache. You must set this cache to a size appropriate for your database and
+ physical memory size. Note that this is a persistent setting - after you
+ set it the first time, further changes will be ignored until you recreate
+ the environment using db_recover.
+set_lg_dir
+ Set the directory for storing transaction logs. For best performance,
+ the transaction logs must be located on a different physical disk from
+ the database files.
+set_lg_bsize
+ Set the buffer size for the transaction log. Larger is better, but it
+ doesn't have much effect unless you're also using the DB_TXN_NOSYNC
+ option. With a default log file size of 10MB I usually set this to 2MB.
+ The default is only 32K, which is too small for back-bdb.
+
+On a very busy system you might see error messages talking about running out of
+locks, lockers, or lock objects. Usually the default values are plenty, and in
+older versions of the BDB library the errors were more likely due to library
+bugs than actual system load. However, it is possible that you have actually
+run out of lock resources due to heavy system usage. If this happens, you
+should read about the set_lk_max_lockers, set_lk_max_locks, and
+set_lk_max_objects keywords.
+
+How do I determine the proper BDB/HDB database cache size?
+----------------------------------------------------------
+(Taken from http://www.openldap.org/faq/data/cache/1075.html, written by
+hyc@openldap.org, Kurt@OpenLDAP.org)
+
+Not having a proper database cache size will cause performance issues. (Note:
+in older versions of Berkeley DB, an improper database case size could also
+cause the server to hang.)
+
+These issues are not an indication of corruption occurring in the database. It
+is merely the fact that the cache is thrashing itself that causes
+performance/response time to slowdown. If you take the time to read and
+understand the Berkeley DB documentation, measure the library performance using
+db_stat, and tune your settings, you will avoid these problems.
+
+It is not absolutely necessary to configure a BerkeleyDB cache equal in size to
+your entire database. All that you need is a cache that's large enough for your
+"working set." That means, large enough to hold all of the most frequently
+accessed data, plus a few less-frequently accessed items.
+
+You should really read the BDB documentation referenced above, but let me spell
+out what that really means here, in detail. The discussion here is focused on
+back-bdb and back-hdb, but most of it also applies to back-ldbm when using
+BerkeleyDB as the underlying database engine.
+
+Start with the most obvious - the back-bdb database lives in two main files,
+dn2id.bdb and id2entry.bdb. These are B-tree databases. We have never
+documented the back-bdb internal layout before, because it didn't seem like
+something anyone should have to worry about, nor was it necessarily cast in
+stone. But here's how it works today, in OpenLDAP 2.1 and 2.2. (All of the
+database files in back-ldbm are B-trees by default.)
+
+A B-tree is a balanced tree; it stores data in its leaf nodes and bookkeeping
+data in its interior nodes. (If you don't know what tree data structures look
+like in general, Google for some references, because that's getting far too
+elementary for the purposes of this discussion.)
+
+For decent performance, you need enough cache memory to contain all the nodes
+along the path from the root of the tree down to the particular data item
+you're accessing. That's enough cache for a single search. For the general
+case, you want enough cache to contain all the internal nodes in the database.
+"db_stat -d" will tell you how many internal pages are present in a database.
+You should check this number for both dn2id and id2entry.
+
+Also note that id2entry always uses 16KB per "page", while dn2id uses whatever
+the underlying filesystem uses, typically 4 or 8KB. To avoid thrashing the
+cache and triggering these infinite hang bugs in BDB 4.1.25, your cache must be
+at least as large as the number of internal pages in both the dn2id and
+id2entry databases, plus some extra space to accomodate the actual leaf data
+pages.
+
+For example, in my OpenLDAP 2.2 test database, I have an input LDIF file that's
+about 360MB. With the back-hdb backend this creates a dn2id.bdb that's 68MB,
+and an id2entry that's 800MB. db_stat tells me that dn2id uses 4KB pages, has
+433 internal pages, and 6378 leaf pages. The id2entry uses 16KB pages, has 52
+internal pages, and 45912 leaf pages. In order to efficiently retrieve any
+single entry in this database, the cache should be at least
+
+(433+1) * 4KB + (52+1) * 16KB in size: 1736KB + 848KB =~ 2.5MB.
+
+This doesn't take into account other library overhead, so this is even lower
+than the barest minimum. The default cache size, when nothing is configured, is
+only 256KB. If you tried to do much of anything with this database and only
+default settings, BDB 4.1.25 would lock up in an infinite loop.
+
+This 2.5MB number also doesn't take indexing into account. Each indexed
+attribute uses another database file of its own, using a Hash structure.
+(Again, in back-ldbm, the indexes also use B-trees by default, so this part of
+the discussion doesn't apply unless back-ldbm was explicitly compiled to use
+Hashes instead. Also, in OpenLDAP 2.2 onward, all of the indexes use B-trees,
+there are no more Hash database files. So just use the B-tree information above
+and ignore this Hash discussion.)
+
+Unlike the B-trees, where you only need to touch one data page to find an entry
+of interest, doing an index lookup generally touches multiple keys, and the
+point of a hash structure is that the keys are evenly distributed across the
+data space. That means there's no convenient compact subset of the database
+that you can keep in the cache to insure quick operation, you can pretty much
+expect references to be scattered across the whole thing. My strategy here
+would be to provide enough cache for at least 50% of all of the hash data.
+(Number of hash buckets + number of overflow pages + number of duplicate pages)
+* page size / 2.
+
+The objectClass index for my example database is 5.9MB and uses 3 hash buckets
+and 656 duplicate pages. So ( 3 + 656 ) * 4KB / 2 =~ 1.3MB.
+
+With only this index enabled, I'd figure at least a 4MB cache for this backend.
+(Of course you're using a single cache shared among all of the database files,
+so the cache pages will most likely get used for something other than what you
+accounted for, but this gives you a fighting chance.)
+
+With this 4MB cache I can slapcat this entire database on my 1.3GHz PIII in 1
+minute, 40 seconds. With the cache doubled to 8MB, it still takes the same
+1:40s. Once you've got enough cache to fit the B-tree internal pages,
+increasing it further won't have any effect until the cache really is large
+enough to hold 100% of the data pages. I don't have enough free RAM to hold all
+the 800MB id2entry data, so 4MB is good enough.
+
+With back-bdb and back-hdb you can use "db_stat -m" to check how well the
+database cache is performing. Unfortunately you can't do this with back-ldbm,
+as the statistics are not accessible when slapd is running, nor are they saved
+anywhere when slapd is stopped. (Yet another reason not to use back-ldbm.)