diff options
Diffstat (limited to 'ctdb/doc/ctdb-tunables.7.xml')
-rw-r--r-- | ctdb/doc/ctdb-tunables.7.xml | 783 |
1 files changed, 783 insertions, 0 deletions
diff --git a/ctdb/doc/ctdb-tunables.7.xml b/ctdb/doc/ctdb-tunables.7.xml new file mode 100644 index 0000000..766213e --- /dev/null +++ b/ctdb/doc/ctdb-tunables.7.xml @@ -0,0 +1,783 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE refentry + PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" + "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"> + +<refentry id="ctdb-tunables.7"> + + <refmeta> + <refentrytitle>ctdb-tunables</refentrytitle> + <manvolnum>7</manvolnum> + <refmiscinfo class="source">ctdb</refmiscinfo> + <refmiscinfo class="manual">CTDB - clustered TDB database</refmiscinfo> + </refmeta> + + <refnamediv> + <refname>ctdb-tunables</refname> + <refpurpose>CTDB tunable configuration variables</refpurpose> + </refnamediv> + + <refsect1> + <title>DESCRIPTION</title> + + <para> + CTDB's behaviour can be configured by setting run-time tunable + variables. This lists and describes all tunables. See the + <citerefentry><refentrytitle>ctdb</refentrytitle> + <manvolnum>1</manvolnum></citerefentry> + <command>listvars</command>, <command>setvar</command> and + <command>getvar</command> commands for more details. + </para> + + <para> + Unless otherwise stated, tunables should be set to the same + value on all nodes. Setting tunables to different values across + nodes may produce unexpected results. Future releases may set + (some or most) tunables globally across the cluster but doing so + is currently a manual process. + </para> + + <para> + Tunables can be set at startup from the + <filename>/usr/local/etc/ctdb/ctdb.tunables</filename> + configuration file. + + <literallayout> +<replaceable>TUNABLE</replaceable>=<replaceable>VALUE</replaceable> + </literallayout> + + Comment lines beginning with '#' are permitted. Whitespace may + be used for formatting/alignment. VALUE must be a non-negative + integer and must be the last thing on a line (i.e. no trailing + garbage, trailing comments are not permitted). + </para> + + <para> + For example: + + <screen format="linespecific"> +MonitorInterval=20 + </screen> + </para> + + <para> + The available tunable variables are listed alphabetically below. + </para> + + <refsect2> + <title>AllowClientDBAttach</title> + <para>Default: 1</para> + <para> + When set to 0, clients are not allowed to attach to any databases. + This can be used to temporarily block any new processes from + attaching to and accessing the databases. This is mainly used + for detaching a volatile database using 'ctdb detach'. + </para> + </refsect2> + + <refsect2> + <title>AllowMixedVersions</title> + <para>Default: 0</para> + <para> + CTDB will not allow incompatible versions to co-exist in + a cluster. If a version mismatch is found, then losing CTDB + will shutdown. To disable the incompatible version check, + set this tunable to 1. + </para> + <para> + For version checking, CTDB uses major and minor version. + For example, CTDB 4.6.1 and CTDB 4.6.2 are matching versions; + CTDB 4.5.x and CTDB 4.6.y do not match. + </para> + <para> + CTDB with version check support will lose to CTDB without + version check support. Between two different CTDB versions with + version check support, one running for less time will lose. + If the running time for both CTDB versions with version check + support is equal (to seconds), then the older version will lose. + The losing CTDB daemon will shutdown. + </para> + </refsect2> + + <refsect2> + <title>AllowUnhealthyDBRead</title> + <para>Default: 0</para> + <para> + When set to 1, ctdb allows database traverses to read unhealthy + databases. By default, ctdb does not allow reading records from + unhealthy databases. + </para> + </refsect2> + + <refsect2> + <title>ControlTimeout</title> + <para>Default: 60</para> + <para> + This is the default setting for timeout for when sending a + control message to either the local or a remote ctdb daemon. + </para> + </refsect2> + + <refsect2> + <title>DatabaseHashSize</title> + <para>Default: 100001</para> + <para> + Number of the hash chains for the local store of the tdbs that + ctdb manages. + </para> + </refsect2> + + <refsect2> + <title>DatabaseMaxDead</title> + <para>Default: 5</para> + <para> + Maximum number of dead records per hash chain for the tdb databases + managed by ctdb. + </para> + </refsect2> + + <refsect2> + <title>DBRecordCountWarn</title> + <para>Default: 100000</para> + <para> + When set to non-zero, ctdb will log a warning during recovery if + a database has more than this many records. This will produce a + warning if a database grows uncontrollably with orphaned records. + </para> + </refsect2> + + <refsect2> + <title>DBRecordSizeWarn</title> + <para>Default: 10000000</para> + <para> + When set to non-zero, ctdb will log a warning during recovery + if a single record is bigger than this size. This will produce + a warning if a database record grows uncontrollably. + </para> + </refsect2> + + <refsect2> + <title>DBSizeWarn</title> + <para>Default: 1000000000</para> + <para> + When set to non-zero, ctdb will log a warning during recovery if + a database size is bigger than this. This will produce a warning + if a database grows uncontrollably. + </para> + </refsect2> + + <refsect2> + <title>DeferredAttachTO</title> + <para>Default: 120</para> + <para> + When databases are frozen we do not allow clients to attach to + the databases. Instead of returning an error immediately to the + client, the attach request from the client is deferred until + the database becomes available again at which stage we respond + to the client. + </para> + <para> + This timeout controls how long we will defer the request from the + client before timing it out and returning an error to the client. + </para> + </refsect2> + + <refsect2> + <title>ElectionTimeout</title> + <para>Default: 3</para> + <para> + The number of seconds to wait for the election of recovery + master to complete. If the election is not completed during this + interval, then that round of election fails and ctdb starts a + new election. + </para> + </refsect2> + + <refsect2> + <title>EnableBans</title> + <para>Default: 1</para> + <para> + This parameter allows ctdb to ban a node if the node is misbehaving. + </para> + <para> + When set to 0, this disables banning completely in the cluster + and thus nodes can not get banned, even it they break. Don't + set to 0 unless you know what you are doing. + </para> + </refsect2> + + <refsect2> + <title>EventScriptTimeout</title> + <para>Default: 30</para> + <para> + Maximum time in seconds to allow an event to run before timing + out. This is the total time for all enabled scripts that are + run for an event, not just a single event script. + </para> + <para> + Note that timeouts are ignored for some events ("takeip", + "releaseip", "startrecovery", "recovered") and converted to + success. The logic here is that the callers of these events + implement their own additional timeout. + </para> + </refsect2> + + <refsect2> + <title>FetchCollapse</title> + <para>Default: 1</para> + <para> + This parameter is used to avoid multiple migration requests for + the same record from a single node. All the record requests for + the same record are queued up and processed when the record is + migrated to the current node. + </para> + <para> + When many clients across many nodes try to access the same record + at the same time this can lead to a fetch storm where the record + becomes very active and bounces between nodes very fast. This + leads to high CPU utilization of the ctdbd daemon, trying to + bounce that record around very fast, and poor performance. + This can improve performance and reduce CPU utilization for + certain workloads. + </para> + </refsect2> + + <refsect2> + <title>HopcountMakeSticky</title> + <para>Default: 50</para> + <para> + For database(s) marked STICKY (using 'ctdb setdbsticky'), + any record that is migrating so fast that hopcount + exceeds this limit is marked as STICKY record for + <varname>StickyDuration</varname> seconds. This means that + after each migration the sticky record will be kept on the node + <varname>StickyPindown</varname>milliseconds and prevented from + being migrated off the node. + </para> + <para> + This will improve performance for certain workloads, such as + locking.tdb if many clients are opening/closing the same file + concurrently. + </para> + </refsect2> + + <refsect2> + <title>IPAllocAlgorithm</title> + <para>Default: 2</para> + <para> + Selects the algorithm that CTDB should use when doing public + IP address allocation. Meaningful values are: + </para> + <variablelist> + <varlistentry> + <term>0</term> + <listitem> + <para> + Deterministic IP address allocation. + </para> + <para> + This is a simple and fast option. However, it can cause + unnecessary address movement during fail-over because + each address has a "home" node. Works badly when some + nodes do not have any addresses defined. Should be used + with care when addresses are defined across multiple + networks. + </para> + <para> + You can override the automatic "home" node allocation by + creating a file "home_nodes" next to the + "public_addresses" file. As an example the following + "home_nodes" file assigns the address 192.168.1.1 to + node 0 and 192.168.1.2 to node 2: + </para> + <screen format="linespecific"> + 192.168.1.1 0 + 192.168.1.2 2 + </screen> + </listitem> + </varlistentry> + <varlistentry> + <term>1</term> + <listitem> + <para> + Non-deterministic IP address allocation. + </para> + <para> + This is a relatively fast option that attempts to do a + minimise unnecessary address movements. Addresses do + not have a "home" node. Rebalancing is limited but it + usually adequate. Works badly when addresses are + defined across multiple networks. + </para> + </listitem> + </varlistentry> + <varlistentry> + <term>2</term> + <listitem> + <para> + LCP2 IP address allocation. + </para> + <para> + Uses a heuristic to assign addresses defined across + multiple networks, usually balancing addresses on each + network evenly across nodes. Addresses do not have a + "home" node. Minimises unnecessary address movements. + The algorithm is complex, so is slower than other + choices for a large number of addresses. However, it + can calculate an optimal assignment of 900 addresses in + under 10 seconds on modern hardware. + </para> + </listitem> + </varlistentry> + </variablelist> + <para> + If the specified value is not one of these then the default + will be used. + </para> + </refsect2> + + <refsect2> + <title>KeepaliveInterval</title> + <para>Default: 5</para> + <para> + How often in seconds should the nodes send keep-alive packets to + each other. + </para> + </refsect2> + + <refsect2> + <title>KeepaliveLimit</title> + <para>Default: 5</para> + <para> + After how many keepalive intervals without any traffic should + a node wait until marking the peer as DISCONNECTED. + </para> + <para> + If a node has hung, it can take + <varname>KeepaliveInterval</varname> * + (<varname>KeepaliveLimit</varname> + 1) seconds before + ctdb determines that the node is DISCONNECTED and performs + a recovery. This limit should not be set too high to enable + early detection and avoid any application timeouts (e.g. SMB1) + to kick in before the fail over is completed. + </para> + </refsect2> + + <refsect2> + <title>LockProcessesPerDB</title> + <para>Default: 200</para> + <para> + This is the maximum number of lock helper processes ctdb will + create for obtaining record locks. When ctdb cannot get a record + lock without blocking, it creates a helper process that waits + for the lock to be obtained. + </para> + </refsect2> + + <refsect2> + <title>LogLatencyMs</title> + <para>Default: 0</para> + <para> + When set to non-zero, ctdb will log if certains operations + take longer than this value, in milliseconds, to complete. + These operations include "process a record request from client", + "take a record or database lock", "update a persistent database + record" and "vacuum a database". + </para> + </refsect2> + + <refsect2> + <title>MaxQueueDropMsg</title> + <para>Default: 1000000</para> + <para> + This is the maximum number of messages to be queued up for + a client before ctdb will treat the client as hung and will + terminate the client connection. + </para> + </refsect2> + + <refsect2> + <title>MonitorInterval</title> + <para>Default: 15</para> + <para> + How often should ctdb run the 'monitor' event in seconds to check + for a node's health. + </para> + </refsect2> + + <refsect2> + <title>MonitorTimeoutCount</title> + <para>Default: 20</para> + <para> + How many 'monitor' events in a row need to timeout before a node + is flagged as UNHEALTHY. This setting is useful if scripts can + not be written so that they do not hang for benign reasons. + </para> + </refsect2> + + <refsect2> + <title>NoIPFailback</title> + <para>Default: 0</para> + <para> + When set to 1, ctdb will not perform failback of IP addresses + when a node becomes healthy. When a node becomes UNHEALTHY, + ctdb WILL perform failover of public IP addresses, but when the + node becomes HEALTHY again, ctdb will not fail the addresses back. + </para> + <para> + Use with caution! Normally when a node becomes available to the + cluster ctdb will try to reassign public IP addresses onto the + new node as a way to distribute the workload evenly across the + clusternode. Ctdb tries to make sure that all running nodes have + approximately the same number of public addresses it hosts. + </para> + <para> + When you enable this tunable, ctdb will no longer attempt to + rebalance the cluster by failing IP addresses back to the new + nodes. An unbalanced cluster will therefore remain unbalanced + until there is manual intervention from the administrator. When + this parameter is set, you can manually fail public IP addresses + over to the new node(s) using the 'ctdb moveip' command. + </para> + </refsect2> + + <refsect2> + <title>NoIPTakeover</title> + <para>Default: 0</para> + <para> + When set to 1, ctdb will not allow IP addresses to be failed + over to other nodes. Any IP addresses already hosted on + healthy nodes will remain. Any IP addresses hosted on + unhealthy nodes will be released by unhealthy nodes and will + become un-hosted. + </para> + </refsect2> + + <refsect2> + <title>PullDBPreallocation</title> + <para>Default: 10*1024*1024</para> + <para> + This is the size of a record buffer to pre-allocate for sending + reply to PULLDB control. Usually record buffer starts with size + of the first record and gets reallocated every time a new record + is added to the record buffer. For a large number of records, + this can be very inefficient to grow the record buffer one record + at a time. + </para> + </refsect2> + + <refsect2> + <title>QueueBufferSize</title> + <para>Default: 1024</para> + <para> + This is the maximum amount of data (in bytes) ctdb will read + from a socket at a time. + </para> + <para> + For a busy setup, if ctdb is not able to process the TCP sockets + fast enough (large amount of data in Recv-Q for tcp sockets), + then this tunable value should be increased. However, large + values can keep ctdb busy processing packets and prevent ctdb + from handling other events. + </para> + </refsect2> + + <refsect2> + <title>RecBufferSizeLimit</title> + <para>Default: 1000000</para> + <para> + This is the limit on the size of the record buffer to be sent + in various controls. This limit is used by new controls used + for recovery and controls used in vacuuming. + </para> + </refsect2> + + <refsect2> + <title>RecdFailCount</title> + <para>Default: 10</para> + <para> + If the recovery daemon has failed to ping the main daemon for + this many consecutive intervals, the main daemon will consider + the recovery daemon as hung and will try to restart it to recover. + </para> + </refsect2> + + <refsect2> + <title>RecdPingTimeout</title> + <para>Default: 60</para> + <para> + If the main daemon has not heard a "ping" from the recovery daemon + for this many seconds, the main daemon will log a message that + the recovery daemon is potentially hung. This also increments a + counter which is checked against <varname>RecdFailCount</varname> + for detection of hung recovery daemon. + </para> + </refsect2> + + <refsect2> + <title>RecLockLatencyMs</title> + <para>Default: 1000</para> + <para> + When using a reclock file for split brain prevention, if set + to non-zero this tunable will make the recovery daemon log a + message if the fcntl() call to lock/testlock the recovery file + takes longer than this number of milliseconds. + </para> + </refsect2> + + <refsect2> + <title>RecoverInterval</title> + <para>Default: 1</para> + <para> + How frequently in seconds should the recovery daemon perform the + consistency checks to determine if it should perform a recovery. + </para> + </refsect2> + + <refsect2> + <title>RecoverTimeout</title> + <para>Default: 120</para> + <para> + This is the default setting for timeouts for controls when sent + from the recovery daemon. We allow longer control timeouts from + the recovery daemon than from normal use since the recovery + daemon often use controls that can take a lot longer than normal + controls. + </para> + </refsect2> + + <refsect2> + <title>RecoveryBanPeriod</title> + <para>Default: 300</para> + <para> + The duration in seconds for which a node is banned if the node + fails during recovery. After this time has elapsed the node will + automatically get unbanned and will attempt to rejoin the cluster. + </para> + <para> + A node usually gets banned due to real problems with the node. + Don't set this value too small. Otherwise, a problematic node + will try to re-join cluster too soon causing unnecessary recoveries. + </para> + </refsect2> + + <refsect2> + <title>RecoveryDropAllIPs</title> + <para>Default: 120</para> + <para> + If a node is stuck in recovery, or stopped, or banned, for this + many seconds, then ctdb will release all public addresses on + that node. + </para> + </refsect2> + + <refsect2> + <title>RecoveryGracePeriod</title> + <para>Default: 120</para> + <para> + During recoveries, if a node has not caused recovery failures + during the last grace period in seconds, any records of + transgressions that the node has caused recovery failures will be + forgiven. This resets the ban-counter back to zero for that node. + </para> + </refsect2> + + <refsect2> + <title>RepackLimit</title> + <para>Default: 10000</para> + <para> + During vacuuming, if the number of freelist records are more than + <varname>RepackLimit</varname>, then the database is repacked + to get rid of the freelist records to avoid fragmentation. + </para> + </refsect2> + + <refsect2> + <title>RerecoveryTimeout</title> + <para>Default: 10</para> + <para> + Once a recovery has completed, no additional recoveries are + permitted until this timeout in seconds has expired. + </para> + </refsect2> + + <refsect2> + <title>SeqnumInterval</title> + <para>Default: 1000</para> + <para> + Some databases have seqnum tracking enabled, so that samba will + be able to detect asynchronously when there has been updates + to the database. Every time a database is updated its sequence + number is increased. + </para> + <para> + This tunable is used to specify in milliseconds how frequently + ctdb will send out updates to remote nodes to inform them that + the sequence number is increased. + </para> + </refsect2> + + <refsect2> + <title>StatHistoryInterval</title> + <para>Default: 1</para> + <para> + Granularity of the statistics collected in the statistics + history. This is reported by 'ctdb stats' command. + </para> + </refsect2> + + <refsect2> + <title>StickyDuration</title> + <para>Default: 600</para> + <para> + Once a record has been marked STICKY, this is the duration in + seconds, the record will be flagged as a STICKY record. + </para> + </refsect2> + + <refsect2> + <title>StickyPindown</title> + <para>Default: 200</para> + <para> + Once a STICKY record has been migrated onto a node, it will be + pinned down on that node for this number of milliseconds. Any + request from other nodes to migrate the record off the node will + be deferred. + </para> + </refsect2> + + <refsect2> + <title>TakeoverTimeout</title> + <para>Default: 9</para> + <para> + This is the duration in seconds in which ctdb tries to complete IP + failover. + </para> + </refsect2> + + <refsect2> + <title>TickleUpdateInterval</title> + <para>Default: 20</para> + <para> + Every <varname>TickleUpdateInterval</varname> seconds, ctdb + synchronizes the client connection information across nodes. + </para> + </refsect2> + + <refsect2> + <title>TraverseTimeout</title> + <para>Default: 20</para> + <para> + This is the duration in seconds for which a database traverse + is allowed to run. If the traverse does not complete during + this interval, ctdb will abort the traverse. + </para> + </refsect2> + + <refsect2> + <title>VacuumFastPathCount</title> + <para>Default: 60</para> + <para> + During a vacuuming run, ctdb usually processes only the records + marked for deletion also called the fast path vacuuming. After + finishing <varname>VacuumFastPathCount</varname> number of fast + path vacuuming runs, ctdb will trigger a scan of complete database + for any empty records that need to be deleted. + </para> + </refsect2> + + <refsect2> + <title>VacuumInterval</title> + <para>Default: 10</para> + <para> + Periodic interval in seconds when vacuuming is triggered for + volatile databases. + </para> + </refsect2> + + <refsect2> + <title>VacuumMaxRunTime</title> + <para>Default: 120</para> + <para> + The maximum time in seconds for which the vacuuming process is + allowed to run. If vacuuming process takes longer than this + value, then the vacuuming process is terminated. + </para> + </refsect2> + + <refsect2> + <title>VerboseMemoryNames</title> + <para>Default: 0</para> + <para> + When set to non-zero, ctdb assigns verbose names for some of + the talloc allocated memory objects. These names are visible + in the talloc memory report generated by 'ctdb dumpmemory'. + </para> + </refsect2> + + </refsect1> + + <refsect1> + <title>FILES></title> + + <simplelist> + <member><filename>/usr/local/etc/ctdb/ctdb.tunables</filename></member> + </simplelist> + </refsect1> + + <refsect1> + <title>SEE ALSO</title> + <para> + <citerefentry><refentrytitle>ctdb</refentrytitle> + <manvolnum>1</manvolnum></citerefentry>, + + <citerefentry><refentrytitle>ctdbd</refentrytitle> + <manvolnum>1</manvolnum></citerefentry>, + + <citerefentry><refentrytitle>ctdb.conf</refentrytitle> + <manvolnum>5</manvolnum></citerefentry>, + + <citerefentry><refentrytitle>ctdb</refentrytitle> + <manvolnum>7</manvolnum></citerefentry>, + + <ulink url="http://ctdb.samba.org/"/> + </para> + </refsect1> + + <refentryinfo> + <author> + <contrib> + This documentation was written by + Ronnie Sahlberg, + Amitay Isaacs, + Martin Schwenke + </contrib> + </author> + + <copyright> + <year>2007</year> + <holder>Andrew Tridgell</holder> + <holder>Ronnie Sahlberg</holder> + </copyright> + <legalnotice> + <para> + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 3 of + the License, or (at your option) any later version. + </para> + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR + PURPOSE. See the GNU General Public License for more details. + </para> + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, see + <ulink url="http://www.gnu.org/licenses"/>. + </para> + </legalnotice> + </refentryinfo> + +</refentry> |