summaryrefslogtreecommitdiffstats
path: root/ctdb/doc/ctdb.7.xml
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--ctdb/doc/ctdb.7.xml1182
1 files changed, 1182 insertions, 0 deletions
diff --git a/ctdb/doc/ctdb.7.xml b/ctdb/doc/ctdb.7.xml
new file mode 100644
index 0000000..351f3e8
--- /dev/null
+++ b/ctdb/doc/ctdb.7.xml
@@ -0,0 +1,1182 @@
+<?xml version="1.0" encoding="iso-8859-1"?>
+<!DOCTYPE refentry
+ PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+ "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
+<refentry id="ctdb.7">
+
+<refmeta>
+ <refentrytitle>ctdb</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo class="source">ctdb</refmiscinfo>
+ <refmiscinfo class="manual">CTDB - clustered TDB database</refmiscinfo>
+</refmeta>
+
+
+<refnamediv>
+ <refname>ctdb</refname>
+ <refpurpose>Clustered TDB</refpurpose>
+</refnamediv>
+
+<refsect1>
+ <title>DESCRIPTION</title>
+
+ <para>
+ CTDB is a clustered database component in clustered Samba that
+ provides a high-availability load-sharing CIFS server cluster.
+ </para>
+
+ <para>
+ The main functions of CTDB are:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ Provide a clustered version of the TDB database with automatic
+ rebuild/recovery of the databases upon node failures.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Monitor nodes in the cluster and services running on each node.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Manage a pool of public IP addresses that are used to provide
+ services to clients. Alternatively, CTDB can be used with
+ LVS.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ Combined with a cluster filesystem CTDB provides a full
+ high-availablity (HA) environment for services such as clustered
+ Samba, NFS and other services.
+ </para>
+
+ <para>
+ In addition to the CTDB manual pages there is much more
+ information available at
+ <ulink url="https://wiki.samba.org/index.php/CTDB_and_Clustered_Samba"/>.
+ </para>
+</refsect1>
+
+<refsect1>
+ <title>ANATOMY OF A CTDB CLUSTER</title>
+
+ <para>
+ A CTDB cluster is a collection of nodes with 2 or more network
+ interfaces. All nodes provide network (usually file/NAS) services
+ to clients. Data served by file services is stored on shared
+ storage (usually a cluster filesystem) that is accessible by all
+ nodes.
+ </para>
+ <para>
+ CTDB provides an "all active" cluster, where services are load
+ balanced across all nodes.
+ </para>
+</refsect1>
+
+ <refsect1>
+ <title>Cluster leader</title>
+
+ <para>
+ CTDB uses a <emphasis>cluster leader and follower</emphasis>
+ model of cluster management. All nodes in a cluster elect one
+ node to be the leader. The leader node coordinates privileged
+ operations such as database recovery and IP address failover.
+ </para>
+
+ <para>
+ CTDB previously referred to the leader as the <emphasis>recovery
+ master</emphasis> or <emphasis>recmaster</emphasis>. References
+ to these terms may still be found in documentation and code.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Cluster Lock</title>
+
+ <para>
+ CTDB uses a cluster lock to assert its privileged role in the
+ cluster. This node takes the cluster lock when it becomes
+ leader and holds the lock until it is no longer leader. The
+ <emphasis>cluster lock</emphasis> helps CTDB to avoid a
+ <emphasis>split brain</emphasis>, where a cluster becomes
+ partitioned and each partition attempts to operate
+ independently. Issues that can result from a split brain
+ include file data corruption, because file locking metadata may
+ not be tracked correctly.
+ </para>
+
+ <para>
+ CTDB previously referred to the cluster lock as the
+ <emphasis>recovery lock</emphasis>. The abbreviation
+ <emphasis>reclock</emphasis> is still used - just "clock" would
+ be confusing.
+ </para>
+
+ <para>
+ <emphasis>CTDB is unable configure a default cluster
+ lock</emphasis>, because this would depend on factors such as
+ cluster filesystem mountpoints. However, <emphasis>running CTDB
+ without a cluster lock is not recommended</emphasis> as there
+ will be no split brain protection.
+ </para>
+
+ <para>
+ When a cluster lock is configured it is used as the election
+ mechanism. Nodes race to take the cluster lock and the winner
+ is the cluster leader. This avoids problems when a node wins an
+ election but is unable to take the lock - this can occur if a
+ cluster becomes partitioned (for example, due to a communication
+ failure) and a different leader is elected by the nodes in each
+ partition, or if the cluster filesystem has a high failover
+ latency.
+ </para>
+
+ <para>
+ By default, the cluster lock is implemented using a file
+ (specified by <parameter>cluster lock</parameter> in the
+ <literal>[cluster]</literal> section of
+ <citerefentry><refentrytitle>ctdb.conf</refentrytitle>
+ <manvolnum>5</manvolnum></citerefentry>) residing in shared
+ storage (usually) on a cluster filesystem. To support a
+ cluster lock the cluster filesystem must support lock
+ coherence. See
+ <citerefentry><refentrytitle>ping_pong</refentrytitle>
+ <manvolnum>1</manvolnum></citerefentry> for more details.
+ </para>
+
+ <para>
+ The cluster lock can also be implemented using an arbitrary
+ cluster mutex helper (or call-out). This is indicated by using
+ an exclamation point ('!') as the first character of the
+ <parameter>cluster lock</parameter> parameter. For example, a
+ value of <command>!/usr/local/bin/myhelper cluster</command>
+ would run the given helper with the specified arguments. The
+ helper will continue to run as long as it holds its mutex. See
+ <filename>ctdb/doc/cluster_mutex_helper.txt</filename> in the
+ source tree, and related code, for clues about writing helpers.
+ </para>
+
+ <para>
+ When a file is specified for the <parameter>cluster
+ lock</parameter> parameter (i.e. no leading '!') the file lock
+ is implemented by a default helper
+ (<command>/usr/local/libexec/ctdb/ctdb_mutex_fcntl_helper</command>).
+ This helper has arguments as follows:
+
+ <!-- cmdsynopsis would not require long line but does not work :-( -->
+ <synopsis>
+<command>ctdb_mutex_fcntl_helper</command> <parameter>FILE</parameter> <optional><parameter>RECHECK-INTERVAL</parameter></optional>
+ </synopsis>
+
+ <command>ctdb_mutex_fcntl_helper</command> will take a lock on
+ FILE and then check every RECHECK-INTERVAL seconds to ensure
+ that FILE still exists and that its inode number is unchanged
+ from when the lock was taken. The default value for
+ RECHECK-INTERVAL is 5.
+ </para>
+
+ <para>
+ CTDB does sanity checks to ensure that the cluster lock is held
+ as expected.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Private vs Public addresses</title>
+
+ <para>
+ Each node in a CTDB cluster has multiple IP addresses assigned
+ to it:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ A single private IP address that is used for communication
+ between nodes.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ One or more public IP addresses that are used to provide
+ NAS or other services.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <refsect2>
+ <title>Private address</title>
+
+ <para>
+ Each node is configured with a unique, permanently assigned
+ private address. This address is configured by the operating
+ system. This address uniquely identifies a physical node in
+ the cluster and is the address that CTDB daemons will use to
+ communicate with the CTDB daemons on other nodes.
+ </para>
+
+ <para>
+ Private addresses are listed in the file
+ <filename>/usr/local/etc/ctdb/nodes</filename>). This file
+ contains the list of private addresses for all nodes in the
+ cluster, one per line. This file must be the same on all nodes
+ in the cluster.
+ </para>
+
+ <para>
+ Some users like to put this configuration file in their
+ cluster filesystem. A symbolic link should be used in this
+ case.
+ </para>
+
+ <para>
+ Private addresses should not be used by clients to connect to
+ services provided by the cluster.
+ </para>
+ <para>
+ It is strongly recommended that the private addresses are
+ configured on a private network that is separate from client
+ networks. This is because the CTDB protocol is both
+ unauthenticated and unencrypted. If clients share the private
+ network then steps need to be taken to stop injection of
+ packets to relevant ports on the private addresses. It is
+ also likely that CTDB protocol traffic between nodes could
+ leak sensitive information if it can be intercepted.
+ </para>
+
+ <para>
+ Example <filename>/usr/local/etc/ctdb/nodes</filename> for a four node
+ cluster:
+ </para>
+ <screen format="linespecific">
+192.168.1.1
+192.168.1.2
+192.168.1.3
+192.168.1.4
+ </screen>
+ </refsect2>
+
+ <refsect2>
+ <title>Public addresses</title>
+
+ <para>
+ Public addresses are used to provide services to clients.
+ Public addresses are not configured at the operating system
+ level and are not permanently associated with a particular
+ node. Instead, they are managed by CTDB and are assigned to
+ interfaces on physical nodes at runtime.
+ </para>
+ <para>
+ The CTDB cluster will assign/reassign these public addresses
+ across the available healthy nodes in the cluster. When one
+ node fails, its public addresses will be taken over by one or
+ more other nodes in the cluster. This ensures that services
+ provided by all public addresses are always available to
+ clients, as long as there are nodes available capable of
+ hosting this address.
+ </para>
+
+ <para>
+ The public address configuration is stored in
+ <filename>/usr/local/etc/ctdb/public_addresses</filename> on
+ each node. This file contains a list of the public addresses
+ that the node is capable of hosting, one per line. Each entry
+ also contains the netmask and the interface to which the
+ address should be assigned. If this file is missing then no
+ public addresses are configured.
+ </para>
+
+ <para>
+ Some users who have the same public addresses on all nodes
+ like to put this configuration file in their cluster
+ filesystem. A symbolic link should be used in this case.
+ </para>
+
+ <para>
+ Example <filename>/usr/local/etc/ctdb/public_addresses</filename> for a
+ node that can host 4 public addresses, on 2 different
+ interfaces:
+ </para>
+ <screen format="linespecific">
+10.1.1.1/24 eth1
+10.1.1.2/24 eth1
+10.1.2.1/24 eth2
+10.1.2.2/24 eth2
+ </screen>
+
+ <para>
+ In many cases the public addresses file will be the same on
+ all nodes. However, it is possible to use different public
+ address configurations on different nodes.
+ </para>
+
+ <para>
+ Example: 4 nodes partitioned into two subgroups:
+ </para>
+ <screen format="linespecific">
+Node 0:/usr/local/etc/ctdb/public_addresses
+ 10.1.1.1/24 eth1
+ 10.1.1.2/24 eth1
+
+Node 1:/usr/local/etc/ctdb/public_addresses
+ 10.1.1.1/24 eth1
+ 10.1.1.2/24 eth1
+
+Node 2:/usr/local/etc/ctdb/public_addresses
+ 10.1.2.1/24 eth2
+ 10.1.2.2/24 eth2
+
+Node 3:/usr/local/etc/ctdb/public_addresses
+ 10.1.2.1/24 eth2
+ 10.1.2.2/24 eth2
+ </screen>
+ <para>
+ In this example nodes 0 and 1 host two public addresses on the
+ 10.1.1.x network while nodes 2 and 3 host two public addresses
+ for the 10.1.2.x network.
+ </para>
+ <para>
+ Public address 10.1.1.1 can be hosted by either of nodes 0 or
+ 1 and will be available to clients as long as at least one of
+ these two nodes are available.
+ </para>
+ <para>
+ If both nodes 0 and 1 become unavailable then public address
+ 10.1.1.1 also becomes unavailable. 10.1.1.1 can not be failed
+ over to nodes 2 or 3 since these nodes do not have this public
+ address configured.
+ </para>
+ <para>
+ The <command>ctdb ip</command> command can be used to view the
+ current assignment of public addresses to physical nodes.
+ </para>
+ </refsect2>
+ </refsect1>
+
+
+ <refsect1>
+ <title>Node status</title>
+
+ <para>
+ The current status of each node in the cluster can be viewed by the
+ <command>ctdb status</command> command.
+ </para>
+
+ <para>
+ A node can be in one of the following states:
+ </para>
+
+ <variablelist>
+ <varlistentry>
+ <term>OK</term>
+ <listitem>
+ <para>
+ This node is healthy and fully functional. It hosts public
+ addresses to provide services.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>DISCONNECTED</term>
+ <listitem>
+ <para>
+ This node is not reachable by other nodes via the private
+ network. It is not currently participating in the cluster.
+ It <emphasis>does not</emphasis> host public addresses to
+ provide services. It might be shut down.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>DISABLED</term>
+ <listitem>
+ <para>
+ This node has been administratively disabled. This node is
+ partially functional and participates in the cluster.
+ However, it <emphasis>does not</emphasis> host public
+ addresses to provide services.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>UNHEALTHY</term>
+ <listitem>
+ <para>
+ A service provided by this node has failed a health check
+ and should be investigated. This node is partially
+ functional and participates in the cluster. However, it
+ <emphasis>does not</emphasis> host public addresses to
+ provide services. Unhealthy nodes should be investigated
+ and may require an administrative action to rectify.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>BANNED</term>
+ <listitem>
+ <para>
+ CTDB is not behaving as designed on this node. For example,
+ it may have failed too many recovery attempts. Such nodes
+ are banned from participating in the cluster for a
+ configurable time period before they attempt to rejoin the
+ cluster. A banned node <emphasis>does not</emphasis> host
+ public addresses to provide services. All banned nodes
+ should be investigated and may require an administrative
+ action to rectify.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>STOPPED</term>
+ <listitem>
+ <para>
+ This node has been administratively exclude from the
+ cluster. A stopped node does no participate in the cluster
+ and <emphasis>does not</emphasis> host public addresses to
+ provide services. This state can be used while performing
+ maintenance on a node.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>PARTIALLYONLINE</term>
+ <listitem>
+ <para>
+ A node that is partially online participates in a cluster
+ like a healthy (OK) node. Some interfaces to serve public
+ addresses are down, but at least one interface is up. See
+ also <command>ctdb ifaces</command>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>CAPABILITIES</title>
+
+ <para>
+ Cluster nodes can have several different capabilities enabled.
+ These are listed below.
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term>LEADER</term>
+ <listitem>
+ <para>
+ Indicates that a node can become the CTDB cluster leader.
+ The current leader is decided via an
+ election held by all active nodes with this capability.
+ </para>
+ <para>
+ Default is YES.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>LMASTER</term>
+ <listitem>
+ <para>
+ Indicates that a node can be the location master (LMASTER)
+ for database records. The LMASTER always knows which node
+ has the latest copy of a record in a volatile database.
+ </para>
+ <para>
+ Default is YES.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <para>
+ The LEADER and LMASTER capabilities can be disabled when CTDB
+ is used to create a cluster spanning across WAN links. In this
+ case CTDB acts as a WAN accelerator.
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>LVS</title>
+
+ <para>
+ LVS is a mode where CTDB presents one single IP address for the
+ entire cluster. This is an alternative to using public IP
+ addresses and round-robin DNS to loadbalance clients across the
+ cluster.
+ </para>
+
+ <para>
+ This is similar to using a layer-4 loadbalancing switch but with
+ some restrictions.
+ </para>
+
+ <para>
+ One extra LVS public address is assigned on the public network
+ to each LVS group. Each LVS group is a set of nodes in the
+ cluster that presents the same LVS address public address to the
+ outside world. Normally there would only be one LVS group
+ spanning an entire cluster, but in situations where one CTDB
+ cluster spans multiple physical sites it might be useful to have
+ one LVS group for each site. There can be multiple LVS groups
+ in a cluster but each node can only be member of one LVS group.
+ </para>
+
+ <para>
+ Client access to the cluster is load-balanced across the HEALTHY
+ nodes in an LVS group. If no HEALTHY nodes exists then all
+ nodes in the group are used, regardless of health status. CTDB
+ will, however never load-balance LVS traffic to nodes that are
+ BANNED, STOPPED, DISABLED or DISCONNECTED. The <command>ctdb
+ lvs</command> command is used to show which nodes are currently
+ load-balanced across.
+ </para>
+
+ <para>
+ In each LVS group, one of the nodes is selected by CTDB to be
+ the LVS leader. This node receives all traffic from clients
+ coming in to the LVS public address and multiplexes it across
+ the internal network to one of the nodes that LVS is using.
+ When responding to the client, that node will send the data back
+ directly to the client, bypassing the LVS leader node. The
+ command <command>ctdb lvs leader</command> will show which node
+ is the current LVS leader.
+ </para>
+
+ <para>
+ The path used for a client I/O is:
+ <orderedlist>
+ <listitem>
+ <para>
+ Client sends request packet to LVS leader.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ LVS leader passes the request on to one node across the
+ internal network.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Selected node processes the request.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Node responds back to client.
+ </para>
+ </listitem>
+ </orderedlist>
+ </para>
+
+ <para>
+ This means that all incoming traffic to the cluster will pass
+ through one physical node, which limits scalability. You can
+ send more data to the LVS address that one physical node can
+ multiplex. This means that you should not use LVS if your I/O
+ pattern is write-intensive since you will be limited in the
+ available network bandwidth that node can handle. LVS does work
+ very well for read-intensive workloads where only smallish READ
+ requests are going through the LVS leader bottleneck and the
+ majority of the traffic volume (the data in the read replies)
+ goes straight from the processing node back to the clients. For
+ read-intensive i/o patterns you can achieve very high throughput
+ rates in this mode.
+ </para>
+
+ <para>
+ Note: you can use LVS and public addresses at the same time.
+ </para>
+
+ <para>
+ If you use LVS, you must have a permanent address configured for
+ the public interface on each node. This address must be routable
+ and the cluster nodes must be configured so that all traffic
+ back to client hosts are routed through this interface. This is
+ also required in order to allow samba/winbind on the node to
+ talk to the domain controller. This LVS IP address can not be
+ used to initiate outgoing traffic.
+ </para>
+ <para>
+ Make sure that the domain controller and the clients are
+ reachable from a node <emphasis>before</emphasis> you enable
+ LVS. Also ensure that outgoing traffic to these hosts is routed
+ out through the configured public interface.
+ </para>
+
+ <refsect2>
+ <title>Configuration</title>
+
+ <para>
+ To activate LVS on a CTDB node you must specify the
+ <varname>CTDB_LVS_PUBLIC_IFACE</varname>,
+ <varname>CTDB_LVS_PUBLIC_IP</varname> and
+ <varname>CTDB_LVS_NODES</varname> configuration variables.
+ <varname>CTDB_LVS_NODES</varname> specifies a file containing
+ the private address of all nodes in the current node's LVS
+ group.
+ </para>
+
+ <para>
+ Example:
+ <screen format="linespecific">
+CTDB_LVS_PUBLIC_IFACE=eth1
+CTDB_LVS_PUBLIC_IP=10.1.1.237
+CTDB_LVS_NODES=/usr/local/etc/ctdb/lvs_nodes
+ </screen>
+ </para>
+
+ <para>
+ Example <filename>/usr/local/etc/ctdb/lvs_nodes</filename>:
+ </para>
+ <screen format="linespecific">
+192.168.1.2
+192.168.1.3
+192.168.1.4
+ </screen>
+
+ <para>
+ Normally any node in an LVS group can act as the LVS leader.
+ Nodes that are highly loaded due to other demands maybe
+ flagged with the "follower-only" option in the
+ <varname>CTDB_LVS_NODES</varname> file to limit the LVS
+ functionality of those nodes.
+ </para>
+
+ <para>
+ LVS nodes file that excludes 192.168.1.4 from being
+ the LVS leader node:
+ </para>
+ <screen format="linespecific">
+192.168.1.2
+192.168.1.3
+192.168.1.4 follower-only
+ </screen>
+
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>TRACKING AND RESETTING TCP CONNECTIONS</title>
+
+ <para>
+ CTDB tracks TCP connections from clients to public IP addresses,
+ on known ports. When an IP address moves from one node to
+ another, all existing TCP connections to that IP address are
+ reset. The node taking over this IP address will also send
+ gratuitous ARPs (for IPv4, or neighbour advertisement, for
+ IPv6). This allows clients to reconnect quickly, rather than
+ waiting for TCP timeouts, which can be very long.
+ </para>
+
+ <para>
+ It is important that established TCP connections do not survive
+ a release and take of a public IP address on the same node.
+ Such connections can get out of sync with sequence and ACK
+ numbers, potentially causing a disruptive ACK storm.
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>NAT GATEWAY</title>
+
+ <para>
+ NAT gateway (NATGW) is an optional feature that is used to
+ configure fallback routing for nodes. This allows cluster nodes
+ to connect to external services (e.g. DNS, AD, NIS and LDAP)
+ when they do not host any public addresses (e.g. when they are
+ unhealthy).
+ </para>
+ <para>
+ This also applies to node startup because CTDB marks nodes as
+ UNHEALTHY until they have passed a "monitor" event. In this
+ context, NAT gateway helps to avoid a "chicken and egg"
+ situation where a node needs to access an external service to
+ become healthy.
+ </para>
+ <para>
+ Another way of solving this type of problem is to assign an
+ extra static IP address to a public interface on every node.
+ This is simpler but it uses an extra IP address per node, while
+ NAT gateway generally uses only one extra IP address.
+ </para>
+
+ <refsect2>
+ <title>Operation</title>
+
+ <para>
+ One extra NATGW public address is assigned on the public
+ network to each NATGW group. Each NATGW group is a set of
+ nodes in the cluster that shares the same NATGW address to
+ talk to the outside world. Normally there would only be one
+ NATGW group spanning an entire cluster, but in situations
+ where one CTDB cluster spans multiple physical sites it might
+ be useful to have one NATGW group for each site.
+ </para>
+ <para>
+ There can be multiple NATGW groups in a cluster but each node
+ can only be member of one NATGW group.
+ </para>
+ <para>
+ In each NATGW group, one of the nodes is selected by CTDB to
+ be the NATGW leader and the other nodes are consider to be
+ NATGW followers. NATGW followers establish a fallback default route
+ to the NATGW leader via the private network. When a NATGW
+ follower hosts no public IP addresses then it will use this route
+ for outbound connections. The NATGW leader hosts the NATGW
+ public IP address and routes outgoing connections from
+ follower nodes via this IP address. It also establishes a
+ fallback default route.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Configuration</title>
+
+ <para>
+ NATGW is usually configured similar to the following example configuration:
+ </para>
+ <screen format="linespecific">
+CTDB_NATGW_NODES=/usr/local/etc/ctdb/natgw_nodes
+CTDB_NATGW_PRIVATE_NETWORK=192.168.1.0/24
+CTDB_NATGW_PUBLIC_IP=10.0.0.227/24
+CTDB_NATGW_PUBLIC_IFACE=eth0
+CTDB_NATGW_DEFAULT_GATEWAY=10.0.0.1
+ </screen>
+
+ <para>
+ Normally any node in a NATGW group can act as the NATGW
+ leader. Some configurations may have special nodes that lack
+ connectivity to a public network. In such cases, those nodes
+ can be flagged with the "follower-only" option in the
+ <varname>CTDB_NATGW_NODES</varname> file to limit the NATGW
+ functionality of those nodes.
+ </para>
+
+ <para>
+ See the <citetitle>NAT GATEWAY</citetitle> section in
+ <citerefentry><refentrytitle>ctdb-script.options</refentrytitle>
+ <manvolnum>5</manvolnum></citerefentry> for more details of
+ NATGW configuration.
+ </para>
+ </refsect2>
+
+
+ <refsect2>
+ <title>Implementation details</title>
+
+ <para>
+ When the NATGW functionality is used, one of the nodes is
+ selected to act as a NAT gateway for all the other nodes in
+ the group when they need to communicate with the external
+ services. The NATGW leader is selected to be a node that is
+ most likely to have usable networks.
+ </para>
+
+ <para>
+ The NATGW leader hosts the NATGW public IP address
+ <varname>CTDB_NATGW_PUBLIC_IP</varname> on the configured public
+ interfaces <varname>CTDB_NATGW_PUBLIC_IFACE</varname> and acts as
+ a router, masquerading outgoing connections from follower nodes
+ via this IP address. If
+ <varname>CTDB_NATGW_DEFAULT_GATEWAY</varname> is set then it
+ also establishes a fallback default route to the configured
+ this gateway with a metric of 10. A metric 10 route is used
+ so it can co-exist with other default routes that may be
+ available.
+ </para>
+
+ <para>
+ A NATGW follower establishes its fallback default route to the
+ NATGW leader via the private network
+ <varname>CTDB_NATGW_PRIVATE_NETWORK</varname>with a metric of 10.
+ This route is used for outbound connections when no other
+ default route is available because the node hosts no public
+ addresses. A metric 10 routes is used so that it can co-exist
+ with other default routes that may be available when the node
+ is hosting public addresses.
+ </para>
+
+ <para>
+ <varname>CTDB_NATGW_STATIC_ROUTES</varname> can be used to
+ have NATGW create more specific routes instead of just default
+ routes.
+ </para>
+
+ <para>
+ This is implemented in the <filename>11.natgw</filename>
+ eventscript. Please see the eventscript file and the
+ <citetitle>NAT GATEWAY</citetitle> section in
+ <citerefentry><refentrytitle>ctdb-script.options</refentrytitle>
+ <manvolnum>5</manvolnum></citerefentry> for more details.
+ </para>
+
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>POLICY ROUTING</title>
+
+ <para>
+ Policy routing is an optional CTDB feature to support complex
+ network topologies. Public addresses may be spread across
+ several different networks (or VLANs) and it may not be possible
+ to route packets from these public addresses via the system's
+ default route. Therefore, CTDB has support for policy routing
+ via the <filename>13.per_ip_routing</filename> eventscript.
+ This allows routing to be specified for packets sourced from
+ each public address. The routes are added and removed as CTDB
+ moves public addresses between nodes.
+ </para>
+
+ <refsect2>
+ <title>Configuration variables</title>
+
+ <para>
+ There are 4 configuration variables related to policy routing:
+ <varname>CTDB_PER_IP_ROUTING_CONF</varname>,
+ <varname>CTDB_PER_IP_ROUTING_RULE_PREF</varname>,
+ <varname>CTDB_PER_IP_ROUTING_TABLE_ID_LOW</varname>,
+ <varname>CTDB_PER_IP_ROUTING_TABLE_ID_HIGH</varname>. See the
+ <citetitle>POLICY ROUTING</citetitle> section in
+ <citerefentry><refentrytitle>ctdb-script.options</refentrytitle>
+ <manvolnum>5</manvolnum></citerefentry> for more details.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Configuration</title>
+
+ <para>
+ The format of each line of
+ <varname>CTDB_PER_IP_ROUTING_CONF</varname> is:
+ </para>
+
+ <screen>
+&lt;public_address&gt; &lt;network&gt; [ &lt;gateway&gt; ]
+ </screen>
+
+ <para>
+ Leading whitespace is ignored and arbitrary whitespace may be
+ used as a separator. Lines that have a "public address" item
+ that doesn't match an actual public address are ignored. This
+ means that comment lines can be added using a leading
+ character such as '#', since this will never match an IP
+ address.
+ </para>
+
+ <para>
+ A line without a gateway indicates a link local route.
+ </para>
+
+ <para>
+ For example, consider the configuration line:
+ </para>
+
+ <screen>
+ 192.168.1.99 192.168.1.0/24
+ </screen>
+
+ <para>
+ If the corresponding public_addresses line is:
+ </para>
+
+ <screen>
+ 192.168.1.99/24 eth2,eth3
+ </screen>
+
+ <para>
+ <varname>CTDB_PER_IP_ROUTING_RULE_PREF</varname> is 100, and
+ CTDB adds the address to eth2 then the following routing
+ information is added:
+ </para>
+
+ <screen>
+ ip rule add from 192.168.1.99 pref 100 table ctdb.192.168.1.99
+ ip route add 192.168.1.0/24 dev eth2 table ctdb.192.168.1.99
+ </screen>
+
+ <para>
+ This causes traffic from 192.168.1.99 to 192.168.1.0/24 go via
+ eth2.
+ </para>
+
+ <para>
+ The <command>ip rule</command> command will show (something
+ like - depending on other public addresses and other routes on
+ the system):
+ </para>
+
+ <screen>
+ 0: from all lookup local
+ 100: from 192.168.1.99 lookup ctdb.192.168.1.99
+ 32766: from all lookup main
+ 32767: from all lookup default
+ </screen>
+
+ <para>
+ <command>ip route show table ctdb.192.168.1.99</command> will show:
+ </para>
+
+ <screen>
+ 192.168.1.0/24 dev eth2 scope link
+ </screen>
+
+ <para>
+ The usual use for a line containing a gateway is to add a
+ default route corresponding to a particular source address.
+ Consider this line of configuration:
+ </para>
+
+ <screen>
+ 192.168.1.99 0.0.0.0/0 192.168.1.1
+ </screen>
+
+ <para>
+ In the situation described above this will cause an extra
+ routing command to be executed:
+ </para>
+
+ <screen>
+ ip route add 0.0.0.0/0 via 192.168.1.1 dev eth2 table ctdb.192.168.1.99
+ </screen>
+
+ <para>
+ With both configuration lines, <command>ip route show table
+ ctdb.192.168.1.99</command> will show:
+ </para>
+
+ <screen>
+ 192.168.1.0/24 dev eth2 scope link
+ default via 192.168.1.1 dev eth2
+ </screen>
+ </refsect2>
+
+ <refsect2>
+ <title>Sample configuration</title>
+
+ <para>
+ Here is a more complete example configuration.
+ </para>
+
+ <screen>
+/usr/local/etc/ctdb/public_addresses:
+
+ 192.168.1.98 eth2,eth3
+ 192.168.1.99 eth2,eth3
+
+/usr/local/etc/ctdb/policy_routing:
+
+ 192.168.1.98 192.168.1.0/24
+ 192.168.1.98 192.168.200.0/24 192.168.1.254
+ 192.168.1.98 0.0.0.0/0 192.168.1.1
+ 192.168.1.99 192.168.1.0/24
+ 192.168.1.99 192.168.200.0/24 192.168.1.254
+ 192.168.1.99 0.0.0.0/0 192.168.1.1
+ </screen>
+
+ <para>
+ The routes local packets as expected, the default route is as
+ previously discussed, but packets to 192.168.200.0/24 are
+ routed via the alternate gateway 192.168.1.254.
+ </para>
+
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>NOTIFICATIONS</title>
+
+ <para>
+ When certain state changes occur in CTDB, it can be configured
+ to perform arbitrary actions via notifications. For example,
+ sending SNMP traps or emails when a node becomes unhealthy or
+ similar.
+ </para>
+
+ <para>
+ The notification mechanism runs all executable files ending in
+ ".script" in
+ <filename>/usr/local/etc/ctdb/events/notification/</filename>,
+ ignoring any failures and continuing to run all files.
+ </para>
+
+ <para>
+ CTDB currently generates notifications after CTDB changes to
+ these states:
+ </para>
+
+ <simplelist>
+ <member>init</member>
+ <member>setup</member>
+ <member>startup</member>
+ <member>healthy</member>
+ <member>unhealthy</member>
+ </simplelist>
+
+ </refsect1>
+
+ <refsect1>
+ <title>LOG LEVELS</title>
+
+ <para>
+ Valid log levels, in increasing order of verbosity, are:
+ </para>
+
+ <simplelist>
+ <member>ERROR</member>
+ <member>WARNING</member>
+ <member>NOTICE</member>
+ <member>INFO</member>
+ <member>DEBUG</member>
+ </simplelist>
+ </refsect1>
+
+
+ <refsect1>
+ <title>REMOTE CLUSTER NODES</title>
+ <para>
+It is possible to have a CTDB cluster that spans across a WAN link.
+For example where you have a CTDB cluster in your datacentre but you also
+want to have one additional CTDB node located at a remote branch site.
+This is similar to how a WAN accelerator works but with the difference
+that while a WAN-accelerator often acts as a Proxy or a MitM, in
+the ctdb remote cluster node configuration the Samba instance at the remote site
+IS the genuine server, not a proxy and not a MitM, and thus provides 100%
+correct CIFS semantics to clients.
+ </para>
+
+ <para>
+ See the cluster as one single multihomed samba server where one of
+ the NICs (the remote node) is very far away.
+ </para>
+
+ <para>
+ NOTE: This does require that the cluster filesystem you use can cope
+ with WAN-link latencies. Not all cluster filesystems can handle
+ WAN-link latencies! Whether this will provide very good WAN-accelerator
+ performance or it will perform very poorly depends entirely
+ on how optimized your cluster filesystem is in handling high latency
+ for data and metadata operations.
+ </para>
+
+ <para>
+ To activate a node as being a remote cluster node you need to
+ set the following two parameters in
+ /usr/local/etc/ctdb/ctdb.conf for the remote node:
+ <screen format="linespecific">
+[legacy]
+ lmaster capability = false
+ leader capability = false
+ </screen>
+ </para>
+
+ <para>
+ Verify with the command "ctdb getcapabilities" that that node no longer
+ has the leader or the lmaster capabilities.
+ </para>
+
+ </refsect1>
+
+
+ <refsect1>
+ <title>SEE ALSO</title>
+
+ <para>
+ <citerefentry><refentrytitle>ctdb</refentrytitle>
+ <manvolnum>1</manvolnum></citerefentry>,
+
+ <citerefentry><refentrytitle>ctdbd</refentrytitle>
+ <manvolnum>1</manvolnum></citerefentry>,
+
+ <citerefentry><refentrytitle>ctdb_diagnostics</refentrytitle>
+ <manvolnum>1</manvolnum></citerefentry>,
+
+ <citerefentry><refentrytitle>ltdbtool</refentrytitle>
+ <manvolnum>1</manvolnum></citerefentry>,
+
+ <citerefentry><refentrytitle>onnode</refentrytitle>
+ <manvolnum>1</manvolnum></citerefentry>,
+
+ <citerefentry><refentrytitle>ping_pong</refentrytitle>
+ <manvolnum>1</manvolnum></citerefentry>,
+
+ <citerefentry><refentrytitle>ctdb.conf</refentrytitle>
+ <manvolnum>5</manvolnum></citerefentry>,
+
+ <citerefentry><refentrytitle>ctdb-script.options</refentrytitle>
+ <manvolnum>5</manvolnum></citerefentry>,
+
+ <citerefentry><refentrytitle>ctdb.sysconfig</refentrytitle>
+ <manvolnum>5</manvolnum></citerefentry>,
+
+ <citerefentry><refentrytitle>ctdb-statistics</refentrytitle>
+ <manvolnum>7</manvolnum></citerefentry>,
+
+ <citerefentry><refentrytitle>ctdb-tunables</refentrytitle>
+ <manvolnum>7</manvolnum></citerefentry>,
+
+ <ulink url="https://wiki.samba.org/index.php/CTDB_and_Clustered_Samba"/>,
+
+ <ulink url="http://ctdb.samba.org/"/>
+ </para>
+ </refsect1>
+
+ <refentryinfo>
+ <author>
+ <contrib>
+ This documentation was written by
+ Ronnie Sahlberg,
+ Amitay Isaacs,
+ Martin Schwenke
+ </contrib>
+ </author>
+
+ <copyright>
+ <year>2007</year>
+ <holder>Andrew Tridgell</holder>
+ <holder>Ronnie Sahlberg</holder>
+ </copyright>
+ <legalnotice>
+ <para>
+ This program is free software; you can redistribute it and/or
+ modify it under the terms of the GNU General Public License as
+ published by the Free Software Foundation; either version 3 of
+ the License, or (at your option) any later version.
+ </para>
+ <para>
+ This program is distributed in the hope that it will be
+ useful, but WITHOUT ANY WARRANTY; without even the implied
+ warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
+ PURPOSE. See the GNU General Public License for more details.
+ </para>
+ <para>
+ You should have received a copy of the GNU General Public
+ License along with this program; if not, see
+ <ulink url="http://www.gnu.org/licenses"/>.
+ </para>
+ </legalnotice>
+ </refentryinfo>
+
+</refentry>