summaryrefslogtreecommitdiffstats
path: root/ctdb/config/events/README
diff options
context:
space:
mode:
Diffstat (limited to 'ctdb/config/events/README')
-rw-r--r--ctdb/config/events/README193
1 files changed, 193 insertions, 0 deletions
diff --git a/ctdb/config/events/README b/ctdb/config/events/README
new file mode 100644
index 0000000..6553830
--- /dev/null
+++ b/ctdb/config/events/README
@@ -0,0 +1,193 @@
+The events/ directory contains event scripts used by CTDB. Event
+scripts are triggered on certain events, such as startup, monitoring
+or public IP allocation. Scripts may be specific to services,
+networking or internal CTDB operations.
+
+Scripts are divided into subdirectories for different CTDB components.
+Right now the only component is "legacy".
+
+All event scripts start with the prefix 'NN.' where N is a digit. The
+event scripts are run in sequence based on NN. Thus 10.interface will
+be run before 60.nfs. It is recommended to keep each NN unique.
+However, scripts with the same NN prefix will be executed in
+alphanumeric sort order.
+
+As a special case, any eventscript that ends with a '~' character will be
+ignored since this is a common postfix that some editors will append to
+older versions of a file. Similarly, any eventscript with multiple '.'s
+will be ignored as package managers can create copies with additional
+suffix starting with '.' (e.g. .rpmnew, .dpkg-dist).
+
+Only executable event scripts are run by CTDB. Any event script that
+does not have execute permission is ignored.
+
+The eventscripts are called with varying number of arguments. The
+first argument is the event name and the rest of the arguments depend
+on the event name.
+
+Event scripts must return 0 for success and non-zero for failure.
+
+Output of event scripts is logged. On failure the output of the
+failing event script is included in the output of "ctdb scriptstatus".
+
+The following events are supported (with arguments shown):
+
+init
+
+ This event is triggered once when CTDB is starting up. This
+ event is used to do some basic cleanup and initialisation.
+
+ During the "init" event CTDB is not listening on its Unix
+ domain socket, so the "ctdb" CLI will not work.
+
+ Failure of this event will cause CTDB to terminate.
+
+ Example: 00.ctdb creates $CTDB_SCRIPT_VARDIR
+
+setup
+
+ This event is triggered once, after the "init" event has
+ completed.
+
+ For this and any subsequent events the CTDB Unix domain socket
+ is available, so the "ctdb" CLI will work.
+
+ Failure of this event will cause CTDB to terminate.
+
+ Example: 11.natgw checks that it has valid configuration
+
+startup
+
+ This event is triggered after the "setup" event has completed
+ and CTDB has finished its initial database recovery.
+
+ This event starts all services that are managed by CTDB. Each
+ service that is managed by CTDB should implement this event
+ and use it to (re)start the service.
+
+ If the "startup" event fails then CTDB will retry it until it
+ succeeds. There is no limit on the number of retries.
+
+ Example: 50.samba uses this event to start the Samba daemon.
+
+shutdown
+
+ This event is triggered when CTDB is shutting down.
+
+ This event shuts down all services that are managed by CTDB.
+ Each service that is managed by CTDB should implement this
+ event and use it to stop the service.
+
+ Example: 50.samba uses this event to shut down the Samba
+ daemon.
+
+monitor
+
+ This event is run periodically. The interval between
+ successive "monitor" events is configured using the
+ MonitorInterval tunable, which defaults to 15 seconds.
+
+ This event is triggered by CTDB to continuously monitor that
+ all managed services are healthy. If all event scripts
+ complete then the monitor event successfully then the node is
+ marked HEALTHY. If any event script fails then no subsequent
+ scripts will be run for that event and the node is marked
+ UNHEALTHY.
+
+ Each service that is managed by CTDB should implement this
+ event and use it to monitor the service.
+
+ Example: 10.interface checks that each configured interface
+ for public IP addresses has a physical link established.
+
+startrecovery
+
+ This event is triggered every time a database recovery process
+ is started.
+
+ This is rarely used.
+
+recovered
+
+ This event is triggered every time a database recovery process
+ is completed.
+
+ This is rarely used.
+
+takeip <interface> <ip-address> <netmask-bits>
+
+ This event is triggered for each public IP address taken by a
+ node during IP address (re)assignment. Multiple "takeip"
+ events can be run in parallel if multiple IP addresses are
+ being assigned.
+
+ Example: In 10.interface the "ip" command (from the Linux
+ iproute2 package) is used to add the specified public IP
+ address to the specified interface. The "ip" command can
+ safely be run concurrently. However, the "iptables" command
+ cannot be run concurrently so a wrapper is used to serialise
+ runs using exclusive locking.
+
+ If substantial work is required to reconfigure a service when
+ a public IP address is taken over it can be better to defer
+ service reconfiguration to the "ipreallocated" event, after
+ all IP addresses have been assigned.
+
+ Example: 60.nfs uses ctdb_service_set_reconfigure() to flag
+ that public IP addresses have changed so that service
+ reconfiguration will occur in the "ipreallocated" event.
+
+releaseip <interface> <ip-address> <netmask-bits>
+
+ This event is triggered for each public IP address released by
+ a node during IP address (re)assignment. Multiple "releaseip"
+ events can be run in parallel if multiple IP addresses are
+ being unassigned.
+
+ In all other regards, this event is analogous to the "takeip"
+ event above.
+
+updateip <old-interface> <new-interface> <ip-address> <netmask-bits>
+
+ This event is triggered for each public IP address moved
+ between interfaces on a node during IP address (re)assignment.
+ Multiple "updateip" events can be run in parallel if multiple
+ IP addresses are being moved.
+
+ This event is only used if multiple interfaces are capable of
+ hosting an IP address, as specified in the public addresses
+ configuration file.
+
+ This event is similar to the "takeip" event above.
+
+ipreallocated
+
+ This event is triggered on all nodes as the last step of
+ public IP address (re)assignment. It is unconditionally
+ triggered after any "releaseip", "takeip" and "updateip"
+ events, even though these events may not run on some nodes if
+ there are no relevant changes. That is, the "ipreallocated"
+ event is triggered unconditionally, even on nodes where public
+ IP addresses assignments have not changed.
+
+ This event is used to reconfigure services.
+
+ Since "ipreallocated" is always run, this allows
+ reconfiguration to depend on the states of other nodes rather
+ that just IP addresses.
+
+ Example: 11.natgw recalculates the NAT gateway master and
+ updates the relevant network configuration on each node if the
+ NAT gateway master has changed.
+
+Additional notes for "takeip", "releaseip", "updateip",
+"ipreallocated":
+
+* Failure of any of these events causes IP allocation to be retried.
+
+* An event script can use ctdb_service_set_reconfigure() in "takeip",
+ "releaseip" or "updateip" events to flag that its service needs to
+ be reconfigured. The "ipreallocated" event can then use
+ ctdb_service_needs_reconfigure() to test if there were public IPs
+ changes to determine what type of reconfiguration (if any) is
+ needed.