diff options
Diffstat (limited to '')
20 files changed, 2714 insertions, 0 deletions
diff --git a/ctdb/config/events/README b/ctdb/config/events/README new file mode 100644 index 0000000..6553830 --- /dev/null +++ b/ctdb/config/events/README @@ -0,0 +1,193 @@ +The events/ directory contains event scripts used by CTDB. Event +scripts are triggered on certain events, such as startup, monitoring +or public IP allocation. Scripts may be specific to services, +networking or internal CTDB operations. + +Scripts are divided into subdirectories for different CTDB components. +Right now the only component is "legacy". + +All event scripts start with the prefix 'NN.' where N is a digit. The +event scripts are run in sequence based on NN. Thus 10.interface will +be run before 60.nfs. It is recommended to keep each NN unique. +However, scripts with the same NN prefix will be executed in +alphanumeric sort order. + +As a special case, any eventscript that ends with a '~' character will be +ignored since this is a common postfix that some editors will append to +older versions of a file. Similarly, any eventscript with multiple '.'s +will be ignored as package managers can create copies with additional +suffix starting with '.' (e.g. .rpmnew, .dpkg-dist). + +Only executable event scripts are run by CTDB. Any event script that +does not have execute permission is ignored. + +The eventscripts are called with varying number of arguments. The +first argument is the event name and the rest of the arguments depend +on the event name. + +Event scripts must return 0 for success and non-zero for failure. + +Output of event scripts is logged. On failure the output of the +failing event script is included in the output of "ctdb scriptstatus". + +The following events are supported (with arguments shown): + +init + + This event is triggered once when CTDB is starting up. This + event is used to do some basic cleanup and initialisation. + + During the "init" event CTDB is not listening on its Unix + domain socket, so the "ctdb" CLI will not work. + + Failure of this event will cause CTDB to terminate. + + Example: 00.ctdb creates $CTDB_SCRIPT_VARDIR + +setup + + This event is triggered once, after the "init" event has + completed. + + For this and any subsequent events the CTDB Unix domain socket + is available, so the "ctdb" CLI will work. + + Failure of this event will cause CTDB to terminate. + + Example: 11.natgw checks that it has valid configuration + +startup + + This event is triggered after the "setup" event has completed + and CTDB has finished its initial database recovery. + + This event starts all services that are managed by CTDB. Each + service that is managed by CTDB should implement this event + and use it to (re)start the service. + + If the "startup" event fails then CTDB will retry it until it + succeeds. There is no limit on the number of retries. + + Example: 50.samba uses this event to start the Samba daemon. + +shutdown + + This event is triggered when CTDB is shutting down. + + This event shuts down all services that are managed by CTDB. + Each service that is managed by CTDB should implement this + event and use it to stop the service. + + Example: 50.samba uses this event to shut down the Samba + daemon. + +monitor + + This event is run periodically. The interval between + successive "monitor" events is configured using the + MonitorInterval tunable, which defaults to 15 seconds. + + This event is triggered by CTDB to continuously monitor that + all managed services are healthy. If all event scripts + complete then the monitor event successfully then the node is + marked HEALTHY. If any event script fails then no subsequent + scripts will be run for that event and the node is marked + UNHEALTHY. + + Each service that is managed by CTDB should implement this + event and use it to monitor the service. + + Example: 10.interface checks that each configured interface + for public IP addresses has a physical link established. + +startrecovery + + This event is triggered every time a database recovery process + is started. + + This is rarely used. + +recovered + + This event is triggered every time a database recovery process + is completed. + + This is rarely used. + +takeip <interface> <ip-address> <netmask-bits> + + This event is triggered for each public IP address taken by a + node during IP address (re)assignment. Multiple "takeip" + events can be run in parallel if multiple IP addresses are + being assigned. + + Example: In 10.interface the "ip" command (from the Linux + iproute2 package) is used to add the specified public IP + address to the specified interface. The "ip" command can + safely be run concurrently. However, the "iptables" command + cannot be run concurrently so a wrapper is used to serialise + runs using exclusive locking. + + If substantial work is required to reconfigure a service when + a public IP address is taken over it can be better to defer + service reconfiguration to the "ipreallocated" event, after + all IP addresses have been assigned. + + Example: 60.nfs uses ctdb_service_set_reconfigure() to flag + that public IP addresses have changed so that service + reconfiguration will occur in the "ipreallocated" event. + +releaseip <interface> <ip-address> <netmask-bits> + + This event is triggered for each public IP address released by + a node during IP address (re)assignment. Multiple "releaseip" + events can be run in parallel if multiple IP addresses are + being unassigned. + + In all other regards, this event is analogous to the "takeip" + event above. + +updateip <old-interface> <new-interface> <ip-address> <netmask-bits> + + This event is triggered for each public IP address moved + between interfaces on a node during IP address (re)assignment. + Multiple "updateip" events can be run in parallel if multiple + IP addresses are being moved. + + This event is only used if multiple interfaces are capable of + hosting an IP address, as specified in the public addresses + configuration file. + + This event is similar to the "takeip" event above. + +ipreallocated + + This event is triggered on all nodes as the last step of + public IP address (re)assignment. It is unconditionally + triggered after any "releaseip", "takeip" and "updateip" + events, even though these events may not run on some nodes if + there are no relevant changes. That is, the "ipreallocated" + event is triggered unconditionally, even on nodes where public + IP addresses assignments have not changed. + + This event is used to reconfigure services. + + Since "ipreallocated" is always run, this allows + reconfiguration to depend on the states of other nodes rather + that just IP addresses. + + Example: 11.natgw recalculates the NAT gateway master and + updates the relevant network configuration on each node if the + NAT gateway master has changed. + +Additional notes for "takeip", "releaseip", "updateip", +"ipreallocated": + +* Failure of any of these events causes IP allocation to be retried. + +* An event script can use ctdb_service_set_reconfigure() in "takeip", + "releaseip" or "updateip" events to flag that its service needs to + be reconfigured. The "ipreallocated" event can then use + ctdb_service_needs_reconfigure() to test if there were public IPs + changes to determine what type of reconfiguration (if any) is + needed. diff --git a/ctdb/config/events/legacy/00.ctdb.script b/ctdb/config/events/legacy/00.ctdb.script new file mode 100755 index 0000000..81c16af --- /dev/null +++ b/ctdb/config/events/legacy/00.ctdb.script @@ -0,0 +1,130 @@ +#!/bin/sh + +# Event script for ctdb-specific setup and other things that don't fit +# elsewhere. + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +load_script_options + +############################################################ + +# type is commonly supported and more portable than which(1) +# shellcheck disable=SC2039 +select_tdb_checker () +{ + # Find the best TDB consistency check available. + use_tdb_tool_check=false + type tdbtool >/dev/null 2>&1 && found_tdbtool=true + type tdbdump >/dev/null 2>&1 && found_tdbdump=true + + if $found_tdbtool && echo "help" | tdbtool | grep -q check ; then + use_tdb_tool_check=true + elif $found_tdbtool && $found_tdbdump ; then + cat <<EOF +WARNING: The installed 'tdbtool' does not offer the 'check' subcommand. + Using 'tdbdump' for database checks. + Consider updating 'tdbtool' for better checks! +EOF + elif $found_tdbdump ; then + cat <<EOF +WARNING: 'tdbtool' is not available. + Using 'tdbdump' to check the databases. + Consider installing a recent 'tdbtool' for better checks! +EOF + else + cat <<EOF +WARNING: Cannot check databases since neither + 'tdbdump' nor 'tdbtool check' is available. + Consider installing tdbtool or at least tdbdump! +EOF + return 1 + fi +} + +check_tdb () +{ + _db="$1" + + if $use_tdb_tool_check ; then + # tdbtool always exits with 0 :-( + if timeout 10 tdbtool "$_db" check 2>/dev/null | + grep -q "Database integrity is OK" ; then + return 0 + else + return 1 + fi + else + timeout 10 tdbdump "$_db" >/dev/null 2>/dev/null + return $? + fi +} + +check_persistent_databases () +{ + _dir="${CTDB_DBDIR_PERSISTENT:-${CTDB_VARDIR}/persistent}" + [ -d "$_dir" ] || return 0 + + for _db in "$_dir/"*.tdb.*[0-9] ; do + [ -r "$_db" ] || continue + check_tdb "$_db" || \ + die "Persistent database $_db is corrupted! CTDB will not start." + done +} + +check_non_persistent_databases () +{ + _dir="${CTDB_DBDIR:-${CTDB_VARDIR}}" + [ -d "$_dir" ] || return 0 + + for _db in "${_dir}/"*.tdb.*[0-9] ; do + [ -r "$_db" ] || continue + check_tdb "$_db" || { + _backup="${_db}.$(date +'%Y%m%d.%H%M%S').corrupt" + cat <<EOF +WARNING: database ${_db} is corrupted. + Moving to backup ${_backup} for later analysis. +EOF + mv "$_db" "$_backup" + + # Now remove excess backups + _max="${CTDB_MAX_CORRUPT_DB_BACKUPS:-10}" + _bdb="${_db##*/}" # basename + find "$_dir" -name "${_bdb}.*.corrupt" | + sort -r | + tail -n +$((_max + 1)) | + xargs rm -f + } + done +} + +############################################################ + +ctdb_check_args "$@" + +case "$1" in +init) + # make sure we have a blank state directory for the scripts to work with + rm -rf "$CTDB_SCRIPT_VARDIR" + mkdir -p "$CTDB_SCRIPT_VARDIR" || \ + die "mkdir -p ${CTDB_SCRIPT_VARDIR} - failed - $?" $? + + # Load/cache database options from configuration file + ctdb_get_db_options + + if select_tdb_checker ; then + check_persistent_databases || exit $? + check_non_persistent_databases + fi + ;; + +startup) + $CTDB attach ctdb.tdb persistent + ;; +esac + +# all OK +exit 0 diff --git a/ctdb/config/events/legacy/01.reclock.script b/ctdb/config/events/legacy/01.reclock.script new file mode 100755 index 0000000..0406875 --- /dev/null +++ b/ctdb/config/events/legacy/01.reclock.script @@ -0,0 +1,34 @@ +#!/bin/sh +# script to check accessibility to the reclock file on a node + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +case "$1" in +init) + recovery_lock=$("${CTDB_HELPER_BINDIR}/ctdb-config" \ + get cluster "recovery lock") + # xshellcheck disable=SC2181 + # Above is already complicated enough without embedding into "if" + case $? in + 0) : ;; + 2) exit 0 ;; # ENOENT: not configured + *) die "Unexpected error getting recovery lock configuration" + esac + + if [ -z "$recovery_lock" ] ; then + exit 0 + fi + + # If a helper is specified then exit because this script can't + # do anything useful + case "$recovery_lock" in + !*) exit 0 ;; + esac + + d=$(dirname "$recovery_lock") + mkdir -p "$d" + ;; +esac diff --git a/ctdb/config/events/legacy/05.system.script b/ctdb/config/events/legacy/05.system.script new file mode 100755 index 0000000..bf36dd2 --- /dev/null +++ b/ctdb/config/events/legacy/05.system.script @@ -0,0 +1,198 @@ +#!/bin/sh +# ctdb event script for checking local file system utilization + +[ -n "$CTDB_BASE" ] || + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +load_script_options + +ctdb_setup_state_dir "service" "system-monitoring" + +validate_percentage() +{ + case "$1" in + "") return 1 ;; # A failure that doesn't need a warning + [0-9] | [0-9][0-9] | 100) return 0 ;; + *) + echo "WARNING: ${1} is an invalid percentage in \"${2}\" check" + return 1 + ;; + esac +} + +check_thresholds() +{ + _thing="$1" + _thresholds="$2" + _usage="$3" + _unhealthy_callout="$4" + + case "$_thresholds" in + *:*) + _warn_threshold="${_thresholds%:*}" + _unhealthy_threshold="${_thresholds#*:}" + ;; + *) + _warn_threshold="$_thresholds" + _unhealthy_threshold="" + ;; + esac + + _t=$(echo "$_thing" | sed -e 's@/@SLASH_@g' -e 's@ @_@g') + # script_state_dir set by ctdb_setup_state_dir() + # shellcheck disable=SC2154 + _cache="${script_state_dir}/cache_${_t}" + if [ -r "$_cache" ]; then + read -r _prev <"$_cache" + else + _prev=0 + fi + if validate_percentage "$_unhealthy_threshold" "$_thing"; then + if [ "$_usage" -ge "$_unhealthy_threshold" ]; then + printf 'ERROR: %s utilization %d%% >= threshold %d%%\n' \ + "$_thing" \ + "$_usage" \ + "$_unhealthy_threshold" + # Only run unhealthy callout if passing the + # unhealthy threshold. That is, if the + # previous usage was below the threshold. + if [ "$_prev" -lt "$_unhealthy_threshold" ]; then + eval "$_unhealthy_callout" + fi + echo "$_usage" >"$_cache" + exit 1 + fi + fi + + if validate_percentage "$_warn_threshold" "$_thing"; then + if [ "$_usage" -ge "$_warn_threshold" ]; then + if [ "$_usage" = "$_prev" ]; then + return + fi + printf 'WARNING: %s utilization %d%% >= threshold %d%%\n' \ + "$_thing" \ + "$_usage" \ + "$_warn_threshold" + echo "$_usage" >"$_cache" + else + if [ ! -r "$_cache" ]; then + return + fi + printf 'NOTICE: %s utilization %d%% < threshold %d%%\n' \ + "$_thing" \ + "$_usage" \ + "$_warn_threshold" + rm -f "$_cache" + fi + fi +} + +set_monitor_filsystem_usage_defaults() +{ + _fs_defaults_cache="${script_state_dir}/cache_filsystem_usage_defaults" + + if [ ! -r "$_fs_defaults_cache" ]; then + # Determine filesystem for each database directory, generate + # an entry to warn at 90%, de-duplicate entries, put all items + # on 1 line (so the read below gets everything) + for _t in "${CTDB_DBDIR:-${CTDB_VARDIR}}" \ + "${CTDB_DBDIR_PERSISTENT:-${CTDB_VARDIR}/persistent}" \ + "${CTDB_DBDIR_STATE:-${CTDB_VARDIR}/state}"; do + df -kP "$_t" | awk 'NR == 2 { printf "%s:90\n", $6 }' + done | sort -u | xargs >"$_fs_defaults_cache" + fi + + read -r CTDB_MONITOR_FILESYSTEM_USAGE <"$_fs_defaults_cache" +} + +monitor_filesystem_usage() +{ + if [ -z "$CTDB_MONITOR_FILESYSTEM_USAGE" ]; then + set_monitor_filsystem_usage_defaults + fi + + # Check each specified filesystem, specified in format + # <fs_mount>:<fs_warn_threshold>[:fs_unhealthy_threshold] + for _fs in $CTDB_MONITOR_FILESYSTEM_USAGE; do + _fs_mount="${_fs%%:*}" + _fs_thresholds="${_fs#*:}" + + if [ ! -d "$_fs_mount" ]; then + echo "WARNING: Directory ${_fs_mount} does not exist" + continue + fi + + # Get current utilization + _fs_usage=$(df -kP "$_fs_mount" | + sed -n -e 's@.*[[:space:]]\([[:digit:]]*\)%.*@\1@p') + if [ -z "$_fs_usage" ]; then + printf 'WARNING: Unable to get FS utilization for %s\n' \ + "$_fs_mount" + continue + fi + + check_thresholds "Filesystem ${_fs_mount}" \ + "$_fs_thresholds" \ + "$_fs_usage" + done +} + +# shellcheck disable=SC2317 +# Called indirectly via check_thresholds() +dump_memory_info() +{ + get_proc "meminfo" + ps auxfww + set_proc "sysrq-trigger" "m" +} + +monitor_memory_usage() +{ + # Defaults + if [ -z "$CTDB_MONITOR_MEMORY_USAGE" ]; then + CTDB_MONITOR_MEMORY_USAGE=80 + fi + + _meminfo=$(get_proc "meminfo") + # Intentional word splitting here + # shellcheck disable=SC2046 + set -- $(echo "$_meminfo" | awk ' +$1 == "MemAvailable:" { memavail += $2 } +$1 == "MemFree:" { memfree += $2 } +$1 == "Cached:" { memfree += $2 } +$1 == "Buffers:" { memfree += $2 } +$1 == "MemTotal:" { memtotal = $2 } +$1 == "SwapFree:" { swapfree = $2 } +$1 == "SwapTotal:" { swaptotal = $2 } +END { + if (memavail != 0) { memfree = memavail ; } + if (memtotal + swaptotal != 0) { + usedtotal = memtotal - memfree + swaptotal - swapfree + print int(usedtotal / (memtotal + swaptotal) * 100) + } else { + print 0 + } +}') + _mem_usage="$1" + + check_thresholds "System memory" \ + "$CTDB_MONITOR_MEMORY_USAGE" \ + "$_mem_usage" \ + dump_memory_info +} + +case "$1" in +monitor) + # Load/cache database options from configuration file + ctdb_get_db_options + + rc=0 + monitor_filesystem_usage || rc=$? + monitor_memory_usage || rc=$? + exit $rc + ;; +esac + +exit 0 diff --git a/ctdb/config/events/legacy/06.nfs.script b/ctdb/config/events/legacy/06.nfs.script new file mode 100755 index 0000000..b937d43 --- /dev/null +++ b/ctdb/config/events/legacy/06.nfs.script @@ -0,0 +1,39 @@ +#!/bin/sh +# script to manage nfs in a clustered environment + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +service_name="nfs" + +load_script_options "service" "60.nfs" + +ctdb_setup_state_dir "service" "$service_name" + +###################################################################### + +nfs_callout_pre () +{ + _event="$1" + shift + + nfs_callout "${_event}-pre" "$@" +} + +###################################################################### + +# script_state_dir set by ctdb_setup_state_dir() +# shellcheck disable=SC2154 +nfs_callout_init "$script_state_dir" + +case "$1" in +takeip) + nfs_callout_pre "$@" + ;; + +releaseip) + nfs_callout_pre "$@" + ;; +esac diff --git a/ctdb/config/events/legacy/10.interface.script b/ctdb/config/events/legacy/10.interface.script new file mode 100755 index 0000000..fead88c --- /dev/null +++ b/ctdb/config/events/legacy/10.interface.script @@ -0,0 +1,262 @@ +#!/bin/sh + +################################# +# interface event script for ctdb +# this adds/removes IPs from your +# public interface + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +load_script_options + +ctdb_public_addresses="${CTDB_BASE}/public_addresses" + +if [ ! -f "$ctdb_public_addresses" ]; then + if [ "$1" = "init" ] ; then + echo "No public addresses file found" + fi + exit 0 +fi + +# This sets $all_interfaces as a side-effect. +get_all_interfaces () +{ + # Get all the interfaces listed in the public_addresses file + all_interfaces=$(sed -e '/^#.*/d' \ + -e 's/^[^\t ]*[\t ]*//' \ + -e 's/,/ /g' \ + -e 's/[\t ]*$//' "$ctdb_public_addresses") + + # Get the interfaces for which CTDB has public IPs configured. + # That is, for all but the 1st line, get the 1st field. + ctdb_ifaces=$($CTDB -X ifaces | sed -e '1d' -e 's@^|@@' -e 's@|.*@@') + + # Add $ctdb_ifaces and make $all_interfaces unique + # Use word splitting to squash whitespace + # shellcheck disable=SC2086 + all_interfaces=$(echo $all_interfaces $ctdb_ifaces | tr ' ' '\n' | sort -u) +} + +monitor_interfaces() +{ + get_all_interfaces + + down_interfaces_found=false + up_interfaces_found=false + + # Note that this loop must not exit early. It must process + # all interfaces so that the correct state for each interface + # is set in CTDB using setifacelink. + for _iface in $all_interfaces ; do + if interface_monitor "$_iface" ; then + up_interfaces_found=true + $CTDB setifacelink "$_iface" up >/dev/null 2>&1 + else + down_interfaces_found=true + $CTDB setifacelink "$_iface" down >/dev/null 2>&1 + fi + done + + if ! $down_interfaces_found ; then + return 0 + fi + + if ! $up_interfaces_found ; then + return 1 + fi + + if [ "$CTDB_PARTIALLY_ONLINE_INTERFACES" != "yes" ]; then + return 1 + fi + + return 0 +} + +# Sets: iface, ip, maskbits +get_iface_ip_maskbits () +{ + _iface_in="$1" + ip="$2" + _maskbits_in="$3" + + # Intentional word splitting here + # shellcheck disable=SC2046 + set -- $(ip_maskbits_iface "$ip") + if [ -n "$1" ] ; then + maskbits="$1" + iface="$2" + + if [ "$iface" != "$_iface_in" ] ; then + printf \ + 'WARNING: Public IP %s hosted on interface %s but VNN says %s\n' \ + "$ip" "$iface" "$_iface_in" + fi + if [ "$maskbits" != "$_maskbits_in" ] ; then + printf \ + 'WARNING: Public IP %s has %s bit netmask but VNN says %s\n' \ + "$ip" "$maskbits" "$_maskbits_in" + fi + else + die "ERROR: Unable to determine interface for IP ${ip}" + fi +} + +ip_block () +{ + _ip="$1" + _iface="$2" + + case "$_ip" in + *:*) _family="inet6" ;; + *) _family="inet" ;; + esac + + # Extra delete copes with previously killed script + iptables_wrapper "$_family" \ + -D INPUT -i "$_iface" -d "$_ip" -j DROP 2>/dev/null + iptables_wrapper "$_family" \ + -I INPUT -i "$_iface" -d "$_ip" -j DROP +} + +ip_unblock () +{ + _ip="$1" + _iface="$2" + + case "$_ip" in + *:*) _family="inet6" ;; + *) _family="inet" ;; + esac + + iptables_wrapper "$_family" \ + -D INPUT -i "$_iface" -d "$_ip" -j DROP 2>/dev/null +} + +ctdb_check_args "$@" + +case "$1" in +init) + # make sure that we only respond to ARP messages from the NIC where + # a particular ip address is associated. + get_proc sys/net/ipv4/conf/all/arp_filter >/dev/null 2>&1 && { + set_proc sys/net/ipv4/conf/all/arp_filter 1 + } + + _promote="sys/net/ipv4/conf/all/promote_secondaries" + get_proc "$_promote" >/dev/null 2>&1 || \ + die "Public IPs only supported if promote_secondaries is available" + + # make sure we drop any ips that might still be held if + # previous instance of ctdb got killed with -9 or similar + drop_all_public_ips + ;; + +startup) + monitor_interfaces + ;; + +shutdown) + drop_all_public_ips + ;; + +takeip) + iface=$2 + ip=$3 + maskbits=$4 + + add_ip_to_iface "$iface" "$ip" "$maskbits" || { + exit 1; + } + + # In case a previous "releaseip" for this IP was killed... + ip_unblock "$ip" "$iface" + + flush_route_cache + ;; + +releaseip) + # releasing an IP is a bit more complex than it seems. Once the IP + # is released, any open tcp connections to that IP on this host will end + # up being stuck. Some of them (such as NFS connections) will be unkillable + # so we need to use the killtcp ctdb function to kill them off. We also + # need to make sure that no new connections get established while we are + # doing this! So what we do is this: + # 1) firewall this IP, so no new external packets arrive for it + # 2) find existing connections, and kill them + # 3) remove the IP from the interface + # 4) remove the firewall rule + shift + get_iface_ip_maskbits "$@" + + ip_block "$ip" "$iface" + + kill_tcp_connections "$iface" "$ip" + + delete_ip_from_iface "$iface" "$ip" "$maskbits" || { + ip_unblock "$ip" "$iface" + exit 1 + } + + ip_unblock "$ip" "$iface" + + flush_route_cache + ;; + +updateip) + # moving an IP is a bit more complex than it seems. + # First we drop all traffic on the old interface. + # Then we try to add the ip to the new interface and before + # we finally remove it from the old interface. + # + # 1) firewall this IP, so no new external packets arrive for it + # 2) remove the IP from the old interface (and new interface, to be sure) + # 3) add the IP to the new interface + # 4) remove the firewall rule + # 5) use ctdb gratarp to propagate the new mac address + # 6) use netstat -tn to find existing connections, and tickle them + _oiface=$2 + niface=$3 + _ip=$4 + _maskbits=$5 + + get_iface_ip_maskbits "$_oiface" "$_ip" "$_maskbits" + oiface="$iface" + + # Could check maskbits too. However, that should never change + # so we want to notice if it does. + if [ "$oiface" = "$niface" ] ; then + echo "Redundant \"updateip\" - ${ip} already on ${niface}" + exit 0 + fi + + ip_block "$ip" "$oiface" + + delete_ip_from_iface "$oiface" "$ip" "$maskbits" 2>/dev/null + delete_ip_from_iface "$niface" "$ip" "$maskbits" 2>/dev/null + + add_ip_to_iface "$niface" "$ip" "$maskbits" || { + ip_unblock "$ip" "$oiface" + exit 1 + } + + ip_unblock "$ip" "$oiface" + + flush_route_cache + + # propagate the new mac address + $CTDB gratarp "$ip" "$niface" + + # tickle all existing connections, so that dropped packets + # are retransmitted and the tcp streams work + tickle_tcp_connections "$ip" + ;; + +monitor) + monitor_interfaces || exit 1 + ;; +esac + +exit 0 diff --git a/ctdb/config/events/legacy/11.natgw.script b/ctdb/config/events/legacy/11.natgw.script new file mode 100755 index 0000000..fb93dea --- /dev/null +++ b/ctdb/config/events/legacy/11.natgw.script @@ -0,0 +1,242 @@ +#!/bin/sh +# Script to set up one of the nodes as a NAT gateway for all other nodes. +# This is used to ensure that all nodes in the cluster can still originate +# traffic to the external network even if there are no public addresses +# available. +# + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +service_name="natgw" + +load_script_options + +[ -n "$CTDB_NATGW_NODES" ] || exit 0 +export CTDB_NATGW_NODES + +ctdb_setup_state_dir "failover" "$service_name" + +# script_state_dir set by ctdb_setup_state_dir() +# shellcheck disable=SC2154 +natgw_cfg_new="${script_state_dir}/cfg_new" +natgw_cfg_old="${script_state_dir}/cfg_old" +natgw_leader_old="${script_state_dir}/leader_old" + +ctdb_natgw_follower_only () +{ + _ip_address=$(ctdb_get_ip_address) + + awk -v my_ip="$_ip_address" \ + '$1 == my_ip { if ($2 ~ "follower-only") { exit 0 } else { exit 1 } }' \ + "$CTDB_NATGW_NODES" +} + +natgw_check_config () +{ + [ -r "$CTDB_NATGW_NODES" ] || \ + die "error: CTDB_NATGW_NODES=${CTDB_NATGW_NODES} unreadable" + if ! ctdb_natgw_follower_only ; then + [ -n "$CTDB_NATGW_PUBLIC_IP" ] || \ + die "Invalid configuration: CTDB_NATGW_PUBLIC_IP not set" + [ -n "$CTDB_NATGW_PUBLIC_IFACE" ] || \ + die "Invalid configuration: CTDB_NATGW_PUBLIC_IFACE not set" + fi + [ -n "$CTDB_NATGW_PRIVATE_NETWORK" ] || \ + die "Invalid configuration: CTDB_NATGW_PRIVATE_NETWORK not set" + + # The default is to create a single default route + [ -n "$CTDB_NATGW_STATIC_ROUTES" ] || CTDB_NATGW_STATIC_ROUTES="0.0.0.0/0" +} + +natgw_write_config () +{ + _f="$1" + + cat >"$_f" <<EOF +CTDB_NATGW_NODES="$CTDB_NATGW_NODES" +CTDB_NATGW_PUBLIC_IP="$CTDB_NATGW_PUBLIC_IP" +CTDB_NATGW_PUBLIC_IFACE="$CTDB_NATGW_PUBLIC_IFACE" +CTDB_NATGW_DEFAULT_GATEWAY="$CTDB_NATGW_DEFAULT_GATEWAY" +CTDB_NATGW_PRIVATE_NETWORK="$CTDB_NATGW_PRIVATE_NETWORK" +CTDB_NATGW_STATIC_ROUTES="$CTDB_NATGW_STATIC_ROUTES" +EOF +} + +natgw_config_has_changed () +{ + natgw_write_config "$natgw_cfg_new" + + # Non-existent old returns true, no log message + if [ ! -f "$natgw_cfg_old" ] ; then + return 0 + fi + + # Handle no change + if cmp "$natgw_cfg_old" "$natgw_cfg_new" >/dev/null 2>&1 ; then + return 1 + fi + + echo "NAT gateway configuration has changed" + return 0 +} + +_natgw_clear () +{ + _ip="${CTDB_NATGW_PUBLIC_IP%/*}" + _maskbits="${CTDB_NATGW_PUBLIC_IP#*/}" + + delete_ip_from_iface \ + "$CTDB_NATGW_PUBLIC_IFACE" "$_ip" "$_maskbits" >/dev/null 2>&1 + for _net_gw in $CTDB_NATGW_STATIC_ROUTES ; do + _net="${_net_gw%@*}" + ip route del "$_net" metric 10 >/dev/null 2>/dev/null + done + + # Delete the masquerading setup from a previous iteration where we + # were the NAT-GW + iptables -D POSTROUTING -t nat \ + -s "$CTDB_NATGW_PRIVATE_NETWORK" ! -d "$CTDB_NATGW_PRIVATE_NETWORK" \ + -j MASQUERADE >/dev/null 2>/dev/null + + iptables -D INPUT -p tcp --syn -d "${_ip}/32" -j REJECT 2>/dev/null +} + +natgw_clear () +{ + if [ -r "$natgw_cfg_old" ] ; then + (. "$natgw_cfg_old" ; _natgw_clear) + else + _natgw_clear + fi +} + +natgw_set_leader () +{ + set_proc sys/net/ipv4/ip_forward 1 + iptables -A POSTROUTING -t nat \ + -s "$CTDB_NATGW_PRIVATE_NETWORK" ! -d "$CTDB_NATGW_PRIVATE_NETWORK" \ + -j MASQUERADE + + # block all incoming connections to the NATGW IP address + ctdb_natgw_public_ip_host="${CTDB_NATGW_PUBLIC_IP%/*}/32" + iptables -D INPUT -p tcp --syn \ + -d "$ctdb_natgw_public_ip_host" -j REJECT 2>/dev/null + iptables -I INPUT -p tcp --syn \ + -d "$ctdb_natgw_public_ip_host" -j REJECT 2>/dev/null + + ip addr add "$CTDB_NATGW_PUBLIC_IP" dev "$CTDB_NATGW_PUBLIC_IFACE" + for _net_gw in $CTDB_NATGW_STATIC_ROUTES ; do + _net="${_net_gw%@*}" + if [ "$_net" != "$_net_gw" ] ; then + _gw="${_net_gw#*@}" + else + _gw="$CTDB_NATGW_DEFAULT_GATEWAY" + fi + + [ -n "$_gw" ] || continue + ip route add "$_net" metric 10 via "$_gw" + done +} + +natgw_set_follower () +{ + _natgwip="$1" + + for _net_gw in $CTDB_NATGW_STATIC_ROUTES ; do + _net="${_net_gw%@*}" + ip route add "$_net" via "$_natgwip" metric 10 + done +} + +natgw_ensure_leader () +{ + # Intentional word splitting here + # shellcheck disable=SC2046 + set -- $("${CTDB_HELPER_BINDIR}/ctdb_natgw" leader) + natgwleader="${1:--1}" # Default is -1, for failure above + natgwip="$2" + + if [ "$natgwleader" = "-1" ]; then + # Fail... + die "There is no NATGW leader node" + fi +} + +natgw_leader_has_changed () +{ + if [ -r "$natgw_leader_old" ] ; then + read _old_natgwleader <"$natgw_leader_old" + else + _old_natgwleader="" + fi + [ "$_old_natgwleader" != "$natgwleader" ] +} + +natgw_save_state () +{ + echo "$natgwleader" >"$natgw_leader_old" + # Created by natgw_config_has_changed() + mv "$natgw_cfg_new" "$natgw_cfg_old" +} + + +case "$1" in +setup) + natgw_check_config + ;; + +startup) + natgw_check_config + + # Error if CTDB_NATGW_PUBLIC_IP is listed in public addresses + ip_pat=$(echo "$CTDB_NATGW_PUBLIC_IP" | sed -e 's@\.@\\.@g') + ctdb_public_addresses="${CTDB_BASE}/public_addresses" + if grep -q "^${ip_pat}[[:space:]]" "$ctdb_public_addresses" ; then + die "ERROR: CTDB_NATGW_PUBLIC_IP same as a public address" + fi + + # do not send out arp requests from loopback addresses + set_proc sys/net/ipv4/conf/all/arp_announce 2 + ;; + +updatenatgw|ipreallocated) + natgw_check_config + + natgw_ensure_leader + + natgw_config_has_changed || natgw_leader_has_changed || exit 0 + + natgw_clear + + pnn=$(ctdb_get_pnn) + if [ "$pnn" = "$natgwleader" ]; then + natgw_set_leader + else + natgw_set_follower "$natgwip" + fi + + # flush our route cache + set_proc sys/net/ipv4/route/flush 1 + + # Only update saved state when NATGW successfully updated + natgw_save_state + ;; + +shutdown|removenatgw) + natgw_check_config + natgw_clear + ;; + +monitor) + natgw_check_config + + if [ -n "$CTDB_NATGW_PUBLIC_IFACE" ] ; then + interface_monitor "$CTDB_NATGW_PUBLIC_IFACE" || exit 1 + fi + ;; +esac + +exit 0 diff --git a/ctdb/config/events/legacy/11.routing.script b/ctdb/config/events/legacy/11.routing.script new file mode 100755 index 0000000..7ba7f3b --- /dev/null +++ b/ctdb/config/events/legacy/11.routing.script @@ -0,0 +1,49 @@ +#!/bin/sh + +# Attempt to add a set of static routes. +# +# Do this in "ipreallocated" rather than just "startup" because some +# of the routes might be missing because the corresponding interface +# has not previously had any IPs assigned or IPs were previously +# released and corresponding routes were dropped. +# +# Addition of some routes might fail, errors go to /dev/null. +# +# Routes to add are defined in $CTDB_BASE/static-routes. Syntax is: +# +# IFACE NET/MASK GATEWAY +# +# Example: +# +# bond1 10.3.3.0/24 10.0.0.1 + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +load_script_options + +[ -f "${CTDB_BASE}/static-routes" ] || { + exit 0 +} + +case "$1" in +ipreallocated) + while read iface dest gw; do + ip route add "$dest" via "$gw" dev "$iface" >/dev/null 2>&1 + done <"${CTDB_BASE}/static-routes" + ;; + +updateip) + oiface=$2 + niface=$3 + while read iface dest gw; do + if [ "$niface" = "$iface" ] || [ "$oiface" = "$iface" ] ; then + ip route add "$dest" via "$gw" dev "$iface" >/dev/null 2>&1 + fi + done <"${CTDB_BASE}/static-routes" + ;; +esac + +exit 0 diff --git a/ctdb/config/events/legacy/13.per_ip_routing.script b/ctdb/config/events/legacy/13.per_ip_routing.script new file mode 100755 index 0000000..d7949c6 --- /dev/null +++ b/ctdb/config/events/legacy/13.per_ip_routing.script @@ -0,0 +1,438 @@ +#!/bin/sh + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +load_script_options + +service_name="per_ip_routing" + +# Do nothing if unconfigured +[ -n "$CTDB_PER_IP_ROUTING_CONF" ] || exit 0 + +table_id_prefix="ctdb." + +[ -n "$CTDB_PER_IP_ROUTING_RULE_PREF" ] || \ + die "error: CTDB_PER_IP_ROUTING_RULE_PREF not configured" + +[ "$CTDB_PER_IP_ROUTING_TABLE_ID_LOW" -lt "$CTDB_PER_IP_ROUTING_TABLE_ID_HIGH" ] 2>/dev/null || \ + die "error: CTDB_PER_IP_ROUTING_TABLE_ID_LOW[$CTDB_PER_IP_ROUTING_TABLE_ID_LOW] and/or CTDB_PER_IP_ROUTING_TABLE_ID_HIGH[$CTDB_PER_IP_ROUTING_TABLE_ID_HIGH] improperly configured" + +if [ "$CTDB_PER_IP_ROUTING_TABLE_ID_LOW" -le 253 ] && \ + [ 255 -le "$CTDB_PER_IP_ROUTING_TABLE_ID_HIGH" ] ; then + die "error: range CTDB_PER_IP_ROUTING_TABLE_ID_LOW[$CTDB_PER_IP_ROUTING_TABLE_ID_LOW]..CTDB_PER_IP_ROUTING_TABLE_ID_HIGH[$CTDB_PER_IP_ROUTING_TABLE_ID_HIGH] must not include 253-255" +fi + +have_link_local_config () +{ + [ "$CTDB_PER_IP_ROUTING_CONF" = "__auto_link_local__" ] +} + +if ! have_link_local_config && [ ! -r "$CTDB_PER_IP_ROUTING_CONF" ] ; then + die "error: CTDB_PER_IP_ROUTING_CONF=$CTDB_PER_IP_ROUTING_CONF file not found" +fi + +ctdb_setup_state_dir "failover" "$service_name" + +###################################################################### + +ipv4_is_valid_addr() +{ + _ip="$1" + + _count=0 + # Get the shell to break up the address into 1 word per octet + # Intentional word splitting here + # shellcheck disable=SC2086 + for _o in $(export IFS="." ; echo $_ip) ; do + # The 2>/dev/null stops output from failures where an "octet" + # is not numeric. The test will still fail. + if ! [ 0 -le $_o ] && [ $_o -le 255 ] 2>/dev/null ; then + return 1 + fi + _count=$((_count + 1)) + done + + # A valid IPv4 address has 4 octets + [ $_count -eq 4 ] +} + +ensure_ipv4_is_valid_addr () +{ + _event="$1" + _ip="$2" + + ipv4_is_valid_addr "$_ip" || { + echo "$0: $_event not an ipv4 address skipping IP:$_ip" + exit 0 + } +} + +ipv4_host_addr_to_net () +{ + _host="$1" + _maskbits="$2" + + # Convert the host address to an unsigned long by splitting out + # the octets and doing the math. + _host_ul=0 + # Intentional word splitting here + # shellcheck disable=SC2086 + for _o in $(export IFS="." ; echo $_host) ; do + _host_ul=$(( (_host_ul << 8) + _o)) # work around Emacs color bug + done + + # Calculate the mask and apply it. + _mask_ul=$(( 0xffffffff << (32 - _maskbits) )) + _net_ul=$(( _host_ul & _mask_ul )) + + # Now convert to a network address one byte at a time. + _net="" + for _o in $(seq 1 4) ; do + _net="$((_net_ul & 255))${_net:+.}${_net}" + _net_ul=$((_net_ul >> 8)) + done + + echo "${_net}/${_maskbits}" +} + +###################################################################### + +ensure_rt_tables () +{ + rt_tables="$CTDB_SYS_ETCDIR/iproute2/rt_tables" + # script_state_dir set by ctdb_setup_state_dir() + # shellcheck disable=SC2154 + rt_tables_lock="${script_state_dir}/rt_tables_lock" + + # This file should always exist. Even if this didn't exist on the + # system, adding a route will have created it. What if we startup + # and immediately shutdown? Let's be sure. + if [ ! -f "$rt_tables" ] ; then + mkdir -p "${rt_tables%/*}" # dirname + touch "$rt_tables" + fi +} + +# Setup a table id to use for the given IP. We don't need to know it, +# it just needs to exist in /etc/iproute2/rt_tables. Fail if no free +# table id could be found in the configured range. +ensure_table_id_for_ip () +{ + _ip=$1 + + ensure_rt_tables + + # Maintain a table id for each IP address we've ever seen in + # rt_tables. We use a "ctdb." prefix on the label. + _label="${table_id_prefix}${_ip}" + + # This finds either the table id corresponding to the label or a + # new unused one (that is greater than all the used ones in the + # range). + ( + # Note that die() just gets us out of the subshell... + flock --timeout 30 9 || \ + die "ensure_table_id_for_ip: failed to lock file $rt_tables" + + _new="$CTDB_PER_IP_ROUTING_TABLE_ID_LOW" + while read _t _l ; do + # Skip comments + case "$_t" in + \#*) continue ;; + esac + # Found existing: done + if [ "$_l" = "$_label" ] ; then + return 0 + fi + # Potentially update the new table id to be used. The + # redirect stops error spam for a non-numeric value. + if [ "$_new" -le "$_t" ] && \ + [ "$_t" -le "$CTDB_PER_IP_ROUTING_TABLE_ID_HIGH" ] \ + 2>/dev/null ; then + _new=$((_t + 1)) + fi + done <"$rt_tables" + + # If the new table id is legal then add it to the file and + # print it. + if [ "$_new" -le "$CTDB_PER_IP_ROUTING_TABLE_ID_HIGH" ] ; then + printf '%d\t%s\n' "$_new" "$_label" >>"$rt_tables" + return 0 + else + return 1 + fi + ) 9>"$rt_tables_lock" +} + +# Clean up all the table ids that we might own. +clean_up_table_ids () +{ + ensure_rt_tables + + ( + # Note that die() just gets us out of the subshell... + flock --timeout 30 9 || \ + die "clean_up_table_ids: failed to lock file $rt_tables" + + # Delete any items from the file that have a table id in our + # range or a label matching our label. Preserve comments. + _tmp="${rt_tables}.$$.ctdb" + awk -v min="$CTDB_PER_IP_ROUTING_TABLE_ID_LOW" \ + -v max="$CTDB_PER_IP_ROUTING_TABLE_ID_HIGH" \ + -v pre="$table_id_prefix" \ + '/^#/ || + !(min <= $1 && $1 <= max) && + !(index($2, pre) == 1) { + print $0 }' "$rt_tables" >"$_tmp" + + mv "$_tmp" "$rt_tables" + ) 9>"$rt_tables_lock" +} + +###################################################################### + +# This prints the config for an IP, which is either relevant entries +# from the config file or, if set to the magic link local value, some +# link local routing config for the IP. +get_config_for_ip () +{ + _ip="$1" + + if have_link_local_config ; then + # When parsing public_addresses also split on '/'. This means + # that we get the maskbits as item #2 without further parsing. + while IFS="/$IFS" read _i _maskbits _x ; do + if [ "$_ip" = "$_i" ] ; then + printf "%s" "$_ip "; ipv4_host_addr_to_net "$_ip" "$_maskbits" + fi + done <"${CTDB_BASE}/public_addresses" + else + while read _i _rest ; do + if [ "$_ip" = "$_i" ] ; then + printf '%s\t%s\n' "$_ip" "$_rest" + fi + done <"$CTDB_PER_IP_ROUTING_CONF" + fi +} + +ip_has_configuration () +{ + _ip="$1" + + _conf=$(get_config_for_ip "$_ip") + [ -n "$_conf" ] +} + +add_routing_for_ip () +{ + _iface="$1" + _ip="$2" + + # Do nothing if no config for this IP. + ip_has_configuration "$_ip" || return 0 + + ensure_table_id_for_ip "$_ip" || \ + die "add_routing_for_ip: out of table ids in range $CTDB_PER_IP_ROUTING_TABLE_ID_LOW - $CTDB_PER_IP_ROUTING_TABLE_ID_HIGH" + + _pref="$CTDB_PER_IP_ROUTING_RULE_PREF" + _table_id="${table_id_prefix}${_ip}" + + del_routing_for_ip "$_ip" >/dev/null 2>&1 + + ip rule add from "$_ip" pref "$_pref" table "$_table_id" || \ + die "add_routing_for_ip: failed to add rule for $_ip" + + # Add routes to table for any lines matching the IP. + get_config_for_ip "$_ip" | + while read _i _dest _gw ; do + _r="$_dest ${_gw:+via} $_gw dev $_iface table $_table_id" + # Intentionally unquoted multi-word value here + # shellcheck disable=SC2086 + ip route add $_r || \ + die "add_routing_for_ip: failed to add route: $_r" + done +} + +del_routing_for_ip () +{ + _ip="$1" + + _pref="$CTDB_PER_IP_ROUTING_RULE_PREF" + _table_id="${table_id_prefix}${_ip}" + + # Do this unconditionally since we own any matching table ids. + # However, print a meaningful message if something goes wrong. + _cmd="ip rule del from $_ip pref $_pref table $_table_id" + _out=$($_cmd 2>&1) || \ + cat <<EOF +WARNING: Failed to delete policy routing rule + Command "$_cmd" failed: + $_out +EOF + # This should never usually fail, so don't redirect output. + # However, it can fail when deleting a rogue IP, since there will + # be no routes for that IP. In this case it should only fail when + # the rule deletion above has already failed because the table id + # is invalid. Therefore, go to a little bit of trouble to indent + # the failure message so that it is associated with the above + # warning message and doesn't look too nasty. + ip route flush table "$_table_id" 2>&1 | sed -e 's@^.@ &@' +} + +###################################################################### + +flush_rules_and_routes () +{ + ip rule show | + while read _p _x _i _x _t ; do + # Remove trailing colon after priority/preference. + _p="${_p%:}" + # Only remove rules that match our priority/preference. + [ "$CTDB_PER_IP_ROUTING_RULE_PREF" = "$_p" ] || continue + + echo "Removing ip rule for public address $_i for routing table $_t" + ip rule del from "$_i" table "$_t" pref "$_p" + ip route flush table "$_t" 2>/dev/null + done +} + +# Add any missing routes. Some might have gone missing if, for +# example, all IPs on the network were removed (possibly if the +# primary was removed). If $1 is "force" then (re-)add all the +# routes. +add_missing_routes () +{ + $CTDB ip -v -X | { + read _x # skip header line + + # Read the rest of the lines. We're only interested in the + # "IP" and "ActiveInterface" columns. The latter is only set + # for addresses local to this node, making it easy to skip + # non-local addresses. For each IP local address we check if + # the relevant routing table is populated and populate it if + # not. + while IFS="|" read _x _ip _x _iface _x ; do + [ -n "$_iface" ] || continue + + _table_id="${table_id_prefix}${_ip}" + if [ -z "$(ip route show table "$_table_id" 2>/dev/null)" ] || \ + [ "$1" = "force" ] ; then + add_routing_for_ip "$_iface" "$_ip" + fi + done + } || exit $? +} + +# Remove rules/routes for addresses that we're not hosting. If a +# releaseip event failed in an earlier script then we might not have +# had a chance to remove the corresponding rules/routes. +remove_bogus_routes () +{ + # Get a IPs current hosted by this node, each anchored with '@'. + _ips=$($CTDB ip -v -X | awk -F'|' 'NR > 1 && $4 != "" {printf "@%s@\n", $2}') + + # x is intentionally ignored + # shellcheck disable=SC2034 + ip rule show | + while read _p _x _i _x _t ; do + # Remove trailing colon after priority/preference. + _p="${_p%:}" + # Only remove rules that match our priority/preference. + [ "$CTDB_PER_IP_ROUTING_RULE_PREF" = "$_p" ] || continue + # Only remove rules for which we don't have an IP. This could + # be done with grep, but let's do it with shell prefix removal + # to avoid unnecessary processes. This falls through if + # "@${_i}@" isn't present in $_ips. + [ "$_ips" = "${_ips#*@"${_i}"@}" ] || continue + + echo "Removing ip rule/routes for unhosted public address $_i" + del_routing_for_ip "$_i" + done +} + +###################################################################### + +ctdb_check_args "$@" + +case "$1" in +startup) + flush_rules_and_routes + + # make sure that we only respond to ARP messages from the NIC + # where a particular ip address is associated. + get_proc sys/net/ipv4/conf/all/arp_filter >/dev/null 2>&1 && { + set_proc sys/net/ipv4/conf/all/arp_filter 1 + } + ;; + +shutdown) + flush_rules_and_routes + clean_up_table_ids + ;; + +takeip) + iface=$2 + ip=$3 + # maskbits included here so argument order is obvious + # shellcheck disable=SC2034 + maskbits=$4 + + ensure_ipv4_is_valid_addr "$1" "$ip" + add_routing_for_ip "$iface" "$ip" + + # flush our route cache + set_proc sys/net/ipv4/route/flush 1 + + $CTDB gratarp "$ip" "$iface" + ;; + +updateip) + # oiface, maskbits included here so argument order is obvious + # shellcheck disable=SC2034 + oiface=$2 + niface=$3 + ip=$4 + # shellcheck disable=SC2034 + maskbits=$5 + + ensure_ipv4_is_valid_addr "$1" "$ip" + add_routing_for_ip "$niface" "$ip" + + # flush our route cache + set_proc sys/net/ipv4/route/flush 1 + + $CTDB gratarp "$ip" "$niface" + tickle_tcp_connections "$ip" + ;; + +releaseip) + iface=$2 + ip=$3 + # maskbits included here so argument order is obvious + # shellcheck disable=SC2034 + maskbits=$4 + + ensure_ipv4_is_valid_addr "$1" "$ip" + del_routing_for_ip "$ip" + ;; + +ipreallocated) + add_missing_routes + remove_bogus_routes + ;; + +reconfigure) + echo "Reconfiguring service \"${service_name}\"..." + + add_missing_routes "force" + remove_bogus_routes + + # flush our route cache + set_proc sys/net/ipv4/route/flush 1 + ;; +esac + +exit 0 diff --git a/ctdb/config/events/legacy/20.multipathd.script b/ctdb/config/events/legacy/20.multipathd.script new file mode 100755 index 0000000..a420251 --- /dev/null +++ b/ctdb/config/events/legacy/20.multipathd.script @@ -0,0 +1,83 @@ +#!/bin/sh +# ctdb event script for monitoring the multipath daemon +# +# Configure monitporing of multipath devices by listing the device serials +# in /etc/ctdb/multipathd : +# CTDB_MONITOR_MPDEVICES="device1 device2 ..." +# + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +service_name="multipathd" + +load_script_options + +[ -n "$CTDB_MONITOR_MPDEVICES" ] || exit 0 + +ctdb_setup_state_dir "service" "$service_name" + +# script_state_dir set by ctdb_setup_state_dir() +# shellcheck disable=SC2154 +multipath_fail="${script_state_dir}/fail" + +multipathd_check_background() +{ + for _device in $CTDB_MONITOR_MPDEVICES; do + # Check multipath knows about the device + _out=$(multipath -ll "$_device") + if [ -z "$_out" ] ; then + echo "ERROR: device \"${_device}\" not known to multipathd" \ + >"$multipath_fail" + exit 1 + fi + + # Check for at least 1 active path + if ! echo "$_out" | grep 'prio=.* status=active' >/dev/null 2>&1 ; then + echo "ERROR: multipath device \"${_device}\" has no active paths" \ + >"$multipath_fail" + exit 1 + fi + done + exit 0 +} + +multipathd_check() +{ + # Run the actual check in the background since the call to + # multipath may block + multipathd_check_background </dev/null >/dev/null 2>&1 & + _pid="$!" + _timeleft=10 + + while [ $_timeleft -gt 0 ]; do + _timeleft=$((_timeleft - 1)) + + # see if the process still exists + kill -0 $_pid >/dev/null 2>&1 || { + if wait $_pid ; then + return 0 + else + cat "$multipath_fail" + rm -f "$multipath_fail" + return 1 + fi + } + sleep 1 + done + + echo "ERROR: callout to multipath checks hung" + # If hung then this probably won't work, but worth trying... + kill -9 $_pid >/dev/null 2>&1 + return 1 +} + +case "$1" in +monitor) + multipathd_check || die "multipath monitoring failed" + ;; +esac + +exit 0 diff --git a/ctdb/config/events/legacy/31.clamd.script b/ctdb/config/events/legacy/31.clamd.script new file mode 100755 index 0000000..5d60fe3 --- /dev/null +++ b/ctdb/config/events/legacy/31.clamd.script @@ -0,0 +1,37 @@ +#!/bin/sh +# event script to manage clamd in a cluster environment + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +detect_init_style + +case $CTDB_INIT_STYLE in +redhat) + service_name="clamd" + ;; +*) + service_name="clamav" + ;; +esac + +load_script_options + +case "$1" in +startup) + service "$service_name" stop > /dev/null 2>&1 + service "$service_name" start || exit $? + ;; + +shutdown) + service "$service_name"_stop + ;; + +monitor) + ctdb_check_unix_socket "$CTDB_CLAMD_SOCKET" || exit $? + ;; +esac + +exit 0 diff --git a/ctdb/config/events/legacy/40.vsftpd.script b/ctdb/config/events/legacy/40.vsftpd.script new file mode 100755 index 0000000..2d2aac4 --- /dev/null +++ b/ctdb/config/events/legacy/40.vsftpd.script @@ -0,0 +1,57 @@ +#!/bin/sh +# event strict to manage vsftpd in a cluster environment + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +service_name="vsftpd" + +service_reconfigure () +{ + # shellcheck disable=SC2317 + # Called indirectly via ctdb_service_reconfigure() + service "$service_name" restart +} + +load_script_options + +ctdb_setup_state_dir "service" "$service_name" + +port_21="vsftpd listening on TCP port 21" + +case "$1" in +startup) + service "$service_name" stop > /dev/null 2>&1 + service "$service_name" start + failcount_init "$port_21" + ;; + +shutdown) + service "$service_name" stop + ;; + +takeip|releaseip) + ctdb_service_set_reconfigure + ;; + +ipreallocated) + if ctdb_service_needs_reconfigure ; then + ctdb_service_reconfigure + fi + ;; + +monitor) + if ctdb_check_tcp_ports 21 ; then + failcount_reset "$port_21" + else + # Set defaults, if unset + : "${CTDB_VSFTPD_MONITOR_THRESHOLDS:=1:2}" + + failcount_incr "$port_21" "$CTDB_VSFTPD_MONITOR_THRESHOLDS" + fi + ;; +esac + +exit 0 diff --git a/ctdb/config/events/legacy/41.httpd.script b/ctdb/config/events/legacy/41.httpd.script new file mode 100755 index 0000000..dd90aed --- /dev/null +++ b/ctdb/config/events/legacy/41.httpd.script @@ -0,0 +1,78 @@ +#!/bin/sh +# event script to manage httpd in a cluster environment + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +detect_init_style + +case $CTDB_INIT_STYLE in +redhat) + service_name="httpd" + ;; +suse|debian|*) + service_name="apache2" + ;; +esac + +load_script_options + +ctdb_setup_state_dir "service" "$service_name" + +# RHEL5 sometimes use a SIGKILL to terminate httpd, which then leaks +# semaphores. This is a hack to clean them up. +cleanup_httpd_semaphore_leak() { + killall -q -0 "$service_name" || + for i in $(ipcs -s | awk '$3 == "apache" { print $2 }') ; do + ipcrm -s "$i" + done +} + +########## + +service_start () +{ + cleanup_httpd_semaphore_leak + service $service_name start +} +service_stop () +{ + service $service_name stop + killall -q -9 $service_name || true +} + +case "$1" in +startup) + service_start + ctdb_counter_init + ;; + +shutdown) + service_stop + ;; + +monitor) + if ctdb_check_tcp_ports 80 >/dev/null 2>/dev/null ; then + ctdb_counter_init + else + ctdb_counter_incr + num_fails=$(ctdb_counter_get) + if [ "$num_fails" -eq 2 ] ; then + echo "HTTPD is not running. Trying to restart HTTPD." + service_stop + service_start + exit 0 + elif [ "$num_fails" -ge 5 ] ; then + echo "HTTPD is not running. Trying to restart HTTPD." + service_stop + service_start + exit 1 + fi + fi + ;; +esac + +exit 0 + diff --git a/ctdb/config/events/legacy/47.samba-dcerpcd.script b/ctdb/config/events/legacy/47.samba-dcerpcd.script new file mode 100755 index 0000000..9492d55 --- /dev/null +++ b/ctdb/config/events/legacy/47.samba-dcerpcd.script @@ -0,0 +1,66 @@ +#!/bin/sh +# ctdb event script for SAMBA DCERPCD Services + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +detect_init_style + +case $CTDB_INIT_STYLE in + *) + # distributions don't have this yet, + # but assume samba-dcerpcd as service name + CTDB_SERVICE_SAMBA_DCERPCD=${CTDB_SERVICE_SAMBA_DCERPCD:-samba-dcerpcd} + ;; +esac + +load_script_options + +service_start () +{ + # make sure samba-dcerpcd is not already started + service "$CTDB_SERVICE_SAMBA_DCERPCD" stop > /dev/null 2>&1 + killall -0 -q samba-dcerpcd && { + sleep 1 + # make absolutely sure samba-dcerpcd is dead + killall -q -9 samba-dcerpcd + } + + # start Samba dcerpcd service. Start it reniced, as under very heavy load + # the number of smbd processes will mean that it leaves few cycles + # for anything else + nice_service "$CTDB_SERVICE_SAMBA_DCERPCD" start || die "Failed to start samba-dcerpcd" +} + +service_stop () +{ + service "$CTDB_SERVICE_SAMBA_DCERPCD" stop +} + +service_status () +{ + service "$CTDB_SERVICE_SAMBA_DCERPCD" status > /dev/null + test $? = 0 && return 0 + service "$CTDB_SERVICE_SAMBA_DCERPCD" status +} + +########################### + +case "$1" in +startup) + service_start + ;; + +shutdown) + service_stop + ;; + +monitor) + service_status + ;; + +esac + +exit 0 diff --git a/ctdb/config/events/legacy/48.netbios.script b/ctdb/config/events/legacy/48.netbios.script new file mode 100755 index 0000000..1531e49 --- /dev/null +++ b/ctdb/config/events/legacy/48.netbios.script @@ -0,0 +1,75 @@ +#!/bin/sh +# ctdb event script for Netbios Name Services + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +detect_init_style + +case $CTDB_INIT_STYLE in + suse) + CTDB_SERVICE_NMB=${CTDB_SERVICE_NMB:-nmb} + ;; + debian) + CTDB_SERVICE_NMB=${CTDB_SERVICE_NMB:-nmbd} + ;; + *) + # Use redhat style as default: + CTDB_SERVICE_NMB=${CTDB_SERVICE_NMB:-nmb} + ;; +esac + +service_name="netbios" + +load_script_options + +ctdb_setup_state_dir "service" "$service_name" + +service_start () +{ + # make sure nmbd is not already started + service "$CTDB_SERVICE_NMB" stop > /dev/null 2>&1 + killall -0 -q nmbd && { + sleep 1 + # make absolutely sure nmbd is dead + killall -q -9 nmbd + } + + # start Samba nmbd service. Start it reniced, as under very heavy load + # the number of smbd processes will mean that it leaves few cycles + # for anything else + nice_service "$CTDB_SERVICE_NMB" start || die "Failed to start nmbd" +} + +service_stop () +{ + service "$CTDB_SERVICE_NMB" stop +} + +service_status () +{ + service "$CTDB_SERVICE_NMB" status > /dev/null + test $? = 0 && return 0 + service "$CTDB_SERVICE_NMB" status +} + +########################### + +case "$1" in +startup) + service_start + ;; + +shutdown) + service_stop + ;; + +monitor) + service_status + ;; + +esac + +exit 0 diff --git a/ctdb/config/events/legacy/49.winbind.script b/ctdb/config/events/legacy/49.winbind.script new file mode 100755 index 0000000..852b541 --- /dev/null +++ b/ctdb/config/events/legacy/49.winbind.script @@ -0,0 +1,55 @@ +#!/bin/sh +# ctdb event script for winbind + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +CTDB_SERVICE_WINBIND=${CTDB_SERVICE_WINBIND:-winbind} + +# service_name is used by various functions +# shellcheck disable=SC2034 +service_name="winbind" + +load_script_options + +service_start () +{ + service "$CTDB_SERVICE_WINBIND" stop >/dev/null 2>&1 + killall -0 -q winbindd && { + sleep 1 + # make absolutely sure winbindd is dead + killall -q -9 winbindd + } + + service "$CTDB_SERVICE_WINBIND" start || \ + die "Failed to start winbind" +} + +service_stop () +{ + service "$CTDB_SERVICE_WINBIND" stop +} + +########################### + +case "$1" in +startup) + service_start + ;; + +shutdown) + service_stop + ;; + +monitor) + if ! out=$(wbinfo -p 2>&1) ; then + echo "ERROR: wbinfo -p returned error" + echo "$out" + exit 1 + fi + ;; +esac + +exit 0 diff --git a/ctdb/config/events/legacy/50.samba.script b/ctdb/config/events/legacy/50.samba.script new file mode 100755 index 0000000..84600e2 --- /dev/null +++ b/ctdb/config/events/legacy/50.samba.script @@ -0,0 +1,166 @@ +#!/bin/sh +# ctdb event script for Samba + +[ -n "$CTDB_BASE" ] || + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +detect_init_style + +case $CTDB_INIT_STYLE in +suse) + CTDB_SERVICE_SMB=${CTDB_SERVICE_SMB:-smb} + ;; +debian) + CTDB_SERVICE_SMB=${CTDB_SERVICE_SMB:-smbd} + ;; +*) + # Use redhat style as default: + CTDB_SERVICE_SMB=${CTDB_SERVICE_SMB:-smb} + ;; +esac + +service_name="samba" + +load_script_options + +ctdb_setup_state_dir "service" "$service_name" + +service_start() +{ + # make sure samba is not already started + service "$CTDB_SERVICE_SMB" stop >/dev/null 2>&1 + killall -0 -q smbd && { + sleep 1 + # make absolutely sure samba is dead + killall -q -9 smbd + } + # start Samba service. Start it reniced, as under very heavy load + # the number of smbd processes will mean that it leaves few cycles + # for anything else + nice_service "$CTDB_SERVICE_SMB" start || die "Failed to start samba" +} + +service_stop() +{ + service "$CTDB_SERVICE_SMB" stop + program_stack_traces "smbd" 5 +} + +###################################################################### +# Show the testparm output using a cached smb.conf to avoid delays due +# to registry access. + +# script_state_dir set by ctdb_setup_state_dir() +# shellcheck disable=SC2154 +smbconf_cache="$script_state_dir/smb.conf.cache" + +testparm_foreground_update() +{ + _timeout="$1" + + # No need to remove these temporary files, since there are only 2 + # of them. + _out="${smbconf_cache}.out" + _err="${smbconf_cache}.err" + + timeout "$_timeout" testparm -v -s >"$_out" 2>"$_err" + case $? in + 0) : ;; + 124) + if [ -f "$smbconf_cache" ]; then + echo "WARNING: smb.conf cache update timed out - using old cache file" + return 1 + else + echo "ERROR: smb.conf cache create failed - testparm command timed out" + exit 1 + fi + ;; + *) + if [ -f "$smbconf_cache" ]; then + echo "WARNING: smb.conf cache update failed - using old cache file" + cat "$_err" + return 1 + else + echo "ERROR: smb.conf cache create failed - testparm failed with:" + cat "$_err" + exit 1 + fi + ;; + esac + + # Only using $$ here to avoid a collision. This is written into + # CTDB's own state directory so there is no real need for a secure + # temporary file. + _tmpfile="${smbconf_cache}.$$" + # Patterns to exclude... + _pat='^[[:space:]]+(registry[[:space:]]+shares|include|copy|winbind[[:space:]]+separator)[[:space:]]+=' + grep -Ev "$_pat" <"$_out" >"$_tmpfile" + mv "$_tmpfile" "$smbconf_cache" # atomic + + return 0 +} + +testparm_background_update() +{ + _timeout="$1" + + testparm_foreground_update "$_timeout" >/dev/null 2>&1 </dev/null & +} + +testparm_get () +{ + _param="$1" + + sed -n \ + -e "s|^[[:space:]]*${_param}[[:space:]]*=[[:space:]]\(..*\)|\1|p" \ + "$smbconf_cache" + +} + +list_samba_shares() +{ + testparm_get "path" | sed -e 's/"//g' +} + +list_samba_ports() +{ + testparm_get "smb ports" +} + +########################### + +case "$1" in +startup) + service_start + ;; + +shutdown) + service_stop + ;; + +monitor) + testparm_foreground_update 10 + ret=$? + + smb_ports="$CTDB_SAMBA_CHECK_PORTS" + if [ -z "$smb_ports" ]; then + smb_ports=$(list_samba_ports) + [ -n "$smb_ports" ] || die "Failed to set smb ports" + fi + # Intentionally unquoted multi-word value here + # shellcheck disable=SC2086 + ctdb_check_tcp_ports $smb_ports || exit $? + + if [ "$CTDB_SAMBA_SKIP_SHARE_CHECK" != "yes" ]; then + list_samba_shares | ctdb_check_directories || exit $? + fi + + if [ $ret -ne 0 ]; then + testparm_background_update 10 + fi + ;; +esac + +exit 0 diff --git a/ctdb/config/events/legacy/60.nfs.script b/ctdb/config/events/legacy/60.nfs.script new file mode 100755 index 0000000..b7ae074 --- /dev/null +++ b/ctdb/config/events/legacy/60.nfs.script @@ -0,0 +1,301 @@ +#!/bin/sh +# script to manage nfs in a clustered environment + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +service_name="nfs" + +load_system_config "nfs" + +load_script_options + +ctdb_setup_state_dir "service" "$service_name" + +###################################################################### + +service_reconfigure () +{ + # Restart lock manager, notify clients + # shellcheck disable=SC2317 + # Called indirectly via check_thresholds() + if [ -x "${CTDB_BASE}/statd-callout" ] ; then + "${CTDB_BASE}/statd-callout" notify & + fi >/dev/null 2>&1 +} + +###################################################################### + +###################################################### +# Check the health of NFS services +# +# Use .check files in $CTDB_NFS_CHECKS_DIR. +# Default is "${CTDB_BASE}/nfs-checks.d/" +###################################################### +nfs_check_services () +{ + _dir="${CTDB_NFS_CHECKS_DIR:-${CTDB_BASE}/nfs-checks.d}" + + # Files must end with .check - avoids editor backups, RPM fu, ... + for _f in "$_dir"/[0-9][0-9].*.check ; do + [ -r "$_f" ] || continue + + _t="${_f%.check}" + _progname="${_t##*/[0-9][0-9].}" + + nfs_check_service "$_progname" <"$_f" + done +} + +###################################################### +# Check the health of an NFS service +# +# $1 - progname, passed to rpcinfo (looked up in /etc/rpc) +# +# Reads variables from stdin +# +# Variables are: +# +# * family - "tcp" or "udp" or space separated list +# default: tcp, not used with "service_check_cmd" +# * version - optional, RPC service version number +# default is to omit to check for any version, +# not used with "service_check_cmd" +# * unhealthy_after - number of check fails before unhealthy +# default: 1 +# * restart_every - number of check fails before restart +# default: 0, meaning no restart +# * service_stop_cmd - command to stop service +# default: no default, must be provided if +# restart_every > 0 +# * service_start_cmd - command to start service +# default: no default, must be provided if +# restart_every > 0 +# * service_check_cmd - command to check health of service +# default is to check RPC service using rpcinfo +# * service_debug_cmd - command to debug a service after trying to stop it; +# for example, it can be useful to print stack +# traces of threads that have not exited, since +# they may be stuck doing I/O; +# no default, see also function program_stack_traces() +# +# Quoting in values is not preserved +# +###################################################### +nfs_check_service () +{ + _progname="$1" + + # This sub-shell is created to intentionally limit the scope of + # variable values read from the .check files. + # shellcheck disable=SC2030 + ( + # Subshell to restrict scope variables... + + # Defaults + family="tcp" + version="" + unhealthy_after=1 + restart_every=0 + service_stop_cmd="" + service_start_cmd="" + service_check_cmd="" + service_debug_cmd="" + + # Eval line-by-line. Expands variable references in values. + # Also allows variable name checking, which seems useful. + while read _line ; do + case "$_line" in + \#*|"") : ;; # Ignore comments, blank lines + + family=*|version=*|\ + unhealthy_after=*|restart_every=*|\ + service_stop_cmd=*|service_start_cmd=*|\ + service_check_cmd=*|service_debug_cmd=*) + + eval "$_line" + ;; + *) + echo "ERROR: Unknown variable for ${_progname}: ${_line}" + exit 1 + esac + done + + _ok=false + if [ -n "$service_check_cmd" ] ; then + # Using eval means variables can contain semicolon separated commands + if eval "$service_check_cmd" ; then + _ok=true + else + _err="monitoring service \"${_progname}\" failed" + fi + else + if nfs_check_rpcinfo \ + "$_progname" "$version" "$family" >/dev/null ; then + _ok=true + else + _err="$ctdb_check_rpc_out" + fi + fi + + if $_ok ; then + if [ $unhealthy_after -ne 1 ] || [ $restart_every -ne 0 ] ; then + ctdb_counter_init "$_progname" + fi + exit 0 + fi + + ctdb_counter_incr "$_progname" + _failcount=$(ctdb_counter_get "$_progname") + + _unhealthy=false + if [ "$unhealthy_after" -gt 0 ] ; then + if [ "$_failcount" -ge "$unhealthy_after" ] ; then + _unhealthy=true + echo "ERROR: $_err" + fi + fi + + if [ "$restart_every" -gt 0 ] ; then + if [ $((_failcount % restart_every)) -eq 0 ] ; then + if ! $_unhealthy ; then + echo "WARNING: $_err" + fi + nfs_restart_service + fi + fi + + if $_unhealthy ; then + exit 1 + fi + + return 0 + ) || exit 1 +} + +# Uses: service_stop_cmd, service_start_cmd, service_debug_cmd +# This function is called within the sub-shell that shellcheck thinks +# loses the above variable values. +# shellcheck disable=SC2031 +nfs_restart_service () +{ + if [ -z "$service_stop_cmd" ] || [ -z "$service_start_cmd" ] ; then + die "ERROR: Can not restart service \"${_progname}\" without corresponding service_start_cmd/service_stop_cmd settings" + fi + + echo "Trying to restart service \"${_progname}\"..." + # Using eval means variables can contain semicolon separated commands + eval "$service_stop_cmd" + if [ -n "$service_debug_cmd" ] ; then + eval "$service_debug_cmd" + fi + background_with_logging eval "$service_start_cmd" +} + +###################################################### +# Check an RPC service with rpcinfo +###################################################### +ctdb_check_rpc () +{ + _progname="$1" # passed to rpcinfo (looked up in /etc/rpc) + _version="$2" # optional, not passed if empty/unset + _family="${3:-tcp}" # optional, default is "tcp" + + case "$_family" in + tcp6|udp6) + _localhost="${CTDB_RPCINFO_LOCALHOST6:-::1}" + ;; + *) + _localhost="${CTDB_RPCINFO_LOCALHOST:-127.0.0.1}" + esac + + # $_version is not quoted because it is optional + # shellcheck disable=SC2086 + if ! ctdb_check_rpc_out=$(rpcinfo -T "$_family" "$_localhost" \ + "$_progname" $_version 2>&1) ; then + ctdb_check_rpc_out="$_progname failed RPC check: +$ctdb_check_rpc_out" + echo "$ctdb_check_rpc_out" + return 1 + fi +} + +nfs_check_rpcinfo () +{ + _progname="$1" # passed to rpcinfo (looked up in /etc/rpc) + _versions="$2" # optional, space separated, not passed if empty/unset + _families="${3:-tcp}" # optional, space separated, default is "tcp" + + for _family in $_families ; do + if [ -n "$_versions" ] ; then + for _version in $_versions ; do + ctdb_check_rpc "$_progname" "$_version" "$_family" || return $? + done + else + ctdb_check_rpc "$_progname" "" "$_family" || return $? + fi + done +} + +################################################################## +# use statd-callout to update NFS lock info +################################################################## +nfs_update_lock_info () +{ + if [ -x "$CTDB_BASE/statd-callout" ] ; then + "$CTDB_BASE/statd-callout" update + fi +} + +###################################################################### + +# script_state_dir set by ctdb_setup_state_dir() +# shellcheck disable=SC2154 +nfs_callout_init "$script_state_dir" + +case "$1" in +startup) + nfs_callout "$@" || exit $? + ;; + +shutdown) + nfs_callout "$@" || exit $? + ;; + +takeip) + nfs_callout "$@" || exit $? + ctdb_service_set_reconfigure + ;; + +releaseip) + nfs_callout "$@" || exit $? + ctdb_service_set_reconfigure + ;; + +ipreallocated) + if ctdb_service_needs_reconfigure ; then + ctdb_service_reconfigure + fi + ;; + +monitor) + nfs_callout "monitor-pre" || exit $? + + # Check that directories for shares actually exist + if [ "$CTDB_NFS_SKIP_SHARE_CHECK" != "yes" ] ; then + nfs_callout "monitor-list-shares" | ctdb_check_directories || \ + exit $? + fi + + update_tickles 2049 + nfs_update_lock_info + + nfs_check_services + + nfs_callout "monitor-post" || exit $? + ;; +esac + +exit 0 diff --git a/ctdb/config/events/legacy/70.iscsi.script b/ctdb/config/events/legacy/70.iscsi.script new file mode 100755 index 0000000..e74651d --- /dev/null +++ b/ctdb/config/events/legacy/70.iscsi.script @@ -0,0 +1,87 @@ +#!/bin/sh + +# CTDB event script for TGTD based iSCSI + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +# service_name is used by various functions +# shellcheck disable=SC2034 +service_name="iscsi" + +load_script_options + +[ -z "$CTDB_START_ISCSI_SCRIPTS" ] && { + echo "No iscsi start script directory found" + exit 0 +} + +case "$1" in +ipreallocated) + all_ips=$($CTDB -X ip | tail -n +2) + + # Block the iSCSI port. Only block for the address families + # we have configured. This copes with, for example, ip6tables + # being unavailable on an IPv4-only system. + have_ipv4=false + have_ipv6=false + # x is intentionally ignored + # shellcheck disable=SC2034 + while IFS='|' read x ip pnn x ; do + case "$ip" in + *:*) have_ipv6=true ;; + *) have_ipv4=true ;; + esac + done <<EOF +$all_ips +EOF + if $have_ipv4 ; then + iptables -I INPUT 1 -p tcp --dport 3260 -j DROP + fi + if $have_ipv6 ; then + ip6tables -I INPUT 1 -p tcp --dport 3260 -j DROP + fi + + # Stop iSCSI daemon + killall -9 tgtd >/dev/null 2>/dev/null + + pnn=$(ctdb_get_pnn) + [ -n "$pnn" ] || die "Failed to get node pnn" + + # Start iSCSI daemon + tgtd >/dev/null 2>&1 + + # Run a script for each currently hosted public IP address + ips=$(echo "$all_ips" | awk -F'|' -v pnn="$pnn" '$3 == pnn {print $2}') + for ip in $ips ; do + script="${CTDB_START_ISCSI_SCRIPTS}/${ip}.sh" + if [ -x "$script" ] ; then + echo "Starting iSCSI service for public address ${ip}" + "$script" + fi + done + + # Unblock iSCSI port. These can be unconditional (compared to + # blocking above), since errors are redirected. + while iptables -D INPUT -p tcp --dport 3260 -j DROP >/dev/null 2>&1 ; do + : + done + while ip6tables -D INPUT -p tcp --dport 3260 -j DROP >/dev/null 2>&1 ; do + : + done + + ;; + +shutdown) + # Shutdown iSCSI daemon when ctdb goes down + killall -9 tgtd >/dev/null 2>&1 + ;; + +monitor) + ctdb_check_tcp_ports 3260 || exit $? + ;; +esac + +exit 0 diff --git a/ctdb/config/events/legacy/91.lvs.script b/ctdb/config/events/legacy/91.lvs.script new file mode 100755 index 0000000..8855068 --- /dev/null +++ b/ctdb/config/events/legacy/91.lvs.script @@ -0,0 +1,124 @@ +#!/bin/sh +# script to manage the lvs ip multiplexer for a single public address cluster + +[ -n "$CTDB_BASE" ] || \ + CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD") + +. "${CTDB_BASE}/functions" + +load_script_options + +[ -n "$CTDB_LVS_NODES" ] || exit 0 +export CTDB_LVS_NODES + +# type is commonly supported and more portable than which(1) +# shellcheck disable=SC2039 +if ! type ipvsadm >/dev/null 2>&1 ; then + echo "LVS configured but ipvsadm not found" + exit 0 +fi + + +lvs_follower_only () +{ + _ip_address=$(ctdb_get_ip_address) + awk -v my_ip="$_ip_address" \ + '$1 == my_ip { if ($2 ~ "follower-only") { exit 0 } else { exit 1 } }' \ + "$CTDB_LVS_NODES" +} + +lvs_check_config () +{ + [ -r "$CTDB_LVS_NODES" ] || \ + die "error: CTDB_LVS_NODES=${CTDB_LVS_NODES} unreadable" + [ -n "$CTDB_LVS_PUBLIC_IP" ] || \ + die "Invalid configuration: CTDB_LVS_PUBLIC_IP not set" + if ! lvs_follower_only ; then + [ -n "$CTDB_LVS_PUBLIC_IFACE" ] || \ + die "Invalid configuration: CTDB_LVS_PUBLIC_IFACE not set" + fi +} + +case "$1" in +setup) + lvs_check_config + ;; +startup) + lvs_check_config + + ipvsadm -D -t "$CTDB_LVS_PUBLIC_IP" >/dev/null 2>&1 + ipvsadm -D -u "$CTDB_LVS_PUBLIC_IP" >/dev/null 2>&1 + + ip addr add "${CTDB_LVS_PUBLIC_IP}/32" dev lo scope host + + # do not respond to ARPs that are for ip addresses with scope 'host' + set_proc_maybe sys/net/ipv4/conf/all/arp_ignore 3 + # do not send out arp requests from loopback addresses + set_proc_maybe sys/net/ipv4/conf/all/arp_announce 2 + ;; + +shutdown) + lvs_check_config + + ipvsadm -D -t "$CTDB_LVS_PUBLIC_IP" + ipvsadm -D -u "$CTDB_LVS_PUBLIC_IP" + + ip addr del "${CTDB_LVS_PUBLIC_IP}/32" dev lo >/dev/null 2>&1 + + flush_route_cache + ;; + +ipreallocated) + lvs_check_config + + # Kill connections + ipvsadm -D -t "$CTDB_LVS_PUBLIC_IP" >/dev/null 2>&1 + ipvsadm -D -u "$CTDB_LVS_PUBLIC_IP" >/dev/null 2>&1 + kill_tcp_connections_local_only \ + "$CTDB_LVS_PUBLIC_IFACE" "$CTDB_LVS_PUBLIC_IP" + + pnn=$(ctdb_get_pnn) + lvsleader=$("${CTDB_HELPER_BINDIR}/ctdb_lvs" leader) + if [ "$pnn" != "$lvsleader" ] ; then + # This node is not the LVS leader so change the IP address + # to have scope "host" so this node won't respond to ARPs + ip addr del "${CTDB_LVS_PUBLIC_IP}/32" dev lo >/dev/null 2>&1 + ip addr add "${CTDB_LVS_PUBLIC_IP}/32" dev lo scope host + exit 0 + fi + + # Change the scope so this node starts responding to ARPs + ip addr del "${CTDB_LVS_PUBLIC_IP}/32" dev lo >/dev/null 2>&1 + ip addr add "${CTDB_LVS_PUBLIC_IP}/32" dev lo >/dev/null 2>&1 + + ipvsadm -A -t "$CTDB_LVS_PUBLIC_IP" -p 1999999 -s lc + ipvsadm -A -u "$CTDB_LVS_PUBLIC_IP" -p 1999999 -s lc + + # Add all nodes (except this node) as LVS servers + "${CTDB_HELPER_BINDIR}/ctdb_lvs" list | + awk -v pnn="$pnn" '$1 != pnn { print $2 }' | + while read ip ; do + ipvsadm -a -t "$CTDB_LVS_PUBLIC_IP" -r "$ip" -g + ipvsadm -a -u "$CTDB_LVS_PUBLIC_IP" -r "$ip" -g + done + + # Add localhost too... + ipvsadm -a -t "$CTDB_LVS_PUBLIC_IP" -r 127.0.0.1 + ipvsadm -a -u "$CTDB_LVS_PUBLIC_IP" -r 127.0.0.1 + + $CTDB gratarp \ + "$CTDB_LVS_PUBLIC_IP" "$CTDB_LVS_PUBLIC_IFACE" >/dev/null 2>&1 + + flush_route_cache + ;; + +monitor) + lvs_check_config + + if [ -n "$CTDB_LVS_PUBLIC_IFACE" ] ; then + interface_monitor "$CTDB_LVS_PUBLIC_IFACE" || exit 1 + fi + ;; +esac + +exit 0 |