diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-17 06:48:59 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-17 06:48:59 +0000 |
commit | d835b2cae8abc71958b69362162e6a70c3d7ef63 (patch) | |
tree | 81052e3d2ce3e1bcda085f73d925e9d6257dec15 /doc/website-v1/start-guide.adoc | |
parent | Initial commit. (diff) | |
download | crmsh-d835b2cae8abc71958b69362162e6a70c3d7ef63.tar.xz crmsh-d835b2cae8abc71958b69362162e6a70c3d7ef63.zip |
Adding upstream version 4.6.0.upstream/4.6.0upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/website-v1/start-guide.adoc')
-rw-r--r-- | doc/website-v1/start-guide.adoc | 208 |
1 files changed, 208 insertions, 0 deletions
diff --git a/doc/website-v1/start-guide.adoc b/doc/website-v1/start-guide.adoc new file mode 100644 index 0000000..7ad6a82 --- /dev/null +++ b/doc/website-v1/start-guide.adoc @@ -0,0 +1,208 @@ += Getting Started + +So, you've successfully installed `crmsh` on one or more machines, and +now you want to configure a basic cluster. This guide is intended to +provide step-by-step instructions for configuring Pacemaker +with a single resource capable of failing over between a pair of +nodes, and then builds on that base to cover some more advanced topics +of cluster management. + +**** +Haven't installed yet? Please follow the +link:/installation[installation instructions] +before continuing this guide. Only `crmsh` and +its dependencies need to be installed before +following this guide. +**** + +Before continuing, make sure that this command executes successfully +on all nodes, and returns a version number that is `3.0` or higher: + +........ +crm --version +........ + +**** +In crmsh 3, the cluster init commands were replaced by the SLE HA +bootstrap scripts. These rely on `csync2` for configuration file +management, so make sure that you have the `csync2` command +installed before proceeding. This requirement may be removed in +the future. +**** + +.Example cluster +************************** + +These are the machines used as an example in this guide. Please +replace the references to these names and IP addresses to the values +appropriate for your cluster: + + +[options="header,footer"] +|======================= +|Name |IP +|alice |10.0.0.2 +|bob |10.0.0.3 +|======================= +************************** + + +== The cluster stack + +The composition of the GNU/Linux cluster stack has changed somewhat +over the years. The stack described here is the currently most common +variant, but there are other ways of configuring these tools. + +Simply put, a High Availability cluster is a set of machines (commonly +referred to as *nodes*) with redundant capacity, such that if one or +more of these machines experience failure of any kind, the other nodes +in the cluster can take over the responsibilities previously handled +by the failed node. + +The cluster stack is a set of programs running on all of these nodes, +communicating with each other over the network to monitor each other +and deciding where, when and how resources are stopped, started or +reconfigured. + +The main component of the stack is *Pacemaker*, the software +responsible for managing cluster resources, allocating them to cluster +nodes according to the rules specified in the *CIB*. + +The CIB is an XML document maintained by Pacemaker, which describes +all cluster resources, their configuration and the constraints that +decide where and how they are managed. This document is not edited +directly, and with the help of `crmsh` it is possible to avoid +exposure to the underlying XML at all. + +Beneath Pacemaker in the stack sits *Corosync*, a cluster +communication system. Corosync provides the communication capabilities +and cluster membership functionality used by Pacemaker. Corosync is +configured through the file `/etc/corosync/corosync.conf`. `crmsh` +provides tools for configuring corosync similar to Pacemaker. + +Aside from these two components, the stack also consists of a +collection of *Resource Agents*. These are basically scripts that wrap +software that the cluster needs to manage, providing a unified +interface to configuration, supervision and management of the +software. For example, there are agents that handle virtual IP +resources, web servers, databases and filesystems. + +`crmsh` is a command line tool which interfaces against all of these +components, providing a unified interface for configuration and +management of the whole cluster stack. + +== SSH + +`crmsh` runs as a command line tool on any one of the cluster +nodes. In order for to to control all cluster nodes, it needs to be +able to execute commands remotely. `crmsh` does this by invoking +`ssh`. + +Configure `/etc/hosts` on each of the nodes so that the names of the +other nodes map to the IP addresses of those nodes. For example in a +cluster consisting of `alice` and `bob`, executing `ping bob` when +logged in as root on `alice` should successfully locate `bob` on the +network. Given the IP addresses of `alice` and `bob` above, the +following should be entered into `/etc/hosts` on both nodes: + +........ +10.0.0.2 alice +10.0.0.3 bob +........ + +== Install and configure + +To configure the basic cluster, we use the `cluster init` command +provided by `crmsh`. This command has quite a few options for +setting up the cluster, but we will use a fairly basic configuration. + +........ +crm cluster init --name demo-cluster --nodes "alice bob" +........ + +The initialization tool will now ask a series of questions about the +configuration, and then proceed to configure and start the cluster +on both nodes. + +== Check cluster status + +To see if Pacemaker is running, what nodes are part of the cluster and +what resources are active, use the `status` command: + +......... +crm status +......... + +If this command fails or times out, there is some problem with +Pacemaker or Corosync on the local machine. Perhaps some dependency is +missing, a firewall is blocking cluster communication or some other +unrelated problem has occurred. If this is the case, the `cluster +health` command may be of use. + +== Cluster health check + +To check the health status of the machines in the cluster, use the +following command: + +........ +crm cluster health +........ + +This command will perform multiple diagnostics on all nodes in the +cluster, and return information about low disk space, communication +issues or problems with mismatching software versions between nodes, +for example. + +If no cluster has been configured or there is some fundamental problem +with cluster communications, `crmsh` may be unable to figure out what +nodes are part of the cluster. If this is the case, the list of nodes +can be provided to the health command directly: + +........ +crm cluster health nodes=alice,bob +........ + +== Adding a resource + +To test the cluster and make sure it is working properly, we can +configure a Dummy resource. The Dummy resource agent is a simple +resource that doesn't actually manage any software. It exposes a +single numerical parameter called `state` which can be used to test +the basic functionality of the cluster before introducing the +complexities of actual resources. + +To configure a Dummy resource, run the following command: + +........ +crm configure primitive p0 Dummy +........ + +This creates a new resource, gives it the name `p0` and sets the +agent for the resource to be the `Dummy` agent. + +`crm status` should now show the `p0` resource as started on one +of the cluster nodes: + +........ +# crm status +Last updated: Wed Jul 2 21:49:26 2014 +Last change: Wed Jul 2 21:49:19 2014 +Stack: corosync +Current DC: alice (2) - partition with quorum +Version: 1.1.11-c3f1a7f +2 Nodes configured +1 Resources configured + + +Online: [ alice bob ] + + p0 (ocf::heartbeat:Dummy): Started alice +........ + +The resource can be stopped or started using the `resource start` and +`resource stop` commands: + +........ +crm resource stop p0 +crm resource start p0 +........ |