summaryrefslogtreecommitdiffstats
path: root/doc/website-v1/start-guide.adoc
diff options
context:
space:
mode:
Diffstat (limited to 'doc/website-v1/start-guide.adoc')
-rw-r--r--doc/website-v1/start-guide.adoc208
1 files changed, 208 insertions, 0 deletions
diff --git a/doc/website-v1/start-guide.adoc b/doc/website-v1/start-guide.adoc
new file mode 100644
index 0000000..7ad6a82
--- /dev/null
+++ b/doc/website-v1/start-guide.adoc
@@ -0,0 +1,208 @@
+= Getting Started
+
+So, you've successfully installed `crmsh` on one or more machines, and
+now you want to configure a basic cluster. This guide is intended to
+provide step-by-step instructions for configuring Pacemaker
+with a single resource capable of failing over between a pair of
+nodes, and then builds on that base to cover some more advanced topics
+of cluster management.
+
+****
+Haven't installed yet? Please follow the
+link:/installation[installation instructions]
+before continuing this guide. Only `crmsh` and
+its dependencies need to be installed before
+following this guide.
+****
+
+Before continuing, make sure that this command executes successfully
+on all nodes, and returns a version number that is `3.0` or higher:
+
+........
+crm --version
+........
+
+****
+In crmsh 3, the cluster init commands were replaced by the SLE HA
+bootstrap scripts. These rely on `csync2` for configuration file
+management, so make sure that you have the `csync2` command
+installed before proceeding. This requirement may be removed in
+the future.
+****
+
+.Example cluster
+**************************
+
+These are the machines used as an example in this guide. Please
+replace the references to these names and IP addresses to the values
+appropriate for your cluster:
+
+
+[options="header,footer"]
+|=======================
+|Name |IP
+|alice |10.0.0.2
+|bob |10.0.0.3
+|=======================
+**************************
+
+
+== The cluster stack
+
+The composition of the GNU/Linux cluster stack has changed somewhat
+over the years. The stack described here is the currently most common
+variant, but there are other ways of configuring these tools.
+
+Simply put, a High Availability cluster is a set of machines (commonly
+referred to as *nodes*) with redundant capacity, such that if one or
+more of these machines experience failure of any kind, the other nodes
+in the cluster can take over the responsibilities previously handled
+by the failed node.
+
+The cluster stack is a set of programs running on all of these nodes,
+communicating with each other over the network to monitor each other
+and deciding where, when and how resources are stopped, started or
+reconfigured.
+
+The main component of the stack is *Pacemaker*, the software
+responsible for managing cluster resources, allocating them to cluster
+nodes according to the rules specified in the *CIB*.
+
+The CIB is an XML document maintained by Pacemaker, which describes
+all cluster resources, their configuration and the constraints that
+decide where and how they are managed. This document is not edited
+directly, and with the help of `crmsh` it is possible to avoid
+exposure to the underlying XML at all.
+
+Beneath Pacemaker in the stack sits *Corosync*, a cluster
+communication system. Corosync provides the communication capabilities
+and cluster membership functionality used by Pacemaker. Corosync is
+configured through the file `/etc/corosync/corosync.conf`. `crmsh`
+provides tools for configuring corosync similar to Pacemaker.
+
+Aside from these two components, the stack also consists of a
+collection of *Resource Agents*. These are basically scripts that wrap
+software that the cluster needs to manage, providing a unified
+interface to configuration, supervision and management of the
+software. For example, there are agents that handle virtual IP
+resources, web servers, databases and filesystems.
+
+`crmsh` is a command line tool which interfaces against all of these
+components, providing a unified interface for configuration and
+management of the whole cluster stack.
+
+== SSH
+
+`crmsh` runs as a command line tool on any one of the cluster
+nodes. In order for to to control all cluster nodes, it needs to be
+able to execute commands remotely. `crmsh` does this by invoking
+`ssh`.
+
+Configure `/etc/hosts` on each of the nodes so that the names of the
+other nodes map to the IP addresses of those nodes. For example in a
+cluster consisting of `alice` and `bob`, executing `ping bob` when
+logged in as root on `alice` should successfully locate `bob` on the
+network. Given the IP addresses of `alice` and `bob` above, the
+following should be entered into `/etc/hosts` on both nodes:
+
+........
+10.0.0.2 alice
+10.0.0.3 bob
+........
+
+== Install and configure
+
+To configure the basic cluster, we use the `cluster init` command
+provided by `crmsh`. This command has quite a few options for
+setting up the cluster, but we will use a fairly basic configuration.
+
+........
+crm cluster init --name demo-cluster --nodes "alice bob"
+........
+
+The initialization tool will now ask a series of questions about the
+configuration, and then proceed to configure and start the cluster
+on both nodes.
+
+== Check cluster status
+
+To see if Pacemaker is running, what nodes are part of the cluster and
+what resources are active, use the `status` command:
+
+.........
+crm status
+.........
+
+If this command fails or times out, there is some problem with
+Pacemaker or Corosync on the local machine. Perhaps some dependency is
+missing, a firewall is blocking cluster communication or some other
+unrelated problem has occurred. If this is the case, the `cluster
+health` command may be of use.
+
+== Cluster health check
+
+To check the health status of the machines in the cluster, use the
+following command:
+
+........
+crm cluster health
+........
+
+This command will perform multiple diagnostics on all nodes in the
+cluster, and return information about low disk space, communication
+issues or problems with mismatching software versions between nodes,
+for example.
+
+If no cluster has been configured or there is some fundamental problem
+with cluster communications, `crmsh` may be unable to figure out what
+nodes are part of the cluster. If this is the case, the list of nodes
+can be provided to the health command directly:
+
+........
+crm cluster health nodes=alice,bob
+........
+
+== Adding a resource
+
+To test the cluster and make sure it is working properly, we can
+configure a Dummy resource. The Dummy resource agent is a simple
+resource that doesn't actually manage any software. It exposes a
+single numerical parameter called `state` which can be used to test
+the basic functionality of the cluster before introducing the
+complexities of actual resources.
+
+To configure a Dummy resource, run the following command:
+
+........
+crm configure primitive p0 Dummy
+........
+
+This creates a new resource, gives it the name `p0` and sets the
+agent for the resource to be the `Dummy` agent.
+
+`crm status` should now show the `p0` resource as started on one
+of the cluster nodes:
+
+........
+# crm status
+Last updated: Wed Jul 2 21:49:26 2014
+Last change: Wed Jul 2 21:49:19 2014
+Stack: corosync
+Current DC: alice (2) - partition with quorum
+Version: 1.1.11-c3f1a7f
+2 Nodes configured
+1 Resources configured
+
+
+Online: [ alice bob ]
+
+ p0 (ocf::heartbeat:Dummy): Started alice
+........
+
+The resource can be stopped or started using the `resource start` and
+`resource stop` commands:
+
+........
+crm resource stop p0
+crm resource start p0
+........