Adding upstream version 2.1.6.upstream/2.1.6

Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
author: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-17 06:53:20 +0000
committer: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-17 06:53:20 +0000
commit: e5a812082ae033afb1eed82c0f2df3d0f6bdc93f (patch)
tree: a6716c9275b4b413f6c9194798b34b91affb3cc7 /doc/sphinx/Pacemaker_Administration
parent: Initial commit. (diff)
download: pacemaker-e5a812082ae033afb1eed82c0f2df3d0f6bdc93f.tar.xz
pacemaker-e5a812082ae033afb1eed82c0f2df3d0f6bdc93f.zip
11 files changed, 2779 insertions, 0 deletions
diff --git a/doc/sphinx/Pacemaker_Administration/agents.rst b/doc/sphinx/Pacemaker_Administration/agents.rst
new file mode 100644
index 0000000..e5b17e2
--- /dev/null
+++ b/doc/sphinx/Pacemaker_Administration/agents.rst
@@ -0,0 +1,443 @@
+.. index::
+   single: resource agent
+
+Resource Agents
+---------------
+
+
+Action Completion
+#################
+
+If one resource depends on another resource via constraints, the cluster will
+interpret an expected result as sufficient to continue with dependent actions.
+This may cause timing issues if the resource agent start returns before the
+service is not only launched but fully ready to perform its function, or if the
+resource agent stop returns before the service has fully released all its
+claims on system resources. At a minimum, the start or stop should not return
+before a status command would return the expected (started or stopped) result.
+
+
+.. index::
+   single: OCF resource agent
+   single: resource agent; OCF
+
+OCF Resource Agents
+###################
+
+.. index::
+   single: OCF resource agent; location
+
+Location of Custom Scripts
+__________________________
+
+OCF Resource Agents are found in ``/usr/lib/ocf/resource.d/$PROVIDER``
+
+When creating your own agents, you are encouraged to create a new directory
+under ``/usr/lib/ocf/resource.d/`` so that they are not confused with (or
+overwritten by) the agents shipped by existing providers.
+
+So, for example, if you choose the provider name of big-corp and want a new
+resource named big-app, you would create a resource agent called
+``/usr/lib/ocf/resource.d/big-corp/big-app`` and define a resource:
+ 
+.. code-block: xml
+
+   <primitive id="custom-app" class="ocf" provider="big-corp" type="big-app"/>
+
+
+.. index::
+   single: OCF resource agent; action
+
+Actions
+_______
+
+All OCF resource agents are required to implement the following actions.
+
+.. table:: **Required Actions for OCF Agents**
+
+   +--------------+-------------+------------------------------------------------+
+   | Action       | Description | Instructions                                   |
+   +==============+=============+================================================+
+   | start        | Start the   | .. index::                                     |
+   |              | resource    |    single: OCF resource agent; start           |
+   |              |             |    single: start action                        |
+   |              |             |                                                |
+   |              |             | Return 0 on success and an appropriate         |
+   |              |             | error code otherwise. Must not report          |
+   |              |             | success until the resource is fully            |
+   |              |             | active.                                        |
+   +--------------+-------------+------------------------------------------------+
+   | stop         | Stop the    | .. index::                                     |
+   |              | resource    |    single: OCF resource agent; stop            |
+   |              |             |    single: stop action                         |
+   |              |             |                                                |
+   |              |             | Return 0 on success and an appropriate         |
+   |              |             | error code otherwise. Must not report          |
+   |              |             | success until the resource is fully            |
+   |              |             | stopped.                                       |
+   +--------------+-------------+------------------------------------------------+
+   | monitor      | Check the   | .. index::                                     |
+   |              | resource's  |    single: OCF resource agent; monitor         |
+   |              | state       |    single: monitor action                      |
+   |              |             |                                                |
+   |              |             | Exit 0 if the resource is running, 7           |
+   |              |             | if it is stopped, and any other OCF            |
+   |              |             | exit code if it is failed. NOTE: The           |
+   |              |             | monitor script should test the state           |
+   |              |             | of the resource on the local machine           |
+   |              |             | only.                                          |
+   +--------------+-------------+------------------------------------------------+
+   | meta-data    | Describe    | .. index::                                     |
+   |              | the         |    single: OCF resource agent; meta-data       |
+   |              | resource    |    single: meta-data action                    |
+   |              |             |                                                |
+   |              |             | Provide information about this                 |
+   |              |             | resource in the XML format defined by          |
+   |              |             | the OCF standard. Exit with 0. NOTE:           |
+   |              |             | This is *not* required to be performed         |
+   |              |             | as root.                                       |
+   +--------------+-------------+------------------------------------------------+
+
+OCF resource agents may optionally implement additional actions. Some are used
+only with advanced resource types such as clones.
+
+.. table:: **Optional Actions for OCF Resource Agents**
+
+   +--------------+-------------+------------------------------------------------+
+   | Action       | Description | Instructions                                   |
+   +==============+=============+================================================+
+   | validate-all | This should | .. index::                                     |
+   |              | validate    |    single: OCF resource agent; validate-all    |
+   |              | the         |    single: validate-all action                 |
+   |              | instance    |                                                |
+   |              | parameters  | Return 0 if parameters are valid, 2 if         |
+   |              | provided.   | not valid, and 6 if resource is not            |
+   |              |             | configured.                                    |
+   +--------------+-------------+------------------------------------------------+
+   | promote      | Bring the   | .. index::                                     |
+   |              | local       |    single: OCF resource agent; promote         |
+   |              | instance of |    single: promote action                      |
+   |              | a promotable|                                                |
+   |              | clone       | Return 0 on success                            |
+   |              | resource to |                                                |
+   |              | the promoted|                                                |
+   |              | role.       |                                                |
+   +--------------+-------------+------------------------------------------------+
+   | demote       | Bring the   | .. index::                                     |
+   |              | local       |    single: OCF resource agent; demote          |
+   |              | instance of |    single: demote action                       |
+   |              | a promotable|                                                |
+   |              | clone       | Return 0 on success                            |
+   |              | resource to |                                                |
+   |              | the         |                                                |
+   |              | unpromoted  |                                                |
+   |              | role.       |                                                |
+   +--------------+-------------+------------------------------------------------+
+   | notify       | Used by the | .. index::                                     |
+   |              | cluster to  |    single: OCF resource agent; notify          |
+   |              | send        |    single: notify action                       |
+   |              | the agent   |                                                |
+   |              | pre- and    | Must not fail. Must exit with 0                |
+   |              | post-       |                                                |
+   |              | notification|                                                |
+   |              | events      |                                                |
+   |              | telling the |                                                |
+   |              | resource    |                                                |
+   |              | what has    |                                                |
+   |              | happened and|                                                |
+   |              | will happen.|                                                |
+   +--------------+-------------+------------------------------------------------+
+   | reload       | Reload the  | .. index::                                     |
+   |              | service's   |    single: OCF resource agent; reload          |
+   |              | own         |    single: reload action                       |
+   |              | config.     |                                                |
+   |              |             | Not used by Pacemaker                          |
+   +--------------+-------------+------------------------------------------------+
+   | reload-agent | Make        | .. index::                                     |
+   |              | effective   |    single: OCF resource agent; reload-agent    |
+   |              | any changes |    single: reload-agent action                 |
+   |              | in instance |                                                |
+   |              | parameters  | This is used when the agent can handle a       |
+   |              | marked as   | change in some of its parameters more          |
+   |              | reloadable  | efficiently than stopping and starting the     |
+   |              | in the      | resource.                                      |
+   |              | agent's     |                                                |
+   |              | meta-data.  |                                                |
+   +--------------+-------------+------------------------------------------------+
+   | recover      | Restart the | .. index::                                     |
+   |              | service.    |    single: OCF resource agent; recover         |
+   |              |             |    single: recover action                      |
+   |              |             |                                                |
+   |              |             | Not used by Pacemaker                          |
+   +--------------+-------------+------------------------------------------------+
+
+.. important::
+
+   If you create a new OCF resource agent, use `ocf-tester` to verify that the
+   agent complies with the OCF standard properly.
+
+
+.. index::
+   single: OCF resource agent; return code
+
+How are OCF Return Codes Interpreted?
+_____________________________________
+
+The first thing the cluster does is to check the return code against
+the expected result.  If the result does not match the expected value,
+then the operation is considered to have failed, and recovery action is
+initiated.
+
+There are three types of failure recovery:
+
+.. table:: **Types of recovery performed by the cluster**
+
+   +-------+--------------------------------------------+--------------------------------------+
+   | Type  | Description                                | Action Taken by the Cluster          |
+   +=======+============================================+======================================+
+   | soft  | .. index::                                 | Restart the resource or move it to a |
+   |       |    single: OCF resource agent; soft error  | new location                         |
+   |       |                                            |                                      |
+   |       | A transient error occurred                 |                                      |
+   +-------+--------------------------------------------+--------------------------------------+
+   | hard  | .. index::                                 | Move the resource elsewhere and      |
+   |       |    single: OCF resource agent; hard error  | prevent it from being retried on the |
+   |       |                                            | current node                         |
+   |       | A non-transient error that                 |                                      |
+   |       | may be specific to the                     |                                      |
+   |       | current node                               |                                      |
+   +-------+--------------------------------------------+--------------------------------------+
+   | fatal | .. index::                                 | Stop the resource and prevent it     |
+   |       |    single: OCF resource agent; fatal error | from being started on any cluster    |
+   |       |                                            | node                                 |
+   |       | A non-transient error that                 |                                      |
+   |       | will be common to all                      |                                      |
+   |       | cluster nodes (e.g. a bad                  |                                      |
+   |       | configuration was specified)               |                                      |
+   +-------+--------------------------------------------+--------------------------------------+
+
+.. _ocf_return_codes:
+
+OCF Return Codes
+________________
+
+The following table outlines the different OCF return codes and the type of
+recovery the cluster will initiate when a failure code is received. Although
+counterintuitive, even actions that return 0 (aka. ``OCF_SUCCESS``) can be
+considered to have failed, if 0 was not the expected return value.
+
+.. table:: **OCF Exit Codes and their Recovery Types**
+
+   +-------+-----------------------+---------------------------------------------------+----------+
+   | Exit  | OCF Alias             | Description                                       | Recovery |
+   | Code  |                       |                                                   |          |
+   +=======+=======================+===================================================+==========+
+   | 0     | OCF_SUCCESS           | .. index::                                        | soft     |
+   |       |                       |    single: OCF_SUCCESS                            |          |
+   |       |                       |    single: OCF return code; OCF_SUCCESS           |          |
+   |       |                       |    pair: OCF return code; 0                       |          |
+   |       |                       |                                                   |          |
+   |       |                       | Success. The command completed successfully.      |          |
+   |       |                       | This is the expected result for all start,        |          |
+   |       |                       | stop, promote and demote commands.                |          |
+   +-------+-----------------------+---------------------------------------------------+----------+
+   | 1     | OCF_ERR_GENERIC       | .. index::                                        | soft     |
+   |       |                       |    single: OCF_ERR_GENERIC                        |          |
+   |       |                       |    single: OCF return code; OCF_ERR_GENERIC       |          |
+   |       |                       |    pair: OCF return code; 1                       |          |
+   |       |                       |                                                   |          |
+   |       |                       | Generic "there was a problem" error code.         |          |
+   +-------+-----------------------+---------------------------------------------------+----------+
+   | 2     | OCF_ERR_ARGS          | .. index::                                        | hard     |
+   |       |                       |     single: OCF_ERR_ARGS                          |          |
+   |       |                       |     single: OCF return code; OCF_ERR_ARGS         |          |
+   |       |                       |     pair: OCF return code; 2                      |          |
+   |       |                       |                                                   |          |
+   |       |                       | The resource's parameter values are not valid on  |          |
+   |       |                       | this machine (for example, a value refers to a    |          |
+   |       |                       | file not found on the local host).                |          |
+   +-------+-----------------------+---------------------------------------------------+----------+
+   | 3     | OCF_ERR_UNIMPLEMENTED | .. index::                                        | hard     |
+   |       |                       |    single: OCF_ERR_UNIMPLEMENTED                  |          |
+   |       |                       |    single: OCF return code; OCF_ERR_UNIMPLEMENTED |          |
+   |       |                       |    pair: OCF return code; 3                       |          |
+   |       |                       |                                                   |          |
+   |       |                       | The requested action is not implemented.          |          |
+   +-------+-----------------------+---------------------------------------------------+----------+
+   | 4     | OCF_ERR_PERM          | .. index::                                        | hard     |
+   |       |                       |    single: OCF_ERR_PERM                           |          |
+   |       |                       |    single: OCF return code; OCF_ERR_PERM          |          |
+   |       |                       |    pair: OCF return code; 4                       |          |
+   |       |                       |                                                   |          |
+   |       |                       | The resource agent does not have                  |          |
+   |       |                       | sufficient privileges to complete the task.       |          |
+   +-------+-----------------------+---------------------------------------------------+----------+
+   | 5     | OCF_ERR_INSTALLED     | .. index::                                        | hard     |
+   |       |                       |    single: OCF_ERR_INSTALLED                      |          |
+   |       |                       |    single: OCF return code; OCF_ERR_INSTALLED     |          |
+   |       |                       |    pair: OCF return code; 5                       |          |
+   |       |                       |                                                   |          |
+   |       |                       | The tools required by the resource are            |          |
+   |       |                       | not installed on this machine.                    |          |
+   +-------+-----------------------+---------------------------------------------------+----------+
+   | 6     | OCF_ERR_CONFIGURED    | .. index::                                        | fatal    |
+   |       |                       |    single: OCF_ERR_CONFIGURED                     |          |
+   |       |                       |    single: OCF return code; OCF_ERR_CONFIGURED    |          |
+   |       |                       |    pair: OCF return code; 6                       |          |
+   |       |                       |                                                   |          |
+   |       |                       | The resource's parameter values are inherently    |          |
+   |       |                       | invalid (for example, a required parameter was    |          |
+   |       |                       | not given).                                       |          |
+   +-------+-----------------------+---------------------------------------------------+----------+
+   | 7     | OCF_NOT_RUNNING       | .. index::                                        | N/A      |
+   |       |                       |    single: OCF_NOT_RUNNING                        |          |
+   |       |                       |    single: OCF return code; OCF_NOT_RUNNING       |          |
+   |       |                       |    pair: OCF return code; 7                       |          |
+   |       |                       |                                                   |          |
+   |       |                       | The resource is safely stopped. This should only  |          |
+   |       |                       | be returned by monitor actions, not stop actions. |          |
+   +-------+-----------------------+---------------------------------------------------+----------+
+   | 8     | OCF_RUNNING_PROMOTED  | .. index::                                        | soft     |
+   |       |                       |    single: OCF_RUNNING_PROMOTED                   |          |
+   |       |                       |    single: OCF return code; OCF_RUNNING_PROMOTED  |          |
+   |       |                       |    pair: OCF return code; 8                       |          |
+   |       |                       |                                                   |          |
+   |       |                       | The resource is running in the promoted role.     |          |
+   +-------+-----------------------+---------------------------------------------------+----------+
+   | 9     | OCF_FAILED_PROMOTED   | .. index::                                        | soft     |
+   |       |                       |    single: OCF_FAILED_PROMOTED                    |          |
+   |       |                       |    single: OCF return code; OCF_FAILED_PROMOTED   |          |
+   |       |                       |    pair: OCF return code; 9                       |          |
+   |       |                       |                                                   |          |
+   |       |                       | The resource is (or might be) in the promoted     |          |
+   |       |                       | role but has failed. The resource will be         |          |
+   |       |                       | demoted, stopped and then started (and possibly   |          |
+   |       |                       | promoted) again.                                  |          |
+   +-------+-----------------------+---------------------------------------------------+----------+
+   | 190   | OCF_DEGRADED          | .. index::                                        | none     |
+   |       |                       |    single: OCF_DEGRADED                           |          |
+   |       |                       |    single: OCF return code; OCF_DEGRADED          |          |
+   |       |                       |    pair: OCF return code; 190                     |          |
+   |       |                       |                                                   |          |
+   |       |                       | The resource is properly active, but in such a    |          |
+   |       |                       | condition that future failures are more likely.   |          |
+   +-------+-----------------------+---------------------------------------------------+----------+
+   | 191   | OCF_DEGRADED_PROMOTED | .. index::                                        | none     |
+   |       |                       |    single: OCF_DEGRADED_PROMOTED                  |          |
+   |       |                       |    single: OCF return code; OCF_DEGRADED_PROMOTED |          |
+   |       |                       |    pair: OCF return code; 191                     |          |
+   |       |                       |                                                   |          |
+   |       |                       | The resource is properly active in the promoted   |          |
+   |       |                       | role, but in such a condition that future         |          |
+   |       |                       | failures are more likely.                         |          |
+   +-------+-----------------------+---------------------------------------------------+----------+
+   | other | *none*                | Custom error code.                                | soft     |
+   +-------+-----------------------+---------------------------------------------------+----------+
+
+Exceptions to the recovery handling described above:
+
+* Probes (non-recurring monitor actions) that find a resource active
+  (or in the promoted role) will not result in recovery action unless it is
+  also found active elsewhere.
+* The recovery action taken when a resource is found active more than
+  once is determined by the resource's ``multiple-active`` property.
+* Recurring actions that return ``OCF_ERR_UNIMPLEMENTED``
+  do not cause any type of recovery.
+* Actions that return one of the "degraded" codes will be treated the same as
+  if they had returned success, but status output will indicate that the
+  resource is degraded.
+
+
+.. index::
+   single: resource agent; LSB
+   single: LSB resource agent
+   single: init script
+
+LSB Resource Agents (Init Scripts)
+##################################
+
+LSB Compliance
+______________
+
+The relevant part of the
+`LSB specifications <http://refspecs.linuxfoundation.org/lsb.shtml>`_
+includes a description of all the return codes listed here.
+    
+Assuming `some_service` is configured correctly and currently
+inactive, the following sequence will help you determine if it is
+LSB-compatible:
+
+#. Start (stopped):
+ 
+   .. code-block:: none
+
+      # /etc/init.d/some_service start ; echo "result: $?"
+
+   * Did the service start?
+   * Did the echo command print ``result: 0`` (in addition to the init script's
+     usual output)?
+
+#. Status (running):
+ 
+   .. code-block:: none
+
+      # /etc/init.d/some_service status ; echo "result: $?"
+
+   * Did the script accept the command?
+   * Did the script indicate the service was running?
+   * Did the echo command print ``result: 0`` (in addition to the init script's
+     usual output)?
+
+#. Start (running):
+ 
+   .. code-block:: none
+
+      # /etc/init.d/some_service start ; echo "result: $?"
+
+   * Is the service still running?
+   * Did the echo command print ``result: 0`` (in addition to the init
+      script's usual output)?
+
+#. Stop (running):
+ 
+   .. code-block:: none
+
+      # /etc/init.d/some_service stop ; echo "result: $?"
+
+   * Was the service stopped?
+   * Did the echo command print ``result: 0`` (in addition to the init
+     script's usual output)?
+
+#. Status (stopped):
+ 
+   .. code-block:: none
+
+      # /etc/init.d/some_service status ; echo "result: $?"
+
+   * Did the script accept the command?
+   * Did the script indicate the service was not running?
+   * Did the echo command print ``result: 3`` (in addition to the init
+     script's usual output)?
+
+#. Stop (stopped):
+ 
+   .. code-block:: none
+
+      # /etc/init.d/some_service stop ; echo "result: $?"
+
+   * Is the service still stopped?
+   * Did the echo command print ``result: 0`` (in addition to the init
+     script's usual output)?
+
+#. Status (failed):
+
+   This step is not readily testable and relies on manual inspection of the script.
+
+   The script can use one of the error codes (other than 3) listed in the
+   LSB spec to indicate that it is active but failed. This tells the
+   cluster that before moving the resource to another node, it needs to
+   stop it on the existing one first.
+
+If the answer to any of the above questions is no, then the script is not
+LSB-compliant. Your options are then to either fix the script or write an OCF
+agent based on the existing script.
diff --git a/doc/sphinx/Pacemaker_Administration/alerts.rst b/doc/sphinx/Pacemaker_Administration/alerts.rst
new file mode 100644
index 0000000..c0f54c6
--- /dev/null
+++ b/doc/sphinx/Pacemaker_Administration/alerts.rst
@@ -0,0 +1,311 @@
+.. index::
+   single: alert; agents
+
+Alert Agents
+------------
+
+.. index::
+   single: alert; sample agents
+
+Using the Sample Alert Agents
+#############################
+   
+Pacemaker provides several sample alert agents, installed in
+``/usr/share/pacemaker/alerts`` by default.
+   
+While these sample scripts may be copied and used as-is, they are provided
+mainly as templates to be edited to suit your purposes. See their source code
+for the full set of instance attributes they support.
+   
+.. topic:: Sending cluster events as SNMP v2c traps
+
+   .. code-block:: xml
+
+      <configuration>
+         <alerts>
+            <alert id="snmp_alert" path="/path/to/alert_snmp.sh">
+               <instance_attributes id="config_for_alert_snmp">
+                  <nvpair id="trap_node_states" name="trap_node_states"
+                          value="all"/>
+               </instance_attributes>
+               <meta_attributes id="config_for_timestamp">
+                  <nvpair id="ts_fmt" name="timestamp-format"
+                          value="%Y-%m-%d,%H:%M:%S.%01N"/>
+               </meta_attributes>
+               <recipient id="snmp_destination" value="192.168.1.2"/>
+            </alert>
+         </alerts>
+      </configuration>
+
+.. note:: **SNMP alert agent attributes**
+
+   The ``timestamp-format`` meta-attribute should always be set to
+   ``%Y-%m-%d,%H:%M:%S.%01N`` when using the SNMP agent, to match the SNMP
+   standard.
+
+   The SNMP agent provides a number of instance attributes in addition to the
+   one used in the example above. The most useful are ``trap_version``, which
+   defaults to ``2c``, and ``trap_community``, which defaults to ``public``.
+   See the source code for more details.
+
+.. topic:: Sending cluster events as SNMP v3 traps
+
+   .. code-block:: xml
+
+      <configuration>
+         <alerts>
+            <alert id="snmp_alert" path="/path/to/alert_snmp.sh">
+               <instance_attributes id="config_for_alert_snmp">
+                  <nvpair id="trap_node_states" name="trap_node_states"
+                          value="all"/>
+                  <nvpair id="trap_version" name="trap_version" value="3"/>
+                  <nvpair id="trap_community" name="trap_community" value=""/>
+                  <nvpair id="trap_options" name="trap_options"
+                          value="-l authNoPriv -a MD5 -u testuser -A secret1"/>
+               </instance_attributes>
+               <meta_attributes id="config_for_timestamp">
+                  <nvpair id="ts_fmt" name="timestamp-format"
+                          value="%Y-%m-%d,%H:%M:%S.%01N"/>
+               </meta_attributes>
+               <recipient id="snmp_destination" value="192.168.1.2"/>
+            </alert>
+         </alerts>
+      </configuration>
+
+.. note:: **SNMP v3 trap configuration**
+
+   To use SNMP v3, ``trap_version`` must be set to ``3``. ``trap_community``
+   will be ignored.
+
+   The example above uses the ``trap_options`` instance attribute to override
+   the security level, authentication protocol, authentication user, and
+   authentication password from snmp.conf. These will be passed to the snmptrap
+   command. Passing the password on the command line is considered insecure;
+   specify authentication and privacy options suitable for your environment.
+
+.. topic:: Sending cluster events as e-mails
+
+   .. code-block:: xml
+
+      <configuration>
+         <alerts>
+            <alert id="smtp_alert" path="/path/to/alert_smtp.sh">
+               <instance_attributes id="config_for_alert_smtp">
+                  <nvpair id="email_sender" name="email_sender"
+                          value="donotreply@example.com"/>
+               </instance_attributes>
+               <recipient id="smtp_destination" value="admin@example.com"/>
+            </alert>
+         </alerts>
+      </configuration>
+
+
+.. index::
+   single: alert; agent development
+
+Writing an Alert Agent
+######################
+   
+.. index::
+   single: alert; environment variables
+   single: environment variable; alert agents
+
+.. table:: **Environment variables passed to alert agents**
+   :class: longtable
+   :widths: 1 3
+   
+   +---------------------------+----------------------------------------------------------------+
+   | Environment Variable      | Description                                                    |
+   +===========================+================================================================+
+   | CRM_alert_kind            | .. index::                                                     | 
+   |                           |   single:environment variable; CRM_alert_kind                  |
+   |                           |   single:CRM_alert_kind                                        |
+   |                           |                                                                |
+   |                           | The type of alert (``node``, ``fencing``, ``resource``, or     |
+   |                           | ``attribute``)                                                 |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_node            | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_node                  |
+   |                           |   single:CRM_alert_node                                        |
+   |                           |                                                                |
+   |                           | Name of affected node                                          |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_node_sequence   | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_sequence              |
+   |                           |   single:CRM_alert_sequence                                    |
+   |                           |                                                                |
+   |                           | A sequence number increased whenever an alert is being issued  |
+   |                           | on the local node, which can be used to reference the order in |
+   |                           | which alerts have been issued by Pacemaker. An alert for an    |
+   |                           | event that happened later in time reliably has a higher        |
+   |                           | sequence number than alerts for earlier events.                |
+   |                           |                                                                |
+   |                           | Be aware that this number has no cluster-wide meaning.         |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_recipient       | .. index::                                                     | 
+   |                           |   single:environment variable; CRM_alert_recipient             |
+   |                           |   single:CRM_alert_recipient                                   |
+   |                           |                                                                |
+   |                           | The configured recipient                                       |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_timestamp       | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_timestamp             |
+   |                           |   single:CRM_alert_timestamp                                   |
+   |                           |                                                                |
+   |                           | A timestamp created prior to executing the agent, in the       |
+   |                           | format specified by the ``timestamp-format`` meta-attribute.   |
+   |                           | This allows the agent to have a reliable, high-precision time  |
+   |                           | of when the event occurred, regardless of when the agent       |
+   |                           | itself was invoked (which could potentially be delayed due to  |
+   |                           | system load, etc.).                                            |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_timestamp_epoch | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_timestamp_epoch       |
+   |                           |   single:CRM_alert_timestamp_epoch                             |
+   |                           |                                                                |
+   |                           | The same time as ``CRM_alert_timestamp``, expressed as the     |
+   |                           | integer number of seconds since January 1, 1970. This (along   |
+   |                           | with ``CRM_alert_timestamp_usec``) can be useful for alert     |
+   |                           | agents that need to format time in a specific way rather than  |
+   |                           | let the user configure it.                                     |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_timestamp_usec  | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_timestamp_usec        |
+   |                           |   single:CRM_alert_timestamp_usec                              |
+   |                           |                                                                |
+   |                           | The same time as ``CRM_alert_timestamp``, expressed as the     |
+   |                           | integer number of microseconds since                           |
+   |                           | ``CRM_alert_timestamp_epoch``.                                 |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_version         | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_version               |
+   |                           |   single:CRM_alert_version                                     |
+   |                           |                                                                |
+   |                           | The version of Pacemaker sending the alert                     |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_desc            | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_desc                  |
+   |                           |   single:CRM_alert_desc                                        |
+   |                           |                                                                |
+   |                           | Detail about event. For ``node`` alerts, this is the node's    |
+   |                           | current state (``member`` or ``lost``). For ``fencing``        |
+   |                           | alerts, this is a summary of the requested fencing operation,  |
+   |                           | including origin, target, and fencing operation error code, if |
+   |                           | any. For ``resource`` alerts, this is a readable string        |
+   |                           | equivalent of ``CRM_alert_status``.                            |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_nodeid          | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_nodeid                |
+   |                           |   single:CRM_alert_nodeid                                      |
+   |                           |                                                                |
+   |                           | ID of node whose status changed (provided with ``node`` alerts |
+   |                           | only)                                                          |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_rc              | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_rc                    |
+   |                           |   single:CRM_alert_rc                                          |
+   |                           |                                                                |
+   |                           | The numerical return code of the fencing or resource operation |
+   |                           | (provided with ``fencing`` and ``resource`` alerts only)       |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_task            | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_task                  |
+   |                           |   single:CRM_alert_task                                        |
+   |                           |                                                                |
+   |                           | The requested fencing or resource operation (provided with     |
+   |                           | ``fencing`` and ``resource`` alerts only)                      |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_exec_time       | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_exec_time             |
+   |                           |   single:CRM_alert_exec_time                                   |
+   |                           |                                                                |
+   |                           | The (wall-clock) time, in milliseconds, that it took to        |
+   |                           | execute the action. If the action timed out,                   |
+   |                           | ``CRM_alert_status`` will be 2, ``CRM_alert_desc`` will be     |
+   |                           | "Timed Out", and this value will be the action timeout. May    |
+   |                           | not be supported on all platforms. (``resource`` alerts only)  |
+   |                           | *(since 2.0.1)*                                                |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_interval        | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_interval              |
+   |                           |   single:CRM_alert_interval                                    |
+   |                           |                                                                |
+   |                           | The interval of the resource operation (``resource`` alerts    |
+   |                           | only)                                                          |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_rsc             | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_rsc                   |
+   |                           |   single:CRM_alert_rsc                                         |
+   |                           |                                                                |
+   |                           | The name of the affected resource (``resource`` alerts only)   |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_status          | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_status                |
+   |                           |   single:CRM_alert_status                                      |
+   |                           |                                                                |
+   |                           | A numerical code used by Pacemaker to represent the operation  |
+   |                           | result (``resource`` alerts only)                              |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_target_rc       | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_target_rc             |
+   |                           |   single:CRM_alert_target_rc                                   |
+   |                           |                                                                |
+   |                           | The expected numerical return code of the operation            |
+   |                           | (``resource`` alerts only)                                     |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_attribute_name  | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_attribute_name        |
+   |                           |   single:CRM_alert_attribute_name                              |
+   |                           |                                                                |
+   |                           | The name of the node attribute that changed (``attribute``     |
+   |                           | alerts only)                                                   |
+   +---------------------------+----------------------------------------------------------------+
+   | CRM_alert_attribute_value | .. index::                                                     |
+   |                           |   single:environment variable; CRM_alert_attribute_value       |
+   |                           |   single:CRM_alert_attribute_value                             |
+   |                           |                                                                |
+   |                           | The new value of the node attribute that changed               |
+   |                           | (``attribute`` alerts only)                                    |
+   +---------------------------+----------------------------------------------------------------+
+   
+Special concerns when writing alert agents:
+   
+* Alert agents may be called with no recipient (if none is configured),
+  so the agent must be able to handle this situation, even if it
+  only exits in that case. (Users may modify the configuration in
+  stages, and add a recipient later.)
+   
+* If more than one recipient is configured for an alert, the alert agent will
+  be called once per recipient. If an agent is not able to run concurrently, it
+  should be configured with only a single recipient. The agent is free,
+  however, to interpret the recipient as a list.
+   
+* When a cluster event occurs, all alerts are fired off at the same time as
+  separate processes. Depending on how many alerts and recipients are
+  configured, and on what is done within the alert agents,
+  a significant load burst may occur. The agent could be written to take
+  this into consideration, for example by queueing resource-intensive actions
+  into some other instance, instead of directly executing them.
+   
+* Alert agents are run as the ``hacluster`` user, which has a minimal set
+  of permissions. If an agent requires additional privileges, it is
+  recommended to configure ``sudo`` to allow the agent to run the necessary
+  commands as another user with the appropriate privileges.
+   
+* As always, take care to validate and sanitize user-configured parameters,
+  such as ``CRM_alert_timestamp`` (whose content is specified by the
+  user-configured ``timestamp-format``), ``CRM_alert_recipient,`` and all
+  instance attributes. Mostly this is needed simply to protect against
+  configuration errors, but if some user can modify the CIB without having
+  ``hacluster``-level access to the cluster nodes, it is a potential security
+  concern as well, to avoid the possibility of code injection.
+   
+.. note:: **ocf:pacemaker:ClusterMon compatibility**
+
+   The alerts interface is designed to be backward compatible with the external
+   scripts interface used by the ``ocf:pacemaker:ClusterMon`` resource, which
+   is now deprecated. To preserve this compatibility, the environment variables
+   passed to alert agents are available prepended with ``CRM_notify_``
+   as well as ``CRM_alert_``. One break in compatibility is that ``ClusterMon``
+   ran external scripts as the ``root`` user, while alert agents are run as the
+   ``hacluster`` user.
diff --git a/doc/sphinx/Pacemaker_Administration/cluster.rst b/doc/sphinx/Pacemaker_Administration/cluster.rst
new file mode 100644
index 0000000..3713733
--- /dev/null
+++ b/doc/sphinx/Pacemaker_Administration/cluster.rst
@@ -0,0 +1,21 @@
+.. index::
+   single: cluster layer
+
+The Cluster Layer
+-----------------
+
+Pacemaker utilizes an underlying cluster layer for two purposes:
+
+* obtaining quorum
+* messaging between nodes
+
+.. index::
+   single: cluster layer; Corosync
+   single: Corosync
+
+Currently, only Corosync 2 and later is supported for this layer.
+
+This document assumes you have configured the cluster nodes in Corosync
+already. High-level cluster management tools are available that can configure
+Corosync for you. If you want the lower-level details, see the
+`Corosync documentation <https://corosync.github.io/corosync/>`_.
diff --git a/doc/sphinx/Pacemaker_Administration/configuring.rst b/doc/sphinx/Pacemaker_Administration/configuring.rst
new file mode 100644
index 0000000..415dd81
--- /dev/null
+++ b/doc/sphinx/Pacemaker_Administration/configuring.rst
@@ -0,0 +1,278 @@
+.. index::
+   single: configuration
+   single: CIB
+
+Configuring Pacemaker
+---------------------
+
+Pacemaker's configuration, the CIB, is stored in XML format. Cluster
+administrators have multiple options for modifying the configuration either via
+the XML, or at a more abstract (and easier for humans to understand) level.
+
+Pacemaker reacts to configuration changes as soon as they are saved.
+Pacemaker's command-line tools and most higher-level tools provide the ability
+to batch changes together and commit them at once, rather than make a series of
+small changes, which could cause avoid unnecessary actions as Pacemaker
+responds to each change individually.
+
+Pacemaker tracks revisions to the configuration and will reject any update
+older than the current revision. Thus, it is a good idea to serialize all
+changes to the configuration. Avoid attempting simultaneous changes, whether on
+the same node or different nodes, and whether manually or using some automated
+configuration tool.
+
+.. note::
+
+   It is not necessary to update the configuration on all cluster nodes.
+   Pacemaker immediately synchronizes changes to all active members of the
+   cluster. To reduce bandwidth, the cluster only broadcasts the incremental
+   updates that result from your changes and uses checksums to ensure that each
+   copy is consistent.
+
+
+Configuration Using Higher-level Tools
+######################################
+
+Most users will benefit from using higher-level tools provided by projects
+separate from Pacemaker. Some of the most commonly used include the crm shell,
+hawk, and pcs. [#]_
+
+See those projects' documentation for details on how to configure Pacemaker
+using them.
+
+
+Configuration Using Pacemaker's Command-Line Tools
+##################################################
+
+Pacemaker provides lower-level, command-line tools to manage the cluster. Most
+configuration tasks can be performed with these tools, without needing any XML
+knowledge.
+
+To enable STONITH for example, one could run:
+
+.. code-block:: none
+
+   # crm_attribute --name stonith-enabled --update 1
+
+Or, to check whether **node1** is allowed to run resources, there is:
+
+.. code-block:: none
+
+   # crm_standby --query --node node1
+
+Or, to change the failure threshold of **my-test-rsc**, one can use:
+
+.. code-block:: none
+
+   # crm_resource -r my-test-rsc --set-parameter migration-threshold --parameter-value 3 --meta
+
+Examples of using these tools for specific cases will be given throughout this
+document where appropriate. See the man pages for further details.
+
+See :ref:`cibadmin` for how to edit the CIB using XML.
+
+See :ref:`crm_shadow` for a way to make a series of changes, then commit them
+all at once to the live cluster.
+
+
+.. index::
+   single: configuration; CIB properties
+   single: CIB; properties
+   single: CIB property
+
+Working with CIB Properties
+___________________________
+
+Although these fields can be written to by the user, in
+most cases the cluster will overwrite any values specified by the
+user with the "correct" ones.
+
+To change the ones that can be specified by the user, for example
+``admin_epoch``, one should use:
+
+.. code-block:: none
+
+   # cibadmin --modify --xml-text '<cib admin_epoch="42"/>'
+
+A complete set of CIB properties will look something like this:
+
+.. topic:: XML attributes set for a cib element
+
+   .. code-block:: xml
+
+      <cib crm_feature_set="3.0.7" validate-with="pacemaker-1.2" 
+         admin_epoch="42" epoch="116" num_updates="1"
+         cib-last-written="Mon Jan 12 15:46:39 2015" update-origin="rhel7-1"
+         update-client="crm_attribute" have-quorum="1" dc-uuid="1">
+
+
+.. index::
+   single: configuration; cluster options
+
+Querying and Setting Cluster Options
+____________________________________
+
+Cluster options can be queried and modified using the ``crm_attribute`` tool.
+To get the current value of ``cluster-delay``, you can run:
+
+.. code-block:: none
+
+   # crm_attribute --query --name cluster-delay
+
+which is more simply written as
+
+.. code-block:: none
+
+   # crm_attribute -G -n cluster-delay
+
+If a value is found, you'll see a result like this:
+
+.. code-block:: none
+
+   # crm_attribute -G -n cluster-delay
+   scope=crm_config name=cluster-delay value=60s
+
+If no value is found, the tool will display an error:
+
+.. code-block:: none
+
+   # crm_attribute -G -n clusta-deway
+   scope=crm_config name=clusta-deway value=(null)
+   Error performing operation: No such device or address
+
+To use a different value (for example, 30 seconds), simply run:
+
+.. code-block:: none
+
+   # crm_attribute --name cluster-delay --update 30s
+
+To go back to the cluster's default value, you can delete the value, for example:
+
+.. code-block:: none
+
+   # crm_attribute --name cluster-delay --delete
+   Deleted crm_config option: id=cib-bootstrap-options-cluster-delay name=cluster-delay
+
+
+When Options are Listed More Than Once
+______________________________________
+
+If you ever see something like the following, it means that the option you're
+modifying is present more than once.
+
+.. topic:: Deleting an option that is listed twice
+
+   .. code-block:: none
+
+      # crm_attribute --name batch-limit --delete
+
+      Please choose from one of the matches below and supply the 'id' with --id
+      Multiple attributes match name=batch-limit in crm_config:
+      Value: 50          (set=cib-bootstrap-options, id=cib-bootstrap-options-batch-limit)
+      Value: 100         (set=custom, id=custom-batch-limit)
+
+In such cases, follow the on-screen instructions to perform the requested
+action.  To determine which value is currently being used by the cluster, refer
+to the "Rules" chapter of *Pacemaker Explained*.
+
+
+.. index::
+   single: configuration; remote
+
+.. _remote_connection:
+
+Connecting from a Remote Machine
+################################
+
+Provided Pacemaker is installed on a machine, it is possible to connect to the
+cluster even if the machine itself is not in the same cluster. To do this, one
+simply sets up a number of environment variables and runs the same commands as
+when working on a cluster node.
+
+.. table:: **Environment Variables Used to Connect to Remote Instances of the CIB**
+
+   +----------------------+-----------+------------------------------------------------+
+   | Environment Variable | Default   | Description                                    |
+   +======================+===========+================================================+
+   | CIB_user             | $USER     | .. index::                                     |
+   |                      |           |    single: CIB_user                            |
+   |                      |           |    single: environment variable; CIB_user      |
+   |                      |           |                                                |
+   |                      |           | The user to connect as. Needs to be            |
+   |                      |           | part of the ``haclient`` group on              |
+   |                      |           | the target host.                               |
+   +----------------------+-----------+------------------------------------------------+
+   | CIB_passwd           |           | .. index::                                     |
+   |                      |           |    single: CIB_passwd                          |
+   |                      |           |    single: environment variable; CIB_passwd    |
+   |                      |           |                                                |
+   |                      |           | The user's password. Read from the             |
+   |                      |           | command line if unset.                         |
+   +----------------------+-----------+------------------------------------------------+
+   | CIB_server           | localhost | .. index::                                     |
+   |                      |           |    single: CIB_server                          |
+   |                      |           |    single: environment variable; CIB_server    |
+   |                      |           |                                                |
+   |                      |           | The host to contact                            |
+   +----------------------+-----------+------------------------------------------------+
+   | CIB_port             |           | .. index::                                     |
+   |                      |           |    single: CIB_port                            |
+   |                      |           |    single: environment variable; CIB_port      |
+   |                      |           |                                                |
+   |                      |           | The port on which to contact the server;       |
+   |                      |           | required.                                      |
+   +----------------------+-----------+------------------------------------------------+
+   | CIB_encrypted        | TRUE      | .. index::                                     |
+   |                      |           |    single: CIB_encrypted                       |
+   |                      |           |    single: environment variable; CIB_encrypted |
+   |                      |           |                                                |
+   |                      |           | Whether to encrypt network traffic             |
+   +----------------------+-----------+------------------------------------------------+
+
+So, if **c001n01** is an active cluster node and is listening on port 1234
+for connections, and **someuser** is a member of the **haclient** group,
+then the following would prompt for **someuser**'s password and return
+the cluster's current configuration:
+
+.. code-block:: none
+
+   # export CIB_port=1234; export CIB_server=c001n01; export CIB_user=someuser;
+   # cibadmin -Q
+
+For security reasons, the cluster does not listen for remote connections by
+default.  If you wish to allow remote access, you need to set the
+``remote-tls-port`` (encrypted) or ``remote-clear-port`` (unencrypted) CIB
+properties (i.e., those kept in the ``cib`` tag, like ``num_updates`` and
+``epoch``).
+
+.. table:: **Extra top-level CIB properties for remote access**
+
+   +----------------------+-----------+------------------------------------------------------+
+   | CIB Property         | Default   | Description                                          |
+   +======================+===========+======================================================+
+   | remote-tls-port      |           | .. index::                                           |
+   |                      |           |    single: remote-tls-port                           |
+   |                      |           |    single: CIB property; remote-tls-port             |
+   |                      |           |                                                      |
+   |                      |           | Listen for encrypted remote connections              |
+   |                      |           | on this port.                                        |
+   +----------------------+-----------+------------------------------------------------------+
+   | remote-clear-port    |           | .. index::                                           |
+   |                      |           |    single: remote-clear-port                         |
+   |                      |           |    single: CIB property; remote-clear-port           |
+   |                      |           |                                                      |
+   |                      |           | Listen for plaintext remote connections              |
+   |                      |           | on this port.                                        |
+   +----------------------+-----------+------------------------------------------------------+
+
+.. important::
+
+   The Pacemaker version on the administration host must be the same or greater
+   than the version(s) on the cluster nodes. Otherwise, it may not have the
+   schema files necessary to validate the CIB.
+
+
+.. rubric:: Footnotes
+
+.. [#] For a list, see "Configuration Tools" at
+       https://clusterlabs.org/components.html
diff --git a/doc/sphinx/Pacemaker_Administration/index.rst b/doc/sphinx/Pacemaker_Administration/index.rst
new file mode 100644
index 0000000..327ad31
--- /dev/null
+++ b/doc/sphinx/Pacemaker_Administration/index.rst
@@ -0,0 +1,36 @@
+Pacemaker Administration
+========================
+
+*Managing Pacemaker Clusters*
+
+
+Abstract
+--------
+This document has instructions and tips for system administrators who
+manage high-availability clusters using Pacemaker.
+
+
+Table of Contents
+-----------------
+
+.. toctree::
+   :maxdepth: 3
+   :numbered:
+
+   intro
+   installing
+   cluster
+   configuring
+   tools
+   troubleshooting
+   upgrading
+   alerts
+   agents
+   pcs-crmsh
+
+
+Index
+-----
+
+* :ref:`genindex`
+* :ref:`search`
diff --git a/doc/sphinx/Pacemaker_Administration/installing.rst b/doc/sphinx/Pacemaker_Administration/installing.rst
new file mode 100644
index 0000000..44a3f5f
--- /dev/null
+++ b/doc/sphinx/Pacemaker_Administration/installing.rst
@@ -0,0 +1,9 @@
+Installing Cluster Software
+---------------------------
+
+.. index:: installation
+
+Most major Linux distributions have pacemaker packages in their standard
+package repositories, or the software can be built from source code.
+See the `Install wiki page <https://wiki.clusterlabs.org/wiki/Install>`_
+for details.
diff --git a/doc/sphinx/Pacemaker_Administration/intro.rst b/doc/sphinx/Pacemaker_Administration/intro.rst
new file mode 100644
index 0000000..067e293
--- /dev/null
+++ b/doc/sphinx/Pacemaker_Administration/intro.rst
@@ -0,0 +1,21 @@
+Introduction
+------------
+
+The Scope of this Document
+##########################
+
+The purpose of this document is to help system administrators learn how to
+manage a Pacemaker cluster.
+      
+System administrators may be interested in other parts of the
+`Pacemaker documentation set <https://www.clusterlabs.org/pacemaker/doc/>`_
+such as *Clusters from Scratch*, a step-by-step guide to setting up an example
+cluster, and *Pacemaker Explained*, an exhaustive reference for cluster
+configuration.
+
+Multiple higher-level tools (both command-line and GUI) are available to
+simplify cluster management. However, this document focuses on the lower-level
+command-line tools that come with Pacemaker itself. The concepts are applicable
+to the higher-level tools, though the syntax would differ.
+
+.. include:: ../shared/pacemaker-intro.rst
diff --git a/doc/sphinx/Pacemaker_Administration/pcs-crmsh.rst b/doc/sphinx/Pacemaker_Administration/pcs-crmsh.rst
new file mode 100644
index 0000000..61ab4e6
--- /dev/null
+++ b/doc/sphinx/Pacemaker_Administration/pcs-crmsh.rst
@@ -0,0 +1,441 @@
+Quick Comparison of pcs and crm shell
+-------------------------------------
+
+``pcs`` and ``crm shell`` are two popular higher-level command-line interfaces
+to Pacemaker. Each has its own syntax; this chapter gives a quick comparion of
+how to accomplish the same tasks using either one. Some examples also show the
+equivalent command using low-level Pacmaker command-line tools.
+
+These examples show the simplest syntax; see the respective man pages for all
+possible options.
+
+Show Cluster Configuration and Status
+#####################################
+
+.. topic:: Show Configuration (Raw XML)
+
+   .. code-block:: none
+
+      crmsh     # crm configure show xml
+      pcs       # pcs cluster cib
+      pacemaker # cibadmin -Q
+
+.. topic:: Show Configuration (Human-friendly)
+
+   .. code-block:: none
+
+      crmsh # crm configure show
+      pcs   # pcs config
+    
+.. topic:: Show Cluster Status
+
+   .. code-block:: none
+
+      crmsh     # crm status
+      pcs       # pcs status
+      pacemaker # crm_mon -1
+
+Manage Nodes
+############
+
+.. topic:: Put node "pcmk-1" in standby mode
+
+   .. code-block:: none
+
+      crmsh     # crm node standby pcmk-1
+      pcs-0.9   # pcs cluster standby pcmk-1
+      pcs-0.10  # pcs node standby pcmk-1
+      pacemaker # crm_standby -N pcmk-1 -v on
+
+.. topic:: Remove node "pcmk-1" from standby mode
+
+   .. code-block:: none
+
+      crmsh     # crm node online pcmk-1
+      pcs-0.9   # pcs cluster unstandby pcmk-1
+      pcs-0.10  # pcs node unstandby pcmk-1
+      pacemaker # crm_standby -N pcmk-1 -v off
+
+Manage Cluster Properties
+#########################
+
+.. topic:: Set the "stonith-enabled" cluster property to "false"
+
+   .. code-block:: none
+
+      crmsh     # crm configure property stonith-enabled=false
+      pcs       # pcs property set stonith-enabled=false
+      pacemaker # crm_attribute -n stonith-enabled -v false
+
+Show Resource Agent Information
+###############################
+
+.. topic:: List Resource Agent (RA) Classes
+
+   .. code-block:: none
+
+      crmsh    # crm ra classes
+      pcs      # pcs resource standards
+      pacmaker # crm_resource --list-standards
+
+.. topic:: List Available Resource Agents (RAs) by Standard
+
+   .. code-block:: none
+
+      crmsh     # crm ra list ocf
+      pcs       # pcs resource agents ocf
+      pacemaker # crm_resource --list-agents ocf
+
+.. topic:: List Available Resource Agents (RAs) by OCF Provider
+
+   .. code-block:: none
+
+      crmsh     # crm ra list ocf pacemaker
+      pcs       # pcs resource agents ocf:pacemaker
+      pacemaker # crm_resource --list-agents ocf:pacemaker
+
+.. topic:: List Available Resource Agent Parameters
+
+   .. code-block:: none
+
+      crmsh     # crm ra info IPaddr2
+      pcs       # pcs resource describe IPaddr2
+      pacemaker # crm_resource --show-metadata ocf:heartbeat:IPaddr2
+
+You can also use the full ``class:provider:type`` format with crmsh and pcs if
+multiple RAs with the same name are available.
+
+.. topic:: Show Available Fence Agent Parameters
+
+   .. code-block:: none
+
+      crmsh # crm ra info stonith:fence_ipmilan
+      pcs   # pcs stonith describe fence_ipmilan
+
+Manage Resources
+################
+
+.. topic:: Create a Resource
+
+   .. code-block:: none
+
+      crmsh # crm configure primitive ClusterIP ocf:heartbeat:IPaddr2 \
+              params ip=192.168.122.120 cidr_netmask=24 \
+              op monitor interval=30s 
+      pcs   # pcs resource create ClusterIP IPaddr2 ip=192.168.122.120 cidr_netmask=24
+
+pcs determines the standard and provider (``ocf:heartbeat``) automatically
+since ``IPaddr2`` is unique, and automatically creates operations (including
+monitor) based on the agent's meta-data.
+
+.. topic:: Show Configuration of All Resources
+
+   .. code-block:: none
+
+      crmsh    # crm configure show
+      pcs-0.9  # pcs resource show --full
+      pcs-0.10 # pcs resource config
+
+.. topic:: Show Configuration of One Resource
+
+   .. code-block:: none
+
+      crmsh    # crm configure show ClusterIP
+      pcs-0.9  # pcs resource show ClusterIP
+      pcs-0.10 # pcs resource config ClusterIP
+
+.. topic:: Show Configuration of Fencing Resources
+
+   .. code-block:: none
+
+      crmsh    # crm resource status
+      pcs-0.9  # pcs stonith show --full
+      pcs-0.10 # pcs stonith config
+
+.. topic:: Start a Resource
+
+   .. code-block:: none
+
+      crmsh     # crm resource start ClusterIP
+      pcs       # pcs resource enable ClusterIP
+      pacemaker # crm_resource -r ClusterIP --set-parameter target-role --meta -v Started
+
+.. topic:: Stop a Resource
+
+   .. code-block:: none
+
+      crmsh     # crm resource stop ClusterIP
+      pcs       # pcs resource disable ClusterIP
+      pacemaker # crm_resource -r ClusterIP --set-parameter target-role --meta -v Stopped
+
+.. topic:: Remove a Resource
+
+   .. code-block:: none
+
+      crmsh # crm configure delete ClusterIP
+      pcs   # pcs resource delete ClusterIP
+
+.. topic:: Modify a Resource's Instance Parameters
+
+   .. code-block:: none
+
+      crmsh     # crm resource param ClusterIP set clusterip_hash=sourceip
+      pcs       # pcs resource update ClusterIP clusterip_hash=sourceip
+      pacemaker # crm_resource -r ClusterIP --set-parameter clusterip_hash -v sourceip
+
+crmsh also has an `edit` command which edits the simplified CIB syntax
+(same commands as the command line) via a configurable text editor.
+
+.. topic:: Modify a Resource's Instance Parameters Interactively
+
+   .. code-block:: none
+
+      crmsh # crm configure edit ClusterIP
+
+Using the interactive shell mode of crmsh, multiple changes can be
+edited and verified before committing to the live configuration:
+
+.. topic:: Make Multiple Configuration Changes Interactively
+
+   .. code-block:: none
+
+      crmsh # crm configure
+      crmsh # edit
+      crmsh # verify
+      crmsh # commit
+
+.. topic:: Delete a Resource's Instance Parameters
+
+   .. code-block:: none
+
+      crmsh     # crm resource param ClusterIP delete nic
+      pcs       # pcs resource update ClusterIP nic=  
+      pacemaker # crm_resource -r ClusterIP --delete-parameter nic
+
+.. topic:: List Current Resource Defaults
+
+   .. code-block:: none
+
+      crmsh     # crm configure show type:rsc_defaults
+      pcs       # pcs resource defaults
+      pacemaker # cibadmin -Q --scope rsc_defaults
+
+.. topic:: Set Resource Defaults
+
+   .. code-block:: none
+
+      crmsh # crm configure rsc_defaults resource-stickiness=100
+      pcs   # pcs resource defaults resource-stickiness=100
+    
+.. topic:: List Current Operation Defaults
+
+   .. code-block:: none
+
+      crmsh     # crm configure show type:op_defaults
+      pcs       # pcs resource op defaults
+      pacemaker # cibadmin -Q --scope op_defaults
+
+.. topic:: Set Operation Defaults
+
+   .. code-block:: none
+
+      crmsh # crm configure op_defaults timeout=240s
+      pcs   # pcs resource op defaults timeout=240s
+
+.. topic:: Enable Resource Agent Tracing for a Resource
+
+   .. code-block:: none
+
+      crmsh # crm resource trace Website
+
+.. topic:: Clear Fail Counts for a Resource
+
+   .. code-block:: none
+
+      crmsh     # crm resource cleanup Website
+      pcs       # pcs resource cleanup Website
+      pacemaker # crm_resource --cleanup -r Website
+
+.. topic:: Create a Clone Resource
+
+   .. code-block:: none
+
+      crmsh # crm configure clone WebIP ClusterIP meta globally-unique=true clone-max=2 clone-node-max=2
+      pcs   # pcs resource clone ClusterIP globally-unique=true clone-max=2 clone-node-max=2
+
+.. topic:: Create a Promotable Clone Resource
+
+   .. code-block:: none
+
+      crmsh    # crm configure ms WebDataClone WebData \
+                 meta master-max=1 master-node-max=1 \
+                 clone-max=2 clone-node-max=1 notify=true
+      pcs-0.9  # pcs resource master WebDataClone WebData \
+                 master-max=1 master-node-max=1 \
+                 clone-max=2 clone-node-max=1 notify=true
+      pcs-0.10 # pcs resource promotable WebData WebDataClone \
+                 promoted-max=1 promoted-node-max=1 \
+                 clone-max=2 clone-node-max=1 notify=true
+
+pcs will generate the clone name automatically if it is omitted from the
+command line.
+
+
+Manage Constraints
+##################
+
+.. topic:: Create a Colocation Constraint
+
+   .. code-block:: none
+
+      crmsh # crm configure colocation website-with-ip INFINITY: WebSite ClusterIP
+      pcs   # pcs constraint colocation add ClusterIP with WebSite INFINITY
+
+.. topic:: Create a Colocation Constraint Based on Role
+
+   .. code-block:: none
+
+      crmsh # crm configure colocation another-ip-with-website inf: AnotherIP WebSite:Master
+      pcs   # pcs constraint colocation add Started AnotherIP with Promoted WebSite INFINITY
+
+.. topic:: Create an Ordering Constraint
+
+   .. code-block:: none
+
+      crmsh # crm configure order apache-after-ip mandatory: ClusterIP WebSite
+      pcs   # pcs constraint order ClusterIP then WebSite
+
+.. topic:: Create an Ordering Constraint Based on Role
+
+   .. code-block:: none
+
+      crmsh # crm configure order ip-after-website Mandatory: WebSite:Master AnotherIP
+      pcs   # pcs constraint order promote WebSite then start AnotherIP
+
+.. topic:: Create a Location Constraint
+
+   .. code-block:: none
+
+      crmsh # crm configure location prefer-pcmk-1 WebSite 50: pcmk-1
+      pcs   # pcs constraint location WebSite prefers pcmk-1=50
+    
+.. topic:: Create a Location Constraint Based on Role
+
+   .. code-block:: none
+
+      crmsh # crm configure location prefer-pcmk-1 WebSite rule role=Master 50: \#uname eq pcmk-1
+      pcs   # pcs constraint location WebSite rule role=Promoted 50 \#uname eq pcmk-1
+
+.. topic:: Move a Resource to a Specific Node (by Creating a Location Constraint)
+
+   .. code-block:: none
+
+      crmsh     # crm resource move WebSite pcmk-1
+      pcs       # pcs resource move WebSite pcmk-1
+      pacemaker # crm_resource -r WebSite --move -N pcmk-1
+    
+.. topic:: Move a Resource Away from Its Current Node (by Creating a Location Constraint)
+
+   .. code-block:: none
+
+      crmsh     # crm resource ban Website pcmk-2
+      pcs       # pcs resource ban Website pcmk-2
+      pacemaker # crm_resource -r WebSite --move
+
+.. topic:: Remove any Constraints Created by Moving a Resource
+
+   .. code-block:: none
+
+      crmsh     # crm resource unmove WebSite
+      pcs       # pcs resource clear WebSite
+      pacemaker # crm_resource -r WebSite --clear
+
+Advanced Configuration
+######################
+
+Manipulate Configuration Elements by Type
+_________________________________________
+
+.. topic:: List Constraints with IDs
+
+   .. code-block:: none
+
+      pcs   # pcs constraint list --full
+
+.. topic:: Remove Constraint by ID
+
+   .. code-block:: none
+
+      pcs   # pcs constraint remove cli-ban-Website-on-pcmk-1
+      crmsh # crm configure remove cli-ban-Website-on-pcmk-1
+
+crmsh's `show` and `edit` commands can be used to manage resources and
+constraints by type:
+
+.. topic:: Show Configuration Elements
+
+   .. code-block:: none
+
+      crmsh # crm configure show type:primitive
+      crmsh # crm configure edit type:colocation
+
+Batch Changes
+_____________
+
+.. topic:: Make Multiple Changes and Apply Together
+
+   .. code-block:: none
+
+      crmsh # crm
+      crmsh # cib new drbd_cfg
+      crmsh # configure primitive WebData ocf:linbit:drbd params drbd_resource=wwwdata \
+              op monitor interval=60s
+      crmsh # configure ms WebDataClone WebData meta master-max=1 master-node-max=1 \
+              clone-max=2 clone-node-max=1 notify=true
+      crmsh # cib commit drbd_cfg
+      crmsh # quit
+
+      pcs      # pcs cluster cib drbd_cfg
+      pcs      # pcs -f drbd_cfg resource create WebData ocf:linbit:drbd drbd_resource=wwwdata \
+                 op monitor interval=60s
+      pcs-0.9  # pcs -f drbd_cfg resource master WebDataClone WebData \
+                 master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
+      pcs-0.10 # pcs -f drbd_cfg resource promotable WebData WebDataClone \
+                 promoted-max=1 promoted-node-max=1 clone-max=2 clone-node-max=1 notify=true
+      pcs      # pcs cluster cib-push drbd_cfg
+
+Template Creation
+_________________
+
+.. topic:: Create Resource Template Based on Existing Primitives of Same Type
+
+   .. code-block:: none
+
+      crmsh # crm configure assist template ClusterIP AdminIP
+
+Log Analysis
+____________
+
+.. topic:: Show Information About Recent Cluster Events
+
+   .. code-block:: none
+
+      crmsh # crm history
+      crmsh # peinputs
+      crmsh # transition pe-input-10
+      crmsh # transition log pe-input-10
+
+Configuration Scripts
+_____________________
+
+.. topic:: Script Multiple-step Cluster Configurations
+
+   .. code-block:: none
+
+      crmsh # crm script show apache
+      crmsh # crm script run apache \
+              id=WebSite \
+              install=true \
+              virtual-ip:ip=192.168.0.15 \
+              database:id=WebData \
+              database:install=true
diff --git a/doc/sphinx/Pacemaker_Administration/tools.rst b/doc/sphinx/Pacemaker_Administration/tools.rst
new file mode 100644
index 0000000..5a6044d
--- /dev/null
+++ b/doc/sphinx/Pacemaker_Administration/tools.rst
@@ -0,0 +1,562 @@
+.. index:: command-line tool
+
+Using Pacemaker Command-Line Tools
+----------------------------------
+
+.. index::
+   single: command-line tool; output format
+
+.. _cmdline_output:
+
+Controlling Command Line Output
+###############################
+
+Some of the pacemaker command line utilities have been converted to a new
+output system. Among these tools are ``crm_mon`` and ``stonith_admin``. This
+is an ongoing project, and more tools will be converted over time. This system
+lets you control the formatting of output with ``--output-as=`` and the
+destination of output with ``--output-to=``.
+
+The available formats vary by tool, but at least plain text and XML are
+supported by all tools that use the new system. The default format is plain
+text. The default destination is stdout but can be redirected to any file.
+Some formats support command line options for changing the style of the output.
+For instance:
+
+.. code-block:: none
+
+   # crm_mon --help-output
+   Usage:
+     crm_mon [OPTION?]
+
+   Provides a summary of cluster's current state.
+
+   Outputs varying levels of detail in a number of different formats.
+
+   Output Options:
+     --output-as=FORMAT                Specify output format as one of: console (default), html, text, xml
+     --output-to=DEST                  Specify file name for output (or "-" for stdout)
+     --html-cgi                        Add text needed to use output in a CGI program
+     --html-stylesheet=URI             Link to an external CSS stylesheet
+     --html-title=TITLE                Page title
+     --text-fancy                      Use more highly formatted output
+
+.. index::
+   single: crm_mon
+   single: command-line tool; crm_mon
+
+.. _crm_mon:
+
+Monitor a Cluster with crm_mon
+##############################
+
+The ``crm_mon`` utility displays the current state of an active cluster. It can
+show the cluster status organized by node or by resource, and can be used in
+either single-shot or dynamically updating mode. It can also display operations
+performed and information about failures.
+
+Using this tool, you can examine the state of the cluster for irregularities,
+and see how it responds when you cause or simulate failures.
+
+See the manual page or the output of ``crm_mon --help`` for a full description
+of its many options.
+      
+.. topic:: Sample output from crm_mon -1
+
+   .. code-block:: none
+
+      Cluster Summary:
+        * Stack: corosync
+        * Current DC: node2 (version 2.0.0-1) - partition with quorum
+        * Last updated: Mon Jan 29 12:18:42 2018
+        * Last change:  Mon Jan 29 12:18:40 2018 by root via crm_attribute	on node3
+        * 5 nodes configured
+        * 2 resources configured
+
+      Node List:
+        * Online: [ node1 node2 node3 node4 node5 ]
+
+      * Active resources:
+        * Fencing (stonith:fence_xvm):    Started node1
+        * IP	(ocf:heartbeat:IPaddr2):	Started node2
+      
+.. topic:: Sample output from crm_mon -n -1
+
+   .. code-block:: none
+
+      Cluster Summary:
+        * Stack: corosync
+        * Current DC: node2 (version 2.0.0-1) - partition with quorum
+        * Last updated: Mon Jan 29 12:21:48 2018
+        * Last change:  Mon Jan 29 12:18:40 2018 by root via crm_attribute	on node3
+        * 5 nodes configured
+        * 2 resources configured
+
+      * Node List:
+        * Node node1: online
+          * Fencing (stonith:fence_xvm):    Started
+        * Node node2: online
+          * IP	(ocf:heartbeat:IPaddr2):	Started
+        * Node node3: online
+        * Node node4: online
+        * Node node5: online
+
+As mentioned in an earlier chapter, the DC is the node is where decisions are
+made. The cluster elects a node to be DC as needed. The only significance of
+the choice of DC to an administrator is the fact that its logs will have the
+most information about why decisions were made.
+
+.. index::
+   pair: crm_mon; CSS
+
+.. _crm_mon_css:
+
+Styling crm_mon HTML output
+___________________________
+
+Various parts of ``crm_mon``'s HTML output have a CSS class associated with
+them. Not everything does, but some of the most interesting portions do. In
+the following example, the status of each node has an ``online`` class and the
+details of each resource have an ``rsc-ok`` class.
+
+.. code-block:: html
+
+   <h2>Node List</h2>
+   <ul>
+   <li>
+   <span>Node: cluster01</span><span class="online"> online</span>
+   </li>
+   <li><ul><li><span class="rsc-ok">ping   (ocf::pacemaker:ping):   Started</span></li></ul></li>
+   <li>
+   <span>Node: cluster02</span><span class="online"> online</span>
+   </li>
+   <li><ul><li><span class="rsc-ok">ping   (ocf::pacemaker:ping):   Started</span></li></ul></li>
+   </ul>
+
+By default, a stylesheet for styling these classes is included in the head of
+the HTML output.  The relevant portions of this stylesheet that would be used
+in the above example is:
+
+.. code-block:: css
+
+   <style>
+   .online { color: green }
+   .rsc-ok { color: green }
+   </style>
+
+If you want to override some or all of the styling, simply create your own
+stylesheet, place it on a web server, and pass ``--html-stylesheet=<URL>``
+to ``crm_mon``. The link is added after the default stylesheet, so your
+changes take precedence. You don't need to duplicate the entire default.
+Only include what you want to change.
+
+.. index::
+   single: cibadmin
+   single: command-line tool; cibadmin
+
+.. _cibadmin:
+
+Edit the CIB XML with cibadmin
+##############################
+
+The most flexible tool for modifying the configuration is Pacemaker's
+``cibadmin`` command.  With ``cibadmin``, you can query, add, remove, update
+or replace any part of the configuration. All changes take effect immediately,
+so there is no need to perform a reload-like operation.
+
+The simplest way of using ``cibadmin`` is to use it to save the current
+configuration to a temporary file, edit that file with your favorite
+text or XML editor, and then upload the revised configuration.
+
+.. topic:: Safely using an editor to modify the cluster configuration
+
+   .. code-block:: none
+
+      # cibadmin --query > tmp.xml
+      # vi tmp.xml
+      # cibadmin --replace --xml-file tmp.xml
+
+Some of the better XML editors can make use of a RELAX NG schema to
+help make sure any changes you make are valid.  The schema describing
+the configuration can be found in ``pacemaker.rng``, which may be
+deployed in a location such as ``/usr/share/pacemaker`` depending on your
+operating system distribution and how you installed the software.
+
+If you want to modify just one section of the configuration, you can
+query and replace just that section to avoid modifying any others.
+      
+.. topic:: Safely using an editor to modify only the resources section
+
+   .. code-block:: none
+
+       # cibadmin --query --scope resources > tmp.xml
+       # vi tmp.xml
+       # cibadmin --replace --scope resources --xml-file tmp.xml
+
+To quickly delete a part of the configuration, identify the object you wish to
+delete by XML tag and id. For example, you might search the CIB for all
+STONITH-related configuration:
+      
+.. topic:: Searching for STONITH-related configuration items
+
+   .. code-block:: none
+
+      # cibadmin --query | grep stonith
+       <nvpair id="cib-bootstrap-options-stonith-action" name="stonith-action" value="reboot"/>
+       <nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="1"/>
+       <primitive id="child_DoFencing" class="stonith" type="external/vmware">
+       <lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith">
+       <lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith">
+       <lrm_resource id="child_DoFencing:1" type="external/vmware" class="stonith">
+       <lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith">
+       <lrm_resource id="child_DoFencing:2" type="external/vmware" class="stonith">
+       <lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith">
+       <lrm_resource id="child_DoFencing:3" type="external/vmware" class="stonith">
+
+If you wanted to delete the ``primitive`` tag with id ``child_DoFencing``,
+you would run:
+
+.. code-block:: none
+
+   # cibadmin --delete --xml-text '<primitive id="child_DoFencing"/>'
+
+See the cibadmin man page for more options.
+
+.. warning::
+
+   Never edit the live ``cib.xml`` file directly. Pacemaker will detect such
+   changes and refuse to use the configuration.
+
+
+.. index::
+   single: crm_shadow
+   single: command-line tool; crm_shadow
+
+.. _crm_shadow:
+
+Batch Configuration Changes with crm_shadow
+###########################################
+
+Often, it is desirable to preview the effects of a series of configuration
+changes before updating the live configuration all at once. For this purpose,
+``crm_shadow`` creates a "shadow" copy of the configuration and arranges for
+all the command-line tools to use it.
+
+To begin, simply invoke ``crm_shadow --create`` with a name of your choice,
+and follow the simple on-screen instructions. Shadow copies are identified with
+a name to make it possible to have more than one.
+
+.. warning::
+
+   Read this section and the on-screen instructions carefully; failure to do so
+   could result in destroying the cluster's active configuration!
+      
+.. topic:: Creating and displaying the active sandbox
+
+   .. code-block:: none
+
+      # crm_shadow --create test
+      Setting up shadow instance
+      Type Ctrl-D to exit the crm_shadow shell
+      shadow[test]: 
+      shadow[test] # crm_shadow --which
+      test
+
+From this point on, all cluster commands will automatically use the shadow copy
+instead of talking to the cluster's active configuration. Once you have
+finished experimenting, you can either make the changes active via the
+``--commit`` option, or discard them using the ``--delete`` option. Again, be
+sure to follow the on-screen instructions carefully!
+      
+For a full list of ``crm_shadow`` options and commands, invoke it with the
+``--help`` option.
+
+.. topic:: Use sandbox to make multiple changes all at once, discard them, and verify real configuration is untouched
+
+   .. code-block:: none
+   
+      shadow[test] # crm_failcount -r rsc_c001n01 -G
+      scope=status  name=fail-count-rsc_c001n01 value=0
+      shadow[test] # crm_standby --node c001n02 -v on
+      shadow[test] # crm_standby --node c001n02 -G
+      scope=nodes  name=standby value=on
+   
+      shadow[test] # cibadmin --erase --force
+      shadow[test] # cibadmin --query
+      <cib crm_feature_set="3.0.14" validate-with="pacemaker-3.0" epoch="112" num_updates="2" admin_epoch="0" cib-last-written="Mon Jan  8 23:26:47 2018" update-origin="rhel7-1" update-client="crm_node" update-user="root" have-quorum="1" dc-uuid="1">
+        <configuration>
+          <crm_config/>
+          <nodes/>
+          <resources/>
+          <constraints/>
+        </configuration>
+        <status/>
+      </cib>
+      shadow[test] # crm_shadow --delete test --force
+      Now type Ctrl-D to exit the crm_shadow shell
+      shadow[test] # exit
+      # crm_shadow --which
+      No active shadow configuration defined
+      # cibadmin -Q
+      <cib crm_feature_set="3.0.14" validate-with="pacemaker-3.0" epoch="110" num_updates="2" admin_epoch="0" cib-last-written="Mon Jan  8 23:26:47 2018" update-origin="rhel7-1" update-client="crm_node" update-user="root" have-quorum="1">
+         <configuration>
+            <crm_config>
+               <cluster_property_set id="cib-bootstrap-options">
+                  <nvpair id="cib-bootstrap-1" name="stonith-enabled" value="1"/>
+                  <nvpair id="cib-bootstrap-2" name="pe-input-series-max" value="30000"/>
+
+See the next section, :ref:`crm_simulate`, for how to test your changes before
+committing them to the live cluster.
+
+
+.. index::
+   single: crm_simulate
+   single: command-line tool; crm_simulate
+
+.. _crm_simulate:
+
+Simulate Cluster Activity with crm_simulate
+###########################################
+
+The command-line tool `crm_simulate` shows the results of the same logic
+the cluster itself uses to respond to a particular cluster configuration and
+status.
+
+As always, the man page is the primary documentation, and should be consulted
+for further details. This section aims for a better conceptual explanation and
+practical examples.
+
+Replaying cluster decision-making logic
+_______________________________________
+
+At any given time, one node in a Pacemaker cluster will be elected DC, and that
+node will run Pacemaker's scheduler to make decisions.
+
+Each time decisions need to be made (a "transition"), the DC will have log
+messages like "Calculated transition ... saving inputs in ..." with a file
+name. You can grab the named file and replay the cluster logic to see why
+particular decisions were made. The file contains the live cluster
+configuration at that moment, so you can also look at it directly to see the
+value of node attributes, etc., at that time.
+
+The simplest usage is (replacing $FILENAME with the actual file name):
+
+.. topic:: Simulate cluster response to a given CIB
+
+   .. code-block:: none
+
+      # crm_simulate --simulate --xml-file $FILENAME
+
+That will show the cluster state when the process started, the actions that
+need to be taken ("Transition Summary"), and the resulting cluster state if the
+actions succeed. Most actions will have a brief description of why they were
+required.
+
+The transition inputs may be compressed. ``crm_simulate`` can handle these
+compressed files directly, though if you want to edit the file, you'll need to
+uncompress it first.
+
+You can do the same simulation for the live cluster configuration at the
+current moment. This is useful mainly when using ``crm_shadow`` to create a
+sandbox version of the CIB; the ``--live-check`` option will use the shadow CIB
+if one is in effect.
+
+.. topic:: Simulate cluster response to current live CIB or shadow CIB
+
+   .. code-block:: none
+
+      # crm_simulate --simulate --live-check
+
+
+Why decisions were made
+_______________________
+
+To get further insight into the "why", it gets user-unfriendly very quickly. If
+you add the ``--show-scores`` option, you will also see all the scores that
+went into the decision-making. The node with the highest cumulative score for a
+resource will run it. You can look for ``-INFINITY`` scores in particular to
+see where complete bans came into effect.
+
+You can also add ``-VVVV`` to get more detailed messages about what's happening
+under the hood. You can add up to two more V's even, but that's usually useful
+only if you're a masochist or tracing through the source code.
+
+
+Visualizing the action sequence
+_______________________________
+
+Another handy feature is the ability to generate a visual graph of the actions
+needed, using the ``--save-dotfile`` option. This relies on the separate
+Graphviz [#]_ project.
+
+.. topic:: Generate a visual graph of cluster actions from a saved CIB
+
+   .. code-block:: none
+
+      # crm_simulate --simulate --xml-file $FILENAME --save-dotfile $FILENAME.dot
+      # dot $FILENAME.dot -Tsvg > $FILENAME.svg
+
+``$FILENAME.dot`` will contain a GraphViz representation of the cluster's
+response to your changes, including all actions with their ordering
+dependencies.
+
+``$FILENAME.svg`` will be the same information in a standard graphical format
+that you can view in your browser or other app of choice. You could, of course,
+use other ``dot`` options to generate other formats.
+      
+How to interpret the graphical output:
+
+ * Bubbles indicate actions, and arrows indicate ordering dependencies
+ * Resource actions have text of the form
+   ``<RESOURCE>_<ACTION>_<INTERVAL_IN_MS> <NODE>`` indicating that the
+   specified action will be executed for the specified resource on the
+   specified node, once if interval is 0 or at specified recurring interval
+   otherwise
+ * Actions with black text will be sent to the executor (that is, the
+   appropriate agent will be invoked)
+ * Actions with orange text are "pseudo" actions that the cluster uses
+   internally for ordering but require no real activity
+ * Actions with a solid green border are part of the transition (that is, the
+   cluster will attempt to execute them in the given order -- though a
+   transition can be interrupted by action failure or new events)
+ * Dashed arrows indicate dependencies that are not present in the transition
+   graph
+ * Actions with a dashed border will not be executed. If the dashed border is
+   blue, the cluster does not feel the action needs to be executed. If the
+   dashed border is red, the cluster would like to execute the action but
+   cannot. Any actions depending on an action with a dashed border will not be
+   able to execute. 
+ * Loops should not happen, and should be reported as a bug if found.
+
+.. topic:: Small Cluster Transition
+
+   .. image:: ../shared/images/Policy-Engine-small.png
+      :alt: An example transition graph as represented by Graphviz
+      :align: center
+
+In the above example, it appears that a new node, ``pcmk-2``, has come online
+and that the cluster is checking to make sure ``rsc1``, ``rsc2`` and ``rsc3``
+are not already running there (indicated by the ``rscN_monitor_0`` entries).
+Once it did that, and assuming the resources were not active there, it would
+have liked to stop ``rsc1`` and ``rsc2`` on ``pcmk-1`` and move them to
+``pcmk-2``. However, there appears to be some problem and the cluster cannot or
+is not permitted to perform the stop actions which implies it also cannot
+perform the start actions. For some reason, the cluster does not want to start
+``rsc3`` anywhere.
+
+.. topic:: Complex Cluster Transition
+
+   .. image:: ../shared/images/Policy-Engine-big.png
+      :alt: Complex transition graph that you're not expected to be able to read
+      :align: center
+
+
+What-if scenarios
+_________________
+
+You can make changes to the saved or shadow CIB and simulate it again, to see
+how Pacemaker would react differently. You can edit the XML by hand, use
+command-line tools such as ``cibadmin`` with either a shadow CIB or the
+``CIB_file`` environment variable set to the filename, or use higher-level tool
+support (see the man pages of the specific tool you're using for how to perform
+actions on a saved CIB file rather than the live CIB).
+
+You can also inject node failures and/or action failures into the simulation;
+see the ``crm_simulate`` man page for more details.
+
+This capability is useful when using a shadow CIB to edit the configuration.
+Before committing the changes to the live cluster with ``crm_shadow --commit``,
+you can use ``crm_simulate`` to see how the cluster will react to the changes.
+
+.. _crm_attribute:
+
+.. index::
+   single: attrd_updater
+   single: command-line tool; attrd_updater
+   single: crm_attribute
+   single: command-line tool; crm_attribute
+
+Manage Node Attributes, Cluster Options and Defaults with crm_attribute and attrd_updater
+#########################################################################################
+
+``crm_attribute`` and ``attrd_updater`` are confusingly similar tools with subtle
+differences.
+
+``attrd_updater`` can query and update node attributes. ``crm_attribute`` can query
+and update not only node attributes, but also cluster options, resource
+defaults, and operation defaults.
+
+To understand the differences, it helps to understand the various types of node
+attribute.
+
+.. table:: **Types of Node Attributes**
+
+   +-----------+----------+-------------------+------------------+----------------+----------------+
+   | Type      | Recorded | Recorded in       | Survive full     | Manageable by  | Manageable by  |
+   |           | in CIB?  | attribute manager | cluster restart? | crm_attribute? | attrd_updater? |
+   |           |          | memory?           |                  |                |                |
+   +===========+==========+===================+==================+================+================+
+   | permanent | yes      | no                | yes              | yes            | no             |
+   +-----------+----------+-------------------+------------------+----------------+----------------+
+   | transient | yes      | yes               | no               | yes            | yes            |
+   +-----------+----------+-------------------+------------------+----------------+----------------+
+   | private   | no       | yes               | no               | no             | yes            |
+   +-----------+----------+-------------------+------------------+----------------+----------------+
+
+As you can see from the table above, ``crm_attribute`` can manage permanent and
+transient node attributes, while ``attrd_updater`` can manage transient and
+private node attributes.
+
+The difference between the two tools lies mainly in *how* they update node
+attributes: ``attrd_updater`` always contacts the Pacemaker attribute manager
+directly, while ``crm_attribute`` will contact the attribute manager only for
+transient node attributes, and will instead modify the CIB directly for
+permanent node attributes (and for transient node attributes when unable to
+contact the attribute manager).
+
+By contacting the attribute manager directly, ``attrd_updater`` can change
+an attribute's "dampening" (whether changes are immediately flushed to the CIB
+or after a specified amount of time, to minimize disk writes for frequent
+changes), set private node attributes (which are never written to the CIB), and
+set attributes for nodes that don't yet exist.
+
+By modifying the CIB directly, ``crm_attribute`` can set permanent node
+attributes (which are only in the CIB and not managed by the attribute
+manager), and can be used with saved CIB files and shadow CIBs.
+
+However a transient node attribute is set, it is synchronized between the CIB
+and the attribute manager, on all nodes.
+
+
+.. index::
+   single: crm_failcount
+   single: command-line tool; crm_failcount
+   single: crm_node
+   single: command-line tool; crm_node
+   single: crm_report
+   single: command-line tool; crm_report
+   single: crm_standby
+   single: command-line tool; crm_standby
+   single: crm_verify
+   single: command-line tool; crm_verify
+   single: stonith_admin
+   single: command-line tool; stonith_admin
+
+Other Commonly Used Tools
+#########################
+
+Other command-line tools include:
+
+* ``crm_failcount``: query or delete resource fail counts
+* ``crm_node``: manage cluster nodes
+* ``crm_report``: generate a detailed cluster report for bug submissions
+* ``crm_resource``: manage cluster resources
+* ``crm_standby``: manage standby status of nodes
+* ``crm_verify``: validate a CIB
+* ``stonith_admin``: manage fencing devices
+
+See the manual pages for details.
+
+.. rubric:: Footnotes
+
+.. [#] Graph visualization software. See http://www.graphviz.org/ for details.
diff --git a/doc/sphinx/Pacemaker_Administration/troubleshooting.rst b/doc/sphinx/Pacemaker_Administration/troubleshooting.rst
new file mode 100644
index 0000000..22c9dc8
--- /dev/null
+++ b/doc/sphinx/Pacemaker_Administration/troubleshooting.rst
@@ -0,0 +1,123 @@
+.. index:: troubleshooting
+
+Troubleshooting Cluster Problems
+--------------------------------
+
+.. index:: logging, pacemaker.log
+
+Logging
+#######
+
+Pacemaker by default logs messages of ``notice`` severity and higher to the
+system log, and messages of ``info`` severity and higher to the detail log,
+which by default is ``/var/log/pacemaker/pacemaker.log``.
+
+Logging options can be controlled via environment variables at Pacemaker
+start-up. Where these are set varies by operating system (often
+``/etc/sysconfig/pacemaker`` or ``/etc/default/pacemaker``). See the comments
+in that file for details.
+
+Because cluster problems are often highly complex, involving multiple machines,
+cluster daemons, and managed services, Pacemaker logs rather verbosely to
+provide as much context as possible. It is an ongoing priority to make these
+logs more user-friendly, but by necessity there is a lot of obscure, low-level
+information that can make them difficult to follow.
+
+The default log rotation configuration shipped with Pacemaker (typically
+installed in ``/etc/logrotate.d/pacemaker``) rotates the log when it reaches
+100MB in size, or weekly, whichever comes first.
+
+If you configure debug or (Heaven forbid) trace-level logging, the logs can
+grow enormous quite quickly. Because rotated logs are by default named with the
+year, month, and day only, this can cause name collisions if your logs exceed
+100MB in a single day. You can add ``dateformat -%Y%m%d-%H`` to the rotation
+configuration to avoid this.
+
+Reading the Logs
+################
+
+When troubleshooting, first check the system log or journal for errors or
+warnings from Pacemaker components (conveniently, they will all have
+"pacemaker" in their logged process name). For example:
+
+.. code-block:: none
+
+   # grep 'pacemaker.*\(error\|warning\)' /var/log/messages
+   Mar 29 14:04:19 node1 pacemaker-controld[86636]: error: Result of monitor operation for rn2 on node1: Timed Out after 45s (Remote executor did not respond)
+
+If that doesn't give sufficient information, next look at the ``notice`` level
+messages from ``pacemaker-controld``. These will show changes in the state of
+cluster nodes. On the DC, this will also show resource actions attempted. For
+example:
+
+.. code-block:: none
+
+   # grep 'pacemaker-controld.*notice:' /var/log/messages
+   ... output skipped for brevity ...
+   Mar 29 14:05:36 node1 pacemaker-controld[86636]: notice: Node rn2 state is now lost
+   ... more output skipped for brevity ...
+   Mar 29 14:12:17 node1 pacemaker-controld[86636]: notice: Initiating stop operation rsc1_stop_0 on node4
+   ... more output skipped for brevity ...
+
+Of course, you can use other tools besides ``grep`` to search the logs.
+
+
+.. index:: transition
+
+Transitions
+###########
+
+A key concept in understanding how a Pacemaker cluster functions is a
+*transition*. A transition is a set of actions that need to be taken to bring
+the cluster from its current state to the desired state (as expressed by the
+configuration).
+
+Whenever a relevant event happens (a node joining or leaving the cluster,
+a resource failing, etc.), the controller will ask the scheduler to recalculate
+the status of the cluster, which generates a new transition. The controller
+then performs the actions in the transition in the proper order.
+
+Each transition can be identified in the DC's logs by a line like:
+
+.. code-block:: none
+
+   notice: Calculated transition 19, saving inputs in /var/lib/pacemaker/pengine/pe-input-1463.bz2
+
+The file listed as the "inputs" is a snapshot of the cluster configuration and
+state at that moment (the CIB). This file can help determine why particular
+actions were scheduled. The ``crm_simulate`` command, described in
+:ref:`crm_simulate`, can be used to replay the file.
+
+The log messages immediately before the "saving inputs" message will include
+any actions that the scheduler thinks need to be done.
+
+
+Node Failures
+#############
+
+When a node fails, and looking at errors and warnings doesn't give an obvious
+explanation, try to answer questions like the following based on log messages:
+
+* When and what was the last successful message on the node itself, or about
+  that node in the other nodes' logs?
+* Did pacemaker-controld on the other nodes notice the node leave?
+* Did pacemaker-controld on the DC invoke the scheduler and schedule a new
+  transition?
+* Did the transition include fencing the failed node?
+* Was fencing attempted?
+* Did fencing succeed?
+
+Resource Failures
+#################
+
+When a resource fails, and looking at errors and warnings doesn't give an
+obvious explanation, try to answer questions like the following based on log
+messages:
+
+* Did pacemaker-controld record the result of the failed resource action?
+* What was the failed action's execution status and exit status?
+* What code in the resource agent could result in those status codes?
+* Did pacemaker-controld on the DC invoke the scheduler and schedule a new
+  transition?
+* Did the new transition include recovery of the resource?
+* Were the recovery actions initiated, and what were their results?
diff --git a/doc/sphinx/Pacemaker_Administration/upgrading.rst b/doc/sphinx/Pacemaker_Administration/upgrading.rst
new file mode 100644
index 0000000..1ca2a4e
--- /dev/null
+++ b/doc/sphinx/Pacemaker_Administration/upgrading.rst
@@ -0,0 +1,534 @@
+.. index:: upgrade
+
+Upgrading a Pacemaker Cluster
+-----------------------------
+
+.. index:: version
+
+Pacemaker Versioning
+####################
+
+Pacemaker has an overall release version, plus separate version numbers for
+certain internal components.
+
+.. index::
+   single: version; release
+
+* **Pacemaker release version:** This version consists of three numbers
+  (*x.y.z*).
+
+  The major version number (the *x* in *x.y.z*) increases when at least some
+  rolling upgrades are not possible from the previous major version. For example,
+  a rolling upgrade from 1.0.8 to 1.1.15 should always be supported, but a
+  rolling upgrade from 1.0.8 to 2.0.0 may not be possible.
+
+  The minor version (the *y* in *x.y.z*) increases when there are significant
+  changes in cluster default behavior, tool behavior, and/or the API interface
+  (for software that utilizes Pacemaker libraries). The main benefit is to alert
+  you to pay closer attention to the release notes, to see if you might be
+  affected.
+
+  The release counter (the *z* in *x.y.z*) is increased with all public releases
+  of Pacemaker, which typically include both bug fixes and new features.
+
+.. index::
+   single: feature set
+   single: version; feature set
+
+* **CRM feature set:** This version number applies to the communication between
+  full cluster nodes, and is used to avoid problems in mixed-version clusters.
+
+  The major version number increases when nodes with different versions would not
+  work (rolling upgrades are not allowed). The minor version number increases
+  when mixed-version clusters are allowed only during rolling upgrades. The
+  minor-minor version number is ignored, but allows resource agents to detect
+  cluster support for various features. [#]_
+
+  Pacemaker ensures that the longest-running node is the cluster's DC. This
+  ensures new features are not enabled until all nodes are upgraded to support
+  them.
+
+.. index::
+   single: version; Pacemaker Remote protocol
+
+* **Pacemaker Remote protocol version:** This version applies to communication
+  between a Pacemaker Remote node and the cluster. It increases when an older
+  cluster node would have problems hosting the connection to a newer
+  Pacemaker Remote node. To avoid these problems, Pacemaker Remote nodes will
+  accept connections only from cluster nodes with the same or newer
+  Pacemaker Remote protocol version.
+
+  Unlike with CRM feature set differences between full cluster nodes,
+  mixed Pacemaker Remote protocol versions between Pacemaker Remote nodes and
+  full cluster nodes are fine, as long as the Pacemaker Remote nodes have the
+  older version. This can be useful, for example, to host a legacy application
+  in an older operating system version used as a Pacemaker Remote node.
+
+.. index::
+   single: version; XML schema
+
+* **XML schema version:** Pacemaker’s configuration syntax — what's allowed in
+  the Configuration Information Base (CIB) — has its own version. This allows
+  the configuration syntax to evolve over time while still allowing clusters
+  with older configurations to work without change.
+
+
+.. index::
+   single: upgrade; methods
+
+Upgrading Cluster Software
+##########################
+
+There are three approaches to upgrading a cluster, each with advantages and
+disadvantages.
+
+.. table:: **Upgrade Methods**
+
+   +---------------------------------------------------+----------+----------+--------+---------+----------+----------+
+   | Method                                            | Available| Can be   | Service| Service | Exercises| Allows   |
+   |                                                   | between  | used with| outage | recovery| failover | change of|
+   |                                                   | all      | Pacemaker| during | during  | logic    | messaging|
+   |                                                   | versions | Remote   | upgrade| upgrade |          | layer    |
+   |                                                   |          | nodes    |        |         |          | [#]_     |
+   +===================================================+==========+==========+========+=========+==========+==========+
+   | Complete cluster shutdown                         | yes      | yes      | always | N/A     | no       | yes      |
+   +---------------------------------------------------+----------+----------+--------+---------+----------+----------+
+   | Rolling (node by node)                            | no       | yes      | always | yes     | yes      | no       |
+   |                                                   |          |          | [#]_   |         |          |          |
+   +---------------------------------------------------+----------+----------+--------+---------+----------+----------+
+   | Detach and reattach                               | yes      | no       | only   | no      | no       | yes      |
+   |                                                   |          |          | due to |         |          |          |
+   |                                                   |          |          | failure|         |          |          |
+   +---------------------------------------------------+----------+----------+--------+---------+----------+----------+
+
+
+.. index::
+   single: upgrade; shutdown
+
+Complete Cluster Shutdown
+_________________________
+
+In this scenario, one shuts down all cluster nodes and resources,
+then upgrades all the nodes before restarting the cluster.
+
+#. On each node:
+
+   a. Shutdown the cluster software (pacemaker and the messaging layer).
+   #. Upgrade the Pacemaker software. This may also include upgrading the
+      messaging layer and/or the underlying operating system.
+   #. Check the configuration with the ``crm_verify`` tool.
+
+#. On each node:
+
+   a. Start the cluster software.
+
+Currently, only Corosync version 2 and greater is supported as the cluster
+layer, but if another stack is supported in the future, the stack does not
+need to be the same one before the upgrade.
+
+One variation of this approach is to build a new cluster on new hosts.
+This allows the new version to be tested beforehand, and minimizes downtime by
+having the new nodes ready to be placed in production as soon as the old nodes
+are shut down.
+
+
+.. index::
+   single: upgrade; rolling upgrade
+
+Rolling (node by node)
+______________________
+
+In this scenario, each node is removed from the cluster, upgraded, and then
+brought back online, until all nodes are running the newest version.
+
+Special considerations when planning a rolling upgrade:
+
+* If you plan to upgrade other cluster software -- such as the messaging layer --
+  at the same time, consult that software's documentation for its compatibility
+  with a rolling upgrade.
+
+* If the major version number is changing in the Pacemaker version you are
+  upgrading to, a rolling upgrade may not be possible. Read the new version's
+  release notes (as well the information here) for what limitations may exist.
+
+* If the CRM feature set is changing in the Pacemaker version you are upgrading
+  to, you should run a mixed-version cluster only during a small rolling
+  upgrade window. If one of the older nodes drops out of the cluster for any
+  reason, it will not be able to rejoin until it is upgraded.
+
+* If the Pacemaker Remote protocol version is changing, all cluster nodes
+  should be upgraded before upgrading any Pacemaker Remote nodes.
+
+See the ClusterLabs wiki's
+`release calendar <https://wiki.clusterlabs.org/wiki/ReleaseCalendar>`_
+to figure out whether the CRM feature set and/or Pacemaker Remote protocol
+version changed between the the Pacemaker release versions in your rolling
+upgrade.
+
+To perform a rolling upgrade, on each node in turn:
+
+#. Put the node into standby mode, and wait for any active resources
+   to be moved cleanly to another node. (This step is optional, but
+   allows you to deal with any resource issues before the upgrade.)
+#. Shutdown the cluster software (pacemaker and the messaging layer) on the node.
+#. Upgrade the Pacemaker software. This may also include upgrading the
+   messaging layer and/or the underlying operating system.
+#. If this is the first node to be upgraded, check the configuration
+   with the ``crm_verify`` tool.
+#. Start the messaging layer.
+   This must be the same messaging layer (currently only Corosync version 2 and
+   greater is supported) that the rest of the cluster is using.
+
+.. note::
+
+   Even if a rolling upgrade from the current version of the cluster to the
+   newest version is not directly possible, it may be possible to perform a
+   rolling upgrade in multiple steps, by upgrading to an intermediate version
+   first.
+
+.. table:: **Version Compatibility Table**
+
+   +-------------------------+---------------------------+
+   | Version being Installed | Oldest Compatible Version |
+   +=========================+===========================+
+   | Pacemaker 2.y.z         | Pacemaker 1.1.11 [#]_     |
+   +-------------------------+---------------------------+
+   | Pacemaker 1.y.z         | Pacemaker 1.0.0           |
+   +-------------------------+---------------------------+
+   | Pacemaker 0.7.z         | Pacemaker 0.6.z           |
+   +-------------------------+---------------------------+
+
+.. index::
+   single: upgrade; detach and reattach
+
+Detach and Reattach
+___________________
+
+The reattach method is a variant of a complete cluster shutdown, where the
+resources are left active and get re-detected when the cluster is restarted.
+
+This method may not be used if the cluster contains any Pacemaker Remote nodes.
+
+#. Tell the cluster to stop managing services. This is required to allow the
+   services to remain active after the cluster shuts down.
+
+   .. code-block:: none
+
+      # crm_attribute --name maintenance-mode --update true
+
+#. On each node, shutdown the cluster software (pacemaker and the messaging
+   layer), and upgrade the Pacemaker software. This may also include upgrading
+   the messaging layer. While the underlying operating system may be upgraded
+   at the same time, that will be more likely to cause outages in the detached
+   services (certainly, if a reboot is required).
+#. Check the configuration with the ``crm_verify`` tool.
+#. On each node, start the cluster software.
+   Currently, only Corosync version 2 and greater is supported as the cluster
+   layer, but if another stack is supported in the future, the stack does not
+   need to be the same one before the upgrade.
+#. Verify that the cluster re-detected all resources correctly.
+#. Allow the cluster to resume managing resources again:
+
+   .. code-block:: none
+
+      # crm_attribute --name maintenance-mode --delete
+
+.. note::
+
+   While the goal of the detach-and-reattach method is to avoid disturbing
+   running services, resources may still move after the upgrade if any
+   resource's location is governed by a rule based on transient node
+   attributes. Transient node attributes are erased when the node leaves the
+   cluster. A common example is using the ``ocf:pacemaker:ping`` resource to
+   set a node attribute used to locate other resources.
+
+.. index::
+   pair: upgrade; CIB
+
+Upgrading the Configuration
+###########################
+
+The CIB schema version can change from one Pacemaker version to another.
+
+After cluster software is upgraded, the cluster will continue to use the older
+schema version that it was previously using. This can be useful, for example,
+when administrators have written tools that modify the configuration, and are
+based on the older syntax. [#]_
+
+However, when using an older syntax, new features may be unavailable, and there
+is a performance impact, since the cluster must do a non-persistent
+configuration upgrade before each transition. So while using the old syntax is
+possible, it is not advisable to continue using it indefinitely.
+
+Even if you wish to continue using the old syntax, it is a good idea to
+follow the upgrade procedure outlined below, except for the last step, to ensure
+that the new software has no problems with your existing configuration (since it
+will perform much the same task internally).
+
+If you are brave, it is sufficient simply to run ``cibadmin --upgrade``.
+
+A more cautious approach would proceed like this:
+
+#. Create a shadow copy of the configuration. The later commands will
+   automatically operate on this copy, rather than the live configuration.
+
+   .. code-block:: none
+
+      # crm_shadow --create shadow
+
+.. index::
+   single: configuration; verify
+
+#. Verify the configuration is valid with the new software (which may be
+   stricter about syntax mistakes, or may have dropped support for deprecated
+   features):
+
+   .. code-block:: none
+
+      # crm_verify --live-check
+
+#. Fix any errors or warnings.
+#. Perform the upgrade:
+
+   .. code-block:: none
+
+      # cibadmin --upgrade
+
+#. If this step fails, there are three main possibilities:
+
+   a. The configuration was not valid to start with (did you do steps 2 and
+      3?).
+   #. The transformation failed; `report a bug <https://bugs.clusterlabs.org/>`_.
+   #. The transformation was successful but produced an invalid result.
+
+   If the result of the transformation is invalid, you may see a number of
+   errors from the validation library. If these are not helpful, visit the
+   `Validation FAQ wiki page <https://wiki.clusterlabs.org/wiki/Validation_FAQ>`_
+   and/or try the manual upgrade procedure described below.
+
+#. Check the changes:
+
+   .. code-block:: none
+
+      # crm_shadow --diff
+
+   If at this point there is anything about the upgrade that you wish to
+   fine-tune (for example, to change some of the automatic IDs), now is the
+   time to do so:
+
+   .. code-block:: none
+
+      # crm_shadow --edit
+
+   This will open the configuration in your favorite editor (whichever is
+   specified by the standard ``$EDITOR`` environment variable).
+
+#. Preview how the cluster will react:
+
+   .. code-block:: none
+
+      # crm_simulate --live-check --save-dotfile shadow.dot -S
+      # dot -Tsvg shadow.dot -o shadow.svg
+
+   You can then view shadow.svg with any compatible image viewer or web
+   browser. Verify that either no resource actions will occur or that you are
+   happy with any that are scheduled.  If the output contains actions you do
+   not expect (possibly due to changes to the score calculations), you may need
+   to make further manual changes. See :ref:`crm_simulate` for further details
+   on how to interpret the output of ``crm_simulate`` and ``dot``.
+
+#. Upload the changes:
+
+   .. code-block:: none
+
+      # crm_shadow --commit shadow --force
+
+   In the unlikely event this step fails, please report a bug.
+
+.. note::
+
+   It is also possible to perform the configuration upgrade steps manually:
+
+   #. Locate the ``upgrade*.xsl`` conversion scripts provided with the source
+      code. These will often be installed in a location such as
+      ``/usr/share/pacemaker``, or may be obtained from the
+      `source repository <https://github.com/ClusterLabs/pacemaker/tree/main/xml>`_.
+          
+   #. Run the conversion scripts that apply to your older version, for example:
+
+      .. code-block:: none
+
+         # xsltproc /path/to/upgrade06.xsl config06.xml > config10.xml
+
+   #. Locate the ``pacemaker.rng`` script (from the same location as the xsl
+      files).
+   #. Check the XML validity:
+
+      .. code-block:: none
+
+         # xmllint --relaxng /path/to/pacemaker.rng config10.xml
+
+   The advantage of this method is that it can be performed without the cluster
+   running, and any validation errors are often more informative.
+
+
+What Changed in 2.1
+###################
+
+The Pacemaker 2.1 release is fully backward-compatible in both the CIB XML and
+the C API. Highlights:
+
+* Pacemaker now supports the **OCF Resource Agent API version 1.1**.
+  Most notably, the ``Master`` and ``Slave`` role names have been renamed to
+  ``Promoted`` and ``Unpromoted``.
+
+* Pacemaker now supports colocations where the dependent resource does not
+  affect the primary resource's placement (via a new ``influence`` colocation
+  constraint option and ``critical`` resource meta-attribute). This is intended
+  for cases where a less-important resource must be colocated with an essential
+  resource, but it is preferred to leave the less-important resource stopped if
+  it fails, rather than move both resources.
+
+* If Pacemaker is built with libqb 2.0 or later, the detail log will use
+  **millisecond-resolution timestamps**.
+
+* In addition to crm_mon and stonith_admin, the crmadmin, crm_resource,
+  crm_simulate, and crm_verify commands now support the ``--output-as`` and
+  ``--output-to`` options, including **XML output** (which scripts and
+  higher-level tools are strongly recommended to use instead of trying to parse
+  the text output, which may change from release to release).
+
+For a detailed list of changes, see the release notes and the
+`Pacemaker 2.1 Changes <https://wiki.clusterlabs.org/wiki/Pacemaker_2.1_Changes>`_
+page on the ClusterLabs wiki.
+
+
+What Changed in 2.0
+###################
+
+The main goal of the 2.0 release was to remove support for deprecated syntax,
+along with some small changes in default configuration behavior and tool
+behavior. Highlights:
+
+* Only Corosync version 2 and greater is now supported as the underlying
+  cluster layer. Support for Heartbeat and Corosync 1 (including CMAN) is
+  removed.
+
+* The Pacemaker detail log file is now stored in
+  ``/var/log/pacemaker/pacemaker.log`` by default.
+
+* The record-pending cluster property now defaults to true, which
+  allows status tools such as crm_mon to show operations that are in
+  progress.
+
+* Support for a number of deprecated build options, environment variables,
+  and configuration settings has been removed.
+
+* The ``master`` tag has been deprecated in favor of using the ``clone`` tag
+  with the new ``promotable`` meta-attribute set to ``true``. "Master/slave"
+  clone resources are now referred to as "promotable" clone resources.
+
+* The public API for Pacemaker libraries that software applications can use
+  has changed significantly.
+
+For a detailed list of changes, see the release notes and the
+`Pacemaker 2.0 Changes <https://wiki.clusterlabs.org/wiki/Pacemaker_2.0_Changes>`_
+page on the ClusterLabs wiki.
+
+
+What Changed in 1.0
+###################
+
+New
+___
+
+* Failure timeouts.
+* New section for resource and operation defaults.
+* Tool for making offline configuration changes.
+* ``Rules``, ``instance_attributes``, ``meta_attributes`` and sets of
+  operations can be defined once and referenced in multiple places.
+* The CIB now accepts XPath-based create/modify/delete operations. See
+  ``cibadmin --help``.
+* Multi-dimensional colocation and ordering constraints.
+* The ability to connect to the CIB from non-cluster machines.
+* Allow recurring actions to be triggered at known times.
+
+
+Changed
+_______
+
+* Syntax
+
+  * All resource and cluster options now use dashes (-) instead of underscores
+    (_)
+  * ``master_slave`` was renamed to ``master``
+  * The ``attributes`` container tag was removed
+  * The operation field ``pre-req`` has been renamed ``requires``
+  * All operations must have an ``interval``, ``start``/``stop`` must have it
+    set to zero
+
+* The ``stonith-enabled`` option now defaults to true.
+* The cluster will refuse to start resources if ``stonith-enabled`` is true (or
+  unset) and no STONITH resources have been defined
+* The attributes of colocation and ordering constraints were renamed for
+  clarity.
+* ``resource-failure-stickiness`` has been replaced by ``migration-threshold``.
+* The parameters for command-line tools have been made consistent
+* Switched to 'RelaxNG' schema validation and 'libxml2' parser
+
+  * id fields are now XML IDs which have the following limitations:
+
+    * id's cannot contain colons (:)
+    * id's cannot begin with a number
+    * id's must be globally unique (not just unique for that tag)
+
+  * Some fields (such as those in constraints that refer to resources) are
+    IDREFs.
+
+    This means that they must reference existing resources or objects in
+    order for the configuration to be valid.  Removing an object which is
+    referenced elsewhere will therefore fail.
+
+  * The CIB representation, from which a MD5 digest is calculated to verify
+    CIBs on the nodes, has changed.
+
+    This means that every CIB update will require a full refresh on any
+    upgraded nodes until the cluster is fully upgraded to 1.0. This will result
+    in significant performance degradation and it is therefore highly
+    inadvisable to run a mixed 1.0/0.6 cluster for any longer than absolutely
+    necessary.
+
+* Ping node information no longer needs to be added to ``ha.cf``. Simply
+  include the lists of hosts in your ping resource(s).
+
+
+Removed
+_______
+
+
+* Syntax
+
+  * It is no longer possible to set resource meta options as top-level
+    attributes. Use meta-attributes instead.
+  * Resource and operation defaults are no longer read from ``crm_config``.
+
+.. rubric:: Footnotes
+
+.. [#] Before CRM feature set 3.1.0 (Pacemaker 2.0.0), the minor-minor version
+       number was treated the same as the minor version.
+
+.. [#] Currently, Corosync version 2 and greater is the only supported cluster
+       stack, but other stacks have been supported by past versions, and may be
+       supported by future versions.
+
+.. [#] Any active resources will be moved off the node being upgraded, so there
+       will be at least a brief outage unless all resources can be migrated
+       "live".
+
+.. [#] Rolling upgrades from Pacemaker 1.1.z to 2.y.z are possible only if the
+       cluster uses corosync version 2 or greater as its messaging layer, and
+       the Cluster Information Base (CIB) uses schema 1.0 or higher in its
+       ``validate-with`` property.
+
+.. [#] As of Pacemaker 2.0.0, only schema versions pacemaker-1.0 and higher
+       are supported (excluding pacemaker-1.1, which was a special case).
author	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-17 06:53:20 +0000
committer	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-17 06:53:20 +0000
commit	e5a812082ae033afb1eed82c0f2df3d0f6bdc93f (patch)
tree	a6716c9275b4b413f6c9194798b34b91affb3cc7 /doc/sphinx/Pacemaker_Administration
parent	Initial commit. (diff)
download	pacemaker-e5a812082ae033afb1eed82c0f2df3d0f6bdc93f.tar.xz pacemaker-e5a812082ae033afb1eed82c0f2df3d0f6bdc93f.zip