summaryrefslogtreecommitdiffstats
path: root/doc/sphinx/Pacemaker_Explained/nodes.rst
diff options
context:
space:
mode:
Diffstat (limited to 'doc/sphinx/Pacemaker_Explained/nodes.rst')
-rw-r--r--doc/sphinx/Pacemaker_Explained/nodes.rst441
1 files changed, 441 insertions, 0 deletions
diff --git a/doc/sphinx/Pacemaker_Explained/nodes.rst b/doc/sphinx/Pacemaker_Explained/nodes.rst
new file mode 100644
index 0000000..6fcadb3
--- /dev/null
+++ b/doc/sphinx/Pacemaker_Explained/nodes.rst
@@ -0,0 +1,441 @@
+Cluster Nodes
+-------------
+
+Defining a Cluster Node
+_______________________
+
+Each cluster node will have an entry in the ``nodes`` section containing at
+least an ID and a name. A cluster node's ID is defined by the cluster layer
+(Corosync).
+
+.. topic:: **Example Corosync cluster node entry**
+
+ .. code-block:: xml
+
+ <node id="101" uname="pcmk-1"/>
+
+In normal circumstances, the admin should let the cluster populate this
+information automatically from the cluster layer.
+
+
+.. _node_name:
+
+Where Pacemaker Gets the Node Name
+##################################
+
+The name that Pacemaker uses for a node in the configuration does not have to
+be the same as its local hostname. Pacemaker uses the following for a Corosync
+node's name, in order of most preferred first:
+
+* The value of ``name`` in the ``nodelist`` section of ``corosync.conf``
+* The value of ``ring0_addr`` in the ``nodelist`` section of ``corosync.conf``
+* The local hostname (value of ``uname -n``)
+
+If the cluster is running, the ``crm_node -n`` command will display the local
+node's name as used by the cluster.
+
+If a Corosync ``nodelist`` is used, ``crm_node --name-for-id`` with a Corosync
+node ID will display the name used by the node with the given Corosync
+``nodeid``, for example:
+
+.. code-block:: none
+
+ crm_node --name-for-id 2
+
+
+.. index::
+ single: node; attribute
+ single: node attribute
+
+.. _node_attributes:
+
+Node Attributes
+_______________
+
+Pacemaker allows node-specific values to be specified using *node attributes*.
+A node attribute has a name, and may have a distinct value for each node.
+
+Node attributes come in two types, *permanent* and *transient*. Permanent node
+attributes are kept within the ``node`` entry, and keep their values even if
+the cluster restarts on a node. Transient node attributes are kept in the CIB's
+``status`` section, and go away when the cluster stops on the node.
+
+While certain node attributes have specific meanings to the cluster, they are
+mainly intended to allow administrators and resource agents to track any
+information desired.
+
+For example, an administrator might choose to define node attributes for how
+much RAM and disk space each node has, which OS each uses, or which server room
+rack each node is in.
+
+Users can configure :ref:`rules` that use node attributes to affect where
+resources are placed.
+
+Setting and querying node attributes
+####################################
+
+Node attributes can be set and queried using the ``crm_attribute`` and
+``attrd_updater`` commands, so that the user does not have to deal with XML
+configuration directly.
+
+Here is an example command to set a permanent node attribute, and the XML
+configuration that would be generated:
+
+.. topic:: **Result of using crm_attribute to specify which kernel pcmk-1 is running**
+
+ .. code-block:: none
+
+ # crm_attribute --type nodes --node pcmk-1 --name kernel --update $(uname -r)
+
+ .. code-block:: xml
+
+ <node id="1" uname="pcmk-1">
+ <instance_attributes id="nodes-1-attributes">
+ <nvpair id="nodes-1-kernel" name="kernel" value="3.10.0-862.14.4.el7.x86_64"/>
+ </instance_attributes>
+ </node>
+
+To read back the value that was just set:
+
+.. code-block:: none
+
+ # crm_attribute --type nodes --node pcmk-1 --name kernel --query
+ scope=nodes name=kernel value=3.10.0-862.14.4.el7.x86_64
+
+The ``--type nodes`` indicates that this is a permanent node attribute;
+``--type status`` would indicate a transient node attribute.
+
+Special node attributes
+#######################
+
+Certain node attributes have special meaning to the cluster.
+
+Node attribute names beginning with ``#`` are considered reserved for these
+special attributes. Some special attributes do not start with ``#``, for
+historical reasons.
+
+Certain special attributes are set automatically by the cluster, should never
+be modified directly, and can be used only within :ref:`rules`; these are
+listed under
+:ref:`built-in node attributes <node-attribute-expressions-special>`.
+
+For true/false values, the cluster considers a value of "1", "y", "yes", "on",
+or "true" (case-insensitively) to be true, "0", "n", "no", "off", "false", or
+unset to be false, and anything else to be an error.
+
+.. table:: **Node attributes with special significance**
+ :class: longtable
+ :widths: 1 2
+
+ +----------------------------+-----------------------------------------------------+
+ | Name | Description |
+ +============================+=====================================================+
+ | fail-count-* | .. index:: |
+ | | pair: node attribute; fail-count |
+ | | |
+ | | Attributes whose names start with |
+ | | ``fail-count-`` are managed by the cluster |
+ | | to track how many times particular resource |
+ | | operations have failed on this node. These |
+ | | should be queried and cleared via the |
+ | | ``crm_failcount`` or |
+ | | ``crm_resource --cleanup`` commands rather |
+ | | than directly. |
+ +----------------------------+-----------------------------------------------------+
+ | last-failure-* | .. index:: |
+ | | pair: node attribute; last-failure |
+ | | |
+ | | Attributes whose names start with |
+ | | ``last-failure-`` are managed by the cluster |
+ | | to track when particular resource operations |
+ | | have most recently failed on this node. |
+ | | These should be cleared via the |
+ | | ``crm_failcount`` or |
+ | | ``crm_resource --cleanup`` commands rather |
+ | | than directly. |
+ +----------------------------+-----------------------------------------------------+
+ | maintenance | .. index:: |
+ | | pair: node attribute; maintenance |
+ | | |
+ | | Similar to the ``maintenance-mode`` |
+ | | :ref:`cluster option <cluster_options>`, but |
+ | | for a single node. If true, resources will |
+ | | not be started or stopped on the node, |
+ | | resources and individual clone instances |
+ | | running on the node will become unmanaged, |
+ | | and any recurring operations for those will |
+ | | be cancelled. |
+ | | |
+ | | **Warning:** Restarting pacemaker on a node that is |
+ | | in single-node maintenance mode will likely |
+ | | lead to undesirable effects. If |
+ | | ``maintenance`` is set as a transient |
+ | | attribute, it will be erased when |
+ | | Pacemaker is stopped, which will |
+ | | immediately take the node out of |
+ | | maintenance mode and likely get it |
+ | | fenced. Even if permanent, if Pacemaker |
+ | | is restarted, any resources active on the |
+ | | node will have their local history erased |
+ | | when the node rejoins, so the cluster |
+ | | will no longer consider them running on |
+ | | the node and thus will consider them |
+ | | managed again, leading them to be started |
+ | | elsewhere. This behavior might be |
+ | | improved in a future release. |
+ +----------------------------+-----------------------------------------------------+
+ | probe_complete | .. index:: |
+ | | pair: node attribute; probe_complete |
+ | | |
+ | | This is managed by the cluster to detect |
+ | | when nodes need to be reprobed, and should |
+ | | never be used directly. |
+ +----------------------------+-----------------------------------------------------+
+ | resource-discovery-enabled | .. index:: |
+ | | pair: node attribute; resource-discovery-enabled |
+ | | |
+ | | If the node is a remote node, fencing is enabled, |
+ | | and this attribute is explicitly set to false |
+ | | (unset means true in this case), resource discovery |
+ | | (probes) will not be done on this node. This is |
+ | | highly discouraged; the ``resource-discovery`` |
+ | | location constraint property is preferred for this |
+ | | purpose. |
+ +----------------------------+-----------------------------------------------------+
+ | shutdown | .. index:: |
+ | | pair: node attribute; shutdown |
+ | | |
+ | | This is managed by the cluster to orchestrate the |
+ | | shutdown of a node, and should never be used |
+ | | directly. |
+ +----------------------------+-----------------------------------------------------+
+ | site-name | .. index:: |
+ | | pair: node attribute; site-name |
+ | | |
+ | | If set, this will be used as the value of the |
+ | | ``#site-name`` node attribute used in rules. (If |
+ | | not set, the value of the ``cluster-name`` cluster |
+ | | option will be used as ``#site-name`` instead.) |
+ +----------------------------+-----------------------------------------------------+
+ | standby | .. index:: |
+ | | pair: node attribute; standby |
+ | | |
+ | | If true, the node is in standby mode. This is |
+ | | typically set and queried via the ``crm_standby`` |
+ | | command rather than directly. |
+ +----------------------------+-----------------------------------------------------+
+ | terminate | .. index:: |
+ | | pair: node attribute; terminate |
+ | | |
+ | | If the value is true or begins with any nonzero |
+ | | number, the node will be fenced. This is typically |
+ | | set by tools rather than directly. |
+ +----------------------------+-----------------------------------------------------+
+ | #digests-* | .. index:: |
+ | | pair: node attribute; #digests |
+ | | |
+ | | Attributes whose names start with ``#digests-`` are |
+ | | managed by the cluster to detect when |
+ | | :ref:`unfencing` needs to be redone, and should |
+ | | never be used directly. |
+ +----------------------------+-----------------------------------------------------+
+ | #node-unfenced | .. index:: |
+ | | pair: node attribute; #node-unfenced |
+ | | |
+ | | When the node was last unfenced (as seconds since |
+ | | the epoch). This is managed by the cluster and |
+ | | should never be used directly. |
+ +----------------------------+-----------------------------------------------------+
+
+.. index::
+ single: node; health
+
+.. _node-health:
+
+Tracking Node Health
+____________________
+
+A node may be functioning adequately as far as cluster membership is concerned,
+and yet be "unhealthy" in some respect that makes it an undesirable location
+for resources. For example, a disk drive may be reporting SMART errors, or the
+CPU may be highly loaded.
+
+Pacemaker offers a way to automatically move resources off unhealthy nodes.
+
+.. index::
+ single: node attribute; health
+
+Node Health Attributes
+######################
+
+Pacemaker will treat any node attribute whose name starts with ``#health`` as
+an indicator of node health. Node health attributes may have one of the
+following values:
+
+.. table:: **Allowed Values for Node Health Attributes**
+ :widths: 1 4
+
+ +------------+--------------------------------------------------------------+
+ | Value | Intended significance |
+ +============+==============================================================+
+ | ``red`` | .. index:: |
+ | | single: red; node health attribute value |
+ | | single: node attribute; health (red) |
+ | | |
+ | | This indicator is unhealthy |
+ +------------+--------------------------------------------------------------+
+ | ``yellow`` | .. index:: |
+ | | single: yellow; node health attribute value |
+ | | single: node attribute; health (yellow) |
+ | | |
+ | | This indicator is becoming unhealthy |
+ +------------+--------------------------------------------------------------+
+ | ``green`` | .. index:: |
+ | | single: green; node health attribute value |
+ | | single: node attribute; health (green) |
+ | | |
+ | | This indicator is healthy |
+ +------------+--------------------------------------------------------------+
+ | *integer* | .. index:: |
+ | | single: score; node health attribute value |
+ | | single: node attribute; health (score) |
+ | | |
+ | | A numeric score to apply to all resources on this node (0 or |
+ | | positive is healthy, negative is unhealthy) |
+ +------------+--------------------------------------------------------------+
+
+
+.. index::
+ pair: cluster option; node-health-strategy
+
+Node Health Strategy
+####################
+
+Pacemaker assigns a node health score to each node, as the sum of the values of
+all its node health attributes. This score will be used as a location
+constraint applied to this node for all resources.
+
+The ``node-health-strategy`` cluster option controls how Pacemaker responds to
+changes in node health attributes, and how it translates ``red``, ``yellow``,
+and ``green`` to scores.
+
+Allowed values are:
+
+.. table:: **Node Health Strategies**
+ :widths: 1 4
+
+ +----------------+----------------------------------------------------------+
+ | Value | Effect |
+ +================+==========================================================+
+ | none | .. index:: |
+ | | single: node-health-strategy; none |
+ | | single: none; node-health-strategy value |
+ | | |
+ | | Do not track node health attributes at all. |
+ +----------------+----------------------------------------------------------+
+ | migrate-on-red | .. index:: |
+ | | single: node-health-strategy; migrate-on-red |
+ | | single: migrate-on-red; node-health-strategy value |
+ | | |
+ | | Assign the value of ``-INFINITY`` to ``red``, and 0 to |
+ | | ``yellow`` and ``green``. This will cause all resources |
+ | | to move off the node if any attribute is ``red``. |
+ +----------------+----------------------------------------------------------+
+ | only-green | .. index:: |
+ | | single: node-health-strategy; only-green |
+ | | single: only-green; node-health-strategy value |
+ | | |
+ | | Assign the value of ``-INFINITY`` to ``red`` and |
+ | | ``yellow``, and 0 to ``green``. This will cause all |
+ | | resources to move off the node if any attribute is |
+ | | ``red`` or ``yellow``. |
+ +----------------+----------------------------------------------------------+
+ | progressive | .. index:: |
+ | | single: node-health-strategy; progressive |
+ | | single: progressive; node-health-strategy value |
+ | | |
+ | | Assign the value of the ``node-health-red`` cluster |
+ | | option to ``red``, the value of ``node-health-yellow`` |
+ | | to ``yellow``, and the value of ``node-health-green`` to |
+ | | ``green``. Each node is additionally assigned a score of |
+ | | ``node-health-base`` (this allows resources to start |
+ | | even if some attributes are ``yellow``). This strategy |
+ | | gives the administrator finer control over how important |
+ | | each value is. |
+ +----------------+----------------------------------------------------------+
+ | custom | .. index:: |
+ | | single: node-health-strategy; custom |
+ | | single: custom; node-health-strategy value |
+ | | |
+ | | Track node health attributes using the same values as |
+ | | ``progressive`` for ``red``, ``yellow``, and ``green``, |
+ | | but do not take them into account. The administrator is |
+ | | expected to implement a policy by defining :ref:`rules` |
+ | | referencing node health attributes. |
+ +----------------+----------------------------------------------------------+
+
+
+Exempting a Resource from Health Restrictions
+#############################################
+
+If you want a resource to be able to run on a node even if its health score
+would otherwise prevent it, set the resource's ``allow-unhealthy-nodes``
+meta-attribute to ``true`` *(available since 2.1.3)*.
+
+This is particularly useful for node health agents, to allow them to detect
+when the node becomes healthy again. If you configure a health agent without
+this setting, then the health agent will be banned from an unhealthy node,
+and you will have to investigate and clear the health attribute manually once
+it is healthy to allow resources on the node again.
+
+If you want the meta-attribute to apply to a clone, it must be set on the clone
+itself, not on the resource being cloned.
+
+
+Configuring Node Health Agents
+##############################
+
+Since Pacemaker calculates node health based on node attributes, any method
+that sets node attributes may be used to measure node health. The most common
+are resource agents and custom daemons.
+
+Pacemaker provides examples that can be used directly or as a basis for custom
+code. The ``ocf:pacemaker:HealthCPU``, ``ocf:pacemaker:HealthIOWait``, and
+``ocf:pacemaker:HealthSMART`` resource agents set node health attributes based
+on CPU and disk status.
+
+To take advantage of this feature, add the resource to your cluster (generally
+as a cloned resource with a recurring monitor action, to continually check the
+health of all nodes). For example:
+
+.. topic:: Example HealthIOWait resource configuration
+
+ .. code-block:: xml
+
+ <clone id="resHealthIOWait-clone">
+ <primitive class="ocf" id="HealthIOWait" provider="pacemaker" type="HealthIOWait">
+ <instance_attributes id="resHealthIOWait-instance_attributes">
+ <nvpair id="resHealthIOWait-instance_attributes-red_limit" name="red_limit" value="30"/>
+ <nvpair id="resHealthIOWait-instance_attributes-yellow_limit" name="yellow_limit" value="10"/>
+ </instance_attributes>
+ <operations>
+ <op id="resHealthIOWait-monitor-interval-5" interval="5" name="monitor" timeout="5"/>
+ <op id="resHealthIOWait-start-interval-0s" interval="0s" name="start" timeout="10s"/>
+ <op id="resHealthIOWait-stop-interval-0s" interval="0s" name="stop" timeout="10s"/>
+ </operations>
+ </primitive>
+ </clone>
+
+The resource agents use ``attrd_updater`` to set proper status for each node
+running this resource, as a node attribute whose name starts with ``#health``
+(for ``HealthIOWait``, the node attribute is named ``#health-iowait``).
+
+When a node is no longer faulty, you can force the cluster to make it available
+to take resources without waiting for the next monitor, by setting the node
+health attribute to green. For example:
+
+.. topic:: **Force node1 to be marked as healthy**
+
+ .. code-block:: none
+
+ # attrd_updater --name "#health-iowait" --update "green" --node "node1"