summaryrefslogtreecommitdiffstats
path: root/doc/sphinx/Pacemaker_Explained/cluster-options.rst
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-06-03 13:39:28 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-06-03 13:39:28 +0000
commit924f5ea83e48277e014ebf0d19a27187cb93e2f7 (patch)
tree75920a275bba045f6d108204562c218a9a26ea15 /doc/sphinx/Pacemaker_Explained/cluster-options.rst
parentAdding upstream version 2.1.7. (diff)
downloadpacemaker-upstream.tar.xz
pacemaker-upstream.zip
Adding upstream version 2.1.8~rc1.upstream/2.1.8_rc1upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/sphinx/Pacemaker_Explained/cluster-options.rst')
-rw-r--r--doc/sphinx/Pacemaker_Explained/cluster-options.rst242
1 files changed, 80 insertions, 162 deletions
diff --git a/doc/sphinx/Pacemaker_Explained/cluster-options.rst b/doc/sphinx/Pacemaker_Explained/cluster-options.rst
index 77bd7e6..042ed0b 100644
--- a/doc/sphinx/Pacemaker_Explained/cluster-options.rst
+++ b/doc/sphinx/Pacemaker_Explained/cluster-options.rst
@@ -62,143 +62,6 @@ Normally, you will use command-line tools that abstract the XML, so the
distinction will be unimportant; both properties and options are cluster
settings you can tweak.
-Configuration Value Types
-#########################
-
-Throughout this document, configuration values will be designated as having one
-of the following types:
-
-.. list-table:: **Configuration Value Types**
- :class: longtable
- :widths: 1 3
- :header-rows: 1
-
- * - Type
- - Description
- * - .. _boolean:
-
- .. index::
- pair: type; boolean
-
- boolean
- - Case-insensitive text value where ``1``, ``yes``, ``y``, ``on``,
- and ``true`` evaluate as true and ``0``, ``no``, ``n``, ``off``,
- ``false``, and unset evaluate as false
- * - .. _date_time:
-
- .. index::
- pair: type; date/time
-
- date/time
- - Textual timestamp like ``Sat Dec 21 11:47:45 2013``
- * - .. _duration:
-
- .. index::
- pair: type; duration
-
- duration
- - A time duration, specified either like a :ref:`timeout <timeout>` or an
- `ISO 8601 duration <https://en.wikipedia.org/wiki/ISO_8601#Durations>`_.
- A duration may be up to approximately 49 days but is intended for much
- smaller time periods.
- * - .. _enumeration:
-
- .. index::
- pair: type; enumeration
-
- enumeration
- - Text that must be one of a set of defined values (which will be listed
- in the description)
- * - .. _integer:
-
- .. index::
- pair: type; integer
-
- integer
- - 32-bit signed integer value (-2,147,483,648 to 2,147,483,647)
- * - .. _nonnegative_integer:
-
- .. index::
- pair: type; nonnegative integer
-
- nonnegative integer
- - 32-bit nonnegative integer value (0 to 2,147,483,647)
- * - .. _port:
-
- .. index::
- pair: type; port
-
- port
- - Integer TCP port number (0 to 65535)
- * - .. _score:
-
- .. index::
- pair: type; score
-
- score
- - A Pacemaker score can be an integer between -1,000,000 and 1,000,000, or
- a string alias: ``INFINITY`` or ``+INFINITY`` is equivalent to
- 1,000,000, ``-INFINITY`` is equivalent to -1,000,000, and ``red``,
- ``yellow``, and ``green`` are equivalent to integers as described in
- :ref:`node-health`.
- * - .. _text:
-
- .. index::
- pair: type; text
-
- text
- - A text string
- * - .. _timeout:
-
- .. index::
- pair: type; timeout
-
- timeout
- - A time duration, specified as a bare number (in which case it is
- considered to be in seconds) or a number with a unit (``ms`` or ``msec``
- for milliseconds, ``us`` or ``usec`` for microseconds, ``s`` or ``sec``
- for seconds, ``m`` or ``min`` for minutes, ``h`` or ``hr`` for hours)
- optionally with whitespace before and/or after the number.
- * - .. _version:
-
- .. index::
- pair: type; version
-
- version
- - Version number (any combination of alphanumeric characters, dots, and
- dashes, starting with a number).
-
-
-Scores
-______
-
-Scores are integral to how Pacemaker works. Practically everything from moving
-a resource to deciding which resource to stop in a degraded cluster is achieved
-by manipulating scores in some way.
-
-Scores are calculated per resource and node. Any node with a negative score for
-a resource can't run that resource. The cluster places a resource on the node
-with the highest score for it.
-
-Score addition and subtraction follow these rules:
-
-* Any value (including ``INFINITY``) - ``INFINITY`` = ``-INFINITY``
-* ``INFINITY`` + any value other than ``-INFINITY`` = ``INFINITY``
-
-.. note::
-
- What if you want to use a score higher than 1,000,000? Typically this possibility
- arises when someone wants to base the score on some external metric that might
- go above 1,000,000.
-
- The short answer is you can't.
-
- The long answer is it is sometimes possible work around this limitation
- creatively. You may be able to set the score to some computed value based on
- the external metric rather than use the metric directly. For nodes, you can
- store the metric as a node attribute, and query the attribute when computing
- the score (possibly as part of a custom resource agent).
-
CIB Properties
##############
@@ -321,6 +184,15 @@ holds. So the decision was made to place them in an easy-to-find location.
-
- Node ID of the cluster's current designated controller (DC). Used and
maintained by the cluster.
+ * - .. _execution_date:
+
+ .. index::
+ pair: execution-date; cib
+
+ execution-date
+ - :ref:`epoch time <epoch_time>`
+ -
+ - Time to use when evaluating rules.
.. _cluster_options:
@@ -427,6 +299,29 @@ values, by running the ``man pacemaker-schedulerd`` and
- The number of :ref:`live migration <live-migration>` actions that the
cluster is allowed to execute in parallel on a node. A value of -1 means
unlimited.
+ * - .. _load_threshold:
+
+ .. index::
+ pair: cluster option; load-threshold
+
+ load-threshold
+ - :ref:`percentage <percentage>`
+ - 80%
+ - Maximum amount of system load that should be used by cluster nodes. The
+ cluster will slow down its recovery process when the amount of system
+ resources used (currently CPU) approaches this limit.
+ * - .. _node_action_limit:
+
+ .. index::
+ pair: cluster option; node-action-limit
+
+ node-action-limit
+ - :ref:`integer <integer>`
+ - 0
+ - Maximum number of jobs that can be scheduled per node. If nonpositive or
+ invalid, double the number of cores is used as the maximum number of jobs
+ per node. :ref:`PCMK_node_action_limit <pcmk_node_action_limit>`
+ overrides this option on a per-node basis.
* - .. _symmetric_cluster:
.. index::
@@ -558,6 +453,22 @@ values, by running the ``man pacemaker-schedulerd`` and
- How many times fencing can fail for a target before the cluster will no
longer immediately re-attempt it. Any value below 1 will be ignored, and
the default will be used instead.
+ * - .. _have_watchdog:
+
+ .. index::
+ pair: cluster option; have-watchdog
+
+ have-watchdog
+ - :ref:`boolean <boolean>`
+ - *detected*
+ - Whether watchdog integration is enabled. This is set automatically by the
+ cluster according to whether SBD is detected to be in use.
+ User-configured values are ignored. The value `true` is meaningful if
+ diskless SBD is used and
+ :ref:`stonith-watchdog-timeout <stonith_watchdog_timeout>` is nonzero. In
+ that case, if fencing is required, watchdog-based self-fencing will be
+ performed via SBD without requiring a fencing resource explicitly
+ configured.
* - .. _stonith_watchdog_timeout:
.. index::
@@ -568,23 +479,29 @@ values, by running the ``man pacemaker-schedulerd`` and
- 0
- If nonzero, and the cluster detects ``have-watchdog`` as ``true``, then
watchdog-based self-fencing will be performed via SBD when fencing is
- required, without requiring a fencing resource explicitly configured.
-
- If this is set to a positive value, unseen nodes are assumed to
- self-fence within this much time.
+ required.
- **Warning:** It must be ensured that this value is larger than the
- ``SBD_WATCHDOG_TIMEOUT`` environment variable on all nodes. Pacemaker
- verifies the settings individually on all nodes and prevents startup or
- shuts down if configured wrongly on the fly. It is strongly recommended
- that ``SBD_WATCHDOG_TIMEOUT`` be set to the same value on all nodes.
+ If this is set to a positive value, lost nodes are assumed to achieve
+ self-fencing within this much time.
+
+ This does not require a fencing resource to be explicitly configured,
+ though a fence_watchdog resource can be configured, to limit use to
+ specific nodes.
+
+ If this is set to 0 (the default), the cluster will never assume
+ watchdog-based self-fencing.
+
+ If this is set to a negative value, the cluster will use twice the local
+ value of the ``SBD_WATCHDOG_TIMEOUT`` environment variable if that is
+ positive, or otherwise treat this as 0.
- If this is set to a negative value, and ``SBD_WATCHDOG_TIMEOUT`` is set,
- twice that value will be used.
+ **Warning:** When used, this timeout must be larger than
+ ``SBD_WATCHDOG_TIMEOUT`` on all nodes that use watchdog-based SBD, and
+ Pacemaker will refuse to start on any of those nodes where this is not
+ true for the local value or SBD is not active. When this is set to a
+ negative value, ``SBD_WATCHDOG_TIMEOUT`` must be set to the same value
+ on all nodes that use SBD, otherwise data corruption or loss could occur.
- **Warning:** In this case, it is essential (and currently not verified
- by pacemaker) that ``SBD_WATCHDOG_TIMEOUT`` is set to the same value on
- all nodes.
* - .. _concurrent-fencing:
.. index::
@@ -607,12 +524,13 @@ values, by running the ``man pacemaker-schedulerd`` and
- :ref:`enumeration <enumeration>`
- stop
- How should a cluster node react if notified of its own fencing? A
- cluster node may receive notification of its own fencing if fencing is
- misconfigured, or if fabric fencing is in use that doesn't cut cluster
- communication. Allowed values are ``stop`` to attempt to immediately
- stop Pacemaker and stay stopped, or ``panic`` to attempt to immediately
- reboot the local node, falling back to stop on failure. The default is
- likely to be changed to ``panic`` in a future release. *(since 2.0.3)*
+ cluster node may receive notification of a "succeeded" fencing that
+ targeted it if fencing is misconfigured, or if fabric fencing is in use
+ that doesn't cut cluster communication. Allowed values are ``stop`` to
+ attempt to immediately stop Pacemaker and stay stopped, or ``panic`` to
+ attempt to immediately reboot the local node, falling back to stop on
+ failure. The default is likely to be changed to ``panic`` in a future
+ release. *(since 2.0.3)*
* - .. _priority_fencing_delay:
.. index::
@@ -784,7 +702,7 @@ values, by running the ``man pacemaker-schedulerd`` and
node-health-red
- :ref:`score <score>`
- - 0
+ - -INFINITY
- The score to use for a node health attribute whose value is ``red``.
Only used when ``node-health-strategy`` is ``progressive`` or
``custom``.
@@ -797,10 +715,10 @@ values, by running the ``man pacemaker-schedulerd`` and
- :ref:`duration <duration>`
- 15min
- Pacemaker is primarily event-driven, and looks ahead to know when to
- recheck the cluster for failure timeouts and most time-based rules
- *(since 2.0.3)*. However, it will also recheck the cluster after this
- amount of inactivity. This has two goals: rules with ``date_spec`` are
- only guaranteed to be checked this often, and it also serves as a
+ recheck the cluster for failure-timeout settings and most time-based
+ rules *(since 2.0.3)*. However, it will also recheck the cluster after
+ this amount of inactivity. This has two goals: rules with ``date_spec``
+ are only guaranteed to be checked this often, and it also serves as a
fail-safe for some kinds of scheduler bugs. A value of 0 disables this
polling.
* - .. _shutdown_lock: