summaryrefslogtreecommitdiffstats
path: root/doc/sphinx/Pacemaker_Explained/alerts.rst
blob: f4cad72cb76c01f7a583e3bd7a39a7106af404da (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
.. _alerts:

.. index::
   single: alert
   single: resource; alert
   single: node; alert
   single: fencing; alert
   pair: XML element; alert
   pair: XML element; alerts

Alerts
------

*Alerts* may be configured to take some external action when a cluster event
occurs (node failure, resource starting or stopping, etc.).


.. index::
   pair: alert; agent

Alert Agents
############

As with resource agents, the cluster calls an external program (an
*alert agent*) to handle alerts. The cluster passes information about the event
to the agent via environment variables. Agents can do anything desired with
this information (send an e-mail, log to a file, update a monitoring system,
etc.).

.. topic:: Simple alert configuration

   .. code-block:: xml

      <configuration>
         <alerts>
            <alert id="my-alert" path="/path/to/my-script.sh" />
         </alerts>
      </configuration>

In the example above, the cluster will call ``my-script.sh`` for each event.

Multiple alert agents may be configured; the cluster will call all of them for
each event.

Alert agents will be called only on cluster nodes. They will be called for
events involving Pacemaker Remote nodes, but they will never be called *on*
those nodes.
   
For more information about sample alert agents provided by Pacemaker and about
developing custom alert agents, see the *Pacemaker Administration* document.


.. index::
   single: alert; recipient
   pair: XML element; recipient

Alert Recipients
################
   
Usually, alerts are directed towards a recipient. Thus, each alert may be
additionally configured with one or more recipients. The cluster will call the
agent separately for each recipient.
   
.. topic:: Alert configuration with recipient

   .. code-block:: xml

      <configuration>
         <alerts>
            <alert id="my-alert" path="/path/to/my-script.sh">
                <recipient id="my-alert-recipient" value="some-address"/>
            </alert>
         </alerts>
      </configuration>
   
In the above example, the cluster will call ``my-script.sh`` for each event,
passing the recipient ``some-address`` as an environment variable.

The recipient may be anything the alert agent can recognize -- an IP address,
an e-mail address, a file name, whatever the particular agent supports.
   
   
.. index::
   single: alert; meta-attributes
   single: meta-attribute; alert meta-attributes

Alert Meta-Attributes
#####################
   
As with resources, meta-attributes can be configured for alerts to change
whether and how Pacemaker calls them.
   
.. table:: **Meta-Attributes of an Alert**
   :class: longtable
   :widths: 1 1 3
   
   +------------------+---------------+-----------------------------------------------------+
   | Meta-Attribute   | Default       | Description                                         |
   +==================+===============+=====================================================+
   | enabled          | true          | .. index::                                          |
   |                  |               |    single: alert; meta-attribute, enabled           |
   |                  |               |    single: meta-attribute; enabled (alert)          |
   |                  |               |    single: enabled; alert meta-attribute            |
   |                  |               |                                                     |
   |                  |               | If false for an alert, the alert will not be used.  |
   |                  |               | If true for an alert and false for a particular     |
   |                  |               | recipient of that alert, that recipient will not be |
   |                  |               | used. *(since 2.1.6)*                               |
   +------------------+---------------+-----------------------------------------------------+
   | timestamp-format | %H:%M:%S.%06N | .. index::                                          |
   |                  |               |    single: alert; meta-attribute, timestamp-format  |
   |                  |               |    single: meta-attribute; timestamp-format (alert) |
   |                  |               |    single: timestamp-format; alert meta-attribute   |
   |                  |               |                                                     |
   |                  |               | Format the cluster will use when sending the        |
   |                  |               | event's timestamp to the agent. This is a string as |
   |                  |               | used with the ``date(1)`` command.                  |
   +------------------+---------------+-----------------------------------------------------+
   | timeout          | 30s           | .. index::                                          |
   |                  |               |    single: alert; meta-attribute, timeout           |
   |                  |               |    single: meta-attribute; timeout (alert)          |
   |                  |               |    single: timeout; alert meta-attribute            |
   |                  |               |                                                     |
   |                  |               | If the alert agent does not complete within this    |
   |                  |               | amount of time, it will be terminated.              |
   +------------------+---------------+-----------------------------------------------------+
   
Meta-attributes can be configured per alert and/or per recipient.
   
.. topic:: Alert configuration with meta-attributes

   .. code-block:: xml

      <configuration>
         <alerts>
            <alert id="my-alert" path="/path/to/my-script.sh">
               <meta_attributes id="my-alert-attributes">
                  <nvpair id="my-alert-attributes-timeout" name="timeout"
                          value="15s"/>
               </meta_attributes>
               <recipient id="my-alert-recipient1" value="someuser@example.com">
                  <meta_attributes id="my-alert-recipient1-attributes">
                     <nvpair id="my-alert-recipient1-timestamp-format"
                             name="timestamp-format" value="%D %H:%M"/>
                  </meta_attributes>
               </recipient>
               <recipient id="my-alert-recipient2" value="otheruser@example.com">
                  <meta_attributes id="my-alert-recipient2-attributes">
                     <nvpair id="my-alert-recipient2-timestamp-format"
                             name="timestamp-format" value="%c"/>
                  </meta_attributes>
               </recipient>
            </alert>
         </alerts>
      </configuration>
   
In the above example, the ``my-script.sh`` will get called twice for each
event, with each call using a 15-second timeout. One call will be passed the
recipient ``someuser@example.com`` and a timestamp in the format ``%D %H:%M``,
while the other call will be passed the recipient ``otheruser@example.com`` and
a timestamp in the format ``%c``.
   
   
.. index::
   single: alert; instance attributes
   single: instance attribute; alert instance attributes

Alert Instance Attributes
#########################
   
As with resource agents, agent-specific configuration values may be configured
as instance attributes. These will be passed to the agent as additional
environment variables. The number, names and allowed values of these instance
attributes are completely up to the particular agent.
   
.. topic:: Alert configuration with instance attributes

   .. code-block:: xml

      <configuration>
         <alerts>
            <alert id="my-alert" path="/path/to/my-script.sh">
               <meta_attributes id="my-alert-attributes">
                  <nvpair id="my-alert-attributes-timeout" name="timeout"
                          value="15s"/>
               </meta_attributes>
               <instance_attributes id="my-alert-options">
                   <nvpair id="my-alert-options-debug" name="debug"
                           value="false"/>
               </instance_attributes>
               <recipient id="my-alert-recipient1"
                          value="someuser@example.com"/>
            </alert>
         </alerts>
      </configuration>
   
   
.. index::
   single: alert; filters
   pair: XML element; select
   pair: XML element; select_nodes
   pair: XML element; select_fencing
   pair: XML element; select_resources
   pair: XML element; select_attributes
   pair: XML element; attribute

Alert Filters
#############
   
By default, an alert agent will be called for node events, fencing events, and
resource events. An agent may choose to ignore certain types of events, but
there is still the overhead of calling it for those events. To eliminate that
overhead, you may select which types of events the agent should receive.

Alert filters are configured within a ``select`` element inside an ``alert``
element.

.. list-table:: **Possible alert filters**
   :class: longtable
   :widths: 1 3
   :header-rows: 1

   * - Name
     - Events alerted
   * - select_nodes
     - A node joins or leaves the cluster (whether at the cluster layer for
       cluster nodes, or via a remote connection for Pacemaker Remote nodes).
   * - select_fencing
     - Fencing or unfencing of a node completes (whether successfully or not).
   * - select_resources
     - A resource action other than meta-data completes (whether successfully
       or not).
   * - select_attributes
     - A transient attribute value update is sent to the CIB.

.. topic:: Alert configuration to receive only node events and fencing events

   .. code-block:: xml

      <configuration>
         <alerts>
            <alert id="my-alert" path="/path/to/my-script.sh">
               <select>
                  <select_nodes />
                  <select_fencing />
               </select>
               <recipient id="my-alert-recipient1"
                          value="someuser@example.com"/>
            </alert>
         </alerts>
      </configuration>
   
With ``<select_attributes>`` (the only event type not enabled by default), the
agent will receive alerts when a node attribute changes. If you wish the agent
to be called only when certain attributes change, you can configure that as well.
   
.. topic:: Alert configuration to be called when certain node attributes change

   .. code-block:: xml

      <configuration>
         <alerts>
            <alert id="my-alert" path="/path/to/my-script.sh">
               <select>
                  <select_attributes>
                     <attribute id="alert-standby" name="standby" />
                     <attribute id="alert-shutdown" name="shutdown" />
                  </select_attributes>
               </select>
               <recipient id="my-alert-recipient1" value="someuser@example.com"/>
            </alert>
         </alerts>
      </configuration>
   
Node attribute alerts are currently considered experimental. Alerts may be
limited to attributes set via ``attrd_updater``, and agents may be called
multiple times with the same attribute value.