doc/rados/operations/devices.rst


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227

.. _devices:

Device Management
=================

Device management allows Ceph to address hardware failure. Ceph tracks hardware
storage devices (HDDs, SSDs) to see which devices are managed by which daemons.
Ceph also collects health metrics about these devices. By doing so, Ceph can
provide tools that predict hardware failure and can automatically respond to
hardware failure.

Device tracking
---------------

To see a list of the storage devices that are in use, run the following
command:

.. prompt:: bash $

   ceph device ls

Alternatively, to list devices by daemon or by host, run a command of one of
the following forms:

.. prompt:: bash $

   ceph device ls-by-daemon <daemon>
   ceph device ls-by-host <host>

To see information about the location of an specific device and about how the
device is being consumed, run a command of the following form:

.. prompt:: bash $

   ceph device info <devid>

Identifying physical devices
----------------------------

To make the replacement of failed disks easier and less error-prone, you can
(in some cases) "blink" the drive's LEDs on hardware enclosures by running a
command of the following form::

  device light on|off <devid> [ident|fault] [--force]

.. note:: Using this command to blink the lights might not work. Whether it
   works will depend upon such factors as your kernel revision, your SES
   firmware, or the setup of your HBA.

The ``<devid>`` parameter is the device identification. To retrieve this
information, run the following command:

.. prompt:: bash $

   ceph device ls

The ``[ident|fault]`` parameter determines which kind of light will blink.  By
default, the `identification` light is used.

.. note:: This command works only if the Cephadm or the Rook `orchestrator
   <https://docs.ceph.com/docs/master/mgr/orchestrator/#orchestrator-cli-module>`_
   module is enabled.  To see which orchestrator module is enabled, run the
   following command:

   .. prompt:: bash $

      ceph orch status

The command that makes the drive's LEDs blink is `lsmcli`. To customize this
command, configure it via a Jinja2 template by running commands of the
following forms::

   ceph config-key set mgr/cephadm/blink_device_light_cmd "<template>"
   ceph config-key set mgr/cephadm/<host>/blink_device_light_cmd "lsmcli local-disk-{{ ident_fault }}-led-{{'on' if on else 'off'}} --path '{{ path or dev }}'"

The following arguments can be used to customize the Jinja2 template:

* ``on``
    A boolean value.
* ``ident_fault``
    A string that contains `ident` or `fault`.
* ``dev``
    A string that contains the device ID: for example, `SanDisk_X400_M.2_2280_512GB_162924424784`.
* ``path``
    A string that contains the device path: for example, `/dev/sda`.

.. _enabling-monitoring:

Enabling monitoring
-------------------

Ceph can also monitor the health metrics associated with your device. For
example, SATA drives implement a standard called SMART that provides a wide
range of internal metrics about the device's usage and health (for example: the
number of hours powered on, the number of power cycles, the number of
unrecoverable read errors). Other device types such as SAS and NVMe present a
similar set of metrics (via slightly different standards).  All of these
metrics can be collected by Ceph via the ``smartctl`` tool.

You can enable or disable health monitoring by running one of the following
commands:

.. prompt:: bash $

   ceph device monitoring on
   ceph device monitoring off

Scraping
--------

If monitoring is enabled, device metrics will be scraped automatically at
regular intervals. To configure that interval, run a command of the following
form:

.. prompt:: bash $

   ceph config set mgr mgr/devicehealth/scrape_frequency <seconds>

By default, device metrics are scraped once every 24 hours.

To manually scrape all devices, run the following command:
   
.. prompt:: bash $

   ceph device scrape-health-metrics

To scrape a single device, run a command of the following form:

.. prompt:: bash $

   ceph device scrape-health-metrics <device-id>

To scrape a single daemon's devices, run a command of the following form:

.. prompt:: bash $

   ceph device scrape-daemon-health-metrics <who>

To retrieve the stored health metrics for a device (optionally for a specific
timestamp),  run a command of the following form:

.. prompt:: bash $

   ceph device get-health-metrics <devid> [sample-timestamp]

Failure prediction
------------------

Ceph can predict drive life expectancy and device failures by analyzing the
health metrics that it collects. The prediction modes are as follows:

* *none*: disable device failure prediction.
* *local*: use a pre-trained prediction model from the ``ceph-mgr`` daemon.

To configure the prediction mode, run a command of the following form:

.. prompt:: bash $

   ceph config set global device_failure_prediction_mode <mode>

Under normal conditions, failure prediction runs periodically in the
background.  For this reason, life expectancy values might be populated only
after a significant amount of time has passed.  The life expectancy of all
devices is displayed in the output of the following command:

.. prompt:: bash $

   ceph device ls

To see the metadata of a specific device, run a command of the following form:

.. prompt:: bash $

   ceph device info <devid>

To explicitly force prediction of a specific device's life expectancy, run a
command of the following form:

.. prompt:: bash $

   ceph device predict-life-expectancy <devid>

In addition to Ceph's internal device failure prediction, you might have an
external source of information about device failures. To inform Ceph of a
specific device's life expectancy, run a command of the following form:

.. prompt:: bash $

   ceph device set-life-expectancy <devid> <from> [<to>]

Life expectancies are expressed as a time interval. This means that the
uncertainty of the life expectancy can be expressed in the form of a range of
time, and perhaps a wide range of time. The interval's end can be left
unspecified.

Health alerts
-------------

The ``mgr/devicehealth/warn_threshold`` configuration option controls the
health check for an expected device failure. If the device is expected to fail
within the specified time interval, an alert is raised.

To check the stored life expectancy of all devices and generate any appropriate
health alert, run the following command:

.. prompt:: bash $

   ceph device check-health

Automatic Migration
-------------------

The ``mgr/devicehealth/self_heal`` option (enabled by default) automatically
migrates data away from devices that are expected to fail soon. If this option
is enabled, the module marks such devices ``out`` so that automatic migration
will occur.

.. note:: The ``mon_osd_min_up_ratio`` configuration option can help prevent
   this process from cascading to total failure. If the "self heal" module
   marks ``out`` so many OSDs that the ratio value of ``mon_osd_min_up_ratio``
   is exceeded, then the cluster raises the ``DEVICE_HEALTH_TOOMANY`` health
   check. For instructions on what to do in this situation, see
   :ref:`DEVICE_HEALTH_TOOMANY<rados_health_checks_device_health_toomany>`.

The ``mgr/devicehealth/mark_out_threshold`` configuration option specifies the
time interval for automatic migration. If a device is expected to fail within
the specified time interval, it will be automatically marked ``out``.