summaryrefslogtreecommitdiffstats
path: root/doc/rbd/rbd-config-ref.rst
blob: 777894cd6c3992d1c295b7959b33869594253b22 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
=======================
 Config Settings
=======================

See `Block Device`_ for additional details.

Generic IO Settings
===================

``rbd_compression_hint``

:Description: Hint to send to the OSDs on write operations. If set to 
              ``compressible`` and the OSD ``bluestore_compression_mode``
              setting is ``passive``, the OSD will attempt to compress data
              If set to ``incompressible`` and the OSD compression setting
              is ``aggressive``, the OSD will not attempt to compress data.
:Type: Enum
:Required: No
:Default: ``none``
:Values: ``none``, ``compressible``, ``incompressible``


``rbd_read_from_replica_policy``

:Description: Policy for determining which OSD will receive read operations.
              If set to ``default``, each PG's primary OSD will always be used
              for read operations. If set to ``balance``, read operations will
              be sent to a randomly selected OSD within the replica set. If set
              to ``localize``, read operations will be sent to the closest OSD
              as determined by the CRUSH map. Note: this feature requires the
              cluster to be configured with a minimum compatible OSD release of
              Octopus.
:Type: Enum
:Required: No
:Default: ``default``
:Values: ``default``, ``balance``, ``localize``

Cache Settings
=======================

.. sidebar:: Kernel Caching

	The kernel driver for Ceph block devices can use the Linux page cache to
	improve performance.

The user space implementation of the Ceph block device (i.e., ``librbd``) cannot
take advantage of the Linux page cache, so it includes its own in-memory
caching, called "RBD caching." RBD caching behaves just like well-behaved hard
disk caching.  When the OS sends a barrier or a flush request, all dirty data is
written to the OSDs. This means that using write-back caching is just as safe as
using a well-behaved physical hard disk with a VM that properly sends flushes
(i.e. Linux kernel >= 2.6.32). The cache uses a Least Recently Used (LRU)
algorithm, and in write-back mode it can coalesce contiguous requests for
better throughput.

The librbd cache is enabled by default and supports three different cache
policies: write-around, write-back, and write-through. Writes return
immediately under both the write-around and write-back policies, unless there
are more than ``rbd_cache_max_dirty`` unwritten bytes to the storage cluster.
The write-around policy differs from the write-back policy in that it does
not attempt to service read requests from the cache, unlike the write-back
policy, and is therefore faster for high performance write workloads. Under the
write-through policy, writes return only when the data is on disk on all
replicas, but reads may come from the cache.

Prior to receiving a flush request, the cache behaves like a write-through cache
to ensure safe operation for older operating systems that do not send flushes to
ensure crash consistent behavior.

If the librbd cache is disabled, writes and
reads go directly to the storage cluster, and writes return only when the data
is on disk on all replicas.

.. note::
   The cache is in memory on the client, and each RBD image has
   its own.  Since the cache is local to the client, there's no coherency
   if there are others accessing the image. Running GFS or OCFS on top of
   RBD will not work with caching enabled.


Option settings for RBD should be set in the ``[client]``
section of your configuration file or the central config store. These settings
include:

``rbd_cache``

:Description: Enable caching for RADOS Block Device (RBD).
:Type: Boolean
:Required: No
:Default: ``true``


``rbd_cache_policy``

:Description: Select the caching policy for librbd.
:Type: Enum
:Required: No
:Default: ``writearound``
:Values: ``writearound``, ``writeback``, ``writethrough``


``rbd_cache_writethrough_until_flush``

:Description: Start out in ``writethrough`` mode, and switch to ``writeback``
              after the first flush request is received. Enabling is a
              conservative but safe strategy in case VMs running on RBD volumes
              are too old to send flushes, like the ``virtio`` driver in Linux
              kernels older than 2.6.32.
:Type: Boolean
:Required: No
:Default: ``true``


``rbd_cache_size``

:Description: The per-volume RBD client cache size in bytes.
:Type: 64-bit Integer
:Required: No
:Default: ``32 MiB``
:Policies: write-back and write-through


``rbd_cache_max_dirty``

:Description: The ``dirty`` limit in bytes at which the cache triggers write-back.  If ``0``, uses write-through caching.
:Type: 64-bit Integer
:Required: No
:Constraint: Must be less than ``rbd_cache_size``.
:Default: ``24 MiB``
:Policies: write-around and write-back


``rbd_cache_target_dirty``

:Description: The ``dirty target`` before the cache begins writing data to the data storage. Does not block writes to the cache.
:Type: 64-bit Integer
:Required: No
:Constraint: Must be less than ``rbd_cache_max_dirty``.
:Default: ``16 MiB``
:Policies: write-back


``rbd_cache_max_dirty_age``

:Description: The number of seconds dirty data is in the cache before writeback starts. 
:Type: Float
:Required: No
:Default: ``1.0``
:Policies: write-back


.. _Block Device: ../../rbd


Read-ahead Settings
=======================

librbd supports read-ahead/prefetching to optimize small, sequential reads.
This should normally be handled by the guest OS in the case of a VM,
but boot loaders may not issue efficient reads. Read-ahead is automatically
disabled if caching is disabled or if the policy is write-around.


``rbd_readahead_trigger_requests``

:Description: Number of sequential read requests necessary to trigger read-ahead.
:Type: Integer
:Required: No
:Default: ``10``


``rbd_readahead_max_bytes``

:Description: Maximum size of a read-ahead request.  If zero, read-ahead is disabled.
:Type: 64-bit Integer
:Required: No
:Default: ``512 KiB``


``rbd_readahead_disable_after_bytes``

:Description: After this many bytes have been read from an RBD image, read-ahead
              is disabled for that image until it is closed.  This allows the
              guest OS to take over read-ahead once it is booted.  If zero,
              read-ahead stays enabled.
:Type: 64-bit Integer
:Required: No
:Default: ``50 MiB``


Image Features
==============

RBD supports advanced features which can be specified via the command line when
creating images or the default features can be configured via
``rbd_default_features = <sum of feature numeric values>`` or
``rbd_default_features = <comma-delimited list of CLI values>``.

``Layering``

:Description: Layering enables cloning.
:Internal value: 1
:CLI value: layering
:Added in: v0.52 (Bobtail)
:KRBD support: since v3.10
:Default: yes

``Striping v2``

:Description: Striping spreads data across multiple objects. Striping helps with
              parallelism for sequential read/write workloads.
:Internal value: 2
:CLI value: striping
:Added in: v0.55 (Bobtail)
:KRBD support: since v3.10 (default striping only, "fancy" striping added in v4.17)
:Default: yes

``Exclusive locking``

:Description: When enabled, it requires a client to acquire a lock on an object
              before making a write. Exclusive lock should only be enabled when
              a single client is accessing an image at any given time.
:Internal value: 4
:CLI value: exclusive-lock
:Added in: v0.92 (Hammer)
:KRBD support: since v4.9
:Default: yes

``Object map``

:Description: Object map support depends on exclusive lock support. Block
              devices are thin provisioned, which means that they only store
              data that actually has been written, ie. they are *sparse*. Object
              map support helps track which objects actually exist (have data
              stored on a device). Enabling object map support speeds up I/O
              operations for cloning, importing and exporting a sparsely
              populated image, and deleting.
:Internal value: 8
:CLI value: object-map
:Added in: v0.93 (Hammer)
:KRBD support: since v5.3
:Default: yes


``Fast-diff``

:Description: Fast-diff support depends on object map support and exclusive lock
              support. It adds another property to the object map, which makes
              it much faster to generate diffs between snapshots of an image.
              It is also much faster to calculate the actual data usage of a
              snapshot or volume (``rbd du``).
:Internal value: 16
:CLI value: fast-diff
:Added in: v9.0.1 (Infernalis)
:KRBD support: since v5.3
:Default: yes


``Deep-flatten``

:Description: Deep-flatten enables ``rbd flatten`` to work on all  snapshots of
              an image, in addition to the image itself. Without it, snapshots
              of an image will still rely on the parent, so the parent cannot be
              deleted until the snapshots are first deleted. Deep-flatten makes
              a parent independent of its clones, even if they have snapshots,
              at the expense of using additional OSD device space.
:Internal value: 32
:CLI value: deep-flatten
:Added in: v9.0.2 (Infernalis)
:KRBD support: since v5.1
:Default: yes


``Journaling``

:Description: Journaling support depends on exclusive lock support. Journaling
              records all modifications to an image in the order they occur. RBD
              mirroring can utilize the journal to replicate a crash-consistent
              image to a remote cluster.  It is best to let ``rbd-mirror``
              manage this feature only as needed, as enabling it long term may
              result in substantial additional OSD space consumption.
:Internal value: 64
:CLI value: journaling
:Added in: v10.0.1 (Jewel)
:KRBD support: no
:Default: no


``Data pool``

:Description: On erasure-coded pools, the image data block objects need to be stored on a separate pool from the image metadata.
:Internal value: 128
:Added in: v11.1.0 (Kraken)
:KRBD support: since v4.11
:Default: no


``Operations``

:Description: Used to restrict older clients from performing certain maintenance operations against an image (e.g. clone, snap create).
:Internal value: 256
:Added in: v13.0.2 (Mimic)
:KRBD support: since v4.16


``Migrating``

:Description: Used to restrict older clients from opening an image when it is in migration state.
:Internal value: 512
:Added in: v14.0.1 (Nautilus)
:KRBD support: no

``Non-primary``

:Description: Used to restrict changes to non-primary images using snapshot-based mirroring.
:Internal value: 1024
:Added in: v15.2.0 (Octopus)
:KRBD support: no


QOS Settings
============

librbd supports limiting per-image IO, controlled by the following
settings.

``rbd_qos_iops_limit``

:Description: The desired limit of IO operations per second.
:Type: Unsigned Integer
:Required: No
:Default: ``0``


``rbd_qos_bps_limit``

:Description: The desired limit of IO bytes per second.
:Type: Unsigned Integer
:Required: No
:Default: ``0``


``rbd_qos_read_iops_limit``

:Description: The desired limit of read operations per second.
:Type: Unsigned Integer
:Required: No
:Default: ``0``


``rbd_qos_write_iops_limit``

:Description: The desired limit of write operations per second.
:Type: Unsigned Integer
:Required: No
:Default: ``0``


``rbd_qos_read_bps_limit``

:Description: The desired limit of read bytes per second.
:Type: Unsigned Integer
:Required: No
:Default: ``0``


``rbd_qos_writ_bps_limit``

:Description: The desired limit of write bytes per second.
:Type: Unsigned Integer
:Required: No
:Default: ``0``


``rbd_qos_iops_burst``

:Description: The desired burst limit of IO operations.
:Type: Unsigned Integer
:Required: No
:Default: ``0``


``rbd_qos_bps_burst``

:Description: The desired burst limit of IO bytes.
:Type: Unsigned Integer
:Required: No
:Default: ``0``


``rbd_qos_read_iops_burst``

:Description: The desired burst limit of read operations.
:Type: Unsigned Integer
:Required: No
:Default: ``0``


``rbd_qos_write_iops_burst``

:Description: The desired burst limit of write operations.
:Type: Unsigned Integer
:Required: No
:Default: ``0``


``rbd_qos_read_bps_burst``

:Description: The desired burst limit of read bytes per second.
:Type: Unsigned Integer
:Required: No
:Default: ``0``


``rbd_qos_write_bps_burst``

:Description: The desired burst limit of write bytes per second.
:Type: Unsigned Integer
:Required: No
:Default: ``0``


``rbd_qos_iops_burst_seconds``

:Description: The desired burst duration in seconds of IO operations.
:Type: Unsigned Integer
:Required: No
:Default: ``1``


``rbd_qos_bps_burst_seconds``

:Description: The desired burst duration in seconds.
:Type: Unsigned Integer
:Required: No
:Default: ``1``


``rbd_qos_read_iops_burst_seconds``

:Description: The desired burst duration in seconds of read operations.
:Type: Unsigned Integer
:Required: No
:Default: ``1``


``rbd_qos_write_iops_burst_seconds``

:Description: The desired burst duration in seconds of write operations.
:Type: Unsigned Integer
:Required: No
:Default: ``1``


``rbd_qos_read_bps_burst_seconds``

:Description: The desired burst duration in seconds of read bytes.
:Type: Unsigned Integer
:Required: No
:Default: ``1``


``rbd_qos_write_bps_burst_seconds``

:Description: The desired burst duration in seconds of write bytes.
:Type: Unsigned Integer
:Required: No
:Default: ``1``


``rbd_qos_schedule_tick_min``

:Description: The minimum schedule tick (in milliseconds) for QoS.
:Type: Unsigned Integer
:Required: No
:Default: ``50``