diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-07 18:45:59 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-07 18:45:59 +0000 |
commit | 19fcec84d8d7d21e796c7624e521b60d28ee21ed (patch) | |
tree | 42d26aa27d1e3f7c0b8bd3fd14e7d7082f5008dc /doc/ceph-volume/simple | |
parent | Initial commit. (diff) | |
download | ceph-19fcec84d8d7d21e796c7624e521b60d28ee21ed.tar.xz ceph-19fcec84d8d7d21e796c7624e521b60d28ee21ed.zip |
Adding upstream version 16.2.11+ds.upstream/16.2.11+dsupstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/ceph-volume/simple')
-rw-r--r-- | doc/ceph-volume/simple/activate.rst | 80 | ||||
-rw-r--r-- | doc/ceph-volume/simple/index.rst | 32 | ||||
-rw-r--r-- | doc/ceph-volume/simple/scan.rst | 176 | ||||
-rw-r--r-- | doc/ceph-volume/simple/systemd.rst | 28 |
4 files changed, 316 insertions, 0 deletions
diff --git a/doc/ceph-volume/simple/activate.rst b/doc/ceph-volume/simple/activate.rst new file mode 100644 index 000000000..2b2795d0b --- /dev/null +++ b/doc/ceph-volume/simple/activate.rst @@ -0,0 +1,80 @@ +.. _ceph-volume-simple-activate: + +``activate`` +============ +Once :ref:`ceph-volume-simple-scan` has been completed, and all the metadata +captured for an OSD has been persisted to ``/etc/ceph/osd/{id}-{uuid}.json`` +the OSD is now ready to get "activated". + +This activation process **disables** all ``ceph-disk`` systemd units by masking +them, to prevent the UDEV/ceph-disk interaction that will attempt to start them +up at boot time. + +The disabling of ``ceph-disk`` units is done only when calling ``ceph-volume +simple activate`` directly, but is avoided when being called by systemd when +the system is booting up. + +The activation process requires using both the :term:`OSD id` and :term:`OSD uuid` +To activate parsed OSDs:: + + ceph-volume simple activate 0 6cc43680-4f6e-4feb-92ff-9c7ba204120e + +The above command will assume that a JSON configuration will be found in:: + + /etc/ceph/osd/0-6cc43680-4f6e-4feb-92ff-9c7ba204120e.json + +Alternatively, using a path to a JSON file directly is also possible:: + + ceph-volume simple activate --file /etc/ceph/osd/0-6cc43680-4f6e-4feb-92ff-9c7ba204120e.json + +requiring uuids +^^^^^^^^^^^^^^^ +The :term:`OSD uuid` is being required as an extra step to ensure that the +right OSD is being activated. It is entirely possible that a previous OSD with +the same id exists and would end up activating the incorrect one. + + +Discovery +--------- +With OSDs previously scanned by ``ceph-volume``, a *discovery* process is +performed using ``blkid`` and ``lvm``. There is currently support only for +devices with GPT partitions and LVM logical volumes. + +The GPT partitions will have a ``PARTUUID`` that can be queried by calling out +to ``blkid``, and the logical volumes will have a ``lv_uuid`` that can be +queried against ``lvs`` (the LVM tool to list logical volumes). + +This discovery process ensures that devices can be correctly detected even if +they are repurposed into another system or if their name changes (as in the +case of non-persisting names like ``/dev/sda1``) + +The JSON configuration file used to map what devices go to what OSD will then +coordinate the mounting and symlinking as part of activation. + +To ensure that the symlinks are always correct, if they exist in the OSD +directory, the symlinks will be re-done. + +A systemd unit will capture the :term:`OSD id` and :term:`OSD uuid` and +persist it. Internally, the activation will enable it like:: + + systemctl enable ceph-volume@simple-$id-$uuid + +For example:: + + systemctl enable ceph-volume@simple-0-8715BEB4-15C5-49DE-BA6F-401086EC7B41 + +Would start the discovery process for the OSD with an id of ``0`` and a UUID of +``8715BEB4-15C5-49DE-BA6F-401086EC7B41``. + + +The systemd process will call out to activate passing the information needed to +identify the OSD and its devices, and it will proceed to: + +# mount the device in the corresponding location (by convention this is + ``/var/lib/ceph/osd/<cluster name>-<osd id>/``) + +# ensure that all required devices are ready for that OSD and properly linked, +regardless of objectstore used (filestore or bluestore). The symbolic link will +**always** be re-done to ensure that the correct device is linked. + +# start the ``ceph-osd@0`` systemd unit diff --git a/doc/ceph-volume/simple/index.rst b/doc/ceph-volume/simple/index.rst new file mode 100644 index 000000000..315dea99a --- /dev/null +++ b/doc/ceph-volume/simple/index.rst @@ -0,0 +1,32 @@ +.. _ceph-volume-simple: + +``simple`` +========== +Implements the functionality needed to manage OSDs from the ``simple`` subcommand: +``ceph-volume simple`` + +**Command Line Subcommands** + +* :ref:`ceph-volume-simple-scan` + +* :ref:`ceph-volume-simple-activate` + +* :ref:`ceph-volume-simple-systemd` + + +By *taking over* management, it disables all ``ceph-disk`` systemd units used +to trigger devices at startup, relying on basic (customizable) JSON +configuration and systemd for starting up OSDs. + +This process involves two steps: + +#. :ref:`Scan <ceph-volume-simple-scan>` the running OSD or the data device +#. :ref:`Activate <ceph-volume-simple-activate>` the scanned OSD + +The scanning will infer everything that ``ceph-volume`` needs to start the OSD, +so that when activation is needed, the OSD can start normally without getting +interference from ``ceph-disk``. + +As part of the activation process the systemd units for ``ceph-disk`` in charge +of reacting to ``udev`` events, are linked to ``/dev/null`` so that they are +fully inactive. diff --git a/doc/ceph-volume/simple/scan.rst b/doc/ceph-volume/simple/scan.rst new file mode 100644 index 000000000..2749b14b6 --- /dev/null +++ b/doc/ceph-volume/simple/scan.rst @@ -0,0 +1,176 @@ +.. _ceph-volume-simple-scan: + +``scan`` +======== +Scanning allows to capture any important details from an already-deployed OSD +so that ``ceph-volume`` can manage it without the need of any other startup +workflows or tools (like ``udev`` or ``ceph-disk``). Encryption with LUKS or +PLAIN formats is fully supported. + +The command has the ability to inspect a running OSD, by inspecting the +directory where the OSD data is stored, or by consuming the data partition. +The command can also scan all running OSDs if no path or device is provided. + +Once scanned, information will (by default) persist the metadata as JSON in +a file in ``/etc/ceph/osd``. This ``JSON`` file will use the naming convention +of: ``{OSD ID}-{OSD FSID}.json``. An OSD with an id of 1, and an FSID like +``86ebd829-1405-43d3-8fd6-4cbc9b6ecf96`` the absolute path of the file would +be:: + + /etc/ceph/osd/1-86ebd829-1405-43d3-8fd6-4cbc9b6ecf96.json + +The ``scan`` subcommand will refuse to write to this file if it already exists. +If overwriting the contents is needed, the ``--force`` flag must be used:: + + ceph-volume simple scan --force {path} + +If there is no need to persist the ``JSON`` metadata, there is support to send +the contents to ``stdout`` (no file will be written):: + + ceph-volume simple scan --stdout {path} + + +.. _ceph-volume-simple-scan-directory: + +Running OSDs scan +----------------- +Using this command without providing an OSD directory or device will scan the +directories of any currently running OSDs. If a running OSD was not created +by ceph-disk it will be ignored and not scanned. + +To scan all running ceph-disk OSDs, the command would look like:: + + ceph-volume simple scan + +Directory scan +-------------- +The directory scan will capture OSD file contents from interesting files. There +are a few files that must exist in order to have a successful scan: + +* ``ceph_fsid`` +* ``fsid`` +* ``keyring`` +* ``ready`` +* ``type`` +* ``whoami`` + +If the OSD is encrypted, it will additionally add the following keys: + +* ``encrypted`` +* ``encryption_type`` +* ``lockbox_keyring`` + +In the case of any other file, as long as it is not a binary or a directory, it +will also get captured and persisted as part of the JSON object. + +The convention for the keys in the JSON object is that any file name will be +a key, and its contents will be its value. If the contents are a single line +(like in the case of the ``whoami``) the contents are trimmed, and the newline +is dropped. For example with an OSD with an id of 1, this is how the JSON entry +would look like:: + + "whoami": "1", + +For files that may have more than one line, the contents are left as-is, except +for keyrings which are treated specially and parsed to extract the keyring. For +example, a ``keyring`` that gets read as:: + + [osd.1]\n\tkey = AQBBJ/dZp57NIBAAtnuQS9WOS0hnLVe0rZnE6Q==\n + +Would get stored as:: + + "keyring": "AQBBJ/dZp57NIBAAtnuQS9WOS0hnLVe0rZnE6Q==", + + +For a directory like ``/var/lib/ceph/osd/ceph-1``, the command could look +like:: + + ceph-volume simple scan /var/lib/ceph/osd/ceph1 + + +.. _ceph-volume-simple-scan-device: + +Device scan +----------- +When an OSD directory is not available (OSD is not running, or device is not +mounted) the ``scan`` command is able to introspect the device to capture +required data. Just like :ref:`ceph-volume-simple-scan-directory`, it would +still require a few files present. This means that the device to be scanned +**must be** the data partition of the OSD. + +As long as the data partition of the OSD is being passed in as an argument, the +sub-command can scan its contents. + +In the case where the device is already mounted, the tool can detect this +scenario and capture file contents from that directory. + +If the device is not mounted, a temporary directory will be created, and the +device will be mounted temporarily just for scanning the contents. Once +contents are scanned, the device will be unmounted. + +For a device like ``/dev/sda1`` which **must** be a data partition, the command +could look like:: + + ceph-volume simple scan /dev/sda1 + + +.. _ceph-volume-simple-scan-json: + +``JSON`` contents +----------------- +The contents of the JSON object is very simple. The scan not only will persist +information from the special OSD files and their contents, but will also +validate paths and device UUIDs. Unlike what ``ceph-disk`` would do, by storing +them in ``{device type}_uuid`` files, the tool will persist them as part of the +device type key. + +For example, a ``block.db`` device would look something like:: + + "block.db": { + "path": "/dev/disk/by-partuuid/6cc43680-4f6e-4feb-92ff-9c7ba204120e", + "uuid": "6cc43680-4f6e-4feb-92ff-9c7ba204120e" + }, + +But it will also persist the ``ceph-disk`` special file generated, like so:: + + "block.db_uuid": "6cc43680-4f6e-4feb-92ff-9c7ba204120e", + +This duplication is in place because the tool is trying to ensure the +following: + +# Support OSDs that may not have ceph-disk special files +# Check the most up-to-date information on the device, by querying against LVM +and ``blkid`` +# Support both logical volumes and GPT devices + +This is a sample ``JSON`` metadata, from an OSD that is using ``bluestore``:: + + { + "active": "ok", + "block": { + "path": "/dev/disk/by-partuuid/40fd0a64-caa5-43a3-9717-1836ac661a12", + "uuid": "40fd0a64-caa5-43a3-9717-1836ac661a12" + }, + "block.db": { + "path": "/dev/disk/by-partuuid/6cc43680-4f6e-4feb-92ff-9c7ba204120e", + "uuid": "6cc43680-4f6e-4feb-92ff-9c7ba204120e" + }, + "block.db_uuid": "6cc43680-4f6e-4feb-92ff-9c7ba204120e", + "block_uuid": "40fd0a64-caa5-43a3-9717-1836ac661a12", + "bluefs": "1", + "ceph_fsid": "c92fc9eb-0610-4363-aafc-81ddf70aaf1b", + "cluster_name": "ceph", + "data": { + "path": "/dev/sdr1", + "uuid": "86ebd829-1405-43d3-8fd6-4cbc9b6ecf96" + }, + "fsid": "86ebd829-1405-43d3-8fd6-4cbc9b6ecf96", + "keyring": "AQBBJ/dZp57NIBAAtnuQS9WOS0hnLVe0rZnE6Q==", + "kv_backend": "rocksdb", + "magic": "ceph osd volume v026", + "mkfs_done": "yes", + "ready": "ready", + "systemd": "", + "type": "bluestore", + "whoami": "3" + } diff --git a/doc/ceph-volume/simple/systemd.rst b/doc/ceph-volume/simple/systemd.rst new file mode 100644 index 000000000..aa5bebffe --- /dev/null +++ b/doc/ceph-volume/simple/systemd.rst @@ -0,0 +1,28 @@ +.. _ceph-volume-simple-systemd: + +systemd +======= +Upon startup, it will identify the logical volume by loading the JSON file in +``/etc/ceph/osd/{id}-{uuid}.json`` corresponding to the instance name of the +systemd unit. + +After identifying the correct volume it will then proceed to mount it by using +the OSD destination conventions, that is:: + + /var/lib/ceph/osd/{cluster name}-{osd id} + +For our example OSD with an id of ``0``, that means the identified device will +be mounted at:: + + + /var/lib/ceph/osd/ceph-0 + + +Once that process is complete, a call will be made to start the OSD:: + + systemctl start ceph-osd@0 + +The systemd portion of this process is handled by the ``ceph-volume simple +trigger`` sub-command, which is only in charge of parsing metadata coming from +systemd and startup, and then dispatching to ``ceph-volume simple activate`` which +would proceed with activation. |