summaryrefslogtreecommitdiffstats
path: root/doc/ceph-volume/simple
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-07 18:45:59 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-07 18:45:59 +0000
commit19fcec84d8d7d21e796c7624e521b60d28ee21ed (patch)
tree42d26aa27d1e3f7c0b8bd3fd14e7d7082f5008dc /doc/ceph-volume/simple
parentInitial commit. (diff)
downloadceph-19fcec84d8d7d21e796c7624e521b60d28ee21ed.tar.xz
ceph-19fcec84d8d7d21e796c7624e521b60d28ee21ed.zip
Adding upstream version 16.2.11+ds.upstream/16.2.11+dsupstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/ceph-volume/simple')
-rw-r--r--doc/ceph-volume/simple/activate.rst80
-rw-r--r--doc/ceph-volume/simple/index.rst32
-rw-r--r--doc/ceph-volume/simple/scan.rst176
-rw-r--r--doc/ceph-volume/simple/systemd.rst28
4 files changed, 316 insertions, 0 deletions
diff --git a/doc/ceph-volume/simple/activate.rst b/doc/ceph-volume/simple/activate.rst
new file mode 100644
index 000000000..2b2795d0b
--- /dev/null
+++ b/doc/ceph-volume/simple/activate.rst
@@ -0,0 +1,80 @@
+.. _ceph-volume-simple-activate:
+
+``activate``
+============
+Once :ref:`ceph-volume-simple-scan` has been completed, and all the metadata
+captured for an OSD has been persisted to ``/etc/ceph/osd/{id}-{uuid}.json``
+the OSD is now ready to get "activated".
+
+This activation process **disables** all ``ceph-disk`` systemd units by masking
+them, to prevent the UDEV/ceph-disk interaction that will attempt to start them
+up at boot time.
+
+The disabling of ``ceph-disk`` units is done only when calling ``ceph-volume
+simple activate`` directly, but is avoided when being called by systemd when
+the system is booting up.
+
+The activation process requires using both the :term:`OSD id` and :term:`OSD uuid`
+To activate parsed OSDs::
+
+ ceph-volume simple activate 0 6cc43680-4f6e-4feb-92ff-9c7ba204120e
+
+The above command will assume that a JSON configuration will be found in::
+
+ /etc/ceph/osd/0-6cc43680-4f6e-4feb-92ff-9c7ba204120e.json
+
+Alternatively, using a path to a JSON file directly is also possible::
+
+ ceph-volume simple activate --file /etc/ceph/osd/0-6cc43680-4f6e-4feb-92ff-9c7ba204120e.json
+
+requiring uuids
+^^^^^^^^^^^^^^^
+The :term:`OSD uuid` is being required as an extra step to ensure that the
+right OSD is being activated. It is entirely possible that a previous OSD with
+the same id exists and would end up activating the incorrect one.
+
+
+Discovery
+---------
+With OSDs previously scanned by ``ceph-volume``, a *discovery* process is
+performed using ``blkid`` and ``lvm``. There is currently support only for
+devices with GPT partitions and LVM logical volumes.
+
+The GPT partitions will have a ``PARTUUID`` that can be queried by calling out
+to ``blkid``, and the logical volumes will have a ``lv_uuid`` that can be
+queried against ``lvs`` (the LVM tool to list logical volumes).
+
+This discovery process ensures that devices can be correctly detected even if
+they are repurposed into another system or if their name changes (as in the
+case of non-persisting names like ``/dev/sda1``)
+
+The JSON configuration file used to map what devices go to what OSD will then
+coordinate the mounting and symlinking as part of activation.
+
+To ensure that the symlinks are always correct, if they exist in the OSD
+directory, the symlinks will be re-done.
+
+A systemd unit will capture the :term:`OSD id` and :term:`OSD uuid` and
+persist it. Internally, the activation will enable it like::
+
+ systemctl enable ceph-volume@simple-$id-$uuid
+
+For example::
+
+ systemctl enable ceph-volume@simple-0-8715BEB4-15C5-49DE-BA6F-401086EC7B41
+
+Would start the discovery process for the OSD with an id of ``0`` and a UUID of
+``8715BEB4-15C5-49DE-BA6F-401086EC7B41``.
+
+
+The systemd process will call out to activate passing the information needed to
+identify the OSD and its devices, and it will proceed to:
+
+# mount the device in the corresponding location (by convention this is
+ ``/var/lib/ceph/osd/<cluster name>-<osd id>/``)
+
+# ensure that all required devices are ready for that OSD and properly linked,
+regardless of objectstore used (filestore or bluestore). The symbolic link will
+**always** be re-done to ensure that the correct device is linked.
+
+# start the ``ceph-osd@0`` systemd unit
diff --git a/doc/ceph-volume/simple/index.rst b/doc/ceph-volume/simple/index.rst
new file mode 100644
index 000000000..315dea99a
--- /dev/null
+++ b/doc/ceph-volume/simple/index.rst
@@ -0,0 +1,32 @@
+.. _ceph-volume-simple:
+
+``simple``
+==========
+Implements the functionality needed to manage OSDs from the ``simple`` subcommand:
+``ceph-volume simple``
+
+**Command Line Subcommands**
+
+* :ref:`ceph-volume-simple-scan`
+
+* :ref:`ceph-volume-simple-activate`
+
+* :ref:`ceph-volume-simple-systemd`
+
+
+By *taking over* management, it disables all ``ceph-disk`` systemd units used
+to trigger devices at startup, relying on basic (customizable) JSON
+configuration and systemd for starting up OSDs.
+
+This process involves two steps:
+
+#. :ref:`Scan <ceph-volume-simple-scan>` the running OSD or the data device
+#. :ref:`Activate <ceph-volume-simple-activate>` the scanned OSD
+
+The scanning will infer everything that ``ceph-volume`` needs to start the OSD,
+so that when activation is needed, the OSD can start normally without getting
+interference from ``ceph-disk``.
+
+As part of the activation process the systemd units for ``ceph-disk`` in charge
+of reacting to ``udev`` events, are linked to ``/dev/null`` so that they are
+fully inactive.
diff --git a/doc/ceph-volume/simple/scan.rst b/doc/ceph-volume/simple/scan.rst
new file mode 100644
index 000000000..2749b14b6
--- /dev/null
+++ b/doc/ceph-volume/simple/scan.rst
@@ -0,0 +1,176 @@
+.. _ceph-volume-simple-scan:
+
+``scan``
+========
+Scanning allows to capture any important details from an already-deployed OSD
+so that ``ceph-volume`` can manage it without the need of any other startup
+workflows or tools (like ``udev`` or ``ceph-disk``). Encryption with LUKS or
+PLAIN formats is fully supported.
+
+The command has the ability to inspect a running OSD, by inspecting the
+directory where the OSD data is stored, or by consuming the data partition.
+The command can also scan all running OSDs if no path or device is provided.
+
+Once scanned, information will (by default) persist the metadata as JSON in
+a file in ``/etc/ceph/osd``. This ``JSON`` file will use the naming convention
+of: ``{OSD ID}-{OSD FSID}.json``. An OSD with an id of 1, and an FSID like
+``86ebd829-1405-43d3-8fd6-4cbc9b6ecf96`` the absolute path of the file would
+be::
+
+ /etc/ceph/osd/1-86ebd829-1405-43d3-8fd6-4cbc9b6ecf96.json
+
+The ``scan`` subcommand will refuse to write to this file if it already exists.
+If overwriting the contents is needed, the ``--force`` flag must be used::
+
+ ceph-volume simple scan --force {path}
+
+If there is no need to persist the ``JSON`` metadata, there is support to send
+the contents to ``stdout`` (no file will be written)::
+
+ ceph-volume simple scan --stdout {path}
+
+
+.. _ceph-volume-simple-scan-directory:
+
+Running OSDs scan
+-----------------
+Using this command without providing an OSD directory or device will scan the
+directories of any currently running OSDs. If a running OSD was not created
+by ceph-disk it will be ignored and not scanned.
+
+To scan all running ceph-disk OSDs, the command would look like::
+
+ ceph-volume simple scan
+
+Directory scan
+--------------
+The directory scan will capture OSD file contents from interesting files. There
+are a few files that must exist in order to have a successful scan:
+
+* ``ceph_fsid``
+* ``fsid``
+* ``keyring``
+* ``ready``
+* ``type``
+* ``whoami``
+
+If the OSD is encrypted, it will additionally add the following keys:
+
+* ``encrypted``
+* ``encryption_type``
+* ``lockbox_keyring``
+
+In the case of any other file, as long as it is not a binary or a directory, it
+will also get captured and persisted as part of the JSON object.
+
+The convention for the keys in the JSON object is that any file name will be
+a key, and its contents will be its value. If the contents are a single line
+(like in the case of the ``whoami``) the contents are trimmed, and the newline
+is dropped. For example with an OSD with an id of 1, this is how the JSON entry
+would look like::
+
+ "whoami": "1",
+
+For files that may have more than one line, the contents are left as-is, except
+for keyrings which are treated specially and parsed to extract the keyring. For
+example, a ``keyring`` that gets read as::
+
+ [osd.1]\n\tkey = AQBBJ/dZp57NIBAAtnuQS9WOS0hnLVe0rZnE6Q==\n
+
+Would get stored as::
+
+ "keyring": "AQBBJ/dZp57NIBAAtnuQS9WOS0hnLVe0rZnE6Q==",
+
+
+For a directory like ``/var/lib/ceph/osd/ceph-1``, the command could look
+like::
+
+ ceph-volume simple scan /var/lib/ceph/osd/ceph1
+
+
+.. _ceph-volume-simple-scan-device:
+
+Device scan
+-----------
+When an OSD directory is not available (OSD is not running, or device is not
+mounted) the ``scan`` command is able to introspect the device to capture
+required data. Just like :ref:`ceph-volume-simple-scan-directory`, it would
+still require a few files present. This means that the device to be scanned
+**must be** the data partition of the OSD.
+
+As long as the data partition of the OSD is being passed in as an argument, the
+sub-command can scan its contents.
+
+In the case where the device is already mounted, the tool can detect this
+scenario and capture file contents from that directory.
+
+If the device is not mounted, a temporary directory will be created, and the
+device will be mounted temporarily just for scanning the contents. Once
+contents are scanned, the device will be unmounted.
+
+For a device like ``/dev/sda1`` which **must** be a data partition, the command
+could look like::
+
+ ceph-volume simple scan /dev/sda1
+
+
+.. _ceph-volume-simple-scan-json:
+
+``JSON`` contents
+-----------------
+The contents of the JSON object is very simple. The scan not only will persist
+information from the special OSD files and their contents, but will also
+validate paths and device UUIDs. Unlike what ``ceph-disk`` would do, by storing
+them in ``{device type}_uuid`` files, the tool will persist them as part of the
+device type key.
+
+For example, a ``block.db`` device would look something like::
+
+ "block.db": {
+ "path": "/dev/disk/by-partuuid/6cc43680-4f6e-4feb-92ff-9c7ba204120e",
+ "uuid": "6cc43680-4f6e-4feb-92ff-9c7ba204120e"
+ },
+
+But it will also persist the ``ceph-disk`` special file generated, like so::
+
+ "block.db_uuid": "6cc43680-4f6e-4feb-92ff-9c7ba204120e",
+
+This duplication is in place because the tool is trying to ensure the
+following:
+
+# Support OSDs that may not have ceph-disk special files
+# Check the most up-to-date information on the device, by querying against LVM
+and ``blkid``
+# Support both logical volumes and GPT devices
+
+This is a sample ``JSON`` metadata, from an OSD that is using ``bluestore``::
+
+ {
+ "active": "ok",
+ "block": {
+ "path": "/dev/disk/by-partuuid/40fd0a64-caa5-43a3-9717-1836ac661a12",
+ "uuid": "40fd0a64-caa5-43a3-9717-1836ac661a12"
+ },
+ "block.db": {
+ "path": "/dev/disk/by-partuuid/6cc43680-4f6e-4feb-92ff-9c7ba204120e",
+ "uuid": "6cc43680-4f6e-4feb-92ff-9c7ba204120e"
+ },
+ "block.db_uuid": "6cc43680-4f6e-4feb-92ff-9c7ba204120e",
+ "block_uuid": "40fd0a64-caa5-43a3-9717-1836ac661a12",
+ "bluefs": "1",
+ "ceph_fsid": "c92fc9eb-0610-4363-aafc-81ddf70aaf1b",
+ "cluster_name": "ceph",
+ "data": {
+ "path": "/dev/sdr1",
+ "uuid": "86ebd829-1405-43d3-8fd6-4cbc9b6ecf96"
+ },
+ "fsid": "86ebd829-1405-43d3-8fd6-4cbc9b6ecf96",
+ "keyring": "AQBBJ/dZp57NIBAAtnuQS9WOS0hnLVe0rZnE6Q==",
+ "kv_backend": "rocksdb",
+ "magic": "ceph osd volume v026",
+ "mkfs_done": "yes",
+ "ready": "ready",
+ "systemd": "",
+ "type": "bluestore",
+ "whoami": "3"
+ }
diff --git a/doc/ceph-volume/simple/systemd.rst b/doc/ceph-volume/simple/systemd.rst
new file mode 100644
index 000000000..aa5bebffe
--- /dev/null
+++ b/doc/ceph-volume/simple/systemd.rst
@@ -0,0 +1,28 @@
+.. _ceph-volume-simple-systemd:
+
+systemd
+=======
+Upon startup, it will identify the logical volume by loading the JSON file in
+``/etc/ceph/osd/{id}-{uuid}.json`` corresponding to the instance name of the
+systemd unit.
+
+After identifying the correct volume it will then proceed to mount it by using
+the OSD destination conventions, that is::
+
+ /var/lib/ceph/osd/{cluster name}-{osd id}
+
+For our example OSD with an id of ``0``, that means the identified device will
+be mounted at::
+
+
+ /var/lib/ceph/osd/ceph-0
+
+
+Once that process is complete, a call will be made to start the OSD::
+
+ systemctl start ceph-osd@0
+
+The systemd portion of this process is handled by the ``ceph-volume simple
+trigger`` sub-command, which is only in charge of parsing metadata coming from
+systemd and startup, and then dispatching to ``ceph-volume simple activate`` which
+would proceed with activation.