summaryrefslogtreecommitdiffstats
path: root/doc/cephadm/host-management.rst
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--doc/cephadm/host-management.rst605
1 files changed, 605 insertions, 0 deletions
diff --git a/doc/cephadm/host-management.rst b/doc/cephadm/host-management.rst
new file mode 100644
index 000000000..4b964c5f4
--- /dev/null
+++ b/doc/cephadm/host-management.rst
@@ -0,0 +1,605 @@
+.. _orchestrator-cli-host-management:
+
+===============
+Host Management
+===============
+
+Listing Hosts
+=============
+
+Run a command of this form to list hosts associated with the cluster:
+
+.. prompt:: bash #
+
+ ceph orch host ls [--format yaml] [--host-pattern <name>] [--label <label>] [--host-status <status>] [--detail]
+
+In commands of this form, the arguments "host-pattern", "label", and
+"host-status" are optional and are used for filtering.
+
+- "host-pattern" is a regex that matches against hostnames and returns only
+ matching hosts.
+- "label" returns only hosts with the specified label.
+- "host-status" returns only hosts with the specified status (currently
+ "offline" or "maintenance").
+- Any combination of these filtering flags is valid. It is possible to filter
+ against name, label and status simultaneously, or to filter against any
+ proper subset of name, label and status.
+
+The "detail" parameter provides more host related information for cephadm based
+clusters. For example:
+
+.. prompt:: bash #
+
+ ceph orch host ls --detail
+
+::
+
+ HOSTNAME ADDRESS LABELS STATUS VENDOR/MODEL CPU HDD SSD NIC
+ ceph-master 192.168.122.73 _admin QEMU (Standard PC (Q35 + ICH9, 2009)) 4C/4T 4/1.6TB - 1
+ 1 hosts in cluster
+
+.. _cephadm-adding-hosts:
+
+Adding Hosts
+============
+
+Hosts must have these :ref:`cephadm-host-requirements` installed.
+Hosts without all the necessary requirements will fail to be added to the cluster.
+
+To add each new host to the cluster, perform two steps:
+
+#. Install the cluster's public SSH key in the new host's root user's ``authorized_keys`` file:
+
+ .. prompt:: bash #
+
+ ssh-copy-id -f -i /etc/ceph/ceph.pub root@*<new-host>*
+
+ For example:
+
+ .. prompt:: bash #
+
+ ssh-copy-id -f -i /etc/ceph/ceph.pub root@host2
+ ssh-copy-id -f -i /etc/ceph/ceph.pub root@host3
+
+#. Tell Ceph that the new node is part of the cluster:
+
+ .. prompt:: bash #
+
+ ceph orch host add *<newhost>* [*<ip>*] [*<label1> ...*]
+
+ For example:
+
+ .. prompt:: bash #
+
+ ceph orch host add host2 10.10.0.102
+ ceph orch host add host3 10.10.0.103
+
+ It is best to explicitly provide the host IP address. If an IP is
+ not provided, then the host name will be immediately resolved via
+ DNS and that IP will be used.
+
+ One or more labels can also be included to immediately label the
+ new host. For example, by default the ``_admin`` label will make
+ cephadm maintain a copy of the ``ceph.conf`` file and a
+ ``client.admin`` keyring file in ``/etc/ceph``:
+
+ .. prompt:: bash #
+
+ ceph orch host add host4 10.10.0.104 --labels _admin
+
+.. _cephadm-removing-hosts:
+
+Removing Hosts
+==============
+
+A host can safely be removed from the cluster after all daemons are removed
+from it.
+
+To drain all daemons from a host, run a command of the following form:
+
+.. prompt:: bash #
+
+ ceph orch host drain *<host>*
+
+The ``_no_schedule`` and ``_no_conf_keyring`` labels will be applied to the
+host. See :ref:`cephadm-special-host-labels`.
+
+If you only want to drain daemons but leave managed ceph conf and keyring
+files on the host, you may pass the ``--keep-conf-keyring`` flag to the
+drain command.
+
+.. prompt:: bash #
+
+ ceph orch host drain *<host>* --keep-conf-keyring
+
+This will apply the ``_no_schedule`` label to the host but not the
+``_no_conf_keyring`` label.
+
+All OSDs on the host will be scheduled to be removed. You can check the progress of the OSD removal operation with the following command:
+
+.. prompt:: bash #
+
+ ceph orch osd rm status
+
+See :ref:`cephadm-osd-removal` for more details about OSD removal.
+
+The ``orch host drain`` command also supports a ``--zap-osd-devices``
+flag. Setting this flag while draining a host will cause cephadm to zap
+the devices of the OSDs it is removing as part of the drain process
+
+.. prompt:: bash #
+
+ ceph orch host drain *<host>* --zap-osd-devices
+
+Use the following command to determine whether any daemons are still on the
+host:
+
+.. prompt:: bash #
+
+ ceph orch ps <host>
+
+After all daemons have been removed from the host, remove the host from the
+cluster by running the following command:
+
+.. prompt:: bash #
+
+ ceph orch host rm <host>
+
+Offline host removal
+--------------------
+
+Even if a host is offline and can not be recovered, it can be removed from the
+cluster by running a command of the following form:
+
+.. prompt:: bash #
+
+ ceph orch host rm <host> --offline --force
+
+.. warning:: This can potentially cause data loss. This command forcefully
+ purges OSDs from the cluster by calling ``osd purge-actual`` for each OSD.
+ Any service specs that still contain this host should be manually updated.
+
+.. _orchestrator-host-labels:
+
+Host labels
+===========
+
+The orchestrator supports assigning labels to hosts. Labels
+are free form and have no particular meaning by itself and each host
+can have multiple labels. They can be used to specify placement
+of daemons. See :ref:`orch-placement-by-labels`
+
+Labels can be added when adding a host with the ``--labels`` flag:
+
+.. prompt:: bash #
+
+ ceph orch host add my_hostname --labels=my_label1
+ ceph orch host add my_hostname --labels=my_label1,my_label2
+
+To add a label a existing host, run:
+
+.. prompt:: bash #
+
+ ceph orch host label add my_hostname my_label
+
+To remove a label, run:
+
+.. prompt:: bash #
+
+ ceph orch host label rm my_hostname my_label
+
+
+.. _cephadm-special-host-labels:
+
+Special host labels
+-------------------
+
+The following host labels have a special meaning to cephadm. All start with ``_``.
+
+* ``_no_schedule``: *Do not schedule or deploy daemons on this host*.
+
+ This label prevents cephadm from deploying daemons on this host. If it is added to
+ an existing host that already contains Ceph daemons, it will cause cephadm to move
+ those daemons elsewhere (except OSDs, which are not removed automatically).
+
+* ``_no_conf_keyring``: *Do not deploy config files or keyrings on this host*.
+
+ This label is effectively the same as ``_no_schedule`` but instead of working for
+ daemons it works for client keyrings and ceph conf files that are being managed
+ by cephadm
+
+* ``_no_autotune_memory``: *Do not autotune memory on this host*.
+
+ This label will prevent daemon memory from being tuned even when the
+ ``osd_memory_target_autotune`` or similar option is enabled for one or more daemons
+ on that host.
+
+* ``_admin``: *Distribute client.admin and ceph.conf to this host*.
+
+ By default, an ``_admin`` label is applied to the first host in the cluster (where
+ bootstrap was originally run), and the ``client.admin`` key is set to be distributed
+ to that host via the ``ceph orch client-keyring ...`` function. Adding this label
+ to additional hosts will normally cause cephadm to deploy config and keyring files
+ in ``/etc/ceph``. Starting from versions 16.2.10 (Pacific) and 17.2.1 (Quincy) in
+ addition to the default location ``/etc/ceph/`` cephadm also stores config and keyring
+ files in the ``/var/lib/ceph/<fsid>/config`` directory.
+
+Maintenance Mode
+================
+
+Place a host in and out of maintenance mode (stops all Ceph daemons on host):
+
+.. prompt:: bash #
+
+ ceph orch host maintenance enter <hostname> [--force] [--yes-i-really-mean-it]
+ ceph orch host maintenance exit <hostname>
+
+The ``--force`` flag allows the user to bypass warnings (but not alerts). The ``--yes-i-really-mean-it``
+flag bypasses all safety checks and will attempt to force the host into maintenance mode no
+matter what.
+
+.. warning:: Using the --yes-i-really-mean-it flag to force the host to enter maintenance
+ mode can potentially cause loss of data availability, the mon quorum to break down due
+ to too few running monitors, mgr module commands (such as ``ceph orch . . .`` commands)
+ to be become unresponsive, and a number of other possible issues. Please only use this
+ flag if you're absolutely certain you know what you're doing.
+
+See also :ref:`cephadm-fqdn`
+
+Rescanning Host Devices
+=======================
+
+Some servers and external enclosures may not register device removal or insertion with the
+kernel. In these scenarios, you'll need to perform a host rescan. A rescan is typically
+non-disruptive, and can be performed with the following CLI command:
+
+.. prompt:: bash #
+
+ ceph orch host rescan <hostname> [--with-summary]
+
+The ``with-summary`` flag provides a breakdown of the number of HBAs found and scanned, together
+with any that failed:
+
+.. prompt:: bash [ceph:root@rh9-ceph1/]#
+
+ ceph orch host rescan rh9-ceph1 --with-summary
+
+::
+
+ Ok. 2 adapters detected: 2 rescanned, 0 skipped, 0 failed (0.32s)
+
+Creating many hosts at once
+===========================
+
+Many hosts can be added at once using
+``ceph orch apply -i`` by submitting a multi-document YAML file:
+
+.. code-block:: yaml
+
+ service_type: host
+ hostname: node-00
+ addr: 192.168.0.10
+ labels:
+ - example1
+ - example2
+ ---
+ service_type: host
+ hostname: node-01
+ addr: 192.168.0.11
+ labels:
+ - grafana
+ ---
+ service_type: host
+ hostname: node-02
+ addr: 192.168.0.12
+
+This can be combined with :ref:`service specifications<orchestrator-cli-service-spec>`
+to create a cluster spec file to deploy a whole cluster in one command. see
+``cephadm bootstrap --apply-spec`` also to do this during bootstrap. Cluster
+SSH Keys must be copied to hosts prior to adding them.
+
+Setting the initial CRUSH location of host
+==========================================
+
+Hosts can contain a ``location`` identifier which will instruct cephadm to
+create a new CRUSH host located in the specified hierarchy.
+
+.. code-block:: yaml
+
+ service_type: host
+ hostname: node-00
+ addr: 192.168.0.10
+ location:
+ rack: rack1
+
+.. note::
+
+ The ``location`` attribute will be only affect the initial CRUSH location. Subsequent
+ changes of the ``location`` property will be ignored. Also, removing a host will not remove
+ any CRUSH buckets.
+
+See also :ref:`crush_map_default_types`.
+
+OS Tuning Profiles
+==================
+
+Cephadm can be used to manage operating-system-tuning profiles that apply sets
+of sysctl settings to sets of hosts.
+
+Create a YAML spec file in the following format:
+
+.. code-block:: yaml
+
+ profile_name: 23-mon-host-profile
+ placement:
+ hosts:
+ - mon-host-01
+ - mon-host-02
+ settings:
+ fs.file-max: 1000000
+ vm.swappiness: '13'
+
+Apply the tuning profile with the following command:
+
+.. prompt:: bash #
+
+ ceph orch tuned-profile apply -i <tuned-profile-file-name>
+
+This profile is written to ``/etc/sysctl.d/`` on each host that matches the
+hosts specified in the placement block of the yaml, and ``sysctl --system`` is
+run on the host.
+
+.. note::
+
+ The exact filename that the profile is written to within ``/etc/sysctl.d/``
+ is ``<profile-name>-cephadm-tuned-profile.conf``, where ``<profile-name>`` is
+ the ``profile_name`` setting that you specify in the YAML spec. Because
+ sysctl settings are applied in lexicographical order (sorted by the filename
+ in which the setting is specified), you may want to set the ``profile_name``
+ in your spec so that it is applied before or after other conf files.
+
+.. note::
+
+ These settings are applied only at the host level, and are not specific
+ to any particular daemon or container.
+
+.. note::
+
+ Applying tuned profiles is idempotent when the ``--no-overwrite`` option is
+ passed. Moreover, if the ``--no-overwrite`` option is passed, existing
+ profiles with the same name are not overwritten.
+
+
+Viewing Profiles
+----------------
+
+Run the following command to view all the profiles that cephadm currently manages:
+
+.. prompt:: bash #
+
+ ceph orch tuned-profile ls
+
+.. note::
+
+ To make modifications and re-apply a profile, pass ``--format yaml`` to the
+ ``tuned-profile ls`` command. The ``tuned-profile ls --format yaml`` command
+ presents the profiles in a format that is easy to copy and re-apply.
+
+
+Removing Profiles
+-----------------
+
+To remove a previously applied profile, run this command:
+
+.. prompt:: bash #
+
+ ceph orch tuned-profile rm <profile-name>
+
+When a profile is removed, cephadm cleans up the file previously written to ``/etc/sysctl.d``.
+
+
+Modifying Profiles
+------------------
+
+Profiles can be modified by re-applying a YAML spec with the same name as the
+profile that you want to modify, but settings within existing profiles can be
+adjusted with the following commands.
+
+To add or modify a setting in an existing profile:
+
+.. prompt:: bash #
+
+ ceph orch tuned-profile add-setting <profile-name> <setting-name> <value>
+
+To remove a setting from an existing profile:
+
+.. prompt:: bash #
+
+ ceph orch tuned-profile rm-setting <profile-name> <setting-name>
+
+.. note::
+
+ Modifying the placement requires re-applying a profile with the same name.
+ Remember that profiles are tracked by their names, so when a profile with the
+ same name as an existing profile is applied, it overwrites the old profile
+ unless the ``--no-overwrite`` flag is passed.
+
+SSH Configuration
+=================
+
+Cephadm uses SSH to connect to remote hosts. SSH uses a key to authenticate
+with those hosts in a secure way.
+
+
+Default behavior
+----------------
+
+Cephadm stores an SSH key in the monitor that is used to
+connect to remote hosts. When the cluster is bootstrapped, this SSH
+key is generated automatically and no additional configuration
+is necessary.
+
+A *new* SSH key can be generated with:
+
+.. prompt:: bash #
+
+ ceph cephadm generate-key
+
+The public portion of the SSH key can be retrieved with:
+
+.. prompt:: bash #
+
+ ceph cephadm get-pub-key
+
+The currently stored SSH key can be deleted with:
+
+.. prompt:: bash #
+
+ ceph cephadm clear-key
+
+You can make use of an existing key by directly importing it with:
+
+.. prompt:: bash #
+
+ ceph config-key set mgr/cephadm/ssh_identity_key -i <key>
+ ceph config-key set mgr/cephadm/ssh_identity_pub -i <pub>
+
+You will then need to restart the mgr daemon to reload the configuration with:
+
+.. prompt:: bash #
+
+ ceph mgr fail
+
+.. _cephadm-ssh-user:
+
+Configuring a different SSH user
+----------------------------------
+
+Cephadm must be able to log into all the Ceph cluster nodes as an user
+that has enough privileges to download container images, start containers
+and execute commands without prompting for a password. If you do not want
+to use the "root" user (default option in cephadm), you must provide
+cephadm the name of the user that is going to be used to perform all the
+cephadm operations. Use the command:
+
+.. prompt:: bash #
+
+ ceph cephadm set-user <user>
+
+Prior to running this the cluster SSH key needs to be added to this users
+authorized_keys file and non-root users must have passwordless sudo access.
+
+
+Customizing the SSH configuration
+---------------------------------
+
+Cephadm generates an appropriate ``ssh_config`` file that is
+used for connecting to remote hosts. This configuration looks
+something like this::
+
+ Host *
+ User root
+ StrictHostKeyChecking no
+ UserKnownHostsFile /dev/null
+
+There are two ways to customize this configuration for your environment:
+
+#. Import a customized configuration file that will be stored
+ by the monitor with:
+
+ .. prompt:: bash #
+
+ ceph cephadm set-ssh-config -i <ssh_config_file>
+
+ To remove a customized SSH config and revert back to the default behavior:
+
+ .. prompt:: bash #
+
+ ceph cephadm clear-ssh-config
+
+#. You can configure a file location for the SSH configuration file with:
+
+ .. prompt:: bash #
+
+ ceph config set mgr mgr/cephadm/ssh_config_file <path>
+
+ We do *not recommend* this approach. The path name must be
+ visible to *any* mgr daemon, and cephadm runs all daemons as
+ containers. That means that the file either need to be placed
+ inside a customized container image for your deployment, or
+ manually distributed to the mgr data directory
+ (``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at
+ ``/var/lib/ceph/mgr/ceph-<id>`` from inside the container).
+
+Setting up CA signed keys for the cluster
+-----------------------------------------
+
+Cephadm also supports using CA signed keys for SSH authentication
+across cluster nodes. In this setup, instead of needing a private
+key and public key, we instead need a private key and certificate
+created by signing that private key with a CA key. For more info
+on setting up nodes for authentication using a CA signed key, see
+:ref:`cephadm-bootstrap-ca-signed-keys`. Once you have your private
+key and signed cert, they can be set up for cephadm to use by running:
+
+.. prompt:: bash #
+
+ ceph config-key set mgr/cephadm/ssh_identity_key -i <private-key-file>
+ ceph config-key set mgr/cephadm/ssh_identity_cert -i <signed-cert-file>
+
+.. _cephadm-fqdn:
+
+Fully qualified domain names vs bare host names
+===============================================
+
+.. note::
+
+ cephadm demands that the name of the host given via ``ceph orch host add``
+ equals the output of ``hostname`` on remote hosts.
+
+Otherwise cephadm can't be sure that names returned by
+``ceph * metadata`` match the hosts known to cephadm. This might result
+in a :ref:`cephadm-stray-host` warning.
+
+When configuring new hosts, there are two **valid** ways to set the
+``hostname`` of a host:
+
+1. Using the bare host name. In this case:
+
+- ``hostname`` returns the bare host name.
+- ``hostname -f`` returns the FQDN.
+
+2. Using the fully qualified domain name as the host name. In this case:
+
+- ``hostname`` returns the FQDN
+- ``hostname -s`` return the bare host name
+
+Note that ``man hostname`` recommends ``hostname`` to return the bare
+host name:
+
+ The FQDN (Fully Qualified Domain Name) of the system is the
+ name that the resolver(3) returns for the host name, such as,
+ ursula.example.com. It is usually the hostname followed by the DNS
+ domain name (the part after the first dot). You can check the FQDN
+ using ``hostname --fqdn`` or the domain name using ``dnsdomainname``.
+
+ .. code-block:: none
+
+ You cannot change the FQDN with hostname or dnsdomainname.
+
+ The recommended method of setting the FQDN is to make the hostname
+ be an alias for the fully qualified name using /etc/hosts, DNS, or
+ NIS. For example, if the hostname was "ursula", one might have
+ a line in /etc/hosts which reads
+
+ 127.0.1.1 ursula.example.com ursula
+
+Which means, ``man hostname`` recommends ``hostname`` to return the bare
+host name. This in turn means that Ceph will return the bare host names
+when executing ``ceph * metadata``. This in turn means cephadm also
+requires the bare host name when adding a host to the cluster:
+``ceph orch host add <bare-name>``.
+
+..
+ TODO: This chapter needs to provide way for users to configure
+ Grafana in the dashboard, as this is right now very hard to do.