diff options
Diffstat (limited to 'doc/cephadm')
-rw-r--r-- | doc/cephadm/adoption.rst | 63 | ||||
-rw-r--r-- | doc/cephadm/client-setup.rst | 36 | ||||
-rw-r--r-- | doc/cephadm/compatibility.rst | 4 | ||||
-rw-r--r-- | doc/cephadm/host-management.rst | 72 | ||||
-rw-r--r-- | doc/cephadm/install.rst | 71 | ||||
-rw-r--r-- | doc/cephadm/operations.rst | 53 | ||||
-rw-r--r-- | doc/cephadm/services/index.rst | 24 | ||||
-rw-r--r-- | doc/cephadm/services/monitoring.rst | 31 | ||||
-rw-r--r-- | doc/cephadm/services/nfs.rst | 2 | ||||
-rw-r--r-- | doc/cephadm/services/osd.rst | 2 | ||||
-rw-r--r-- | doc/cephadm/services/rgw.rst | 5 | ||||
-rw-r--r-- | doc/cephadm/troubleshooting.rst | 27 | ||||
-rw-r--r-- | doc/cephadm/upgrade.rst | 14 |
13 files changed, 256 insertions, 148 deletions
diff --git a/doc/cephadm/adoption.rst b/doc/cephadm/adoption.rst index 86254a16c..2ebce606c 100644 --- a/doc/cephadm/adoption.rst +++ b/doc/cephadm/adoption.rst @@ -22,20 +22,20 @@ Preparation #. Make sure that the ``cephadm`` command line tool is available on each host in the existing cluster. See :ref:`get-cephadm` to learn how. -#. Prepare each host for use by ``cephadm`` by running this command: +#. Prepare each host for use by ``cephadm`` by running this command on that host: .. prompt:: bash # cephadm prepare-host #. Choose a version of Ceph to use for the conversion. This procedure will work - with any release of Ceph that is Octopus (15.2.z) or later, inclusive. The + with any release of Ceph that is Octopus (15.2.z) or later. The latest stable release of Ceph is the default. You might be upgrading from an earlier Ceph release at the same time that you're performing this - conversion; if you are upgrading from an earlier release, make sure to + conversion. If you are upgrading from an earlier release, make sure to follow any upgrade-related instructions for that release. - Pass the image to cephadm with the following command: + Pass the Ceph container image to cephadm with the following command: .. prompt:: bash # @@ -50,25 +50,27 @@ Preparation cephadm ls - Before starting the conversion process, ``cephadm ls`` shows all existing - daemons to have a style of ``legacy``. As the adoption process progresses, - adopted daemons will appear with a style of ``cephadm:v1``. + Before starting the conversion process, ``cephadm ls`` reports all existing + daemons with the style ``legacy``. As the adoption process progresses, + adopted daemons will appear with the style ``cephadm:v1``. Adoption process ---------------- -#. Make sure that the ceph configuration has been migrated to use the cluster - config database. If the ``/etc/ceph/ceph.conf`` is identical on each host, - then the following command can be run on one single host and will affect all - hosts: +#. Make sure that the ceph configuration has been migrated to use the cluster's + central config database. If ``/etc/ceph/ceph.conf`` is identical on all + hosts, then the following command can be run on one host and will take + effect for all hosts: .. prompt:: bash # ceph config assimilate-conf -i /etc/ceph/ceph.conf If there are configuration variations between hosts, you will need to repeat - this command on each host. During this adoption process, view the cluster's + this command on each host, taking care that if there are conflicting option + settings across hosts, the values from the last host will be used. During this + adoption process, view the cluster's central configuration to confirm that it is complete by running the following command: @@ -76,36 +78,36 @@ Adoption process ceph config dump -#. Adopt each monitor: +#. Adopt each Monitor: .. prompt:: bash # cephadm adopt --style legacy --name mon.<hostname> - Each legacy monitor should stop, quickly restart as a cephadm + Each legacy Monitor will stop, quickly restart as a cephadm container, and rejoin the quorum. -#. Adopt each manager: +#. Adopt each Manager: .. prompt:: bash # cephadm adopt --style legacy --name mgr.<hostname> -#. Enable cephadm: +#. Enable cephadm orchestration: .. prompt:: bash # ceph mgr module enable cephadm ceph orch set backend cephadm -#. Generate an SSH key: +#. Generate an SSH key for cephadm: .. prompt:: bash # ceph cephadm generate-key ceph cephadm get-pub-key > ~/ceph.pub -#. Install the cluster SSH key on each host in the cluster: +#. Install the cephadm SSH key on each host in the cluster: .. prompt:: bash # @@ -118,9 +120,10 @@ Adoption process SSH keys. .. note:: - It is also possible to have cephadm use a non-root user to SSH + It is also possible to arrange for cephadm to use a non-root user to SSH into cluster hosts. This user needs to have passwordless sudo access. - Use ``ceph cephadm set-user <user>`` and copy the SSH key to that user. + Use ``ceph cephadm set-user <user>`` and copy the SSH key to that user's + home directory on each host. See :ref:`cephadm-ssh-user` #. Tell cephadm which hosts to manage: @@ -129,10 +132,10 @@ Adoption process ceph orch host add <hostname> [ip-address] - This will perform a ``cephadm check-host`` on each host before adding it; - this check ensures that the host is functioning properly. The IP address - argument is recommended; if not provided, then the host name will be resolved - via DNS. + This will run ``cephadm check-host`` on each host before adding it. + This check ensures that the host is functioning properly. The IP address + argument is recommended. If the address is not provided, then the host name + will be resolved via DNS. #. Verify that the adopted monitor and manager daemons are visible: @@ -153,8 +156,8 @@ Adoption process cephadm adopt --style legacy --name osd.1 cephadm adopt --style legacy --name osd.2 -#. Redeploy MDS daemons by telling cephadm how many daemons to run for - each file system. List file systems by name with the command ``ceph fs +#. Redeploy CephFS MDS daemons (if deployed) by telling cephadm how many daemons to run for + each file system. List CephFS file systems by name with the command ``ceph fs ls``. Run the following command on the master nodes to redeploy the MDS daemons: @@ -189,19 +192,19 @@ Adoption process systemctl stop ceph-mds.target rm -rf /var/lib/ceph/mds/ceph-* -#. Redeploy RGW daemons. Cephadm manages RGW daemons by zone. For each - zone, deploy new RGW daemons with cephadm: +#. Redeploy Ceph Object Gateway RGW daemons if deployed. Cephadm manages RGW + daemons by zone. For each zone, deploy new RGW daemons with cephadm: .. prompt:: bash # ceph orch apply rgw <svc_id> [--realm=<realm>] [--zone=<zone>] [--port=<port>] [--ssl] [--placement=<placement>] where *<placement>* can be a simple daemon count, or a list of - specific hosts (see :ref:`orchestrator-cli-placement-spec`), and the + specific hosts (see :ref:`orchestrator-cli-placement-spec`). The zone and realm arguments are needed only for a multisite setup. After the daemons have started and you have confirmed that they are - functioning, stop and remove the old, legacy daemons: + functioning, stop and remove the legacy daemons: .. prompt:: bash # diff --git a/doc/cephadm/client-setup.rst b/doc/cephadm/client-setup.rst index f98ba798b..0f38773b1 100644 --- a/doc/cephadm/client-setup.rst +++ b/doc/cephadm/client-setup.rst @@ -1,36 +1,36 @@ ======================= Basic Ceph Client Setup ======================= -Client machines require some basic configuration to interact with -Ceph clusters. This section describes how to configure a client machine -so that it can interact with a Ceph cluster. +Client hosts require basic configuration to interact with +Ceph clusters. This section describes how to perform this configuration. .. note:: - Most client machines need to install only the `ceph-common` package - and its dependencies. Such a setup supplies the basic `ceph` and - `rados` commands, as well as other commands including `mount.ceph` - and `rbd`. + Most client hosts need to install only the ``ceph-common`` package + and its dependencies. Such an installation supplies the basic ``ceph`` and + ``rados`` commands, as well as other commands including ``mount.ceph`` + and ``rbd``. Config File Setup ================= -Client machines usually require smaller configuration files (here -sometimes called "config files") than do full-fledged cluster members. +Client hosts usually require smaller configuration files (here +sometimes called "config files") than do back-end cluster hosts. To generate a minimal config file, log into a host that has been -configured as a client or that is running a cluster daemon, and then run the following command: +configured as a client or that is running a cluster daemon, then +run the following command: .. prompt:: bash # ceph config generate-minimal-conf This command generates a minimal config file that tells the client how -to reach the Ceph monitors. The contents of this file should usually -be installed in ``/etc/ceph/ceph.conf``. +to reach the Ceph Monitors. This file should usually +be copied to ``/etc/ceph/ceph.conf`` on each client host. Keyring Setup ============= Most Ceph clusters run with authentication enabled. This means that -the client needs keys in order to communicate with the machines in the -cluster. To generate a keyring file with credentials for `client.fs`, +the client needs keys in order to communicate with Ceph daemons. +To generate a keyring file with credentials for ``client.fs``, log into an running cluster member and run the following command: .. prompt:: bash $ @@ -40,6 +40,10 @@ log into an running cluster member and run the following command: The resulting output is directed into a keyring file, typically ``/etc/ceph/ceph.keyring``. -To gain a broader understanding of client keyring distribution and administration, you should read :ref:`client_keyrings_and_configs`. +To gain a broader understanding of client keyring distribution and administration, +you should read :ref:`client_keyrings_and_configs`. -To see an example that explains how to distribute ``ceph.conf`` configuration files to hosts that are tagged with the ``bare_config`` label, you should read the section called "Distributing ceph.conf to hosts tagged with bare_config" in the section called :ref:`etc_ceph_conf_distribution`. +To see an example that explains how to distribute ``ceph.conf`` configuration +files to hosts that are tagged with the ``bare_config`` label, you should read +the subsection named "Distributing ceph.conf to hosts tagged with bare_config" +under the heading :ref:`etc_ceph_conf_distribution`. diff --git a/doc/cephadm/compatibility.rst b/doc/cephadm/compatibility.rst index 46ab62a62..8dd301f1a 100644 --- a/doc/cephadm/compatibility.rst +++ b/doc/cephadm/compatibility.rst @@ -30,8 +30,8 @@ This table shows which version pairs are expected to work or not work together: .. note:: - While not all podman versions have been actively tested against - all Ceph versions, there are no known issues with using podman + While not all Podman versions have been actively tested against + all Ceph versions, there are no known issues with using Podman version 3.0 or greater with Ceph Quincy and later releases. .. warning:: diff --git a/doc/cephadm/host-management.rst b/doc/cephadm/host-management.rst index 4b964c5f4..1fba8e582 100644 --- a/doc/cephadm/host-management.rst +++ b/doc/cephadm/host-management.rst @@ -74,9 +74,9 @@ To add each new host to the cluster, perform two steps: ceph orch host add host2 10.10.0.102 ceph orch host add host3 10.10.0.103 - It is best to explicitly provide the host IP address. If an IP is + It is best to explicitly provide the host IP address. If an address is not provided, then the host name will be immediately resolved via - DNS and that IP will be used. + DNS and the result will be used. One or more labels can also be included to immediately label the new host. For example, by default the ``_admin`` label will make @@ -104,7 +104,7 @@ To drain all daemons from a host, run a command of the following form: The ``_no_schedule`` and ``_no_conf_keyring`` labels will be applied to the host. See :ref:`cephadm-special-host-labels`. -If you only want to drain daemons but leave managed ceph conf and keyring +If you want to drain daemons but leave managed `ceph.conf` and keyring files on the host, you may pass the ``--keep-conf-keyring`` flag to the drain command. @@ -115,7 +115,8 @@ drain command. This will apply the ``_no_schedule`` label to the host but not the ``_no_conf_keyring`` label. -All OSDs on the host will be scheduled to be removed. You can check the progress of the OSD removal operation with the following command: +All OSDs on the host will be scheduled to be removed. You can check +progress of the OSD removal operation with the following command: .. prompt:: bash # @@ -148,7 +149,7 @@ cluster by running the following command: Offline host removal -------------------- -Even if a host is offline and can not be recovered, it can be removed from the +If a host is offline and can not be recovered, it can be removed from the cluster by running a command of the following form: .. prompt:: bash # @@ -250,8 +251,8 @@ Rescanning Host Devices ======================= Some servers and external enclosures may not register device removal or insertion with the -kernel. In these scenarios, you'll need to perform a host rescan. A rescan is typically -non-disruptive, and can be performed with the following CLI command: +kernel. In these scenarios, you'll need to perform a device rescan on the appropriate host. +A rescan is typically non-disruptive, and can be performed with the following CLI command: .. prompt:: bash # @@ -314,19 +315,43 @@ create a new CRUSH host located in the specified hierarchy. .. note:: - The ``location`` attribute will be only affect the initial CRUSH location. Subsequent - changes of the ``location`` property will be ignored. Also, removing a host will not remove - any CRUSH buckets. + The ``location`` attribute will be only affect the initial CRUSH location. + Subsequent changes of the ``location`` property will be ignored. Also, + removing a host will not remove an associated CRUSH bucket unless the + ``--rm-crush-entry`` flag is provided to the ``orch host rm`` command See also :ref:`crush_map_default_types`. +Removing a host from the CRUSH map +================================== + +The ``ceph orch host rm`` command has support for removing the associated host bucket +from the CRUSH map. This is done by providing the ``--rm-crush-entry`` flag. + +.. prompt:: bash [ceph:root@host1/]# + + ceph orch host rm host1 --rm-crush-entry + +When this flag is specified, cephadm will attempt to remove the host bucket +from the CRUSH map as part of the host removal process. Note that if +it fails to do so, cephadm will report the failure and the host will remain under +cephadm control. + +.. note:: + + Removal from the CRUSH map will fail if there are OSDs deployed on the + host. If you would like to remove all the host's OSDs as well, please start + by using the ``ceph orch host drain`` command to do so. Once the OSDs + have been removed, then you may direct cephadm remove the CRUSH bucket + along with the host using the ``--rm-crush-entry`` flag. + OS Tuning Profiles ================== -Cephadm can be used to manage operating-system-tuning profiles that apply sets -of sysctl settings to sets of hosts. +Cephadm can be used to manage operating system tuning profiles that apply +``sysctl`` settings to sets of hosts. -Create a YAML spec file in the following format: +To do so, create a YAML spec file in the following format: .. code-block:: yaml @@ -345,18 +370,21 @@ Apply the tuning profile with the following command: ceph orch tuned-profile apply -i <tuned-profile-file-name> -This profile is written to ``/etc/sysctl.d/`` on each host that matches the -hosts specified in the placement block of the yaml, and ``sysctl --system`` is +This profile is written to a file under ``/etc/sysctl.d/`` on each host +specified in the ``placement`` block, then ``sysctl --system`` is run on the host. .. note:: The exact filename that the profile is written to within ``/etc/sysctl.d/`` is ``<profile-name>-cephadm-tuned-profile.conf``, where ``<profile-name>`` is - the ``profile_name`` setting that you specify in the YAML spec. Because + the ``profile_name`` setting that you specify in the YAML spec. We suggest + naming these profiles following the usual ``sysctl.d`` `NN-xxxxx` convention. Because sysctl settings are applied in lexicographical order (sorted by the filename - in which the setting is specified), you may want to set the ``profile_name`` - in your spec so that it is applied before or after other conf files. + in which the setting is specified), you may want to carefully choose + the ``profile_name`` in your spec so that it is applied before or after other + conf files. Careful selection ensures that values supplied here override or + do not override those in other ``sysctl.d`` files as desired. .. note:: @@ -365,7 +393,7 @@ run on the host. .. note:: - Applying tuned profiles is idempotent when the ``--no-overwrite`` option is + Applying tuning profiles is idempotent when the ``--no-overwrite`` option is passed. Moreover, if the ``--no-overwrite`` option is passed, existing profiles with the same name are not overwritten. @@ -525,7 +553,7 @@ There are two ways to customize this configuration for your environment: We do *not recommend* this approach. The path name must be visible to *any* mgr daemon, and cephadm runs all daemons as - containers. That means that the file either need to be placed + containers. That means that the file must either be placed inside a customized container image for your deployment, or manually distributed to the mgr data directory (``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at @@ -578,8 +606,8 @@ Note that ``man hostname`` recommends ``hostname`` to return the bare host name: The FQDN (Fully Qualified Domain Name) of the system is the - name that the resolver(3) returns for the host name, such as, - ursula.example.com. It is usually the hostname followed by the DNS + name that the resolver(3) returns for the host name, for example + ``ursula.example.com``. It is usually the short hostname followed by the DNS domain name (the part after the first dot). You can check the FQDN using ``hostname --fqdn`` or the domain name using ``dnsdomainname``. diff --git a/doc/cephadm/install.rst b/doc/cephadm/install.rst index b1aa736e2..0ab4531ff 100644 --- a/doc/cephadm/install.rst +++ b/doc/cephadm/install.rst @@ -4,7 +4,7 @@ Deploying a new Ceph cluster ============================ -Cephadm creates a new Ceph cluster by "bootstrapping" on a single +Cephadm creates a new Ceph cluster by bootstrapping a single host, expanding the cluster to encompass any additional hosts, and then deploying the needed services. @@ -18,7 +18,7 @@ Requirements - Python 3 - Systemd - Podman or Docker for running containers -- Time synchronization (such as chrony or NTP) +- Time synchronization (such as Chrony or the legacy ``ntpd``) - LVM2 for provisioning storage devices Any modern Linux distribution should be sufficient. Dependencies @@ -45,6 +45,13 @@ There are two ways to install ``cephadm``: Choose either the distribution-specific method or the curl-based method. Do not attempt to use both these methods on one system. +.. note:: Recent versions of cephadm are distributed as an executable compiled + from source code. Unlike for earlier versions of Ceph it is no longer + sufficient to copy a single script from Ceph's git tree and run it. If you + wish to run cephadm using a development version you should create your own + build of cephadm. See :ref:`compiling-cephadm` for details on how to create + your own standalone cephadm executable. + .. _cephadm_install_distros: distribution-specific installations @@ -85,9 +92,9 @@ that case, you can install cephadm directly. For example: curl-based installation ----------------------- -* First, determine what version of Ceph you will need. You can use the releases +* First, determine what version of Ceph you wish to install. You can use the releases page to find the `latest active releases <https://docs.ceph.com/en/latest/releases/#active-releases>`_. - For example, we might look at that page and find that ``18.2.0`` is the latest + For example, we might find that ``18.2.1`` is the latest active release. * Use ``curl`` to fetch a build of cephadm for that release. @@ -113,7 +120,7 @@ curl-based installation * If you encounter any issues with running cephadm due to errors including the message ``bad interpreter``, then you may not have Python or the correct version of Python installed. The cephadm tool requires Python 3.6 - and above. You can manually run cephadm with a particular version of Python by + or later. You can manually run cephadm with a particular version of Python by prefixing the command with your installed Python version. For example: .. prompt:: bash # @@ -121,6 +128,11 @@ curl-based installation python3.8 ./cephadm <arguments...> +* Although the standalone cephadm is sufficient to bootstrap a cluster, it is + best to have the ``cephadm`` command installed on the host. To install + the packages that provide the ``cephadm`` command, run the following + commands: + .. _cephadm_update: update cephadm @@ -166,7 +178,7 @@ What to know before you bootstrap The first step in creating a new Ceph cluster is running the ``cephadm bootstrap`` command on the Ceph cluster's first host. The act of running the ``cephadm bootstrap`` command on the Ceph cluster's first host creates the Ceph -cluster's first "monitor daemon", and that monitor daemon needs an IP address. +cluster's first Monitor daemon. You must pass the IP address of the Ceph cluster's first host to the ``ceph bootstrap`` command, so you'll need to know the IP address of that host. @@ -187,13 +199,13 @@ Run the ``ceph bootstrap`` command: This command will: -* Create a monitor and manager daemon for the new cluster on the local +* Create a Monitor and a Manager daemon for the new cluster on the local host. * Generate a new SSH key for the Ceph cluster and add it to the root user's ``/root/.ssh/authorized_keys`` file. * Write a copy of the public key to ``/etc/ceph/ceph.pub``. * Write a minimal configuration file to ``/etc/ceph/ceph.conf``. This - file is needed to communicate with the new cluster. + file is needed to communicate with Ceph daemons. * Write a copy of the ``client.admin`` administrative (privileged!) secret key to ``/etc/ceph/ceph.client.admin.keyring``. * Add the ``_admin`` label to the bootstrap host. By default, any host @@ -205,7 +217,7 @@ This command will: Further information about cephadm bootstrap ------------------------------------------- -The default bootstrap behavior will work for most users. But if you'd like +The default bootstrap process will work for most users. But if you'd like immediately to know more about ``cephadm bootstrap``, read the list below. Also, you can run ``cephadm bootstrap -h`` to see all of ``cephadm``'s @@ -216,15 +228,15 @@ available options. journald. If you want Ceph to write traditional log files to ``/var/log/ceph/$fsid``, use the ``--log-to-file`` option during bootstrap. -* Larger Ceph clusters perform better when (external to the Ceph cluster) +* Larger Ceph clusters perform best when (external to the Ceph cluster) public network traffic is separated from (internal to the Ceph cluster) cluster traffic. The internal cluster traffic handles replication, recovery, and heartbeats between OSD daemons. You can define the :ref:`cluster network<cluster-network>` by supplying the ``--cluster-network`` option to the ``bootstrap`` - subcommand. This parameter must define a subnet in CIDR notation (for example + subcommand. This parameter must be a subnet in CIDR notation (for example ``10.90.90.0/24`` or ``fe80::/64``). -* ``cephadm bootstrap`` writes to ``/etc/ceph`` the files needed to access +* ``cephadm bootstrap`` writes to ``/etc/ceph`` files needed to access the new cluster. This central location makes it possible for Ceph packages installed on the host (e.g., packages that give access to the cephadm command line interface) to find these files. @@ -245,12 +257,12 @@ available options. EOF $ ./cephadm bootstrap --config initial-ceph.conf ... -* The ``--ssh-user *<user>*`` option makes it possible to choose which SSH +* The ``--ssh-user *<user>*`` option makes it possible to designate which SSH user cephadm will use to connect to hosts. The associated SSH key will be added to ``/home/*<user>*/.ssh/authorized_keys``. The user that you designate with this option must have passwordless sudo access. -* If you are using a container on an authenticated registry that requires +* If you are using a container image from a registry that requires login, you may add the argument: * ``--registry-json <path to json file>`` @@ -261,7 +273,7 @@ available options. Cephadm will attempt to log in to this registry so it can pull your container and then store the login info in its config database. Other hosts added to - the cluster will then also be able to make use of the authenticated registry. + the cluster will then also be able to make use of the authenticated container registry. * See :ref:`cephadm-deployment-scenarios` for additional examples for using ``cephadm bootstrap``. @@ -326,7 +338,7 @@ Add all hosts to the cluster by following the instructions in By default, a ``ceph.conf`` file and a copy of the ``client.admin`` keyring are maintained in ``/etc/ceph`` on all hosts that have the ``_admin`` label. This -label is initially applied only to the bootstrap host. We usually recommend +label is initially applied only to the bootstrap host. We recommend that one or more other hosts be given the ``_admin`` label so that the Ceph CLI (for example, via ``cephadm shell``) is easily accessible on multiple hosts. To add the ``_admin`` label to additional host(s), run a command of the following form: @@ -339,9 +351,10 @@ the ``_admin`` label to additional host(s), run a command of the following form: Adding additional MONs ====================== -A typical Ceph cluster has three or five monitor daemons spread +A typical Ceph cluster has three or five Monitor daemons spread across different hosts. We recommend deploying five -monitors if there are five or more nodes in your cluster. +Monitors if there are five or more nodes in your cluster. Most clusters do not +benefit from seven or more Monitors. Please follow :ref:`deploy_additional_monitors` to deploy additional MONs. @@ -366,12 +379,12 @@ See :ref:`osd_autotune`. To deploy hyperconverged Ceph with TripleO, please refer to the TripleO documentation: `Scenario: Deploy Hyperconverged Ceph <https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/cephadm.html#scenario-deploy-hyperconverged-ceph>`_ -In other cases where the cluster hardware is not exclusively used by Ceph (hyperconverged), +In other cases where the cluster hardware is not exclusively used by Ceph (converged infrastructure), reduce the memory consumption of Ceph like so: .. prompt:: bash # - # hyperconverged only: + # converged only: ceph config set mgr mgr/cephadm/autotune_memory_target_ratio 0.2 Then enable memory autotuning: @@ -400,9 +413,11 @@ Different deployment scenarios Single host ----------- -To configure a Ceph cluster to run on a single host, use the -``--single-host-defaults`` flag when bootstrapping. For use cases of this, see -:ref:`one-node-cluster`. +To deploy a Ceph cluster running on a single host, use the +``--single-host-defaults`` flag when bootstrapping. For use cases, see +:ref:`one-node-cluster`. Such clusters are generally not suitable for +production. + The ``--single-host-defaults`` flag sets the following configuration options:: @@ -419,8 +434,8 @@ Deployment in an isolated environment ------------------------------------- You might need to install cephadm in an environment that is not connected -directly to the internet (such an environment is also called an "isolated -environment"). This can be done if a custom container registry is used. Either +directly to the Internet (an "isolated" or "airgapped" +environment). This requires the use of a custom container registry. Either of two kinds of custom container registry can be used in this scenario: (1) a Podman-based or Docker-based insecure registry, or (2) a secure registry. @@ -569,9 +584,9 @@ in order to have cephadm use them for SSHing between cluster hosts Note that this setup does not require installing the corresponding public key from the private key passed to bootstrap on other nodes. In fact, cephadm will reject the ``--ssh-public-key`` argument when passed along with ``--ssh-signed-cert``. -Not because having the public key breaks anything, but because it is not at all needed -for this setup and it helps bootstrap differentiate if the user wants the CA signed -keys setup or standard pubkey encryption. What this means is, SSH key rotation +This is not because having the public key breaks anything, but rather because it is not at all needed +and helps the bootstrap command differentiate if the user wants the CA signed +keys setup or standard pubkey encryption. What this means is that SSH key rotation would simply be a matter of getting another key signed by the same CA and providing cephadm with the new private key and signed cert. No additional distribution of keys to cluster nodes is needed after the initial setup of the CA key as a trusted key, diff --git a/doc/cephadm/operations.rst b/doc/cephadm/operations.rst index 5d8fdaca8..623cf1635 100644 --- a/doc/cephadm/operations.rst +++ b/doc/cephadm/operations.rst @@ -328,15 +328,15 @@ You can disable this health warning by running the following command: Cluster Configuration Checks ---------------------------- -Cephadm periodically scans each of the hosts in the cluster in order -to understand the state of the OS, disks, NICs etc. These facts can -then be analysed for consistency across the hosts in the cluster to +Cephadm periodically scans each host in the cluster in order +to understand the state of the OS, disks, network interfacess etc. This information can +then be analyzed for consistency across the hosts in the cluster to identify any configuration anomalies. Enabling Cluster Configuration Checks ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The configuration checks are an **optional** feature, and are enabled +These configuration checks are an **optional** feature, and are enabled by running the following command: .. prompt:: bash # @@ -346,7 +346,7 @@ by running the following command: States Returned by Cluster Configuration Checks ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The configuration checks are triggered after each host scan (1m). The +Configuration checks are triggered after each host scan. The cephadm log entries will show the current state and outcome of the configuration checks as follows: @@ -383,14 +383,14 @@ To list all the configuration checks and their current states, run the following # ceph cephadm config-check ls NAME HEALTHCHECK STATUS DESCRIPTION - kernel_security CEPHADM_CHECK_KERNEL_LSM enabled checks SELINUX/Apparmor profiles are consistent across cluster hosts - os_subscription CEPHADM_CHECK_SUBSCRIPTION enabled checks subscription states are consistent for all cluster hosts - public_network CEPHADM_CHECK_PUBLIC_MEMBERSHIP enabled check that all hosts have a NIC on the Ceph public_network + kernel_security CEPHADM_CHECK_KERNEL_LSM enabled check that SELINUX/Apparmor profiles are consistent across cluster hosts + os_subscription CEPHADM_CHECK_SUBSCRIPTION enabled check that subscription states are consistent for all cluster hosts + public_network CEPHADM_CHECK_PUBLIC_MEMBERSHIP enabled check that all hosts have a network interface on the Ceph public_network osd_mtu_size CEPHADM_CHECK_MTU enabled check that OSD hosts share a common MTU setting - osd_linkspeed CEPHADM_CHECK_LINKSPEED enabled check that OSD hosts share a common linkspeed - network_missing CEPHADM_CHECK_NETWORK_MISSING enabled checks that the cluster/public networks defined exist on the Ceph hosts - ceph_release CEPHADM_CHECK_CEPH_RELEASE enabled check for Ceph version consistency - ceph daemons should be on the same release (unless upgrade is active) - kernel_version CEPHADM_CHECK_KERNEL_VERSION enabled checks that the MAJ.MIN of the kernel on Ceph hosts is consistent + osd_linkspeed CEPHADM_CHECK_LINKSPEED enabled check that OSD hosts share a common network link speed + network_missing CEPHADM_CHECK_NETWORK_MISSING enabled check that the cluster/public networks as defined exist on the Ceph hosts + ceph_release CEPHADM_CHECK_CEPH_RELEASE enabled check for Ceph version consistency: all Ceph daemons should be the same release unless upgrade is in progress + kernel_version CEPHADM_CHECK_KERNEL_VERSION enabled checks that the maj.min version of the kernel is consistent across Ceph hosts The name of each configuration check can be used to enable or disable a specific check by running a command of the following form: : @@ -414,31 +414,31 @@ flagged as an anomaly and a healthcheck (WARNING) state raised. CEPHADM_CHECK_SUBSCRIPTION ~~~~~~~~~~~~~~~~~~~~~~~~~~ -This check relates to the status of vendor subscription. This check is -performed only for hosts using RHEL, but helps to confirm that all hosts are +This check relates to the status of OS vendor subscription. This check is +performed only for hosts using RHEL and helps to confirm that all hosts are covered by an active subscription, which ensures that patches and updates are available. CEPHADM_CHECK_PUBLIC_MEMBERSHIP ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -All members of the cluster should have NICs configured on at least one of the +All members of the cluster should have a network interface configured on at least one of the public network subnets. Hosts that are not on the public network will rely on routing, which may affect performance. CEPHADM_CHECK_MTU ~~~~~~~~~~~~~~~~~ -The MTU of the NICs on OSDs can be a key factor in consistent performance. This +The MTU of the network interfaces on OSD hosts can be a key factor in consistent performance. This check examines hosts that are running OSD services to ensure that the MTU is -configured consistently within the cluster. This is determined by establishing +configured consistently within the cluster. This is determined by determining the MTU setting that the majority of hosts is using. Any anomalies result in a -Ceph health check. +health check. CEPHADM_CHECK_LINKSPEED ~~~~~~~~~~~~~~~~~~~~~~~ -This check is similar to the MTU check. Linkspeed consistency is a factor in -consistent cluster performance, just as the MTU of the NICs on the OSDs is. -This check determines the linkspeed shared by the majority of OSD hosts, and a -health check is run for any hosts that are set at a lower linkspeed rate. +This check is similar to the MTU check. Link speed consistency is a factor in +consistent cluster performance, as is the MTU of the OSD node network interfaces. +This check determines the link speed shared by the majority of OSD hosts, and a +health check is run for any hosts that are set at a lower link speed rate. CEPHADM_CHECK_NETWORK_MISSING ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -448,15 +448,14 @@ a health check is raised. CEPHADM_CHECK_CEPH_RELEASE ~~~~~~~~~~~~~~~~~~~~~~~~~~ -Under normal operations, the Ceph cluster runs daemons under the same ceph -release (that is, the Ceph cluster runs all daemons under (for example) -Octopus). This check determines the active release for each daemon, and +Under normal operations, the Ceph cluster runs daemons that are of the same Ceph +release (for example, Reef). This check determines the active release for each daemon, and reports any anomalies as a healthcheck. *This check is bypassed if an upgrade -process is active within the cluster.* +is in process.* CEPHADM_CHECK_KERNEL_VERSION ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The OS kernel version (maj.min) is checked for consistency across the hosts. +The OS kernel version (maj.min) is checked for consistency across hosts. The kernel version of the majority of the hosts is used as the basis for identifying anomalies. diff --git a/doc/cephadm/services/index.rst b/doc/cephadm/services/index.rst index 82f83bfac..c1da5d15f 100644 --- a/doc/cephadm/services/index.rst +++ b/doc/cephadm/services/index.rst @@ -357,7 +357,9 @@ Or in YAML: Placement by pattern matching ----------------------------- -Daemons can be placed on hosts as well: +Daemons can be placed on hosts using a host pattern as well. +By default, the host pattern is matched using fnmatch which supports +UNIX shell-style wildcards (see https://docs.python.org/3/library/fnmatch.html): .. prompt:: bash # @@ -385,6 +387,26 @@ Or in YAML: placement: host_pattern: "*" +The host pattern also has support for using a regex. To use a regex, you +must either add "regex: " to the start of the pattern when using the +command line, or specify a ``pattern_type`` field to be "regex" +when using YAML. + +On the command line: + +.. prompt:: bash # + + ceph orch apply prometheus --placement='regex:FOO[0-9]|BAR[0-9]' + +In YAML: + +.. code-block:: yaml + + service_type: prometheus + placement: + host_pattern: + pattern: 'FOO[0-9]|BAR[0-9]' + pattern_type: regex Changing the number of daemons ------------------------------ diff --git a/doc/cephadm/services/monitoring.rst b/doc/cephadm/services/monitoring.rst index a17a5ba03..d95504796 100644 --- a/doc/cephadm/services/monitoring.rst +++ b/doc/cephadm/services/monitoring.rst @@ -83,6 +83,37 @@ steps below: ceph orch apply grafana +Enabling security for the monitoring stack +---------------------------------------------- + +By default, in a cephadm-managed cluster, the monitoring components are set up and configured without enabling security measures. +While this suffices for certain deployments, others with strict security needs may find it necessary to protect the +monitoring stack against unauthorized access. In such cases, cephadm relies on a specific configuration parameter, +`mgr/cephadm/secure_monitoring_stack`, which toggles the security settings for all monitoring components. To activate security +measures, set this option to ``true`` with a command of the following form: + + .. prompt:: bash # + + ceph config set mgr mgr/cephadm/secure_monitoring_stack true + +This change will trigger a sequence of reconfigurations across all monitoring daemons, typically requiring +few minutes until all components are fully operational. The updated secure configuration includes the following modifications: + +#. Prometheus: basic authentication is required to access the web portal and TLS is enabled for secure communication. +#. Alertmanager: basic authentication is required to access the web portal and TLS is enabled for secure communication. +#. Node Exporter: TLS is enabled for secure communication. +#. Grafana: TLS is enabled and authentication is requiered to access the datasource information. + +In this secure setup, users will need to setup authentication +(username/password) for both Prometheus and Alertmanager. By default the +username and password are set to ``admin``/``admin``. The user can change these +value with the commands ``ceph orch prometheus set-credentials`` and ``ceph +orch alertmanager set-credentials`` respectively. These commands offer the +flexibility to input the username/password either as parameters or via a JSON +file, which enhances security. Additionally, Cephadm provides the commands +`orch prometheus get-credentials` and `orch alertmanager get-credentials` to +retrieve the current credentials. + .. _cephadm-monitoring-centralized-logs: Centralized Logging in Ceph diff --git a/doc/cephadm/services/nfs.rst b/doc/cephadm/services/nfs.rst index 2f12c5916..ab616ddcb 100644 --- a/doc/cephadm/services/nfs.rst +++ b/doc/cephadm/services/nfs.rst @@ -15,7 +15,7 @@ Deploying NFS ganesha ===================== Cephadm deploys NFS Ganesha daemon (or set of daemons). The configuration for -NFS is stored in the ``nfs-ganesha`` pool and exports are managed via the +NFS is stored in the ``.nfs`` pool and exports are managed via the ``ceph nfs export ...`` commands and via the dashboard. To deploy a NFS Ganesha gateway, run the following command: diff --git a/doc/cephadm/services/osd.rst b/doc/cephadm/services/osd.rst index f62b0f831..aa906e239 100644 --- a/doc/cephadm/services/osd.rst +++ b/doc/cephadm/services/osd.rst @@ -232,7 +232,7 @@ Remove an OSD Removing an OSD from a cluster involves two steps: -#. evacuating all placement groups (PGs) from the cluster +#. evacuating all placement groups (PGs) from the OSD #. removing the PG-free OSD from the cluster The following command performs these two steps: diff --git a/doc/cephadm/services/rgw.rst b/doc/cephadm/services/rgw.rst index 20ec39a88..ed0b14936 100644 --- a/doc/cephadm/services/rgw.rst +++ b/doc/cephadm/services/rgw.rst @@ -246,6 +246,7 @@ It is a yaml format file with the following properties: virtual_interface_networks: [ ... ] # optional: list of CIDR networks use_keepalived_multicast: <bool> # optional: Default is False. vrrp_interface_network: <string>/<string> # optional: ex: 192.168.20.0/24 + health_check_interval: <string> # optional: Default is 2s. ssl_cert: | # optional: SSL certificate and key -----BEGIN CERTIFICATE----- ... @@ -273,6 +274,7 @@ It is a yaml format file with the following properties: monitor_port: <integer> # ex: 1967, used by haproxy for load balancer status virtual_interface_networks: [ ... ] # optional: list of CIDR networks first_virtual_router_id: <integer> # optional: default 50 + health_check_interval: <string> # optional: Default is 2s. ssl_cert: | # optional: SSL certificate and key -----BEGIN CERTIFICATE----- ... @@ -321,6 +323,9 @@ where the properties of this service specification are: keepalived will have different virtual_router_id. In the case of using ``virtual_ips_list``, each IP will create its own virtual router. So the first one will have ``first_virtual_router_id``, second one will have ``first_virtual_router_id`` + 1, etc. Valid values go from 1 to 255. +* ``health_check_interval`` + Default is 2 seconds. This parameter can be used to set the interval between health checks + for the haproxy with the backend servers. .. _ingress-virtual-ip: diff --git a/doc/cephadm/troubleshooting.rst b/doc/cephadm/troubleshooting.rst index d891ebaf2..a7afaa108 100644 --- a/doc/cephadm/troubleshooting.rst +++ b/doc/cephadm/troubleshooting.rst @@ -32,7 +32,7 @@ completely by running the following commands: ceph orch set backend '' ceph mgr module disable cephadm -These commands disable all of the ``ceph orch ...`` CLI commands. All +These commands disable all ``ceph orch ...`` CLI commands. All previously deployed daemon containers continue to run and will start just as they were before you ran these commands. @@ -56,7 +56,7 @@ following form: ceph orch ls --service_name=<service-name> --format yaml -This will return something in the following form: +This will return information in the following form: .. code-block:: yaml @@ -252,16 +252,17 @@ For more detail on operations of this kind, see Accessing the Admin Socket -------------------------- -Each Ceph daemon provides an admin socket that bypasses the MONs (See -:ref:`rados-monitoring-using-admin-socket`). +Each Ceph daemon provides an admin socket that allows runtime option setting and statistic reading. See +:ref:`rados-monitoring-using-admin-socket`. #. To access the admin socket, enter the daemon container on the host:: [root@mon1 ~]# cephadm enter --name <daemon-name> -#. Run a command of the following form to see the admin socket's configuration:: +#. Run a command of the following forms to see the admin socket's configuration and other available actions:: [ceph: root@mon1 /]# ceph --admin-daemon /var/run/ceph/ceph-<daemon-name>.asok config show + [ceph: root@mon1 /]# ceph --admin-daemon /var/run/ceph/ceph-<daemon-name>.asok help Running Various Ceph Tools -------------------------------- @@ -444,11 +445,11 @@ Running repeated debugging sessions When using ``cephadm shell``, as in the example above, any changes made to the container that is spawned by the shell command are ephemeral. After the shell session exits, the files that were downloaded and installed cease to be -available. You can simply re-run the same commands every time ``cephadm -shell`` is invoked, but in order to save time and resources one can create a -new container image and use it for repeated debugging sessions. +available. You can simply re-run the same commands every time ``cephadm shell`` +is invoked, but to save time and resources you can create a new container image +and use it for repeated debugging sessions. -In the following example, we create a simple file that will construct the +In the following example, we create a simple file that constructs the container image. The command below uses podman but it is expected to work correctly even if ``podman`` is replaced with ``docker``:: @@ -463,14 +464,14 @@ correctly even if ``podman`` is replaced with ``docker``:: The above file creates a new local image named ``ceph:debugging``. This image can be used on the same machine that built it. The image can also be pushed to -a container repository or saved and copied to a node runing other Ceph -containers. Consult the ``podman`` or ``docker`` documentation for more +a container repository or saved and copied to a node that is running other Ceph +containers. See the ``podman`` or ``docker`` documentation for more information about the container workflow. After the image has been built, it can be used to initiate repeat debugging sessions. By using an image in this way, you avoid the trouble of having to -re-install the debug tools and debuginfo packages every time you need to run a -debug session. To debug a core file using this image, in the same way as +re-install the debug tools and the debuginfo packages every time you need to +run a debug session. To debug a core file using this image, in the same way as previously described, run: .. prompt:: bash # diff --git a/doc/cephadm/upgrade.rst b/doc/cephadm/upgrade.rst index e0a9f610a..9bb1a6b4d 100644 --- a/doc/cephadm/upgrade.rst +++ b/doc/cephadm/upgrade.rst @@ -2,7 +2,7 @@ Upgrading Ceph ============== -Cephadm can safely upgrade Ceph from one bugfix release to the next. For +Cephadm can safely upgrade Ceph from one point release to the next. For example, you can upgrade from v15.2.0 (the first Octopus release) to the next point release, v15.2.1. @@ -137,25 +137,25 @@ UPGRADE_NO_STANDBY_MGR ---------------------- This alert (``UPGRADE_NO_STANDBY_MGR``) means that Ceph does not detect an -active standby manager daemon. In order to proceed with the upgrade, Ceph -requires an active standby manager daemon (which you can think of in this +active standby Manager daemon. In order to proceed with the upgrade, Ceph +requires an active standby Manager daemon (which you can think of in this context as "a second manager"). -You can ensure that Cephadm is configured to run 2 (or more) managers by +You can ensure that Cephadm is configured to run two (or more) Managers by running the following command: .. prompt:: bash # ceph orch apply mgr 2 # or more -You can check the status of existing mgr daemons by running the following +You can check the status of existing Manager daemons by running the following command: .. prompt:: bash # ceph orch ps --daemon-type mgr -If an existing mgr daemon has stopped, you can try to restart it by running the +If an existing Manager daemon has stopped, you can try to restart it by running the following command: .. prompt:: bash # @@ -183,7 +183,7 @@ Using customized container images ================================= For most users, upgrading requires nothing more complicated than specifying the -Ceph version number to upgrade to. In such cases, cephadm locates the specific +Ceph version to which to upgrade. In such cases, cephadm locates the specific Ceph container image to use by combining the ``container_image_base`` configuration option (default: ``docker.io/ceph/ceph``) with a tag of ``vX.Y.Z``. |