summaryrefslogtreecommitdiffstats
path: root/doc/cephadm/services
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-07 18:45:59 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-07 18:45:59 +0000
commit19fcec84d8d7d21e796c7624e521b60d28ee21ed (patch)
tree42d26aa27d1e3f7c0b8bd3fd14e7d7082f5008dc /doc/cephadm/services
parentInitial commit. (diff)
downloadceph-19fcec84d8d7d21e796c7624e521b60d28ee21ed.tar.xz
ceph-19fcec84d8d7d21e796c7624e521b60d28ee21ed.zip
Adding upstream version 16.2.11+ds.upstream/16.2.11+dsupstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/cephadm/services')
-rw-r--r--doc/cephadm/services/custom-container.rst79
-rw-r--r--doc/cephadm/services/index.rst658
-rw-r--r--doc/cephadm/services/iscsi.rst80
-rw-r--r--doc/cephadm/services/mds.rst49
-rw-r--r--doc/cephadm/services/mgr.rst43
-rw-r--r--doc/cephadm/services/mon.rst179
-rw-r--r--doc/cephadm/services/monitoring.rst457
-rw-r--r--doc/cephadm/services/nfs.rst120
-rw-r--r--doc/cephadm/services/osd.rst936
-rw-r--r--doc/cephadm/services/rgw.rst324
-rw-r--r--doc/cephadm/services/snmp-gateway.rst171
11 files changed, 3096 insertions, 0 deletions
diff --git a/doc/cephadm/services/custom-container.rst b/doc/cephadm/services/custom-container.rst
new file mode 100644
index 000000000..3ece248c5
--- /dev/null
+++ b/doc/cephadm/services/custom-container.rst
@@ -0,0 +1,79 @@
+========================
+Custom Container Service
+========================
+
+The orchestrator enables custom containers to be deployed using a YAML file.
+A corresponding :ref:`orchestrator-cli-service-spec` must look like:
+
+.. code-block:: yaml
+
+ service_type: container
+ service_id: foo
+ placement:
+ ...
+ spec:
+ image: docker.io/library/foo:latest
+ entrypoint: /usr/bin/foo
+ uid: 1000
+ gid: 1000
+ args:
+ - "--net=host"
+ - "--cpus=2"
+ ports:
+ - 8080
+ - 8443
+ envs:
+ - SECRET=mypassword
+ - PORT=8080
+ - PUID=1000
+ - PGID=1000
+ volume_mounts:
+ CONFIG_DIR: /etc/foo
+ bind_mounts:
+ - ['type=bind', 'source=lib/modules', 'destination=/lib/modules', 'ro=true']
+ dirs:
+ - CONFIG_DIR
+ files:
+ CONFIG_DIR/foo.conf:
+ - refresh=true
+ - username=xyz
+ - "port: 1234"
+
+where the properties of a service specification are:
+
+* ``service_id``
+ A unique name of the service.
+* ``image``
+ The name of the Docker image.
+* ``uid``
+ The UID to use when creating directories and files in the host system.
+* ``gid``
+ The GID to use when creating directories and files in the host system.
+* ``entrypoint``
+ Overwrite the default ENTRYPOINT of the image.
+* ``args``
+ A list of additional Podman/Docker command line arguments.
+* ``ports``
+ A list of TCP ports to open in the host firewall.
+* ``envs``
+ A list of environment variables.
+* ``bind_mounts``
+ When you use a bind mount, a file or directory on the host machine
+ is mounted into the container. Relative `source=...` paths will be
+ located below `/var/lib/ceph/<cluster-fsid>/<daemon-name>`.
+* ``volume_mounts``
+ When you use a volume mount, a new directory is created within
+ Docker’s storage directory on the host machine, and Docker manages
+ that directory’s contents. Relative source paths will be located below
+ `/var/lib/ceph/<cluster-fsid>/<daemon-name>`.
+* ``dirs``
+ A list of directories that are created below
+ `/var/lib/ceph/<cluster-fsid>/<daemon-name>`.
+* ``files``
+ A dictionary, where the key is the relative path of the file and the
+ value the file content. The content must be double quoted when using
+ a string. Use '\\n' for line breaks in that case. Otherwise define
+ multi-line content as list of strings. The given files will be created
+ below the directory `/var/lib/ceph/<cluster-fsid>/<daemon-name>`.
+ The absolute path of the directory where the file will be created must
+ exist. Use the `dirs` property to create them if necessary.
diff --git a/doc/cephadm/services/index.rst b/doc/cephadm/services/index.rst
new file mode 100644
index 000000000..26fd8864a
--- /dev/null
+++ b/doc/cephadm/services/index.rst
@@ -0,0 +1,658 @@
+==================
+Service Management
+==================
+
+A service is a group of daemons configured together. See these chapters
+for details on individual services:
+
+.. toctree::
+ :maxdepth: 1
+
+ mon
+ mgr
+ osd
+ rgw
+ mds
+ nfs
+ iscsi
+ custom-container
+ monitoring
+ snmp-gateway
+
+Service Status
+==============
+
+
+To see the status of one
+of the services running in the Ceph cluster, do the following:
+
+#. Use the command line to print a list of services.
+#. Locate the service whose status you want to check.
+#. Print the status of the service.
+
+The following command prints a list of services known to the orchestrator. To
+limit the output to services only on a specified host, use the optional
+``--host`` parameter. To limit the output to services of only a particular
+type, use the optional ``--type`` parameter (mon, osd, mgr, mds, rgw):
+
+ .. prompt:: bash #
+
+ ceph orch ls [--service_type type] [--service_name name] [--export] [--format f] [--refresh]
+
+Discover the status of a particular service or daemon:
+
+ .. prompt:: bash #
+
+ ceph orch ls --service_type type --service_name <name> [--refresh]
+
+To export the service specifications knows to the orchestrator, run the following command.
+
+ .. prompt:: bash #
+
+ ceph orch ls --export
+
+The service specifications exported with this command will be exported as yaml
+and that yaml can be used with the ``ceph orch apply -i`` command.
+
+For information about retrieving the specifications of single services (including examples of commands), see :ref:`orchestrator-cli-service-spec-retrieve`.
+
+Daemon Status
+=============
+
+A daemon is a systemd unit that is running and part of a service.
+
+To see the status of a daemon, do the following:
+
+#. Print a list of all daemons known to the orchestrator.
+#. Query the status of the target daemon.
+
+First, print a list of all daemons known to the orchestrator:
+
+ .. prompt:: bash #
+
+ ceph orch ps [--hostname host] [--daemon_type type] [--service_name name] [--daemon_id id] [--format f] [--refresh]
+
+Then query the status of a particular service instance (mon, osd, mds, rgw).
+For OSDs the id is the numeric OSD ID. For MDS services the id is the file
+system name:
+
+ .. prompt:: bash #
+
+ ceph orch ps --daemon_type osd --daemon_id 0
+
+.. _orchestrator-cli-service-spec:
+
+Service Specification
+=====================
+
+A *Service Specification* is a data structure that is used to specify the
+deployment of services. In addition to parameters such as `placement` or
+`networks`, the user can set initial values of service configuration parameters
+by means of the `config` section. For each param/value configuration pair,
+cephadm calls the following command to set its value:
+
+ .. prompt:: bash #
+
+ ceph config set <service-name> <param> <value>
+
+cephadm raises health warnings in case invalid configuration parameters are
+found in the spec (`CEPHADM_INVALID_CONFIG_OPTION`) or if any error while
+trying to apply the new configuration option(s) (`CEPHADM_FAILED_SET_OPTION`).
+
+Here is an example of a service specification in YAML:
+
+.. code-block:: yaml
+
+ service_type: rgw
+ service_id: realm.zone
+ placement:
+ hosts:
+ - host1
+ - host2
+ - host3
+ config:
+ param_1: val_1
+ ...
+ param_N: val_N
+ unmanaged: false
+ networks:
+ - 192.169.142.0/24
+ spec:
+ # Additional service specific attributes.
+
+In this example, the properties of this service specification are:
+
+.. py:currentmodule:: ceph.deployment.service_spec
+
+.. autoclass:: ServiceSpec
+ :members:
+
+Each service type can have additional service-specific properties.
+
+Service specifications of type ``mon``, ``mgr``, and the monitoring
+types do not require a ``service_id``.
+
+A service of type ``osd`` is described in :ref:`drivegroups`
+
+Many service specifications can be applied at once using ``ceph orch apply -i``
+by submitting a multi-document YAML file::
+
+ cat <<EOF | ceph orch apply -i -
+ service_type: mon
+ placement:
+ host_pattern: "mon*"
+ ---
+ service_type: mgr
+ placement:
+ host_pattern: "mgr*"
+ ---
+ service_type: osd
+ service_id: default_drive_group
+ placement:
+ host_pattern: "osd*"
+ data_devices:
+ all: true
+ EOF
+
+.. _orchestrator-cli-service-spec-retrieve:
+
+Retrieving the running Service Specification
+--------------------------------------------
+
+If the services have been started via ``ceph orch apply...``, then directly changing
+the Services Specification is complicated. Instead of attempting to directly change
+the Services Specification, we suggest exporting the running Service Specification by
+following these instructions:
+
+ .. prompt:: bash #
+
+ ceph orch ls --service-name rgw.<realm>.<zone> --export > rgw.<realm>.<zone>.yaml
+ ceph orch ls --service-type mgr --export > mgr.yaml
+ ceph orch ls --export > cluster.yaml
+
+The Specification can then be changed and re-applied as above.
+
+Updating Service Specifications
+-------------------------------
+
+The Ceph Orchestrator maintains a declarative state of each
+service in a ``ServiceSpec``. For certain operations, like updating
+the RGW HTTP port, we need to update the existing
+specification.
+
+1. List the current ``ServiceSpec``:
+
+ .. prompt:: bash #
+
+ ceph orch ls --service_name=<service-name> --export > myservice.yaml
+
+2. Update the yaml file:
+
+ .. prompt:: bash #
+
+ vi myservice.yaml
+
+3. Apply the new ``ServiceSpec``:
+
+ .. prompt:: bash #
+
+ ceph orch apply -i myservice.yaml [--dry-run]
+
+.. _orchestrator-cli-placement-spec:
+
+Daemon Placement
+================
+
+For the orchestrator to deploy a *service*, it needs to know where to deploy
+*daemons*, and how many to deploy. This is the role of a placement
+specification. Placement specifications can either be passed as command line arguments
+or in a YAML files.
+
+.. note::
+
+ cephadm will not deploy daemons on hosts with the ``_no_schedule`` label; see :ref:`cephadm-special-host-labels`.
+
+.. note::
+ The **apply** command can be confusing. For this reason, we recommend using
+ YAML specifications.
+
+ Each ``ceph orch apply <service-name>`` command supersedes the one before it.
+ If you do not use the proper syntax, you will clobber your work
+ as you go.
+
+ For example:
+
+ .. prompt:: bash #
+
+ ceph orch apply mon host1
+ ceph orch apply mon host2
+ ceph orch apply mon host3
+
+ This results in only one host having a monitor applied to it: host 3.
+
+ (The first command creates a monitor on host1. Then the second command
+ clobbers the monitor on host1 and creates a monitor on host2. Then the
+ third command clobbers the monitor on host2 and creates a monitor on
+ host3. In this scenario, at this point, there is a monitor ONLY on
+ host3.)
+
+ To make certain that a monitor is applied to each of these three hosts,
+ run a command like this:
+
+ .. prompt:: bash #
+
+ ceph orch apply mon "host1,host2,host3"
+
+ There is another way to apply monitors to multiple hosts: a ``yaml`` file
+ can be used. Instead of using the "ceph orch apply mon" commands, run a
+ command of this form:
+
+ .. prompt:: bash #
+
+ ceph orch apply -i file.yaml
+
+ Here is a sample **file.yaml** file
+
+ .. code-block:: yaml
+
+ service_type: mon
+ placement:
+ hosts:
+ - host1
+ - host2
+ - host3
+
+Explicit placements
+-------------------
+
+Daemons can be explicitly placed on hosts by simply specifying them:
+
+ .. prompt:: bash #
+
+ orch apply prometheus --placement="host1 host2 host3"
+
+Or in YAML:
+
+.. code-block:: yaml
+
+ service_type: prometheus
+ placement:
+ hosts:
+ - host1
+ - host2
+ - host3
+
+MONs and other services may require some enhanced network specifications:
+
+ .. prompt:: bash #
+
+ orch daemon add mon --placement="myhost:[v2:1.2.3.4:3300,v1:1.2.3.4:6789]=name"
+
+where ``[v2:1.2.3.4:3300,v1:1.2.3.4:6789]`` is the network address of the monitor
+and ``=name`` specifies the name of the new monitor.
+
+.. _orch-placement-by-labels:
+
+Placement by labels
+-------------------
+
+Daemon placement can be limited to hosts that match a specific label. To set
+a label ``mylabel`` to the appropriate hosts, run this command:
+
+ .. prompt:: bash #
+
+ ceph orch host label add *<hostname>* mylabel
+
+ To view the current hosts and labels, run this command:
+
+ .. prompt:: bash #
+
+ ceph orch host ls
+
+ For example:
+
+ .. prompt:: bash #
+
+ ceph orch host label add host1 mylabel
+ ceph orch host label add host2 mylabel
+ ceph orch host label add host3 mylabel
+ ceph orch host ls
+
+ .. code-block:: bash
+
+ HOST ADDR LABELS STATUS
+ host1 mylabel
+ host2 mylabel
+ host3 mylabel
+ host4
+ host5
+
+Now, Tell cephadm to deploy daemons based on the label by running
+this command:
+
+ .. prompt:: bash #
+
+ orch apply prometheus --placement="label:mylabel"
+
+Or in YAML:
+
+.. code-block:: yaml
+
+ service_type: prometheus
+ placement:
+ label: "mylabel"
+
+* See :ref:`orchestrator-host-labels`
+
+Placement by pattern matching
+-----------------------------
+
+Daemons can be placed on hosts as well:
+
+ .. prompt:: bash #
+
+ orch apply prometheus --placement='myhost[1-3]'
+
+Or in YAML:
+
+.. code-block:: yaml
+
+ service_type: prometheus
+ placement:
+ host_pattern: "myhost[1-3]"
+
+To place a service on *all* hosts, use ``"*"``:
+
+ .. prompt:: bash #
+
+ orch apply node-exporter --placement='*'
+
+Or in YAML:
+
+.. code-block:: yaml
+
+ service_type: node-exporter
+ placement:
+ host_pattern: "*"
+
+
+Changing the number of daemons
+------------------------------
+
+By specifying ``count``, only the number of daemons specified will be created:
+
+ .. prompt:: bash #
+
+ orch apply prometheus --placement=3
+
+To deploy *daemons* on a subset of hosts, specify the count:
+
+ .. prompt:: bash #
+
+ orch apply prometheus --placement="2 host1 host2 host3"
+
+If the count is bigger than the amount of hosts, cephadm deploys one per host:
+
+ .. prompt:: bash #
+
+ orch apply prometheus --placement="3 host1 host2"
+
+The command immediately above results in two Prometheus daemons.
+
+YAML can also be used to specify limits, in the following way:
+
+.. code-block:: yaml
+
+ service_type: prometheus
+ placement:
+ count: 3
+
+YAML can also be used to specify limits on hosts:
+
+.. code-block:: yaml
+
+ service_type: prometheus
+ placement:
+ count: 2
+ hosts:
+ - host1
+ - host2
+ - host3
+
+.. _cephadm_co_location:
+
+Co-location of daemons
+----------------------
+
+Cephadm supports the deployment of multiple daemons on the same host:
+
+.. code-block:: yaml
+
+ service_type: rgw
+ placement:
+ label: rgw
+ count_per_host: 2
+
+The main reason for deploying multiple daemons per host is an additional
+performance benefit for running multiple RGW and MDS daemons on the same host.
+
+See also:
+
+* :ref:`cephadm_mgr_co_location`.
+* :ref:`cephadm-rgw-designated_gateways`.
+
+This feature was introduced in Pacific.
+
+Algorithm description
+---------------------
+
+Cephadm's declarative state consists of a list of service specifications
+containing placement specifications.
+
+Cephadm continually compares a list of daemons actually running in the cluster
+against the list in the service specifications. Cephadm adds new daemons and
+removes old daemons as necessary in order to conform to the service
+specifications.
+
+Cephadm does the following to maintain compliance with the service
+specifications.
+
+Cephadm first selects a list of candidate hosts. Cephadm seeks explicit host
+names and selects them. If cephadm finds no explicit host names, it looks for
+label specifications. If no label is defined in the specification, cephadm
+selects hosts based on a host pattern. If no host pattern is defined, as a last
+resort, cephadm selects all known hosts as candidates.
+
+Cephadm is aware of existing daemons running services and tries to avoid moving
+them.
+
+Cephadm supports the deployment of a specific amount of services.
+Consider the following service specification:
+
+.. code-block:: yaml
+
+ service_type: mds
+ service_name: myfs
+ placement:
+ count: 3
+ label: myfs
+
+This service specifcation instructs cephadm to deploy three daemons on hosts
+labeled ``myfs`` across the cluster.
+
+If there are fewer than three daemons deployed on the candidate hosts, cephadm
+randomly chooses hosts on which to deploy new daemons.
+
+If there are more than three daemons deployed on the candidate hosts, cephadm
+removes existing daemons.
+
+Finally, cephadm removes daemons on hosts that are outside of the list of
+candidate hosts.
+
+.. note::
+
+ There is a special case that cephadm must consider.
+
+ If there are fewer hosts selected by the placement specification than
+ demanded by ``count``, cephadm will deploy only on the selected hosts.
+
+Extra Container Arguments
+=========================
+
+.. warning::
+ The arguments provided for extra container args are limited to whatever arguments are available for a `run` command from whichever container engine you are using. Providing any arguments the `run` command does not support (or invalid values for arguments) will cause the daemon to fail to start.
+
+
+Cephadm supports providing extra miscellaneous container arguments for
+specific cases when they may be necessary. For example, if a user needed
+to limit the amount of cpus their mon daemons make use of they could apply
+a spec like
+
+.. code-block:: yaml
+
+ service_type: mon
+ service_name: mon
+ placement:
+ hosts:
+ - host1
+ - host2
+ - host3
+ extra_container_args:
+ - "--cpus=2"
+
+which would cause each mon daemon to be deployed with `--cpus=2`.
+
+Mounting Files with Extra Container Arguments
+---------------------------------------------
+
+A common use case for extra container arguments is to mount additional
+files within the container. However, some intuitive formats for doing
+so can cause deployment to fail (see https://tracker.ceph.com/issues/57338).
+The recommended syntax for mounting a file with extra container arguments is:
+
+.. code-block:: yaml
+
+ extra_container_args:
+ - "-v"
+ - "/absolute/file/path/on/host:/absolute/file/path/in/container"
+
+For example:
+
+.. code-block:: yaml
+
+ extra_container_args:
+ - "-v"
+ - "/opt/ceph_cert/host.cert:/etc/grafana/certs/cert_file:ro"
+
+.. _orch-rm:
+
+Removing a Service
+==================
+
+In order to remove a service including the removal
+of all daemons of that service, run
+
+.. prompt:: bash
+
+ ceph orch rm <service-name>
+
+For example:
+
+.. prompt:: bash
+
+ ceph orch rm rgw.myrgw
+
+.. _cephadm-spec-unmanaged:
+
+Disabling automatic deployment of daemons
+=========================================
+
+Cephadm supports disabling the automated deployment and removal of daemons on a
+per service basis. The CLI supports two commands for this.
+
+In order to fully remove a service, see :ref:`orch-rm`.
+
+Disabling automatic management of daemons
+-----------------------------------------
+
+To disable the automatic management of dameons, set ``unmanaged=True`` in the
+:ref:`orchestrator-cli-service-spec` (``mgr.yaml``).
+
+``mgr.yaml``:
+
+.. code-block:: yaml
+
+ service_type: mgr
+ unmanaged: true
+ placement:
+ label: mgr
+
+
+.. prompt:: bash #
+
+ ceph orch apply -i mgr.yaml
+
+
+.. note::
+
+ After you apply this change in the Service Specification, cephadm will no
+ longer deploy any new daemons (even if the placement specification matches
+ additional hosts).
+
+Deploying a daemon on a host manually
+-------------------------------------
+
+.. note::
+
+ This workflow has a very limited use case and should only be used
+ in rare circumstances.
+
+To manually deploy a daemon on a host, follow these steps:
+
+Modify the service spec for a service by getting the
+existing spec, adding ``unmanaged: true``, and applying the modified spec.
+
+Then manually deploy the daemon using the following:
+
+ .. prompt:: bash #
+
+ ceph orch daemon add <daemon-type> --placement=<placement spec>
+
+For example :
+
+ .. prompt:: bash #
+
+ ceph orch daemon add mgr --placement=my_host
+
+.. note::
+
+ Removing ``unmanaged: true`` from the service spec will
+ enable the reconciliation loop for this service and will
+ potentially lead to the removal of the daemon, depending
+ on the placement spec.
+
+Removing a daemon from a host manually
+--------------------------------------
+
+To manually remove a daemon, run a command of the following form:
+
+ .. prompt:: bash #
+
+ ceph orch daemon rm <daemon name>... [--force]
+
+For example:
+
+ .. prompt:: bash #
+
+ ceph orch daemon rm mgr.my_host.xyzxyz
+
+.. note::
+
+ For managed services (``unmanaged=False``), cephadm will automatically
+ deploy a new daemon a few seconds later.
+
+See also
+--------
+
+* See :ref:`cephadm-osd-declarative` for special handling of unmanaged OSDs.
+* See also :ref:`cephadm-pause`
diff --git a/doc/cephadm/services/iscsi.rst b/doc/cephadm/services/iscsi.rst
new file mode 100644
index 000000000..e039e8d9a
--- /dev/null
+++ b/doc/cephadm/services/iscsi.rst
@@ -0,0 +1,80 @@
+=============
+iSCSI Service
+=============
+
+.. _cephadm-iscsi:
+
+Deploying iSCSI
+===============
+
+To deploy an iSCSI gateway, create a yaml file containing a
+service specification for iscsi:
+
+.. code-block:: yaml
+
+ service_type: iscsi
+ service_id: iscsi
+ placement:
+ hosts:
+ - host1
+ - host2
+ spec:
+ pool: mypool # RADOS pool where ceph-iscsi config data is stored.
+ trusted_ip_list: "IP_ADDRESS_1,IP_ADDRESS_2"
+ api_port: ... # optional
+ api_user: ... # optional
+ api_password: ... # optional
+ api_secure: true/false # optional
+ ssl_cert: | # optional
+ ...
+ ssl_key: | # optional
+ ...
+
+For example:
+
+.. code-block:: yaml
+
+ service_type: iscsi
+ service_id: iscsi
+ placement:
+ hosts:
+ - [...]
+ spec:
+ pool: iscsi_pool
+ trusted_ip_list: "IP_ADDRESS_1,IP_ADDRESS_2,IP_ADDRESS_3,..."
+ api_user: API_USERNAME
+ api_password: API_PASSWORD
+ ssl_cert: |
+ -----BEGIN CERTIFICATE-----
+ MIIDtTCCAp2gAwIBAgIYMC4xNzc1NDQxNjEzMzc2MjMyXzxvQ7EcMA0GCSqGSIb3
+ DQEBCwUAMG0xCzAJBgNVBAYTAlVTMQ0wCwYDVQQIDARVdGFoMRcwFQYDVQQHDA5T
+ [...]
+ -----END CERTIFICATE-----
+ ssl_key: |
+ -----BEGIN PRIVATE KEY-----
+ MIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQC5jdYbjtNTAKW4
+ /CwQr/7wOiLGzVxChn3mmCIF3DwbL/qvTFTX2d8bDf6LjGwLYloXHscRfxszX/4h
+ [...]
+ -----END PRIVATE KEY-----
+
+.. py:currentmodule:: ceph.deployment.service_spec
+
+.. autoclass:: IscsiServiceSpec
+ :members:
+
+
+The specification can then be applied using:
+
+.. prompt:: bash #
+
+ ceph orch apply -i iscsi.yaml
+
+
+See :ref:`orchestrator-cli-placement-spec` for details of the placement specification.
+
+See also: :ref:`orchestrator-cli-service-spec`.
+
+Further Reading
+===============
+
+* RBD: :ref:`ceph-iscsi`
diff --git a/doc/cephadm/services/mds.rst b/doc/cephadm/services/mds.rst
new file mode 100644
index 000000000..949a0fa5d
--- /dev/null
+++ b/doc/cephadm/services/mds.rst
@@ -0,0 +1,49 @@
+===========
+MDS Service
+===========
+
+
+.. _orchestrator-cli-cephfs:
+
+Deploy CephFS
+=============
+
+One or more MDS daemons is required to use the :term:`CephFS` file system.
+These are created automatically if the newer ``ceph fs volume``
+interface is used to create a new file system. For more information,
+see :ref:`fs-volumes-and-subvolumes`.
+
+For example:
+
+.. prompt:: bash #
+
+ ceph fs volume create <fs_name> --placement="<placement spec>"
+
+where ``fs_name`` is the name of the CephFS and ``placement`` is a
+:ref:`orchestrator-cli-placement-spec`.
+
+For manually deploying MDS daemons, use this specification:
+
+.. code-block:: yaml
+
+ service_type: mds
+ service_id: fs_name
+ placement:
+ count: 3
+
+
+The specification can then be applied using:
+
+.. prompt:: bash #
+
+ ceph orch apply -i mds.yaml
+
+See :ref:`orchestrator-cli-stateless-services` for manually deploying
+MDS daemons on the CLI.
+
+Further Reading
+===============
+
+* :ref:`ceph-file-system`
+
+
diff --git a/doc/cephadm/services/mgr.rst b/doc/cephadm/services/mgr.rst
new file mode 100644
index 000000000..133a00d77
--- /dev/null
+++ b/doc/cephadm/services/mgr.rst
@@ -0,0 +1,43 @@
+.. _mgr-cephadm-mgr:
+
+===========
+MGR Service
+===========
+
+The cephadm MGR service is hosting different modules, like the :ref:`mgr-dashboard`
+and the cephadm manager module.
+
+.. _cephadm-mgr-networks:
+
+Specifying Networks
+-------------------
+
+The MGR service supports binding only to a specific IP within a network.
+
+example spec file (leveraging a default placement):
+
+.. code-block:: yaml
+
+ service_type: mgr
+ networks:
+ - 192.169.142.0/24
+
+.. _cephadm_mgr_co_location:
+
+Allow co-location of MGR daemons
+================================
+
+In deployment scenarios with just a single host, cephadm still needs
+to deploy at least two MGR daemons in order to allow an automated
+upgrade of the cluster. See ``mgr_standby_modules`` in
+the :ref:`mgr-administrator-guide` for further details.
+
+See also: :ref:`cephadm_co_location`.
+
+
+Further Reading
+===============
+
+* :ref:`ceph-manager-daemon`
+* :ref:`cephadm-manually-deploy-mgr`
+
diff --git a/doc/cephadm/services/mon.rst b/doc/cephadm/services/mon.rst
new file mode 100644
index 000000000..6326b73f4
--- /dev/null
+++ b/doc/cephadm/services/mon.rst
@@ -0,0 +1,179 @@
+===========
+MON Service
+===========
+
+.. _deploy_additional_monitors:
+
+Deploying additional monitors
+=============================
+
+A typical Ceph cluster has three or five monitor daemons that are spread
+across different hosts. We recommend deploying five monitors if there are
+five or more nodes in your cluster.
+
+.. _CIDR: https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation
+
+Ceph deploys monitor daemons automatically as the cluster grows and Ceph
+scales back monitor daemons automatically as the cluster shrinks. The
+smooth execution of this automatic growing and shrinking depends upon
+proper subnet configuration.
+
+The cephadm bootstrap procedure assigns the first monitor daemon in the
+cluster to a particular subnet. ``cephadm`` designates that subnet as the
+default subnet of the cluster. New monitor daemons will be assigned by
+default to that subnet unless cephadm is instructed to do otherwise.
+
+If all of the ceph monitor daemons in your cluster are in the same subnet,
+manual administration of the ceph monitor daemons is not necessary.
+``cephadm`` will automatically add up to five monitors to the subnet, as
+needed, as new hosts are added to the cluster.
+
+By default, cephadm will deploy 5 daemons on arbitrary hosts. See
+:ref:`orchestrator-cli-placement-spec` for details of specifying
+the placement of daemons.
+
+Designating a Particular Subnet for Monitors
+--------------------------------------------
+
+To designate a particular IP subnet for use by ceph monitor daemons, use a
+command of the following form, including the subnet's address in `CIDR`_
+format (e.g., ``10.1.2.0/24``):
+
+ .. prompt:: bash #
+
+ ceph config set mon public_network *<mon-cidr-network>*
+
+ For example:
+
+ .. prompt:: bash #
+
+ ceph config set mon public_network 10.1.2.0/24
+
+Cephadm deploys new monitor daemons only on hosts that have IP addresses in
+the designated subnet.
+
+You can also specify two public networks by using a list of networks:
+
+ .. prompt:: bash #
+
+ ceph config set mon public_network *<mon-cidr-network1>,<mon-cidr-network2>*
+
+ For example:
+
+ .. prompt:: bash #
+
+ ceph config set mon public_network 10.1.2.0/24,192.168.0.1/24
+
+
+Deploying Monitors on a Particular Network
+------------------------------------------
+
+You can explicitly specify the IP address or CIDR network for each monitor and
+control where each monitor is placed. To disable automated monitor deployment,
+run this command:
+
+ .. prompt:: bash #
+
+ ceph orch apply mon --unmanaged
+
+ To deploy each additional monitor:
+
+ .. prompt:: bash #
+
+ ceph orch daemon add mon *<host1:ip-or-network1>
+
+ For example, to deploy a second monitor on ``newhost1`` using an IP
+ address ``10.1.2.123`` and a third monitor on ``newhost2`` in
+ network ``10.1.2.0/24``, run the following commands:
+
+ .. prompt:: bash #
+
+ ceph orch apply mon --unmanaged
+ ceph orch daemon add mon newhost1:10.1.2.123
+ ceph orch daemon add mon newhost2:10.1.2.0/24
+
+ Now, enable automatic placement of Daemons
+
+ .. prompt:: bash #
+
+ ceph orch apply mon --placement="newhost1,newhost2,newhost3" --dry-run
+
+ See :ref:`orchestrator-cli-placement-spec` for details of specifying
+ the placement of daemons.
+
+ Finally apply this new placement by dropping ``--dry-run``
+
+ .. prompt:: bash #
+
+ ceph orch apply mon --placement="newhost1,newhost2,newhost3"
+
+
+Moving Monitors to a Different Network
+--------------------------------------
+
+To move Monitors to a new network, deploy new monitors on the new network and
+subsequently remove monitors from the old network. It is not advised to
+modify and inject the ``monmap`` manually.
+
+First, disable the automated placement of daemons:
+
+ .. prompt:: bash #
+
+ ceph orch apply mon --unmanaged
+
+To deploy each additional monitor:
+
+ .. prompt:: bash #
+
+ ceph orch daemon add mon *<newhost1:ip-or-network1>*
+
+For example, to deploy a second monitor on ``newhost1`` using an IP
+address ``10.1.2.123`` and a third monitor on ``newhost2`` in
+network ``10.1.2.0/24``, run the following commands:
+
+ .. prompt:: bash #
+
+ ceph orch apply mon --unmanaged
+ ceph orch daemon add mon newhost1:10.1.2.123
+ ceph orch daemon add mon newhost2:10.1.2.0/24
+
+ Subsequently remove monitors from the old network:
+
+ .. prompt:: bash #
+
+ ceph orch daemon rm *mon.<oldhost1>*
+
+ Update the ``public_network``:
+
+ .. prompt:: bash #
+
+ ceph config set mon public_network *<mon-cidr-network>*
+
+ For example:
+
+ .. prompt:: bash #
+
+ ceph config set mon public_network 10.1.2.0/24
+
+ Now, enable automatic placement of Daemons
+
+ .. prompt:: bash #
+
+ ceph orch apply mon --placement="newhost1,newhost2,newhost3" --dry-run
+
+ See :ref:`orchestrator-cli-placement-spec` for details of specifying
+ the placement of daemons.
+
+ Finally apply this new placement by dropping ``--dry-run``
+
+ .. prompt:: bash #
+
+ ceph orch apply mon --placement="newhost1,newhost2,newhost3"
+
+Futher Reading
+==============
+
+* :ref:`rados-operations`
+* :ref:`rados-troubleshooting-mon`
+* :ref:`cephadm-restore-quorum`
+
diff --git a/doc/cephadm/services/monitoring.rst b/doc/cephadm/services/monitoring.rst
new file mode 100644
index 000000000..86e3e3f69
--- /dev/null
+++ b/doc/cephadm/services/monitoring.rst
@@ -0,0 +1,457 @@
+.. _mgr-cephadm-monitoring:
+
+Monitoring Services
+===================
+
+Ceph Dashboard uses `Prometheus <https://prometheus.io/>`_, `Grafana
+<https://grafana.com/>`_, and related tools to store and visualize detailed
+metrics on cluster utilization and performance. Ceph users have three options:
+
+#. Have cephadm deploy and configure these services. This is the default
+ when bootstrapping a new cluster unless the ``--skip-monitoring-stack``
+ option is used.
+#. Deploy and configure these services manually. This is recommended for users
+ with existing prometheus services in their environment (and in cases where
+ Ceph is running in Kubernetes with Rook).
+#. Skip the monitoring stack completely. Some Ceph dashboard graphs will
+ not be available.
+
+The monitoring stack consists of `Prometheus <https://prometheus.io/>`_,
+Prometheus exporters (:ref:`mgr-prometheus`, `Node exporter
+<https://prometheus.io/docs/guides/node-exporter/>`_), `Prometheus Alert
+Manager <https://prometheus.io/docs/alerting/alertmanager/>`_ and `Grafana
+<https://grafana.com/>`_.
+
+.. note::
+
+ Prometheus' security model presumes that untrusted users have access to the
+ Prometheus HTTP endpoint and logs. Untrusted users have access to all the
+ (meta)data Prometheus collects that is contained in the database, plus a
+ variety of operational and debugging information.
+
+ However, Prometheus' HTTP API is limited to read-only operations.
+ Configurations can *not* be changed using the API and secrets are not
+ exposed. Moreover, Prometheus has some built-in measures to mitigate the
+ impact of denial of service attacks.
+
+ Please see `Prometheus' Security model
+ <https://prometheus.io/docs/operating/security/>` for more detailed
+ information.
+
+Deploying monitoring with cephadm
+---------------------------------
+
+The default behavior of ``cephadm`` is to deploy a basic monitoring stack. It
+is however possible that you have a Ceph cluster without a monitoring stack,
+and you would like to add a monitoring stack to it. (Here are some ways that
+you might have come to have a Ceph cluster without a monitoring stack: You
+might have passed the ``--skip-monitoring stack`` option to ``cephadm`` during
+the installation of the cluster, or you might have converted an existing
+cluster (which had no monitoring stack) to cephadm management.)
+
+To set up monitoring on a Ceph cluster that has no monitoring, follow the
+steps below:
+
+#. Deploy a node-exporter service on every node of the cluster. The node-exporter provides host-level metrics like CPU and memory utilization:
+
+ .. prompt:: bash #
+
+ ceph orch apply node-exporter
+
+#. Deploy alertmanager:
+
+ .. prompt:: bash #
+
+ ceph orch apply alertmanager
+
+#. Deploy Prometheus. A single Prometheus instance is sufficient, but
+ for high availablility (HA) you might want to deploy two:
+
+ .. prompt:: bash #
+
+ ceph orch apply prometheus
+
+ or
+
+ .. prompt:: bash #
+
+ ceph orch apply prometheus --placement 'count:2'
+
+#. Deploy grafana:
+
+ .. prompt:: bash #
+
+ ceph orch apply grafana
+
+.. _cephadm-monitoring-networks-ports:
+
+Networks and Ports
+~~~~~~~~~~~~~~~~~~
+
+All monitoring services can have the network and port they bind to configured with a yaml service specification
+
+example spec file:
+
+.. code-block:: yaml
+
+ service_type: grafana
+ service_name: grafana
+ placement:
+ count: 1
+ networks:
+ - 192.169.142.0/24
+ spec:
+ port: 4200
+
+.. _cephadm_monitoring-images:
+
+Using custom images
+~~~~~~~~~~~~~~~~~~~
+
+It is possible to install or upgrade monitoring components based on other
+images. To do so, the name of the image to be used needs to be stored in the
+configuration first. The following configuration options are available.
+
+- ``container_image_prometheus``
+- ``container_image_grafana``
+- ``container_image_alertmanager``
+- ``container_image_node_exporter``
+
+Custom images can be set with the ``ceph config`` command
+
+.. code-block:: bash
+
+ ceph config set mgr mgr/cephadm/<option_name> <value>
+
+For example
+
+.. code-block:: bash
+
+ ceph config set mgr mgr/cephadm/container_image_prometheus prom/prometheus:v1.4.1
+
+If there were already running monitoring stack daemon(s) of the type whose
+image you've changed, you must redeploy the daemon(s) in order to have them
+actually use the new image.
+
+For example, if you had changed the prometheus image
+
+.. prompt:: bash #
+
+ ceph orch redeploy prometheus
+
+
+.. note::
+
+ By setting a custom image, the default value will be overridden (but not
+ overwritten). The default value changes when updates become available.
+ By setting a custom image, you will not be able to update the component
+ you have set the custom image for automatically. You will need to
+ manually update the configuration (image name and tag) to be able to
+ install updates.
+
+ If you choose to go with the recommendations instead, you can reset the
+ custom image you have set before. After that, the default value will be
+ used again. Use ``ceph config rm`` to reset the configuration option
+
+ .. code-block:: bash
+
+ ceph config rm mgr mgr/cephadm/<option_name>
+
+ For example
+
+ .. code-block:: bash
+
+ ceph config rm mgr mgr/cephadm/container_image_prometheus
+
+See also :ref:`cephadm-airgap`.
+
+.. _cephadm-overwrite-jinja2-templates:
+
+Using custom configuration files
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+By overriding cephadm templates, it is possible to completely customize the
+configuration files for monitoring services.
+
+Internally, cephadm already uses `Jinja2
+<https://jinja.palletsprojects.com/en/2.11.x/>`_ templates to generate the
+configuration files for all monitoring components. To be able to customize the
+configuration of Prometheus, Grafana or the Alertmanager it is possible to store
+a Jinja2 template for each service that will be used for configuration
+generation instead. This template will be evaluated every time a service of that
+kind is deployed or reconfigured. That way, the custom configuration is
+preserved and automatically applied on future deployments of these services.
+
+.. note::
+
+ The configuration of the custom template is also preserved when the default
+ configuration of cephadm changes. If the updated configuration is to be used,
+ the custom template needs to be migrated *manually* after each upgrade of Ceph.
+
+Option names
+""""""""""""
+
+The following templates for files that will be generated by cephadm can be
+overridden. These are the names to be used when storing with ``ceph config-key
+set``:
+
+- ``services/alertmanager/alertmanager.yml``
+- ``services/grafana/ceph-dashboard.yml``
+- ``services/grafana/grafana.ini``
+- ``services/prometheus/prometheus.yml``
+- ``services/prometheus/alerting/custom_alerts.yml``
+
+You can look up the file templates that are currently used by cephadm in
+``src/pybind/mgr/cephadm/templates``:
+
+- ``services/alertmanager/alertmanager.yml.j2``
+- ``services/grafana/ceph-dashboard.yml.j2``
+- ``services/grafana/grafana.ini.j2``
+- ``services/prometheus/prometheus.yml.j2``
+
+Usage
+"""""
+
+The following command applies a single line value:
+
+.. code-block:: bash
+
+ ceph config-key set mgr/cephadm/<option_name> <value>
+
+To set contents of files as template use the ``-i`` argument:
+
+.. code-block:: bash
+
+ ceph config-key set mgr/cephadm/<option_name> -i $PWD/<filename>
+
+.. note::
+
+ When using files as input to ``config-key`` an absolute path to the file must
+ be used.
+
+
+Then the configuration file for the service needs to be recreated.
+This is done using `reconfig`. For more details see the following example.
+
+Example
+"""""""
+
+.. code-block:: bash
+
+ # set the contents of ./prometheus.yml.j2 as template
+ ceph config-key set mgr/cephadm/services/prometheus/prometheus.yml \
+ -i $PWD/prometheus.yml.j2
+
+ # reconfig the prometheus service
+ ceph orch reconfig prometheus
+
+.. code-block:: bash
+
+ # set additional custom alerting rules for Prometheus
+ ceph config-key set mgr/cephadm/services/prometheus/alerting/custom_alerts.yml \
+ -i $PWD/custom_alerts.yml
+
+ # Note that custom alerting rules are not parsed by Jinja and hence escaping
+ # will not be an issue.
+
+Deploying monitoring without cephadm
+------------------------------------
+
+If you have an existing prometheus monitoring infrastructure, or would like
+to manage it yourself, you need to configure it to integrate with your Ceph
+cluster.
+
+* Enable the prometheus module in the ceph-mgr daemon
+
+ .. code-block:: bash
+
+ ceph mgr module enable prometheus
+
+ By default, ceph-mgr presents prometheus metrics on port 9283 on each host
+ running a ceph-mgr daemon. Configure prometheus to scrape these.
+
+* To enable the dashboard's prometheus-based alerting, see :ref:`dashboard-alerting`.
+
+* To enable dashboard integration with Grafana, see :ref:`dashboard-grafana`.
+
+Disabling monitoring
+--------------------
+
+To disable monitoring and remove the software that supports it, run the following commands:
+
+.. code-block:: console
+
+ $ ceph orch rm grafana
+ $ ceph orch rm prometheus --force # this will delete metrics data collected so far
+ $ ceph orch rm node-exporter
+ $ ceph orch rm alertmanager
+ $ ceph mgr module disable prometheus
+
+See also :ref:`orch-rm`.
+
+Setting up RBD-Image monitoring
+-------------------------------
+
+Due to performance reasons, monitoring of RBD images is disabled by default. For more information please see
+:ref:`prometheus-rbd-io-statistics`. If disabled, the overview and details dashboards will stay empty in Grafana
+and the metrics will not be visible in Prometheus.
+
+Setting up Prometheus
+-----------------------
+
+Setting Prometheus Retention Time
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Cephadm provides the option to set the Prometheus TDSB retention time using
+a ``retention_time`` field in the Prometheus service spec. The value defaults
+to 15 days (15d). If you would like a different value, such as 1 year (1y) you
+can apply a service spec similar to:
+
+.. code-block:: yaml
+
+ service_type: prometheus
+ placement:
+ count: 1
+ spec:
+ retention_time: "1y"
+
+.. note::
+
+ If you already had Prometheus daemon(s) deployed before and are updating an
+ existent spec as opposed to doing a fresh Prometheus deployment, you must also
+ tell cephadm to redeploy the Prometheus daemon(s) to put this change into effect.
+ This can be done with a ``ceph orch redeploy prometheus`` command.
+
+Setting up Grafana
+------------------
+
+Manually setting the Grafana URL
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Cephadm automatically configures Prometheus, Grafana, and Alertmanager in
+all cases except one.
+
+In a some setups, the Dashboard user's browser might not be able to access the
+Grafana URL that is configured in Ceph Dashboard. This can happen when the
+cluster and the accessing user are in different DNS zones.
+
+If this is the case, you can use a configuration option for Ceph Dashboard
+to set the URL that the user's browser will use to access Grafana. This
+value will never be altered by cephadm. To set this configuration option,
+issue the following command:
+
+ .. prompt:: bash $
+
+ ceph dashboard set-grafana-frontend-api-url <grafana-server-api>
+
+It might take a minute or two for services to be deployed. After the
+services have been deployed, you should see something like this when you issue the command ``ceph orch ls``:
+
+.. code-block:: console
+
+ $ ceph orch ls
+ NAME RUNNING REFRESHED IMAGE NAME IMAGE ID SPEC
+ alertmanager 1/1 6s ago docker.io/prom/alertmanager:latest 0881eb8f169f present
+ crash 2/2 6s ago docker.io/ceph/daemon-base:latest-master-devel mix present
+ grafana 1/1 0s ago docker.io/pcuzner/ceph-grafana-el8:latest f77afcf0bcf6 absent
+ node-exporter 2/2 6s ago docker.io/prom/node-exporter:latest e5a616e4b9cf present
+ prometheus 1/1 6s ago docker.io/prom/prometheus:latest e935122ab143 present
+
+Configuring SSL/TLS for Grafana
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+``cephadm`` deploys Grafana using the certificate defined in the ceph
+key/value store. If no certificate is specified, ``cephadm`` generates a
+self-signed certificate during the deployment of the Grafana service.
+
+A custom certificate can be configured using the following commands:
+
+.. prompt:: bash #
+
+ ceph config-key set mgr/cephadm/grafana_key -i $PWD/key.pem
+ ceph config-key set mgr/cephadm/grafana_crt -i $PWD/certificate.pem
+
+If you have already deployed Grafana, run ``reconfig`` on the service to
+update its configuration:
+
+.. prompt:: bash #
+
+ ceph orch reconfig grafana
+
+The ``reconfig`` command also sets the proper URL for Ceph Dashboard.
+
+Setting the initial admin password
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+By default, Grafana will not create an initial
+admin user. In order to create the admin user, please create a file
+``grafana.yaml`` with this content:
+
+.. code-block:: yaml
+
+ service_type: grafana
+ spec:
+ initial_admin_password: mypassword
+
+Then apply this specification:
+
+.. code-block:: bash
+
+ ceph orch apply -i grafana.yaml
+ ceph orch redeploy grafana
+
+Grafana will now create an admin user called ``admin`` with the
+given password.
+
+
+Setting up Alertmanager
+-----------------------
+
+Adding Alertmanager webhooks
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To add new webhooks to the Alertmanager configuration, add additional
+webhook urls like so:
+
+.. code-block:: yaml
+
+ service_type: alertmanager
+ spec:
+ user_data:
+ default_webhook_urls:
+ - "https://foo"
+ - "https://bar"
+
+Where ``default_webhook_urls`` is a list of additional URLs that are
+added to the default receivers' ``<webhook_configs>`` configuration.
+
+Run ``reconfig`` on the service to update its configuration:
+
+.. prompt:: bash #
+
+ ceph orch reconfig alertmanager
+
+Turn on Certificate Validation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If you are using certificates for alertmanager and want to make sure
+these certs are verified, you should set the "secure" option to
+true in your alertmanager spec (this defaults to false).
+
+.. code-block:: yaml
+
+ service_type: alertmanager
+ spec:
+ secure: true
+
+If you already had alertmanager daemons running before applying the spec
+you must reconfigure them to update their configuration
+
+.. prompt:: bash #
+
+ ceph orch reconfig alertmanager
+
+Further Reading
+---------------
+
+* :ref:`mgr-prometheus`
diff --git a/doc/cephadm/services/nfs.rst b/doc/cephadm/services/nfs.rst
new file mode 100644
index 000000000..c48d0f765
--- /dev/null
+++ b/doc/cephadm/services/nfs.rst
@@ -0,0 +1,120 @@
+.. _deploy-cephadm-nfs-ganesha:
+
+===========
+NFS Service
+===========
+
+.. note:: Only the NFSv4 protocol is supported.
+
+The simplest way to manage NFS is via the ``ceph nfs cluster ...``
+commands; see :ref:`mgr-nfs`. This document covers how to manage the
+cephadm services directly, which should only be necessary for unusual NFS
+configurations.
+
+Deploying NFS ganesha
+=====================
+
+Cephadm deploys NFS Ganesha daemon (or set of daemons). The configuration for
+NFS is stored in the ``nfs-ganesha`` pool and exports are managed via the
+``ceph nfs export ...`` commands and via the dashboard.
+
+To deploy a NFS Ganesha gateway, run the following command:
+
+.. prompt:: bash #
+
+ ceph orch apply nfs *<svc_id>* [--port *<port>*] [--placement ...]
+
+For example, to deploy NFS with a service id of *foo* on the default
+port 2049 with the default placement of a single daemon:
+
+.. prompt:: bash #
+
+ ceph orch apply nfs foo
+
+See :ref:`orchestrator-cli-placement-spec` for the details of the placement
+specification.
+
+Service Specification
+=====================
+
+Alternatively, an NFS service can be applied using a YAML specification.
+
+.. code-block:: yaml
+
+ service_type: nfs
+ service_id: mynfs
+ placement:
+ hosts:
+ - host1
+ - host2
+ spec:
+ port: 12345
+
+In this example, we run the server on the non-default ``port`` of
+12345 (instead of the default 2049) on ``host1`` and ``host2``.
+
+The specification can then be applied by running the following command:
+
+.. prompt:: bash #
+
+ ceph orch apply -i nfs.yaml
+
+.. _cephadm-ha-nfs:
+
+High-availability NFS
+=====================
+
+Deploying an *ingress* service for an existing *nfs* service will provide:
+
+* a stable, virtual IP that can be used to access the NFS server
+* fail-over between hosts if there is a host failure
+* load distribution across multiple NFS gateways (although this is rarely necessary)
+
+Ingress for NFS can be deployed for an existing NFS service
+(``nfs.mynfs`` in this example) with the following specification:
+
+.. code-block:: yaml
+
+ service_type: ingress
+ service_id: nfs.mynfs
+ placement:
+ count: 2
+ spec:
+ backend_service: nfs.mynfs
+ frontend_port: 2049
+ monitor_port: 9000
+ virtual_ip: 10.0.0.123/24
+
+A few notes:
+
+ * The *virtual_ip* must include a CIDR prefix length, as in the
+ example above. The virtual IP will normally be configured on the
+ first identified network interface that has an existing IP in the
+ same subnet. You can also specify a *virtual_interface_networks*
+ property to match against IPs in other networks; see
+ :ref:`ingress-virtual-ip` for more information.
+ * The *monitor_port* is used to access the haproxy load status
+ page. The user is ``admin`` by default, but can be modified by
+ via an *admin* property in the spec. If a password is not
+ specified via a *password* property in the spec, the auto-generated password
+ can be found with:
+
+ .. prompt:: bash #
+
+ ceph config-key get mgr/cephadm/ingress.*{svc_id}*/monitor_password
+
+ For example:
+
+ .. prompt:: bash #
+
+ ceph config-key get mgr/cephadm/ingress.nfs.myfoo/monitor_password
+
+ * The backend service (``nfs.mynfs`` in this example) should include
+ a *port* property that is not 2049 to avoid conflicting with the
+ ingress service, which could be placed on the same host(s).
+
+Further Reading
+===============
+
+* CephFS: :ref:`cephfs-nfs`
+* MGR: :ref:`mgr-nfs`
diff --git a/doc/cephadm/services/osd.rst b/doc/cephadm/services/osd.rst
new file mode 100644
index 000000000..de0d4f82a
--- /dev/null
+++ b/doc/cephadm/services/osd.rst
@@ -0,0 +1,936 @@
+***********
+OSD Service
+***********
+.. _device management: ../rados/operations/devices
+.. _libstoragemgmt: https://github.com/libstorage/libstoragemgmt
+
+List Devices
+============
+
+``ceph-volume`` scans each host in the cluster from time to time in order
+to determine which devices are present and whether they are eligible to be
+used as OSDs.
+
+To print a list of devices discovered by ``cephadm``, run this command:
+
+.. prompt:: bash #
+
+ ceph orch device ls [--hostname=...] [--wide] [--refresh]
+
+Example
+::
+
+ Hostname Path Type Serial Size Health Ident Fault Available
+ srv-01 /dev/sdb hdd 15P0A0YFFRD6 300G Unknown N/A N/A No
+ srv-01 /dev/sdc hdd 15R0A08WFRD6 300G Unknown N/A N/A No
+ srv-01 /dev/sdd hdd 15R0A07DFRD6 300G Unknown N/A N/A No
+ srv-01 /dev/sde hdd 15P0A0QDFRD6 300G Unknown N/A N/A No
+ srv-02 /dev/sdb hdd 15R0A033FRD6 300G Unknown N/A N/A No
+ srv-02 /dev/sdc hdd 15R0A05XFRD6 300G Unknown N/A N/A No
+ srv-02 /dev/sde hdd 15R0A0ANFRD6 300G Unknown N/A N/A No
+ srv-02 /dev/sdf hdd 15R0A06EFRD6 300G Unknown N/A N/A No
+ srv-03 /dev/sdb hdd 15R0A0OGFRD6 300G Unknown N/A N/A No
+ srv-03 /dev/sdc hdd 15R0A0P7FRD6 300G Unknown N/A N/A No
+ srv-03 /dev/sdd hdd 15R0A0O7FRD6 300G Unknown N/A N/A No
+
+Using the ``--wide`` option provides all details relating to the device,
+including any reasons that the device might not be eligible for use as an OSD.
+
+In the above example you can see fields named "Health", "Ident", and "Fault".
+This information is provided by integration with `libstoragemgmt`_. By default,
+this integration is disabled (because `libstoragemgmt`_ may not be 100%
+compatible with your hardware). To make ``cephadm`` include these fields,
+enable cephadm's "enhanced device scan" option as follows;
+
+.. prompt:: bash #
+
+ ceph config set mgr mgr/cephadm/device_enhanced_scan true
+
+.. warning::
+ Although the libstoragemgmt library performs standard SCSI inquiry calls,
+ there is no guarantee that your firmware fully implements these standards.
+ This can lead to erratic behaviour and even bus resets on some older
+ hardware. It is therefore recommended that, before enabling this feature,
+ you test your hardware's compatibility with libstoragemgmt first to avoid
+ unplanned interruptions to services.
+
+ There are a number of ways to test compatibility, but the simplest may be
+ to use the cephadm shell to call libstoragemgmt directly - ``cephadm shell
+ lsmcli ldl``. If your hardware is supported you should see something like
+ this:
+
+ ::
+
+ Path | SCSI VPD 0x83 | Link Type | Serial Number | Health Status
+ ----------------------------------------------------------------------------
+ /dev/sda | 50000396082ba631 | SAS | 15P0A0R0FRD6 | Good
+ /dev/sdb | 50000396082bbbf9 | SAS | 15P0A0YFFRD6 | Good
+
+
+After you have enabled libstoragemgmt support, the output will look something
+like this:
+
+::
+
+ # ceph orch device ls
+ Hostname Path Type Serial Size Health Ident Fault Available
+ srv-01 /dev/sdb hdd 15P0A0YFFRD6 300G Good Off Off No
+ srv-01 /dev/sdc hdd 15R0A08WFRD6 300G Good Off Off No
+ :
+
+In this example, libstoragemgmt has confirmed the health of the drives and the ability to
+interact with the Identification and Fault LEDs on the drive enclosures. For further
+information about interacting with these LEDs, refer to `device management`_.
+
+.. note::
+ The current release of `libstoragemgmt`_ (1.8.8) supports SCSI, SAS, and SATA based
+ local disks only. There is no official support for NVMe devices (PCIe)
+
+.. _cephadm-deploy-osds:
+
+Deploy OSDs
+===========
+
+Listing Storage Devices
+-----------------------
+
+In order to deploy an OSD, there must be a storage device that is *available* on
+which the OSD will be deployed.
+
+Run this command to display an inventory of storage devices on all cluster hosts:
+
+.. prompt:: bash #
+
+ ceph orch device ls
+
+A storage device is considered *available* if all of the following
+conditions are met:
+
+* The device must have no partitions.
+* The device must not have any LVM state.
+* The device must not be mounted.
+* The device must not contain a file system.
+* The device must not contain a Ceph BlueStore OSD.
+* The device must be larger than 5 GB.
+
+Ceph will not provision an OSD on a device that is not available.
+
+Creating New OSDs
+-----------------
+
+There are a few ways to create new OSDs:
+
+* Tell Ceph to consume any available and unused storage device:
+
+ .. prompt:: bash #
+
+ ceph orch apply osd --all-available-devices
+
+* Create an OSD from a specific device on a specific host:
+
+ .. prompt:: bash #
+
+ ceph orch daemon add osd *<host>*:*<device-path>*
+
+ For example:
+
+ .. prompt:: bash #
+
+ ceph orch daemon add osd host1:/dev/sdb
+
+ Advanced OSD creation from specific devices on a specific host:
+
+ .. prompt:: bash #
+
+ ceph orch daemon add osd host1:data_devices=/dev/sda,/dev/sdb,db_devices=/dev/sdc,osds_per_device=2
+
+* You can use :ref:`drivegroups` to categorize device(s) based on their
+ properties. This might be useful in forming a clearer picture of which
+ devices are available to consume. Properties include device type (SSD or
+ HDD), device model names, size, and the hosts on which the devices exist:
+
+ .. prompt:: bash #
+
+ ceph orch apply -i spec.yml
+
+Dry Run
+-------
+
+The ``--dry-run`` flag causes the orchestrator to present a preview of what
+will happen without actually creating the OSDs.
+
+For example:
+
+ .. prompt:: bash #
+
+ ceph orch apply osd --all-available-devices --dry-run
+
+ ::
+
+ NAME HOST DATA DB WAL
+ all-available-devices node1 /dev/vdb - -
+ all-available-devices node2 /dev/vdc - -
+ all-available-devices node3 /dev/vdd - -
+
+.. _cephadm-osd-declarative:
+
+Declarative State
+-----------------
+
+The effect of ``ceph orch apply`` is persistent. This means that drives that
+are added to the system after the ``ceph orch apply`` command completes will be
+automatically found and added to the cluster. It also means that drives that
+become available (by zapping, for example) after the ``ceph orch apply``
+command completes will be automatically found and added to the cluster.
+
+We will examine the effects of the following command:
+
+ .. prompt:: bash #
+
+ ceph orch apply osd --all-available-devices
+
+After running the above command:
+
+* If you add new disks to the cluster, they will automatically be used to
+ create new OSDs.
+* If you remove an OSD and clean the LVM physical volume, a new OSD will be
+ created automatically.
+
+To disable the automatic creation of OSD on available devices, use the
+``unmanaged`` parameter:
+
+If you want to avoid this behavior (disable automatic creation of OSD on available devices), use the ``unmanaged`` parameter:
+
+.. prompt:: bash #
+
+ ceph orch apply osd --all-available-devices --unmanaged=true
+
+.. note::
+
+ Keep these three facts in mind:
+
+ - The default behavior of ``ceph orch apply`` causes cephadm constantly to reconcile. This means that cephadm creates OSDs as soon as new drives are detected.
+
+ - Setting ``unmanaged: True`` disables the creation of OSDs. If ``unmanaged: True`` is set, nothing will happen even if you apply a new OSD service.
+
+ - ``ceph orch daemon add`` creates OSDs, but does not add an OSD service.
+
+* For cephadm, see also :ref:`cephadm-spec-unmanaged`.
+
+.. _cephadm-osd-removal:
+
+Remove an OSD
+=============
+
+Removing an OSD from a cluster involves two steps:
+
+#. evacuating all placement groups (PGs) from the cluster
+#. removing the PG-free OSD from the cluster
+
+The following command performs these two steps:
+
+.. prompt:: bash #
+
+ ceph orch osd rm <osd_id(s)> [--replace] [--force]
+
+Example:
+
+.. prompt:: bash #
+
+ ceph orch osd rm 0
+
+Expected output::
+
+ Scheduled OSD(s) for removal
+
+OSDs that are not safe to destroy will be rejected.
+
+.. note::
+ After removing OSDs, if the drives the OSDs were deployed on once again
+ become available, cephadm may automatically try to deploy more OSDs
+ on these drives if they match an existing drivegroup spec. If you deployed
+ the OSDs you are removing with a spec and don't want any new OSDs deployed on
+ the drives after removal, it's best to modify the drivegroup spec before removal.
+ Either set ``unmanaged: true`` to stop it from picking up new drives at all,
+ or modify it in some way that it no longer matches the drives used for the
+ OSDs you wish to remove. Then re-apply the spec. For more info on drivegroup
+ specs see :ref:`drivegroups`. For more info on the declarative nature of
+ cephadm in reference to deploying OSDs, see :ref:`cephadm-osd-declarative`
+
+Monitoring OSD State
+--------------------
+
+You can query the state of OSD operation with the following command:
+
+.. prompt:: bash #
+
+ ceph orch osd rm status
+
+Expected output::
+
+ OSD_ID HOST STATE PG_COUNT REPLACE FORCE STARTED_AT
+ 2 cephadm-dev done, waiting for purge 0 True False 2020-07-17 13:01:43.147684
+ 3 cephadm-dev draining 17 False True 2020-07-17 13:01:45.162158
+ 4 cephadm-dev started 42 False True 2020-07-17 13:01:45.162158
+
+
+When no PGs are left on the OSD, it will be decommissioned and removed from the cluster.
+
+.. note::
+ After removing an OSD, if you wipe the LVM physical volume in the device used by the removed OSD, a new OSD will be created.
+ For more information on this, read about the ``unmanaged`` parameter in :ref:`cephadm-osd-declarative`.
+
+Stopping OSD Removal
+--------------------
+
+It is possible to stop queued OSD removals by using the following command:
+
+.. prompt:: bash #
+
+ ceph orch osd rm stop <osd_id(s)>
+
+Example:
+
+.. prompt:: bash #
+
+ ceph orch osd rm stop 4
+
+Expected output::
+
+ Stopped OSD(s) removal
+
+This resets the initial state of the OSD and takes it off the removal queue.
+
+.. _cephadm-replacing-an-osd:
+
+Replacing an OSD
+----------------
+
+.. prompt:: bash #
+
+ orch osd rm <osd_id(s)> --replace [--force]
+
+Example:
+
+.. prompt:: bash #
+
+ ceph orch osd rm 4 --replace
+
+Expected output::
+
+ Scheduled OSD(s) for replacement
+
+This follows the same procedure as the procedure in the "Remove OSD" section, with
+one exception: the OSD is not permanently removed from the CRUSH hierarchy, but is
+instead assigned a 'destroyed' flag.
+
+.. note::
+ The new OSD that will replace the removed OSD must be created on the same host
+ as the OSD that was removed.
+
+**Preserving the OSD ID**
+
+The 'destroyed' flag is used to determine which OSD ids will be reused in the
+next OSD deployment.
+
+If you use OSDSpecs for OSD deployment, your newly added disks will be assigned
+the OSD ids of their replaced counterparts. This assumes that the new disks
+still match the OSDSpecs.
+
+Use the ``--dry-run`` flag to make certain that the ``ceph orch apply osd``
+command does what you want it to. The ``--dry-run`` flag shows you what the
+outcome of the command will be without making the changes you specify. When
+you are satisfied that the command will do what you want, run the command
+without the ``--dry-run`` flag.
+
+.. tip::
+
+ The name of your OSDSpec can be retrieved with the command ``ceph orch ls``
+
+Alternatively, you can use your OSDSpec file:
+
+.. prompt:: bash #
+
+ ceph orch apply -i <osd_spec_file> --dry-run
+
+Expected output::
+
+ NAME HOST DATA DB WAL
+ <name_of_osd_spec> node1 /dev/vdb - -
+
+
+When this output reflects your intention, omit the ``--dry-run`` flag to
+execute the deployment.
+
+
+Erasing Devices (Zapping Devices)
+---------------------------------
+
+Erase (zap) a device so that it can be reused. ``zap`` calls ``ceph-volume
+zap`` on the remote host.
+
+.. prompt:: bash #
+
+ ceph orch device zap <hostname> <path>
+
+Example command:
+
+.. prompt:: bash #
+
+ ceph orch device zap my_hostname /dev/sdx
+
+.. note::
+ If the unmanaged flag is unset, cephadm automatically deploys drives that
+ match the OSDSpec. For example, if you use the
+ ``all-available-devices`` option when creating OSDs, when you ``zap`` a
+ device the cephadm orchestrator automatically creates a new OSD in the
+ device. To disable this behavior, see :ref:`cephadm-osd-declarative`.
+
+
+.. _osd_autotune:
+
+Automatically tuning OSD memory
+===============================
+
+OSD daemons will adjust their memory consumption based on the
+``osd_memory_target`` config option (several gigabytes, by
+default). If Ceph is deployed on dedicated nodes that are not sharing
+memory with other services, cephadm can automatically adjust the per-OSD
+memory consumption based on the total amount of RAM and the number of deployed
+OSDs.
+
+.. warning:: Cephadm sets ``osd_memory_target_autotune`` to ``true`` by default which is unsuitable for hyperconverged infrastructures.
+
+Cephadm will start with a fraction
+(``mgr/cephadm/autotune_memory_target_ratio``, which defaults to
+``.7``) of the total RAM in the system, subtract off any memory
+consumed by non-autotuned daemons (non-OSDs, for OSDs for which
+``osd_memory_target_autotune`` is false), and then divide by the
+remaining OSDs.
+
+The final targets are reflected in the config database with options like::
+
+ WHO MASK LEVEL OPTION VALUE
+ osd host:foo basic osd_memory_target 126092301926
+ osd host:bar basic osd_memory_target 6442450944
+
+Both the limits and the current memory consumed by each daemon are visible from
+the ``ceph orch ps`` output in the ``MEM LIMIT`` column::
+
+ NAME HOST PORTS STATUS REFRESHED AGE MEM USED MEM LIMIT VERSION IMAGE ID CONTAINER ID
+ osd.1 dael running (3h) 10s ago 3h 72857k 117.4G 17.0.0-3781-gafaed750 7015fda3cd67 9e183363d39c
+ osd.2 dael running (81m) 10s ago 81m 63989k 117.4G 17.0.0-3781-gafaed750 7015fda3cd67 1f0cc479b051
+ osd.3 dael running (62m) 10s ago 62m 64071k 117.4G 17.0.0-3781-gafaed750 7015fda3cd67 ac5537492f27
+
+To exclude an OSD from memory autotuning, disable the autotune option
+for that OSD and also set a specific memory target. For example,
+
+ .. prompt:: bash #
+
+ ceph config set osd.123 osd_memory_target_autotune false
+ ceph config set osd.123 osd_memory_target 16G
+
+
+.. _drivegroups:
+
+Advanced OSD Service Specifications
+===================================
+
+:ref:`orchestrator-cli-service-spec`\s of type ``osd`` are a way to describe a
+cluster layout, using the properties of disks. Service specifications give the
+user an abstract way to tell Ceph which disks should turn into OSDs with which
+configurations, without knowing the specifics of device names and paths.
+
+Service specifications make it possible to define a yaml or json file that can
+be used to reduce the amount of manual work involved in creating OSDs.
+
+For example, instead of running the following command:
+
+.. prompt:: bash [monitor.1]#
+
+ ceph orch daemon add osd *<host>*:*<path-to-device>*
+
+for each device and each host, we can define a yaml or json file that allows us
+to describe the layout. Here's the most basic example.
+
+Create a file called (for example) ``osd_spec.yml``:
+
+.. code-block:: yaml
+
+ service_type: osd
+ service_id: default_drive_group # custom name of the osd spec
+ placement:
+ host_pattern: '*' # which hosts to target
+ spec:
+ data_devices: # the type of devices you are applying specs to
+ all: true # a filter, check below for a full list
+
+This means :
+
+#. Turn any available device (ceph-volume decides what 'available' is) into an
+ OSD on all hosts that match the glob pattern '*'. (The glob pattern matches
+ against the registered hosts from `host ls`) A more detailed section on
+ host_pattern is available below.
+
+#. Then pass it to `osd create` like this:
+
+ .. prompt:: bash [monitor.1]#
+
+ ceph orch apply -i /path/to/osd_spec.yml
+
+ This instruction will be issued to all the matching hosts, and will deploy
+ these OSDs.
+
+ Setups more complex than the one specified by the ``all`` filter are
+ possible. See :ref:`osd_filters` for details.
+
+ A ``--dry-run`` flag can be passed to the ``apply osd`` command to display a
+ synopsis of the proposed layout.
+
+Example
+
+.. prompt:: bash [monitor.1]#
+
+ ceph orch apply -i /path/to/osd_spec.yml --dry-run
+
+
+
+.. _osd_filters:
+
+Filters
+-------
+
+.. note::
+ Filters are applied using an `AND` gate by default. This means that a drive
+ must fulfill all filter criteria in order to get selected. This behavior can
+ be adjusted by setting ``filter_logic: OR`` in the OSD specification.
+
+Filters are used to assign disks to groups, using their attributes to group
+them.
+
+The attributes are based off of ceph-volume's disk query. You can retrieve
+information about the attributes with this command:
+
+.. code-block:: bash
+
+ ceph-volume inventory </path/to/disk>
+
+Vendor or Model
+^^^^^^^^^^^^^^^
+
+Specific disks can be targeted by vendor or model:
+
+.. code-block:: yaml
+
+ model: disk_model_name
+
+or
+
+.. code-block:: yaml
+
+ vendor: disk_vendor_name
+
+
+Size
+^^^^
+
+Specific disks can be targeted by `Size`:
+
+.. code-block:: yaml
+
+ size: size_spec
+
+Size specs
+__________
+
+Size specifications can be of the following forms:
+
+* LOW:HIGH
+* :HIGH
+* LOW:
+* EXACT
+
+Concrete examples:
+
+To include disks of an exact size
+
+.. code-block:: yaml
+
+ size: '10G'
+
+To include disks within a given range of size:
+
+.. code-block:: yaml
+
+ size: '10G:40G'
+
+To include disks that are less than or equal to 10G in size:
+
+.. code-block:: yaml
+
+ size: ':10G'
+
+To include disks equal to or greater than 40G in size:
+
+.. code-block:: yaml
+
+ size: '40G:'
+
+Sizes don't have to be specified exclusively in Gigabytes(G).
+
+Other units of size are supported: Megabyte(M), Gigabyte(G) and Terrabyte(T).
+Appending the (B) for byte is also supported: ``MB``, ``GB``, ``TB``.
+
+
+Rotational
+^^^^^^^^^^
+
+This operates on the 'rotational' attribute of the disk.
+
+.. code-block:: yaml
+
+ rotational: 0 | 1
+
+`1` to match all disks that are rotational
+
+`0` to match all disks that are non-rotational (SSD, NVME etc)
+
+
+All
+^^^
+
+This will take all disks that are 'available'
+
+.. note:: This is exclusive for the data_devices section.
+
+.. code-block:: yaml
+
+ all: true
+
+
+Limiter
+^^^^^^^
+
+If you have specified some valid filters but want to limit the number of disks that they match, use the ``limit`` directive:
+
+.. code-block:: yaml
+
+ limit: 2
+
+For example, if you used `vendor` to match all disks that are from `VendorA`
+but want to use only the first two, you could use `limit`:
+
+.. code-block:: yaml
+
+ data_devices:
+ vendor: VendorA
+ limit: 2
+
+.. note:: `limit` is a last resort and shouldn't be used if it can be avoided.
+
+
+Additional Options
+------------------
+
+There are multiple optional settings you can use to change the way OSDs are deployed.
+You can add these options to the base level of an OSD spec for it to take effect.
+
+This example would deploy all OSDs with encryption enabled.
+
+.. code-block:: yaml
+
+ service_type: osd
+ service_id: example_osd_spec
+ placement:
+ host_pattern: '*'
+ spec:
+ data_devices:
+ all: true
+ encrypted: true
+
+See a full list in the DriveGroupSpecs
+
+.. py:currentmodule:: ceph.deployment.drive_group
+
+.. autoclass:: DriveGroupSpec
+ :members:
+ :exclude-members: from_json
+
+Examples
+========
+
+The simple case
+---------------
+
+All nodes with the same setup
+
+.. code-block:: none
+
+ 20 HDDs
+ Vendor: VendorA
+ Model: HDD-123-foo
+ Size: 4TB
+
+ 2 SSDs
+ Vendor: VendorB
+ Model: MC-55-44-ZX
+ Size: 512GB
+
+This is a common setup and can be described quite easily:
+
+.. code-block:: yaml
+
+ service_type: osd
+ service_id: osd_spec_default
+ placement:
+ host_pattern: '*'
+ spec:
+ data_devices:
+ model: HDD-123-foo # Note, HDD-123 would also be valid
+ db_devices:
+ model: MC-55-44-XZ # Same here, MC-55-44 is valid
+
+However, we can improve it by reducing the filters on core properties of the drives:
+
+.. code-block:: yaml
+
+ service_type: osd
+ service_id: osd_spec_default
+ placement:
+ host_pattern: '*'
+ spec:
+ data_devices:
+ rotational: 1
+ db_devices:
+ rotational: 0
+
+Now, we enforce all rotating devices to be declared as 'data devices' and all non-rotating devices will be used as shared_devices (wal, db)
+
+If you know that drives with more than 2TB will always be the slower data devices, you can also filter by size:
+
+.. code-block:: yaml
+
+ service_type: osd
+ service_id: osd_spec_default
+ placement:
+ host_pattern: '*'
+ spec:
+ data_devices:
+ size: '2TB:'
+ db_devices:
+ size: ':2TB'
+
+.. note:: All of the above OSD specs are equally valid. Which of those you want to use depends on taste and on how much you expect your node layout to change.
+
+
+Multiple OSD specs for a single host
+------------------------------------
+
+Here we have two distinct setups
+
+.. code-block:: none
+
+ 20 HDDs
+ Vendor: VendorA
+ Model: HDD-123-foo
+ Size: 4TB
+
+ 12 SSDs
+ Vendor: VendorB
+ Model: MC-55-44-ZX
+ Size: 512GB
+
+ 2 NVMEs
+ Vendor: VendorC
+ Model: NVME-QQQQ-987
+ Size: 256GB
+
+
+* 20 HDDs should share 2 SSDs
+* 10 SSDs should share 2 NVMes
+
+This can be described with two layouts.
+
+.. code-block:: yaml
+
+ service_type: osd
+ service_id: osd_spec_hdd
+ placement:
+ host_pattern: '*'
+ spec:
+ data_devices:
+ rotational: 0
+ db_devices:
+ model: MC-55-44-XZ
+ limit: 2 # db_slots is actually to be favoured here, but it's not implemented yet
+ ---
+ service_type: osd
+ service_id: osd_spec_ssd
+ placement:
+ host_pattern: '*'
+ spec:
+ data_devices:
+ model: MC-55-44-XZ
+ db_devices:
+ vendor: VendorC
+
+This would create the desired layout by using all HDDs as data_devices with two SSD assigned as dedicated db/wal devices.
+The remaining SSDs(8) will be data_devices that have the 'VendorC' NVMEs assigned as dedicated db/wal devices.
+
+Multiple hosts with the same disk layout
+----------------------------------------
+
+Assuming the cluster has different kinds of hosts each with similar disk
+layout, it is recommended to apply different OSD specs matching only one
+set of hosts. Typically you will have a spec for multiple hosts with the
+same layout.
+
+The sevice id as the unique key: In case a new OSD spec with an already
+applied service id is applied, the existing OSD spec will be superseeded.
+cephadm will now create new OSD daemons based on the new spec
+definition. Existing OSD daemons will not be affected. See :ref:`cephadm-osd-declarative`.
+
+Node1-5
+
+.. code-block:: none
+
+ 20 HDDs
+ Vendor: Intel
+ Model: SSD-123-foo
+ Size: 4TB
+ 2 SSDs
+ Vendor: VendorA
+ Model: MC-55-44-ZX
+ Size: 512GB
+
+Node6-10
+
+.. code-block:: none
+
+ 5 NVMEs
+ Vendor: Intel
+ Model: SSD-123-foo
+ Size: 4TB
+ 20 SSDs
+ Vendor: VendorA
+ Model: MC-55-44-ZX
+ Size: 512GB
+
+You can use the 'placement' key in the layout to target certain nodes.
+
+.. code-block:: yaml
+
+ service_type: osd
+ service_id: disk_layout_a
+ placement:
+ label: disk_layout_a
+ spec:
+ data_devices:
+ rotational: 1
+ db_devices:
+ rotational: 0
+ ---
+ service_type: osd
+ service_id: disk_layout_b
+ placement:
+ label: disk_layout_b
+ spec:
+ data_devices:
+ model: MC-55-44-XZ
+ db_devices:
+ model: SSD-123-foo
+
+This applies different OSD specs to different hosts depending on the `placement` key.
+See :ref:`orchestrator-cli-placement-spec`
+
+.. note::
+
+ Assuming each host has a unique disk layout, each OSD
+ spec needs to have a different service id
+
+
+Dedicated wal + db
+------------------
+
+All previous cases co-located the WALs with the DBs.
+It's however possible to deploy the WAL on a dedicated device as well, if it makes sense.
+
+.. code-block:: none
+
+ 20 HDDs
+ Vendor: VendorA
+ Model: SSD-123-foo
+ Size: 4TB
+
+ 2 SSDs
+ Vendor: VendorB
+ Model: MC-55-44-ZX
+ Size: 512GB
+
+ 2 NVMEs
+ Vendor: VendorC
+ Model: NVME-QQQQ-987
+ Size: 256GB
+
+
+The OSD spec for this case would look like the following (using the `model` filter):
+
+.. code-block:: yaml
+
+ service_type: osd
+ service_id: osd_spec_default
+ placement:
+ host_pattern: '*'
+ spec:
+ data_devices:
+ model: MC-55-44-XZ
+ db_devices:
+ model: SSD-123-foo
+ wal_devices:
+ model: NVME-QQQQ-987
+
+
+It is also possible to specify directly device paths in specific hosts like the following:
+
+.. code-block:: yaml
+
+ service_type: osd
+ service_id: osd_using_paths
+ placement:
+ hosts:
+ - Node01
+ - Node02
+ spec:
+ data_devices:
+ paths:
+ - /dev/sdb
+ db_devices:
+ paths:
+ - /dev/sdc
+ wal_devices:
+ paths:
+ - /dev/sdd
+
+
+This can easily be done with other filters, like `size` or `vendor` as well.
+
+.. _cephadm-osd-activate:
+
+Activate existing OSDs
+======================
+
+In case the OS of a host was reinstalled, existing OSDs need to be activated
+again. For this use case, cephadm provides a wrapper for :ref:`ceph-volume-lvm-activate` that
+activates all existing OSDs on a host.
+
+.. prompt:: bash #
+
+ ceph cephadm osd activate <host>...
+
+This will scan all existing disks for OSDs and deploy corresponding daemons.
+
+Futher Reading
+==============
+
+* :ref:`ceph-volume`
+* :ref:`rados-index`
diff --git a/doc/cephadm/services/rgw.rst b/doc/cephadm/services/rgw.rst
new file mode 100644
index 000000000..0f9b14650
--- /dev/null
+++ b/doc/cephadm/services/rgw.rst
@@ -0,0 +1,324 @@
+===========
+RGW Service
+===========
+
+.. _cephadm-deploy-rgw:
+
+Deploy RGWs
+===========
+
+Cephadm deploys radosgw as a collection of daemons that manage a
+single-cluster deployment or a particular *realm* and *zone* in a
+multisite deployment. (For more information about realms and zones,
+see :ref:`multisite`.)
+
+Note that with cephadm, radosgw daemons are configured via the monitor
+configuration database instead of via a `ceph.conf` or the command line. If
+that configuration isn't already in place (usually in the
+``client.rgw.<something>`` section), then the radosgw
+daemons will start up with default settings (e.g., binding to port
+80).
+
+To deploy a set of radosgw daemons, with an arbitrary service name
+*name*, run the following command:
+
+.. prompt:: bash #
+
+ ceph orch apply rgw *<name>* [--realm=*<realm-name>*] [--zone=*<zone-name>*] --placement="*<num-daemons>* [*<host1>* ...]"
+
+Trivial setup
+-------------
+
+For example, to deploy 2 RGW daemons (the default) for a single-cluster RGW deployment
+under the arbitrary service id *foo*:
+
+.. prompt:: bash #
+
+ ceph orch apply rgw foo
+
+.. _cephadm-rgw-designated_gateways:
+
+Designated gateways
+-------------------
+
+A common scenario is to have a labeled set of hosts that will act
+as gateways, with multiple instances of radosgw running on consecutive
+ports 8000 and 8001:
+
+.. prompt:: bash #
+
+ ceph orch host label add gwhost1 rgw # the 'rgw' label can be anything
+ ceph orch host label add gwhost2 rgw
+ ceph orch apply rgw foo '--placement=label:rgw count-per-host:2' --port=8000
+
+See also: :ref:`cephadm_co_location`.
+
+.. _cephadm-rgw-networks:
+
+Specifying Networks
+-------------------
+
+The RGW service can have the network they bind to configured with a yaml service specification.
+
+example spec file:
+
+.. code-block:: yaml
+
+ service_type: rgw
+ service_id: foo
+ placement:
+ label: rgw
+ count_per_host: 2
+ networks:
+ - 192.169.142.0/24
+ spec:
+ rgw_frontend_port: 8080
+
+
+Multisite zones
+---------------
+
+To deploy RGWs serving the multisite *myorg* realm and the *us-east-1* zone on
+*myhost1* and *myhost2*:
+
+.. prompt:: bash #
+
+ ceph orch apply rgw east --realm=myorg --zone=us-east-1 --placement="2 myhost1 myhost2"
+
+Note that in a multisite situation, cephadm only deploys the daemons. It does not create
+or update the realm or zone configurations. To create a new realm and zone, you need to do
+something like:
+
+.. prompt:: bash #
+
+ radosgw-admin realm create --rgw-realm=<realm-name> --default
+
+.. prompt:: bash #
+
+ radosgw-admin zonegroup create --rgw-zonegroup=<zonegroup-name> --master --default
+
+.. prompt:: bash #
+
+ radosgw-admin zone create --rgw-zonegroup=<zonegroup-name> --rgw-zone=<zone-name> --master --default
+
+.. prompt:: bash #
+
+ radosgw-admin period update --rgw-realm=<realm-name> --commit
+
+See :ref:`orchestrator-cli-placement-spec` for details of the placement
+specification. See :ref:`multisite` for more information of setting up multisite RGW.
+
+See also :ref:`multisite`.
+
+Setting up HTTPS
+----------------
+
+In order to enable HTTPS for RGW services, apply a spec file following this scheme:
+
+.. code-block:: yaml
+
+ service_type: rgw
+ service_id: myrgw
+ spec:
+ rgw_frontend_ssl_certificate: |
+ -----BEGIN PRIVATE KEY-----
+ V2VyIGRhcyBsaWVzdCBpc3QgZG9vZi4gTG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFt
+ ZXQsIGNvbnNldGV0dXIgc2FkaXBzY2luZyBlbGl0ciwgc2VkIGRpYW0gbm9udW15
+ IGVpcm1vZCB0ZW1wb3IgaW52aWR1bnQgdXQgbGFib3JlIGV0IGRvbG9yZSBtYWdu
+ YSBhbGlxdXlhbSBlcmF0LCBzZWQgZGlhbSB2b2x1cHR1YS4gQXQgdmVybyBlb3Mg
+ ZXQgYWNjdXNhbSBldCBqdXN0byBkdW8=
+ -----END PRIVATE KEY-----
+ -----BEGIN CERTIFICATE-----
+ V2VyIGRhcyBsaWVzdCBpc3QgZG9vZi4gTG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFt
+ ZXQsIGNvbnNldGV0dXIgc2FkaXBzY2luZyBlbGl0ciwgc2VkIGRpYW0gbm9udW15
+ IGVpcm1vZCB0ZW1wb3IgaW52aWR1bnQgdXQgbGFib3JlIGV0IGRvbG9yZSBtYWdu
+ YSBhbGlxdXlhbSBlcmF0LCBzZWQgZGlhbSB2b2x1cHR1YS4gQXQgdmVybyBlb3Mg
+ ZXQgYWNjdXNhbSBldCBqdXN0byBkdW8=
+ -----END CERTIFICATE-----
+ ssl: true
+
+Then apply this yaml document:
+
+.. prompt:: bash #
+
+ ceph orch apply -i myrgw.yaml
+
+Note the value of ``rgw_frontend_ssl_certificate`` is a literal string as
+indicated by a ``|`` character preserving newline characters.
+
+Service specification
+---------------------
+
+.. py:currentmodule:: ceph.deployment.service_spec
+
+.. autoclass:: RGWSpec
+ :members:
+
+.. _orchestrator-haproxy-service-spec:
+
+High availability service for RGW
+=================================
+
+The *ingress* service allows you to create a high availability endpoint
+for RGW with a minumum set of configuration options. The orchestrator will
+deploy and manage a combination of haproxy and keepalived to provide load
+balancing on a floating virtual IP.
+
+If SSL is used, then SSL must be configured and terminated by the ingress service
+and not RGW itself.
+
+.. image:: ../../images/HAProxy_for_RGW.svg
+
+There are N hosts where the ingress service is deployed. Each host
+has a haproxy daemon and a keepalived daemon. A virtual IP is
+automatically configured on only one of these hosts at a time.
+
+Each keepalived daemon checks every few seconds whether the haproxy
+daemon on the same host is responding. Keepalived will also check
+that the master keepalived daemon is running without problems. If the
+"master" keepalived daemon or the active haproxy is not responding,
+one of the remaining keepalived daemons running in backup mode will be
+elected as master, and the virtual IP will be moved to that node.
+
+The active haproxy acts like a load balancer, distributing all RGW requests
+between all the RGW daemons available.
+
+Prerequisites
+-------------
+
+* An existing RGW service, without SSL. (If you want SSL service, the certificate
+ should be configured on the ingress service, not the RGW service.)
+
+Deploying
+---------
+
+Use the command::
+
+ ceph orch apply -i <ingress_spec_file>
+
+Service specification
+---------------------
+
+It is a yaml format file with the following properties:
+
+.. code-block:: yaml
+
+ service_type: ingress
+ service_id: rgw.something # adjust to match your existing RGW service
+ placement:
+ hosts:
+ - host1
+ - host2
+ - host3
+ spec:
+ backend_service: rgw.something # adjust to match your existing RGW service
+ virtual_ip: <string>/<string> # ex: 192.168.20.1/24
+ frontend_port: <integer> # ex: 8080
+ monitor_port: <integer> # ex: 1967, used by haproxy for load balancer status
+ virtual_interface_networks: [ ... ] # optional: list of CIDR networks
+ ssl_cert: | # optional: SSL certificate and key
+ -----BEGIN CERTIFICATE-----
+ ...
+ -----END CERTIFICATE-----
+ -----BEGIN PRIVATE KEY-----
+ ...
+ -----END PRIVATE KEY-----
+
+.. code-block:: yaml
+
+ service_type: ingress
+ service_id: rgw.something # adjust to match your existing RGW service
+ placement:
+ hosts:
+ - host1
+ - host2
+ - host3
+ spec:
+ backend_service: rgw.something # adjust to match your existing RGW service
+ virtual_ips_list:
+ - <string>/<string> # ex: 192.168.20.1/24
+ - <string>/<string> # ex: 192.168.20.2/24
+ - <string>/<string> # ex: 192.168.20.3/24
+ frontend_port: <integer> # ex: 8080
+ monitor_port: <integer> # ex: 1967, used by haproxy for load balancer status
+ virtual_interface_networks: [ ... ] # optional: list of CIDR networks
+ ssl_cert: | # optional: SSL certificate and key
+ -----BEGIN CERTIFICATE-----
+ ...
+ -----END CERTIFICATE-----
+ -----BEGIN PRIVATE KEY-----
+ ...
+ -----END PRIVATE KEY-----
+
+
+where the properties of this service specification are:
+
+* ``service_type``
+ Mandatory and set to "ingress"
+* ``service_id``
+ The name of the service. We suggest naming this after the service you are
+ controlling ingress for (e.g., ``rgw.foo``).
+* ``placement hosts``
+ The hosts where it is desired to run the HA daemons. An haproxy and a
+ keepalived container will be deployed on these hosts. These hosts do not need
+ to match the nodes where RGW is deployed.
+* ``virtual_ip``
+ The virtual IP (and network) in CIDR format where the ingress service will be available.
+* ``virtual_ips_list``
+ The virtual IP address in CIDR format where the ingress service will be available.
+ Each virtual IP address will be primary on one node running the ingress service. The number
+ of virtual IP addresses must be less than or equal to the number of ingress nodes.
+* ``virtual_interface_networks``
+ A list of networks to identify which ethernet interface to use for the virtual IP.
+* ``frontend_port``
+ The port used to access the ingress service.
+* ``ssl_cert``:
+ SSL certificate, if SSL is to be enabled. This must contain the both the certificate and
+ private key blocks in .pem format.
+
+.. _ingress-virtual-ip:
+
+Selecting ethernet interfaces for the virtual IP
+------------------------------------------------
+
+You cannot simply provide the name of the network interface on which
+to configure the virtual IP because interface names tend to vary
+across hosts (and/or reboots). Instead, cephadm will select
+interfaces based on other existing IP addresses that are already
+configured.
+
+Normally, the virtual IP will be configured on the first network
+interface that has an existing IP in the same subnet. For example, if
+the virtual IP is 192.168.0.80/24 and eth2 has the static IP
+192.168.0.40/24, cephadm will use eth2.
+
+In some cases, the virtual IP may not belong to the same subnet as an existing static
+IP. In such cases, you can provide a list of subnets to match against existing IPs,
+and cephadm will put the virtual IP on the first network interface to match. For example,
+if the virtual IP is 192.168.0.80/24 and we want it on the same interface as the machine's
+static IP in 10.10.0.0/16, you can use a spec like::
+
+ service_type: ingress
+ service_id: rgw.something
+ spec:
+ virtual_ip: 192.168.0.80/24
+ virtual_interface_networks:
+ - 10.10.0.0/16
+ ...
+
+A consequence of this strategy is that you cannot currently configure the virtual IP
+on an interface that has no existing IP address. In this situation, we suggest
+configuring a "dummy" IP address is an unroutable network on the correct interface
+and reference that dummy network in the networks list (see above).
+
+
+Useful hints for ingress
+------------------------
+
+* It is good to have at least 3 RGW daemons.
+* We recommend at least 3 hosts for the ingress service.
+
+Further Reading
+===============
+
+* :ref:`object-gateway`
diff --git a/doc/cephadm/services/snmp-gateway.rst b/doc/cephadm/services/snmp-gateway.rst
new file mode 100644
index 000000000..f927fdfd0
--- /dev/null
+++ b/doc/cephadm/services/snmp-gateway.rst
@@ -0,0 +1,171 @@
+====================
+SNMP Gateway Service
+====================
+
+SNMP_ is still a widely used protocol, to monitor distributed systems and devices across a variety of hardware
+and software platforms. Ceph's SNMP integration focuses on forwarding alerts from it's Prometheus Alertmanager
+cluster to a gateway daemon. The gateway daemon, transforms the alert into an SNMP Notification and sends
+it on to a designated SNMP management platform. The gateway daemon is from the snmp_notifier_ project,
+which provides SNMP V2c and V3 support (authentication and encryption).
+
+Ceph's SNMP gateway service deploys one instance of the gateway by default. You may increase this
+by providing placement information. However, bear in mind that if you enable multiple SNMP gateway daemons,
+your SNMP management platform will receive multiple notifications for the same event.
+
+.. _SNMP: https://en.wikipedia.org/wiki/Simple_Network_Management_Protocol
+.. _snmp_notifier: https://github.com/maxwo/snmp_notifier
+
+Compatibility
+=============
+The table below shows the SNMP versions that are supported by the gateway implementation
+
+================ =========== ===============================================
+ SNMP Version Supported Notes
+================ =========== ===============================================
+ V1 ❌ Not supported by snmp_notifier
+ V2c ✔
+ V3 authNoPriv ✔ uses username/password authentication, without
+ encryption (NoPriv = no privacy)
+ V3 authPriv ✔ uses username/password authentication with
+ encryption to the SNMP management platform
+================ =========== ===============================================
+
+
+Deploying an SNMP Gateway
+=========================
+Both SNMP V2c and V3 provide credentials support. In the case of V2c, this is just the community string - but for V3
+environments you must provide additional authentication information. These credentials are not supported on the command
+line when deploying the service. Instead, you must create the service using a credentials file (in yaml format), or
+specify the complete service definition in a yaml file.
+
+Command format
+--------------
+
+.. prompt:: bash #
+
+ ceph orch apply snmp-gateway <snmp_version:V2c|V3> <destination> [<port:int>] [<engine_id>] [<auth_protocol: MD5|SHA>] [<privacy_protocol:DES|AES>] [<placement>] ...
+
+
+Usage Notes
+
+- you must supply the ``--snmp-version`` parameter
+- the ``--destination`` parameter must be of the format hostname:port (no default)
+- you may omit ``--port``. It defaults to 9464
+- the ``--engine-id`` is a unique identifier for the device (in hex) and required for SNMP v3 only.
+ Suggested value: 8000C53F<fsid> where the fsid is from your cluster, without the '-' symbols
+- for SNMP V3, the ``--auth-protocol`` setting defaults to **SHA**
+- for SNMP V3, with encryption you must define the ``--privacy-protocol``
+- you **must** provide a -i <filename> to pass the secrets/passwords to the orchestrator
+
+Deployment Examples
+===================
+
+SNMP V2c
+--------
+Here's an example for V2c, showing CLI and service based deployments
+
+.. prompt:: bash #
+
+ ceph orch apply snmp-gateway --port 9464 --snmp_version=V2c --destination=192.168.122.73:162 -i ./snmp_creds.yaml
+
+with a credentials file that contains;
+
+.. code-block:: yaml
+
+ ---
+ snmp_community: public
+
+Alternatively, you can create a yaml definition for the gateway and apply it from a single file
+
+.. prompt:: bash #
+
+ ceph orch apply -i snmp-gateway.yml
+
+with the file containing the following configuration
+
+.. code-block:: yaml
+
+ service_type: snmp-gateway
+ service_name: snmp-gateway
+ placement:
+ count: 1
+ spec:
+ credentials:
+ snmp_community: public
+ port: 9464
+ snmp_destination: 192.168.122.73:162
+ snmp_version: V2c
+
+
+SNMP V3 (authNoPriv)
+--------------------
+Deploying an snmp-gateway service supporting SNMP V3 with authentication only, would look like this;
+
+.. prompt:: bash #
+
+ ceph orch apply snmp-gateway --snmp-version=V3 --engine-id=800C53F000000 --destination=192.168.122.1:162 -i ./snmpv3_creds.yml
+
+with a credentials file as;
+
+.. code-block:: yaml
+
+ ---
+ snmp_v3_auth_username: myuser
+ snmp_v3_auth_password: mypassword
+
+or as a service configuration file
+
+.. code-block:: yaml
+
+ service_type: snmp-gateway
+ service_name: snmp-gateway
+ placement:
+ count: 1
+ spec:
+ credentials:
+ snmp_v3_auth_password: mypassword
+ snmp_v3_auth_username: myuser
+ engine_id: 800C53F000000
+ port: 9464
+ snmp_destination: 192.168.122.1:162
+ snmp_version: V3
+
+
+SNMP V3 (authPriv)
+------------------
+
+Defining an SNMP V3 gateway service that implements authentication and privacy (encryption), requires two additional values
+
+.. prompt:: bash #
+
+ ceph orch apply snmp-gateway --snmp-version=V3 --engine-id=800C53F000000 --destination=192.168.122.1:162 --privacy-protocol=AES -i ./snmpv3_creds.yml
+
+with a credentials file as;
+
+.. code-block:: yaml
+
+ ---
+ snmp_v3_auth_username: myuser
+ snmp_v3_auth_password: mypassword
+ snmp_v3_priv_password: mysecret
+
+
+.. note::
+
+ The credentials are stored on the host, restricted to the root user and passed to the snmp_notifier daemon as
+ an environment file (``--env-file``), to limit exposure.
+
+
+AlertManager Integration
+========================
+When an SNMP gateway service is deployed or updated, the Prometheus Alertmanager configuration is automatically updated to forward any
+alert that has an OID_ label to the SNMP gateway daemon for processing.
+
+.. _OID: https://en.wikipedia.org/wiki/Object_identifier
+
+Implementing the MIB
+======================
+To make sense of the SNMP Notification/Trap, you'll need to apply the MIB to your SNMP management platform. The MIB (CEPH-MIB.txt) can
+downloaded from the main Ceph repo_
+
+.. _repo: https://github.com/ceph/ceph/tree/master/monitoring/snmp