diff options
Diffstat (limited to '')
-rw-r--r-- | doc/cephadm/services/custom-container.rst | 79 | ||||
-rw-r--r-- | doc/cephadm/services/index.rst | 871 | ||||
-rw-r--r-- | doc/cephadm/services/iscsi.rst | 87 | ||||
-rw-r--r-- | doc/cephadm/services/mds.rst | 61 | ||||
-rw-r--r-- | doc/cephadm/services/mgr.rst | 43 | ||||
-rw-r--r-- | doc/cephadm/services/mon.rst | 237 | ||||
-rw-r--r-- | doc/cephadm/services/monitoring.rst | 543 | ||||
-rw-r--r-- | doc/cephadm/services/nfs.rst | 215 | ||||
-rw-r--r-- | doc/cephadm/services/osd.rst | 997 | ||||
-rw-r--r-- | doc/cephadm/services/rgw.rst | 371 | ||||
-rw-r--r-- | doc/cephadm/services/snmp-gateway.rst | 171 | ||||
-rw-r--r-- | doc/cephadm/services/tracing.rst | 45 |
12 files changed, 3720 insertions, 0 deletions
diff --git a/doc/cephadm/services/custom-container.rst b/doc/cephadm/services/custom-container.rst new file mode 100644 index 000000000..3ece248c5 --- /dev/null +++ b/doc/cephadm/services/custom-container.rst @@ -0,0 +1,79 @@ +======================== +Custom Container Service +======================== + +The orchestrator enables custom containers to be deployed using a YAML file. +A corresponding :ref:`orchestrator-cli-service-spec` must look like: + +.. code-block:: yaml + + service_type: container + service_id: foo + placement: + ... + spec: + image: docker.io/library/foo:latest + entrypoint: /usr/bin/foo + uid: 1000 + gid: 1000 + args: + - "--net=host" + - "--cpus=2" + ports: + - 8080 + - 8443 + envs: + - SECRET=mypassword + - PORT=8080 + - PUID=1000 + - PGID=1000 + volume_mounts: + CONFIG_DIR: /etc/foo + bind_mounts: + - ['type=bind', 'source=lib/modules', 'destination=/lib/modules', 'ro=true'] + dirs: + - CONFIG_DIR + files: + CONFIG_DIR/foo.conf: + - refresh=true + - username=xyz + - "port: 1234" + +where the properties of a service specification are: + +* ``service_id`` + A unique name of the service. +* ``image`` + The name of the Docker image. +* ``uid`` + The UID to use when creating directories and files in the host system. +* ``gid`` + The GID to use when creating directories and files in the host system. +* ``entrypoint`` + Overwrite the default ENTRYPOINT of the image. +* ``args`` + A list of additional Podman/Docker command line arguments. +* ``ports`` + A list of TCP ports to open in the host firewall. +* ``envs`` + A list of environment variables. +* ``bind_mounts`` + When you use a bind mount, a file or directory on the host machine + is mounted into the container. Relative `source=...` paths will be + located below `/var/lib/ceph/<cluster-fsid>/<daemon-name>`. +* ``volume_mounts`` + When you use a volume mount, a new directory is created within + Docker’s storage directory on the host machine, and Docker manages + that directory’s contents. Relative source paths will be located below + `/var/lib/ceph/<cluster-fsid>/<daemon-name>`. +* ``dirs`` + A list of directories that are created below + `/var/lib/ceph/<cluster-fsid>/<daemon-name>`. +* ``files`` + A dictionary, where the key is the relative path of the file and the + value the file content. The content must be double quoted when using + a string. Use '\\n' for line breaks in that case. Otherwise define + multi-line content as list of strings. The given files will be created + below the directory `/var/lib/ceph/<cluster-fsid>/<daemon-name>`. + The absolute path of the directory where the file will be created must + exist. Use the `dirs` property to create them if necessary. diff --git a/doc/cephadm/services/index.rst b/doc/cephadm/services/index.rst new file mode 100644 index 000000000..82f83bfac --- /dev/null +++ b/doc/cephadm/services/index.rst @@ -0,0 +1,871 @@ +================== +Service Management +================== + +A service is a group of daemons configured together. See these chapters +for details on individual services: + +.. toctree:: + :maxdepth: 1 + + mon + mgr + osd + rgw + mds + nfs + iscsi + custom-container + monitoring + snmp-gateway + tracing + +Service Status +============== + + +To see the status of one +of the services running in the Ceph cluster, do the following: + +#. Use the command line to print a list of services. +#. Locate the service whose status you want to check. +#. Print the status of the service. + +The following command prints a list of services known to the orchestrator. To +limit the output to services only on a specified host, use the optional +``--host`` parameter. To limit the output to services of only a particular +type, use the optional ``--type`` parameter (mon, osd, mgr, mds, rgw): + + .. prompt:: bash # + + ceph orch ls [--service_type type] [--service_name name] [--export] [--format f] [--refresh] + +Discover the status of a particular service or daemon: + + .. prompt:: bash # + + ceph orch ls --service_type type --service_name <name> [--refresh] + +To export the service specifications knows to the orchestrator, run the following command. + + .. prompt:: bash # + + ceph orch ls --export + +The service specifications exported with this command will be exported as yaml +and that yaml can be used with the ``ceph orch apply -i`` command. + +For information about retrieving the specifications of single services (including examples of commands), see :ref:`orchestrator-cli-service-spec-retrieve`. + +Daemon Status +============= + +A daemon is a systemd unit that is running and part of a service. + +To see the status of a daemon, do the following: + +#. Print a list of all daemons known to the orchestrator. +#. Query the status of the target daemon. + +First, print a list of all daemons known to the orchestrator: + + .. prompt:: bash # + + ceph orch ps [--hostname host] [--daemon_type type] [--service_name name] [--daemon_id id] [--format f] [--refresh] + +Then query the status of a particular service instance (mon, osd, mds, rgw). +For OSDs the id is the numeric OSD ID. For MDS services the id is the file +system name: + + .. prompt:: bash # + + ceph orch ps --daemon_type osd --daemon_id 0 + +.. note:: + The output of the command ``ceph orch ps`` may not reflect the current status of the daemons. By default, + the status is updated every 10 minutes. This interval can be shortened by modifying the ``mgr/cephadm/daemon_cache_timeout`` + configuration variable (in seconds) e.g: ``ceph config set mgr mgr/cephadm/daemon_cache_timeout 60`` would reduce the refresh + interval to one minute. The information is updated every ``daemon_cache_timeout`` seconds unless the ``--refresh`` option + is used. This option would trigger a request to refresh the information, which may take some time depending on the size of + the cluster. In general ``REFRESHED`` value indicates how recent the information displayed by ``ceph orch ps`` and similar + commands is. + +.. _orchestrator-cli-service-spec: + +Service Specification +===================== + +A *Service Specification* is a data structure that is used to specify the +deployment of services. In addition to parameters such as `placement` or +`networks`, the user can set initial values of service configuration parameters +by means of the `config` section. For each param/value configuration pair, +cephadm calls the following command to set its value: + + .. prompt:: bash # + + ceph config set <service-name> <param> <value> + +cephadm raises health warnings in case invalid configuration parameters are +found in the spec (`CEPHADM_INVALID_CONFIG_OPTION`) or if any error while +trying to apply the new configuration option(s) (`CEPHADM_FAILED_SET_OPTION`). + +Here is an example of a service specification in YAML: + +.. code-block:: yaml + + service_type: rgw + service_id: realm.zone + placement: + hosts: + - host1 + - host2 + - host3 + config: + param_1: val_1 + ... + param_N: val_N + unmanaged: false + networks: + - 192.169.142.0/24 + spec: + # Additional service specific attributes. + +In this example, the properties of this service specification are: + +.. py:currentmodule:: ceph.deployment.service_spec + +.. autoclass:: ServiceSpec + :members: + +Each service type can have additional service-specific properties. + +Service specifications of type ``mon``, ``mgr``, and the monitoring +types do not require a ``service_id``. + +A service of type ``osd`` is described in :ref:`drivegroups` + +Many service specifications can be applied at once using ``ceph orch apply -i`` +by submitting a multi-document YAML file:: + + cat <<EOF | ceph orch apply -i - + service_type: mon + placement: + host_pattern: "mon*" + --- + service_type: mgr + placement: + host_pattern: "mgr*" + --- + service_type: osd + service_id: default_drive_group + placement: + host_pattern: "osd*" + data_devices: + all: true + EOF + +.. _orchestrator-cli-service-spec-retrieve: + +Retrieving the running Service Specification +-------------------------------------------- + +If the services have been started via ``ceph orch apply...``, then directly changing +the Services Specification is complicated. Instead of attempting to directly change +the Services Specification, we suggest exporting the running Service Specification by +following these instructions: + + .. prompt:: bash # + + ceph orch ls --service-name rgw.<realm>.<zone> --export > rgw.<realm>.<zone>.yaml + ceph orch ls --service-type mgr --export > mgr.yaml + ceph orch ls --export > cluster.yaml + +The Specification can then be changed and re-applied as above. + +Updating Service Specifications +------------------------------- + +The Ceph Orchestrator maintains a declarative state of each +service in a ``ServiceSpec``. For certain operations, like updating +the RGW HTTP port, we need to update the existing +specification. + +1. List the current ``ServiceSpec``: + + .. prompt:: bash # + + ceph orch ls --service_name=<service-name> --export > myservice.yaml + +2. Update the yaml file: + + .. prompt:: bash # + + vi myservice.yaml + +3. Apply the new ``ServiceSpec``: + + .. prompt:: bash # + + ceph orch apply -i myservice.yaml [--dry-run] + +.. _orchestrator-cli-placement-spec: + +Daemon Placement +================ + +For the orchestrator to deploy a *service*, it needs to know where to deploy +*daemons*, and how many to deploy. This is the role of a placement +specification. Placement specifications can either be passed as command line arguments +or in a YAML files. + +.. note:: + + cephadm will not deploy daemons on hosts with the ``_no_schedule`` label; see :ref:`cephadm-special-host-labels`. + +.. note:: + The **apply** command can be confusing. For this reason, we recommend using + YAML specifications. + + Each ``ceph orch apply <service-name>`` command supersedes the one before it. + If you do not use the proper syntax, you will clobber your work + as you go. + + For example: + + .. prompt:: bash # + + ceph orch apply mon host1 + ceph orch apply mon host2 + ceph orch apply mon host3 + + This results in only one host having a monitor applied to it: host 3. + + (The first command creates a monitor on host1. Then the second command + clobbers the monitor on host1 and creates a monitor on host2. Then the + third command clobbers the monitor on host2 and creates a monitor on + host3. In this scenario, at this point, there is a monitor ONLY on + host3.) + + To make certain that a monitor is applied to each of these three hosts, + run a command like this: + + .. prompt:: bash # + + ceph orch apply mon "host1,host2,host3" + + There is another way to apply monitors to multiple hosts: a ``yaml`` file + can be used. Instead of using the "ceph orch apply mon" commands, run a + command of this form: + + .. prompt:: bash # + + ceph orch apply -i file.yaml + + Here is a sample **file.yaml** file + + .. code-block:: yaml + + service_type: mon + placement: + hosts: + - host1 + - host2 + - host3 + +Explicit placements +------------------- + +Daemons can be explicitly placed on hosts by simply specifying them: + + .. prompt:: bash # + + ceph orch apply prometheus --placement="host1 host2 host3" + +Or in YAML: + +.. code-block:: yaml + + service_type: prometheus + placement: + hosts: + - host1 + - host2 + - host3 + +MONs and other services may require some enhanced network specifications: + + .. prompt:: bash # + + ceph orch daemon add mon --placement="myhost:[v2:1.2.3.4:3300,v1:1.2.3.4:6789]=name" + +where ``[v2:1.2.3.4:3300,v1:1.2.3.4:6789]`` is the network address of the monitor +and ``=name`` specifies the name of the new monitor. + +.. _orch-placement-by-labels: + +Placement by labels +------------------- + +Daemon placement can be limited to hosts that match a specific label. To set +a label ``mylabel`` to the appropriate hosts, run this command: + + .. prompt:: bash # + + ceph orch host label add *<hostname>* mylabel + + To view the current hosts and labels, run this command: + + .. prompt:: bash # + + ceph orch host ls + + For example: + + .. prompt:: bash # + + ceph orch host label add host1 mylabel + ceph orch host label add host2 mylabel + ceph orch host label add host3 mylabel + ceph orch host ls + + .. code-block:: bash + + HOST ADDR LABELS STATUS + host1 mylabel + host2 mylabel + host3 mylabel + host4 + host5 + +Now, Tell cephadm to deploy daemons based on the label by running +this command: + + .. prompt:: bash # + + ceph orch apply prometheus --placement="label:mylabel" + +Or in YAML: + +.. code-block:: yaml + + service_type: prometheus + placement: + label: "mylabel" + +* See :ref:`orchestrator-host-labels` + +Placement by pattern matching +----------------------------- + +Daemons can be placed on hosts as well: + + .. prompt:: bash # + + ceph orch apply prometheus --placement='myhost[1-3]' + +Or in YAML: + +.. code-block:: yaml + + service_type: prometheus + placement: + host_pattern: "myhost[1-3]" + +To place a service on *all* hosts, use ``"*"``: + + .. prompt:: bash # + + ceph orch apply node-exporter --placement='*' + +Or in YAML: + +.. code-block:: yaml + + service_type: node-exporter + placement: + host_pattern: "*" + + +Changing the number of daemons +------------------------------ + +By specifying ``count``, only the number of daemons specified will be created: + + .. prompt:: bash # + + ceph orch apply prometheus --placement=3 + +To deploy *daemons* on a subset of hosts, specify the count: + + .. prompt:: bash # + + ceph orch apply prometheus --placement="2 host1 host2 host3" + +If the count is bigger than the amount of hosts, cephadm deploys one per host: + + .. prompt:: bash # + + ceph orch apply prometheus --placement="3 host1 host2" + +The command immediately above results in two Prometheus daemons. + +YAML can also be used to specify limits, in the following way: + +.. code-block:: yaml + + service_type: prometheus + placement: + count: 3 + +YAML can also be used to specify limits on hosts: + +.. code-block:: yaml + + service_type: prometheus + placement: + count: 2 + hosts: + - host1 + - host2 + - host3 + +.. _cephadm_co_location: + +Co-location of daemons +---------------------- + +Cephadm supports the deployment of multiple daemons on the same host: + +.. code-block:: yaml + + service_type: rgw + placement: + label: rgw + count_per_host: 2 + +The main reason for deploying multiple daemons per host is an additional +performance benefit for running multiple RGW and MDS daemons on the same host. + +See also: + +* :ref:`cephadm_mgr_co_location`. +* :ref:`cephadm-rgw-designated_gateways`. + +This feature was introduced in Pacific. + +Algorithm description +--------------------- + +Cephadm's declarative state consists of a list of service specifications +containing placement specifications. + +Cephadm continually compares a list of daemons actually running in the cluster +against the list in the service specifications. Cephadm adds new daemons and +removes old daemons as necessary in order to conform to the service +specifications. + +Cephadm does the following to maintain compliance with the service +specifications. + +Cephadm first selects a list of candidate hosts. Cephadm seeks explicit host +names and selects them. If cephadm finds no explicit host names, it looks for +label specifications. If no label is defined in the specification, cephadm +selects hosts based on a host pattern. If no host pattern is defined, as a last +resort, cephadm selects all known hosts as candidates. + +Cephadm is aware of existing daemons running services and tries to avoid moving +them. + +Cephadm supports the deployment of a specific amount of services. +Consider the following service specification: + +.. code-block:: yaml + + service_type: mds + service_name: myfs + placement: + count: 3 + label: myfs + +This service specification instructs cephadm to deploy three daemons on hosts +labeled ``myfs`` across the cluster. + +If there are fewer than three daemons deployed on the candidate hosts, cephadm +randomly chooses hosts on which to deploy new daemons. + +If there are more than three daemons deployed on the candidate hosts, cephadm +removes existing daemons. + +Finally, cephadm removes daemons on hosts that are outside of the list of +candidate hosts. + +.. note:: + + There is a special case that cephadm must consider. + + If there are fewer hosts selected by the placement specification than + demanded by ``count``, cephadm will deploy only on the selected hosts. + +.. _cephadm-extra-container-args: + +Extra Container Arguments +========================= + +.. warning:: + The arguments provided for extra container args are limited to whatever arguments are available for + a `run` command from whichever container engine you are using. Providing any arguments the `run` + command does not support (or invalid values for arguments) will cause the daemon to fail to start. + +.. note:: + + For arguments passed to the process running inside the container rather than the for + the container runtime itself, see :ref:`cephadm-extra-entrypoint-args` + + +Cephadm supports providing extra miscellaneous container arguments for +specific cases when they may be necessary. For example, if a user needed +to limit the amount of cpus their mon daemons make use of they could apply +a spec like + +.. code-block:: yaml + + service_type: mon + service_name: mon + placement: + hosts: + - host1 + - host2 + - host3 + extra_container_args: + - "--cpus=2" + +which would cause each mon daemon to be deployed with `--cpus=2`. + +There are two ways to express arguments in the ``extra_container_args`` list. +To start, an item in the list can be a string. When passing an argument +as a string and the string contains spaces, Cephadm will automatically split it +into multiple arguments. For example, ``--cpus 2`` would become ``["--cpus", +"2"]`` when processed. Example: + +.. code-block:: yaml + + service_type: mon + service_name: mon + placement: + hosts: + - host1 + - host2 + - host3 + extra_container_args: + - "--cpus 2" + +As an alternative, an item in the list can be an object (mapping) containing +the required key "argument" and an optional key "split". The value associated +with the ``argument`` key must be a single string. The value associated with +the ``split`` key is a boolean value. The ``split`` key explicitly controls if +spaces in the argument value cause the value to be split into multiple +arguments. If ``split`` is true then Cephadm will automatically split the value +into multiple arguments. If ``split`` is false then spaces in the value will +be retained in the argument. The default, when ``split`` is not provided, is +false. Examples: + +.. code-block:: yaml + + service_type: mon + service_name: mon + placement: + hosts: + - tiebreaker + extra_container_args: + # No spaces, always treated as a single argument + - argument: "--timout=3000" + # Splitting explicitly disabled, one single argument + - argument: "--annotation=com.example.name=my favorite mon" + split: false + # Splitting explicitly enabled, will become two arguments + - argument: "--cpuset-cpus 1-3,7-11" + split: true + # Splitting implicitly disabled, one single argument + - argument: "--annotation=com.example.note=a simple example" + +Mounting Files with Extra Container Arguments +--------------------------------------------- + +A common use case for extra container arguments is to mount additional +files within the container. Older versions of Ceph did not support spaces +in arguments and therefore the examples below apply to the widest range +of Ceph versions. + +.. code-block:: yaml + + extra_container_args: + - "-v" + - "/absolute/file/path/on/host:/absolute/file/path/in/container" + +For example: + +.. code-block:: yaml + + extra_container_args: + - "-v" + - "/opt/ceph_cert/host.cert:/etc/grafana/certs/cert_file:ro" + +.. _cephadm-extra-entrypoint-args: + +Extra Entrypoint Arguments +========================== + +.. note:: + + For arguments intended for the container runtime rather than the process inside + it, see :ref:`cephadm-extra-container-args` + +Similar to extra container args for the container runtime, Cephadm supports +appending to args passed to the entrypoint process running +within a container. For example, to set the collector textfile directory for +the node-exporter service , one could apply a service spec like + +.. code-block:: yaml + + service_type: node-exporter + service_name: node-exporter + placement: + host_pattern: '*' + extra_entrypoint_args: + - "--collector.textfile.directory=/var/lib/node_exporter/textfile_collector2" + +There are two ways to express arguments in the ``extra_entrypoint_args`` list. +To start, an item in the list can be a string. When passing an argument as a +string and the string contains spaces, cephadm will automatically split it into +multiple arguments. For example, ``--debug_ms 10`` would become +``["--debug_ms", "10"]`` when processed. Example: + +.. code-block:: yaml + + service_type: mon + service_name: mon + placement: + hosts: + - host1 + - host2 + - host3 + extra_entrypoint_args: + - "--debug_ms 2" + +As an alternative, an item in the list can be an object (mapping) containing +the required key "argument" and an optional key "split". The value associated +with the ``argument`` key must be a single string. The value associated with +the ``split`` key is a boolean value. The ``split`` key explicitly controls if +spaces in the argument value cause the value to be split into multiple +arguments. If ``split`` is true then cephadm will automatically split the value +into multiple arguments. If ``split`` is false then spaces in the value will +be retained in the argument. The default, when ``split`` is not provided, is +false. Examples: + +.. code-block:: yaml + + # An theoretical data migration service + service_type: pretend + service_name: imagine1 + placement: + hosts: + - host1 + extra_entrypoint_args: + # No spaces, always treated as a single argument + - argument: "--timout=30m" + # Splitting explicitly disabled, one single argument + - argument: "--import=/mnt/usb/My Documents" + split: false + # Splitting explicitly enabled, will become two arguments + - argument: "--tag documents" + split: true + # Splitting implicitly disabled, one single argument + - argument: "--title=Imported Documents" + + +Custom Config Files +=================== + +Cephadm supports specifying miscellaneous config files for daemons. +To do so, users must provide both the content of the config file and the +location within the daemon's container at which it should be mounted. After +applying a YAML spec with custom config files specified and having cephadm +redeploy the daemons for which the config files are specified, these files will +be mounted within the daemon's container at the specified location. + +Example service spec: + +.. code-block:: yaml + + service_type: grafana + service_name: grafana + custom_configs: + - mount_path: /etc/example.conf + content: | + setting1 = value1 + setting2 = value2 + - mount_path: /usr/share/grafana/example.cert + content: | + -----BEGIN PRIVATE KEY----- + V2VyIGRhcyBsaWVzdCBpc3QgZG9vZi4gTG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFt + ZXQsIGNvbnNldGV0dXIgc2FkaXBzY2luZyBlbGl0ciwgc2VkIGRpYW0gbm9udW15 + IGVpcm1vZCB0ZW1wb3IgaW52aWR1bnQgdXQgbGFib3JlIGV0IGRvbG9yZSBtYWdu + YSBhbGlxdXlhbSBlcmF0LCBzZWQgZGlhbSB2b2x1cHR1YS4gQXQgdmVybyBlb3Mg + ZXQgYWNjdXNhbSBldCBqdXN0byBkdW8= + -----END PRIVATE KEY----- + -----BEGIN CERTIFICATE----- + V2VyIGRhcyBsaWVzdCBpc3QgZG9vZi4gTG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFt + ZXQsIGNvbnNldGV0dXIgc2FkaXBzY2luZyBlbGl0ciwgc2VkIGRpYW0gbm9udW15 + IGVpcm1vZCB0ZW1wb3IgaW52aWR1bnQgdXQgbGFib3JlIGV0IGRvbG9yZSBtYWdu + YSBhbGlxdXlhbSBlcmF0LCBzZWQgZGlhbSB2b2x1cHR1YS4gQXQgdmVybyBlb3Mg + ZXQgYWNjdXNhbSBldCBqdXN0byBkdW8= + -----END CERTIFICATE----- + +To make these new config files actually get mounted within the +containers for the daemons + +.. prompt:: bash + + ceph orch redeploy <service-name> + +For example: + +.. prompt:: bash + + ceph orch redeploy grafana + +.. _orch-rm: + +Removing a Service +================== + +In order to remove a service including the removal +of all daemons of that service, run + +.. prompt:: bash + + ceph orch rm <service-name> + +For example: + +.. prompt:: bash + + ceph orch rm rgw.myrgw + +.. _cephadm-spec-unmanaged: + +Disabling automatic deployment of daemons +========================================= + +Cephadm supports disabling the automated deployment and removal of daemons on a +per service basis. The CLI supports two commands for this. + +In order to fully remove a service, see :ref:`orch-rm`. + +Disabling automatic management of daemons +----------------------------------------- + +To disable the automatic management of dameons, set ``unmanaged=True`` in the +:ref:`orchestrator-cli-service-spec` (``mgr.yaml``). + +``mgr.yaml``: + +.. code-block:: yaml + + service_type: mgr + unmanaged: true + placement: + label: mgr + + +.. prompt:: bash # + + ceph orch apply -i mgr.yaml + +Cephadm also supports setting the unmanaged parameter to true or false +using the ``ceph orch set-unmanaged`` and ``ceph orch set-managed`` commands. +The commands take the service name (as reported in ``ceph orch ls``) as +the only argument. For example, + +.. prompt:: bash # + + ceph orch set-unmanaged mon + +would set ``unmanaged: true`` for the mon service and + +.. prompt:: bash # + + ceph orch set-managed mon + +would set ``unmanaged: false`` for the mon service + +.. note:: + + After you apply this change in the Service Specification, cephadm will no + longer deploy any new daemons (even if the placement specification matches + additional hosts). + +.. note:: + + The "osd" service used to track OSDs that are not tied to any specific + service spec is special and will always be marked unmanaged. Attempting + to modify it with ``ceph orch set-unmanaged`` or ``ceph orch set-managed`` + will result in a message ``No service of name osd found. Check "ceph orch ls" for all known services`` + +Deploying a daemon on a host manually +------------------------------------- + +.. note:: + + This workflow has a very limited use case and should only be used + in rare circumstances. + +To manually deploy a daemon on a host, follow these steps: + +Modify the service spec for a service by getting the +existing spec, adding ``unmanaged: true``, and applying the modified spec. + +Then manually deploy the daemon using the following: + + .. prompt:: bash # + + ceph orch daemon add <daemon-type> --placement=<placement spec> + +For example : + + .. prompt:: bash # + + ceph orch daemon add mgr --placement=my_host + +.. note:: + + Removing ``unmanaged: true`` from the service spec will + enable the reconciliation loop for this service and will + potentially lead to the removal of the daemon, depending + on the placement spec. + +Removing a daemon from a host manually +-------------------------------------- + +To manually remove a daemon, run a command of the following form: + + .. prompt:: bash # + + ceph orch daemon rm <daemon name>... [--force] + +For example: + + .. prompt:: bash # + + ceph orch daemon rm mgr.my_host.xyzxyz + +.. note:: + + For managed services (``unmanaged=False``), cephadm will automatically + deploy a new daemon a few seconds later. + +See also +-------- + +* See :ref:`cephadm-osd-declarative` for special handling of unmanaged OSDs. +* See also :ref:`cephadm-pause` diff --git a/doc/cephadm/services/iscsi.rst b/doc/cephadm/services/iscsi.rst new file mode 100644 index 000000000..bb2de7921 --- /dev/null +++ b/doc/cephadm/services/iscsi.rst @@ -0,0 +1,87 @@ +============= +iSCSI Service +============= + +.. _cephadm-iscsi: + +Deploying iSCSI +=============== + +To deploy an iSCSI gateway, create a yaml file containing a +service specification for iscsi: + +.. code-block:: yaml + + service_type: iscsi + service_id: iscsi + placement: + hosts: + - host1 + - host2 + spec: + pool: mypool # RADOS pool where ceph-iscsi config data is stored. + trusted_ip_list: "IP_ADDRESS_1,IP_ADDRESS_2" + api_port: ... # optional + api_user: ... # optional + api_password: ... # optional + api_secure: true/false # optional + ssl_cert: | # optional + ... + ssl_key: | # optional + ... + +For example: + +.. code-block:: yaml + + service_type: iscsi + service_id: iscsi + placement: + hosts: + - [...] + spec: + pool: iscsi_pool + trusted_ip_list: "IP_ADDRESS_1,IP_ADDRESS_2,IP_ADDRESS_3,..." + api_user: API_USERNAME + api_password: API_PASSWORD + ssl_cert: | + -----BEGIN CERTIFICATE----- + MIIDtTCCAp2gAwIBAgIYMC4xNzc1NDQxNjEzMzc2MjMyXzxvQ7EcMA0GCSqGSIb3 + DQEBCwUAMG0xCzAJBgNVBAYTAlVTMQ0wCwYDVQQIDARVdGFoMRcwFQYDVQQHDA5T + [...] + -----END CERTIFICATE----- + ssl_key: | + -----BEGIN PRIVATE KEY----- + MIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQC5jdYbjtNTAKW4 + /CwQr/7wOiLGzVxChn3mmCIF3DwbL/qvTFTX2d8bDf6LjGwLYloXHscRfxszX/4h + [...] + -----END PRIVATE KEY----- + +.. py:currentmodule:: ceph.deployment.service_spec + +.. autoclass:: IscsiServiceSpec + :members: + + +The specification can then be applied using: + +.. prompt:: bash # + + ceph orch apply -i iscsi.yaml + + +See :ref:`orchestrator-cli-placement-spec` for details of the placement specification. + +See also: :ref:`orchestrator-cli-service-spec`. + +Configuring iSCSI client +======================== + +The containerized iscsi service can be used from any host by +:ref:`configuring-the-iscsi-initiators`, which will use TCP/IP to send SCSI +commands to the iSCSI target (gateway). + +Further Reading +=============== + +* Ceph iSCSI Overview: :ref:`ceph-iscsi` diff --git a/doc/cephadm/services/mds.rst b/doc/cephadm/services/mds.rst new file mode 100644 index 000000000..96b7c2dda --- /dev/null +++ b/doc/cephadm/services/mds.rst @@ -0,0 +1,61 @@ +=========== +MDS Service +=========== + + +.. _orchestrator-cli-cephfs: + +Deploy CephFS +============= + +One or more MDS daemons is required to use the :term:`CephFS` file system. +These are created automatically if the newer ``ceph fs volume`` +interface is used to create a new file system. For more information, +see :ref:`fs-volumes-and-subvolumes`. + +For example: + +.. prompt:: bash # + + ceph fs volume create <fs_name> --placement="<placement spec>" + +where ``fs_name`` is the name of the CephFS and ``placement`` is a +:ref:`orchestrator-cli-placement-spec`. For example, to place +MDS daemons for the new ``foo`` volume on hosts labeled with ``mds``: + +.. prompt:: bash # + + ceph fs volume create foo --placement="label:mds" + +You can also update the placement after-the-fact via: + +.. prompt:: bash # + + ceph orch apply mds foo 'mds-[012]' + +For manually deploying MDS daemons, use this specification: + +.. code-block:: yaml + + service_type: mds + service_id: fs_name + placement: + count: 3 + label: mds + + +The specification can then be applied using: + +.. prompt:: bash # + + ceph orch apply -i mds.yaml + +See :ref:`orchestrator-cli-stateless-services` for manually deploying +MDS daemons on the CLI. + +Further Reading +=============== + +* :ref:`ceph-file-system` + + diff --git a/doc/cephadm/services/mgr.rst b/doc/cephadm/services/mgr.rst new file mode 100644 index 000000000..9baff3a7a --- /dev/null +++ b/doc/cephadm/services/mgr.rst @@ -0,0 +1,43 @@ +.. _mgr-cephadm-mgr: + +=========== +MGR Service +=========== + +The cephadm MGR service hosts multiple modules. These include the +:ref:`mgr-dashboard` and the cephadm manager module. + +.. _cephadm-mgr-networks: + +Specifying Networks +------------------- + +The MGR service supports binding only to a specific IP within a network. + +example spec file (leveraging a default placement): + +.. code-block:: yaml + + service_type: mgr + networks: + - 192.169.142.0/24 + +.. _cephadm_mgr_co_location: + +Allow co-location of MGR daemons +================================ + +In deployment scenarios with just a single host, cephadm still needs +to deploy at least two MGR daemons in order to allow an automated +upgrade of the cluster. See ``mgr_standby_modules`` in +the :ref:`mgr-administrator-guide` for further details. + +See also: :ref:`cephadm_co_location`. + + +Further Reading +=============== + +* :ref:`ceph-manager-daemon` +* :ref:`cephadm-manually-deploy-mgr` + diff --git a/doc/cephadm/services/mon.rst b/doc/cephadm/services/mon.rst new file mode 100644 index 000000000..389dc450e --- /dev/null +++ b/doc/cephadm/services/mon.rst @@ -0,0 +1,237 @@ +=========== +MON Service +=========== + +.. _deploy_additional_monitors: + +Deploying additional monitors +============================= + +A typical Ceph cluster has three or five monitor daemons that are spread +across different hosts. We recommend deploying five monitors if there are +five or more nodes in your cluster. + +.. _CIDR: https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation + +Ceph deploys monitor daemons automatically as the cluster grows and Ceph +scales back monitor daemons automatically as the cluster shrinks. The +smooth execution of this automatic growing and shrinking depends upon +proper subnet configuration. + +The cephadm bootstrap procedure assigns the first monitor daemon in the +cluster to a particular subnet. ``cephadm`` designates that subnet as the +default subnet of the cluster. New monitor daemons will be assigned by +default to that subnet unless cephadm is instructed to do otherwise. + +If all of the ceph monitor daemons in your cluster are in the same subnet, +manual administration of the ceph monitor daemons is not necessary. +``cephadm`` will automatically add up to five monitors to the subnet, as +needed, as new hosts are added to the cluster. + +By default, cephadm will deploy 5 daemons on arbitrary hosts. See +:ref:`orchestrator-cli-placement-spec` for details of specifying +the placement of daemons. + +Designating a Particular Subnet for Monitors +-------------------------------------------- + +To designate a particular IP subnet for use by ceph monitor daemons, use a +command of the following form, including the subnet's address in `CIDR`_ +format (e.g., ``10.1.2.0/24``): + + .. prompt:: bash # + + ceph config set mon public_network *<mon-cidr-network>* + + For example: + + .. prompt:: bash # + + ceph config set mon public_network 10.1.2.0/24 + +Cephadm deploys new monitor daemons only on hosts that have IP addresses in +the designated subnet. + +You can also specify two public networks by using a list of networks: + + .. prompt:: bash # + + ceph config set mon public_network *<mon-cidr-network1>,<mon-cidr-network2>* + + For example: + + .. prompt:: bash # + + ceph config set mon public_network 10.1.2.0/24,192.168.0.1/24 + + +Deploying Monitors on a Particular Network +------------------------------------------ + +You can explicitly specify the IP address or CIDR network for each monitor and +control where each monitor is placed. To disable automated monitor deployment, +run this command: + + .. prompt:: bash # + + ceph orch apply mon --unmanaged + + To deploy each additional monitor: + + .. prompt:: bash # + + ceph orch daemon add mon *<host1:ip-or-network1> + + For example, to deploy a second monitor on ``newhost1`` using an IP + address ``10.1.2.123`` and a third monitor on ``newhost2`` in + network ``10.1.2.0/24``, run the following commands: + + .. prompt:: bash # + + ceph orch apply mon --unmanaged + ceph orch daemon add mon newhost1:10.1.2.123 + ceph orch daemon add mon newhost2:10.1.2.0/24 + + Now, enable automatic placement of Daemons + + .. prompt:: bash # + + ceph orch apply mon --placement="newhost1,newhost2,newhost3" --dry-run + + See :ref:`orchestrator-cli-placement-spec` for details of specifying + the placement of daemons. + + Finally apply this new placement by dropping ``--dry-run`` + + .. prompt:: bash # + + ceph orch apply mon --placement="newhost1,newhost2,newhost3" + + +Moving Monitors to a Different Network +-------------------------------------- + +To move Monitors to a new network, deploy new monitors on the new network and +subsequently remove monitors from the old network. It is not advised to +modify and inject the ``monmap`` manually. + +First, disable the automated placement of daemons: + + .. prompt:: bash # + + ceph orch apply mon --unmanaged + +To deploy each additional monitor: + + .. prompt:: bash # + + ceph orch daemon add mon *<newhost1:ip-or-network1>* + +For example, to deploy a second monitor on ``newhost1`` using an IP +address ``10.1.2.123`` and a third monitor on ``newhost2`` in +network ``10.1.2.0/24``, run the following commands: + + .. prompt:: bash # + + ceph orch apply mon --unmanaged + ceph orch daemon add mon newhost1:10.1.2.123 + ceph orch daemon add mon newhost2:10.1.2.0/24 + + Subsequently remove monitors from the old network: + + .. prompt:: bash # + + ceph orch daemon rm *mon.<oldhost1>* + + Update the ``public_network``: + + .. prompt:: bash # + + ceph config set mon public_network *<mon-cidr-network>* + + For example: + + .. prompt:: bash # + + ceph config set mon public_network 10.1.2.0/24 + + Now, enable automatic placement of Daemons + + .. prompt:: bash # + + ceph orch apply mon --placement="newhost1,newhost2,newhost3" --dry-run + + See :ref:`orchestrator-cli-placement-spec` for details of specifying + the placement of daemons. + + Finally apply this new placement by dropping ``--dry-run`` + + .. prompt:: bash # + + ceph orch apply mon --placement="newhost1,newhost2,newhost3" + + +Setting Crush Locations for Monitors +------------------------------------ + +Cephadm supports setting CRUSH locations for mon daemons +using the mon service spec. The CRUSH locations are set +by hostname. When cephadm deploys a mon on a host that matches +a hostname specified in the CRUSH locations, it will add +``--set-crush-location <CRUSH-location>`` where the CRUSH location +is the first entry in the list of CRUSH locations for that +host. If multiple CRUSH locations are set for one host, cephadm +will attempt to set the additional locations using the +"ceph mon set_location" command. + +.. note:: + + Setting the CRUSH location in the spec is the recommended way of + replacing tiebreaker mon daemons, as they require having a location + set when they are added. + + .. note:: + + Tiebreaker mon daemons are a part of stretch mode clusters. For more + info on stretch mode clusters see :ref:`stretch_mode` + +Example syntax for setting the CRUSH locations: + +.. code-block:: yaml + + service_type: mon + service_name: mon + placement: + count: 5 + spec: + crush_locations: + host1: + - datacenter=a + host2: + - datacenter=b + - rack=2 + host3: + - datacenter=a + +.. note:: + + Sometimes, based on the timing of mon daemons being admitted to the mon + quorum, cephadm may fail to set the CRUSH location for some mon daemons + when multiple locations are specified. In this case, the recommended + action is to re-apply the same mon spec to retrigger the service action. + +.. note:: + + Mon daemons will only get the ``--set-crush-location`` flag set when cephadm + actually deploys them. This means if a spec is applied that includes a CRUSH + location for a mon that is already deployed, the flag may not be set until + a redeploy command is issued for that mon daemon. + + +Further Reading +=============== + +* :ref:`rados-operations` +* :ref:`rados-troubleshooting-mon` +* :ref:`cephadm-restore-quorum` + diff --git a/doc/cephadm/services/monitoring.rst b/doc/cephadm/services/monitoring.rst new file mode 100644 index 000000000..a17a5ba03 --- /dev/null +++ b/doc/cephadm/services/monitoring.rst @@ -0,0 +1,543 @@ +.. _mgr-cephadm-monitoring: + +Monitoring Services +=================== + +Ceph Dashboard uses `Prometheus <https://prometheus.io/>`_, `Grafana +<https://grafana.com/>`_, and related tools to store and visualize detailed +metrics on cluster utilization and performance. Ceph users have three options: + +#. Have cephadm deploy and configure these services. This is the default + when bootstrapping a new cluster unless the ``--skip-monitoring-stack`` + option is used. +#. Deploy and configure these services manually. This is recommended for users + with existing prometheus services in their environment (and in cases where + Ceph is running in Kubernetes with Rook). +#. Skip the monitoring stack completely. Some Ceph dashboard graphs will + not be available. + +The monitoring stack consists of `Prometheus <https://prometheus.io/>`_, +Prometheus exporters (:ref:`mgr-prometheus`, `Node exporter +<https://prometheus.io/docs/guides/node-exporter/>`_), `Prometheus Alert +Manager <https://prometheus.io/docs/alerting/alertmanager/>`_ and `Grafana +<https://grafana.com/>`_. + +.. note:: + + Prometheus' security model presumes that untrusted users have access to the + Prometheus HTTP endpoint and logs. Untrusted users have access to all the + (meta)data Prometheus collects that is contained in the database, plus a + variety of operational and debugging information. + + However, Prometheus' HTTP API is limited to read-only operations. + Configurations can *not* be changed using the API and secrets are not + exposed. Moreover, Prometheus has some built-in measures to mitigate the + impact of denial of service attacks. + + Please see `Prometheus' Security model + <https://prometheus.io/docs/operating/security/>` for more detailed + information. + +Deploying monitoring with cephadm +--------------------------------- + +The default behavior of ``cephadm`` is to deploy a basic monitoring stack. It +is however possible that you have a Ceph cluster without a monitoring stack, +and you would like to add a monitoring stack to it. (Here are some ways that +you might have come to have a Ceph cluster without a monitoring stack: You +might have passed the ``--skip-monitoring stack`` option to ``cephadm`` during +the installation of the cluster, or you might have converted an existing +cluster (which had no monitoring stack) to cephadm management.) + +To set up monitoring on a Ceph cluster that has no monitoring, follow the +steps below: + +#. Deploy a node-exporter service on every node of the cluster. The node-exporter provides host-level metrics like CPU and memory utilization: + + .. prompt:: bash # + + ceph orch apply node-exporter + +#. Deploy alertmanager: + + .. prompt:: bash # + + ceph orch apply alertmanager + +#. Deploy Prometheus. A single Prometheus instance is sufficient, but + for high availability (HA) you might want to deploy two: + + .. prompt:: bash # + + ceph orch apply prometheus + + or + + .. prompt:: bash # + + ceph orch apply prometheus --placement 'count:2' + +#. Deploy grafana: + + .. prompt:: bash # + + ceph orch apply grafana + +.. _cephadm-monitoring-centralized-logs: + +Centralized Logging in Ceph +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Ceph now provides centralized logging with Loki & Promtail. Centralized Log Management (CLM) consolidates all log data and pushes it to a central repository, +with an accessible and easy-to-use interface. Centralized logging is designed to make your life easier. +Some of the advantages are: + +#. **Linear event timeline**: it is easier to troubleshoot issues analyzing a single chain of events than thousands of different logs from a hundred nodes. +#. **Real-time live log monitoring**: it is impractical to follow logs from thousands of different sources. +#. **Flexible retention policies**: with per-daemon logs, log rotation is usually set to a short interval (1-2 weeks) to save disk usage. +#. **Increased security & backup**: logs can contain sensitive information and expose usage patterns. Additionally, centralized logging allows for HA, etc. + +Centralized Logging in Ceph is implemented using two new services - ``loki`` & ``promtail``. + +Loki: It is basically a log aggregation system and is used to query logs. It can be configured as a datasource in Grafana. + +Promtail: It acts as an agent that gathers logs from the system and makes them available to Loki. + +These two services are not deployed by default in a Ceph cluster. To enable the centralized logging you can follow the steps mentioned here :ref:`centralized-logging`. + +.. _cephadm-monitoring-networks-ports: + +Networks and Ports +~~~~~~~~~~~~~~~~~~ + +All monitoring services can have the network and port they bind to configured with a yaml service specification. By default +cephadm will use ``https`` protocol when configuring Grafana daemons unless the user explicitly sets the protocol to ``http``. + +example spec file: + +.. code-block:: yaml + + service_type: grafana + service_name: grafana + placement: + count: 1 + networks: + - 192.169.142.0/24 + spec: + port: 4200 + protocol: http + +.. _cephadm_monitoring-images: + +Using custom images +~~~~~~~~~~~~~~~~~~~ + +It is possible to install or upgrade monitoring components based on other +images. To do so, the name of the image to be used needs to be stored in the +configuration first. The following configuration options are available. + +- ``container_image_prometheus`` +- ``container_image_grafana`` +- ``container_image_alertmanager`` +- ``container_image_node_exporter`` + +Custom images can be set with the ``ceph config`` command + +.. code-block:: bash + + ceph config set mgr mgr/cephadm/<option_name> <value> + +For example + +.. code-block:: bash + + ceph config set mgr mgr/cephadm/container_image_prometheus prom/prometheus:v1.4.1 + +If there were already running monitoring stack daemon(s) of the type whose +image you've changed, you must redeploy the daemon(s) in order to have them +actually use the new image. + +For example, if you had changed the prometheus image + +.. prompt:: bash # + + ceph orch redeploy prometheus + + +.. note:: + + By setting a custom image, the default value will be overridden (but not + overwritten). The default value changes when updates become available. + By setting a custom image, you will not be able to update the component + you have set the custom image for automatically. You will need to + manually update the configuration (image name and tag) to be able to + install updates. + + If you choose to go with the recommendations instead, you can reset the + custom image you have set before. After that, the default value will be + used again. Use ``ceph config rm`` to reset the configuration option + + .. code-block:: bash + + ceph config rm mgr mgr/cephadm/<option_name> + + For example + + .. code-block:: bash + + ceph config rm mgr mgr/cephadm/container_image_prometheus + +See also :ref:`cephadm-airgap`. + +.. _cephadm-overwrite-jinja2-templates: + +Using custom configuration files +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +By overriding cephadm templates, it is possible to completely customize the +configuration files for monitoring services. + +Internally, cephadm already uses `Jinja2 +<https://jinja.palletsprojects.com/en/2.11.x/>`_ templates to generate the +configuration files for all monitoring components. Starting from version 17.2.3, +cephadm supports Prometheus http service discovery, and uses this endpoint for the +definition and management of the embedded Prometheus service. The endpoint listens on +``https://<mgr-ip>:8765/sd/`` (the port is +configurable through the variable ``service_discovery_port``) and returns scrape target +information in `http_sd_config format +<https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config/>`_ + +Customers with external monitoring stack can use `ceph-mgr` service discovery endpoint +to get scraping configuration. Root certificate of the server can be obtained by the +following command: + + .. prompt:: bash # + + ceph orch sd dump cert + +The configuration of Prometheus, Grafana, or Alertmanager may be customized by storing +a Jinja2 template for each service. This template will be evaluated every time a service +of that kind is deployed or reconfigured. That way, the custom configuration is preserved +and automatically applied on future deployments of these services. + +.. note:: + + The configuration of the custom template is also preserved when the default + configuration of cephadm changes. If the updated configuration is to be used, + the custom template needs to be migrated *manually* after each upgrade of Ceph. + +Option names +"""""""""""" + +The following templates for files that will be generated by cephadm can be +overridden. These are the names to be used when storing with ``ceph config-key +set``: + +- ``services/alertmanager/alertmanager.yml`` +- ``services/grafana/ceph-dashboard.yml`` +- ``services/grafana/grafana.ini`` +- ``services/prometheus/prometheus.yml`` +- ``services/prometheus/alerting/custom_alerts.yml`` +- ``services/loki.yml`` +- ``services/promtail.yml`` + +You can look up the file templates that are currently used by cephadm in +``src/pybind/mgr/cephadm/templates``: + +- ``services/alertmanager/alertmanager.yml.j2`` +- ``services/grafana/ceph-dashboard.yml.j2`` +- ``services/grafana/grafana.ini.j2`` +- ``services/prometheus/prometheus.yml.j2`` +- ``services/loki.yml.j2`` +- ``services/promtail.yml.j2`` + +Usage +""""" + +The following command applies a single line value: + +.. code-block:: bash + + ceph config-key set mgr/cephadm/<option_name> <value> + +To set contents of files as template use the ``-i`` argument: + +.. code-block:: bash + + ceph config-key set mgr/cephadm/<option_name> -i $PWD/<filename> + +.. note:: + + When using files as input to ``config-key`` an absolute path to the file must + be used. + + +Then the configuration file for the service needs to be recreated. +This is done using `reconfig`. For more details see the following example. + +Example +""""""" + +.. code-block:: bash + + # set the contents of ./prometheus.yml.j2 as template + ceph config-key set mgr/cephadm/services/prometheus/prometheus.yml \ + -i $PWD/prometheus.yml.j2 + + # reconfig the prometheus service + ceph orch reconfig prometheus + +.. code-block:: bash + + # set additional custom alerting rules for Prometheus + ceph config-key set mgr/cephadm/services/prometheus/alerting/custom_alerts.yml \ + -i $PWD/custom_alerts.yml + + # Note that custom alerting rules are not parsed by Jinja and hence escaping + # will not be an issue. + +Deploying monitoring without cephadm +------------------------------------ + +If you have an existing prometheus monitoring infrastructure, or would like +to manage it yourself, you need to configure it to integrate with your Ceph +cluster. + +* Enable the prometheus module in the ceph-mgr daemon + + .. code-block:: bash + + ceph mgr module enable prometheus + + By default, ceph-mgr presents prometheus metrics on port 9283 on each host + running a ceph-mgr daemon. Configure prometheus to scrape these. + +To make this integration easier, cephadm provides a service discovery endpoint at +``https://<mgr-ip>:8765/sd/``. This endpoint can be used by an external +Prometheus server to retrieve target information for a specific service. Information returned +by this endpoint uses the format specified by the Prometheus `http_sd_config option +<https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config/>`_ + +Here's an example prometheus job definition that uses the cephadm service discovery endpoint + + .. code-block:: bash + + - job_name: 'ceph-exporter' + http_sd_configs: + - url: http://<mgr-ip>:8765/sd/prometheus/sd-config?service=ceph-exporter + + +* To enable the dashboard's prometheus-based alerting, see :ref:`dashboard-alerting`. + +* To enable dashboard integration with Grafana, see :ref:`dashboard-grafana`. + +Disabling monitoring +-------------------- + +To disable monitoring and remove the software that supports it, run the following commands: + +.. code-block:: console + + $ ceph orch rm grafana + $ ceph orch rm prometheus --force # this will delete metrics data collected so far + $ ceph orch rm node-exporter + $ ceph orch rm alertmanager + $ ceph mgr module disable prometheus + +See also :ref:`orch-rm`. + +Setting up RBD-Image monitoring +------------------------------- + +Due to performance reasons, monitoring of RBD images is disabled by default. For more information please see +:ref:`prometheus-rbd-io-statistics`. If disabled, the overview and details dashboards will stay empty in Grafana +and the metrics will not be visible in Prometheus. + +Setting up Prometheus +----------------------- + +Setting Prometheus Retention Size and Time +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Cephadm can configure Prometheus TSDB retention by specifying ``retention_time`` +and ``retention_size`` values in the Prometheus service spec. +The retention time value defaults to 15 days (15d). Users can set a different value/unit where +supported units are: 'y', 'w', 'd', 'h', 'm' and 's'. The retention size value defaults +to 0 (disabled). Supported units in this case are: 'B', 'KB', 'MB', 'GB', 'TB', 'PB' and 'EB'. + +In the following example spec we set the retention time to 1 year and the size to 1GB. + +.. code-block:: yaml + + service_type: prometheus + placement: + count: 1 + spec: + retention_time: "1y" + retention_size: "1GB" + +.. note:: + + If you already had Prometheus daemon(s) deployed before and are updating an + existent spec as opposed to doing a fresh Prometheus deployment, you must also + tell cephadm to redeploy the Prometheus daemon(s) to put this change into effect. + This can be done with a ``ceph orch redeploy prometheus`` command. + +Setting up Grafana +------------------ + +Manually setting the Grafana URL +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Cephadm automatically configures Prometheus, Grafana, and Alertmanager in +all cases except one. + +In a some setups, the Dashboard user's browser might not be able to access the +Grafana URL that is configured in Ceph Dashboard. This can happen when the +cluster and the accessing user are in different DNS zones. + +If this is the case, you can use a configuration option for Ceph Dashboard +to set the URL that the user's browser will use to access Grafana. This +value will never be altered by cephadm. To set this configuration option, +issue the following command: + + .. prompt:: bash $ + + ceph dashboard set-grafana-frontend-api-url <grafana-server-api> + +It might take a minute or two for services to be deployed. After the +services have been deployed, you should see something like this when you issue the command ``ceph orch ls``: + +.. code-block:: console + + $ ceph orch ls + NAME RUNNING REFRESHED IMAGE NAME IMAGE ID SPEC + alertmanager 1/1 6s ago docker.io/prom/alertmanager:latest 0881eb8f169f present + crash 2/2 6s ago docker.io/ceph/daemon-base:latest-master-devel mix present + grafana 1/1 0s ago docker.io/pcuzner/ceph-grafana-el8:latest f77afcf0bcf6 absent + node-exporter 2/2 6s ago docker.io/prom/node-exporter:latest e5a616e4b9cf present + prometheus 1/1 6s ago docker.io/prom/prometheus:latest e935122ab143 present + +Configuring SSL/TLS for Grafana +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``cephadm`` deploys Grafana using the certificate defined in the ceph +key/value store. If no certificate is specified, ``cephadm`` generates a +self-signed certificate during the deployment of the Grafana service. Each +certificate is specific for the host it was generated on. + +A custom certificate can be configured using the following commands: + +.. prompt:: bash # + + ceph config-key set mgr/cephadm/{hostname}/grafana_key -i $PWD/key.pem + ceph config-key set mgr/cephadm/{hostname}/grafana_crt -i $PWD/certificate.pem + +Where `hostname` is the hostname for the host where grafana service is deployed. + +If you have already deployed Grafana, run ``reconfig`` on the service to +update its configuration: + +.. prompt:: bash # + + ceph orch reconfig grafana + +The ``reconfig`` command also sets the proper URL for Ceph Dashboard. + +Setting the initial admin password +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +By default, Grafana will not create an initial +admin user. In order to create the admin user, please create a file +``grafana.yaml`` with this content: + +.. code-block:: yaml + + service_type: grafana + spec: + initial_admin_password: mypassword + +Then apply this specification: + +.. code-block:: bash + + ceph orch apply -i grafana.yaml + ceph orch redeploy grafana + +Grafana will now create an admin user called ``admin`` with the +given password. + +Turning off anonymous access +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +By default, cephadm allows anonymous users (users who have not provided any +login information) limited, viewer only access to the grafana dashboard. In +order to set up grafana to only allow viewing from logged in users, you can +set ``anonymous_access: False`` in your grafana spec. + +.. code-block:: yaml + + service_type: grafana + placement: + hosts: + - host1 + spec: + anonymous_access: False + initial_admin_password: "mypassword" + +Since deploying grafana with anonymous access set to false without an initial +admin password set would make the dashboard inaccessible, cephadm requires +setting the ``initial_admin_password`` when ``anonymous_access`` is set to false. + + +Setting up Alertmanager +----------------------- + +Adding Alertmanager webhooks +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To add new webhooks to the Alertmanager configuration, add additional +webhook urls like so: + +.. code-block:: yaml + + service_type: alertmanager + spec: + user_data: + default_webhook_urls: + - "https://foo" + - "https://bar" + +Where ``default_webhook_urls`` is a list of additional URLs that are +added to the default receivers' ``<webhook_configs>`` configuration. + +Run ``reconfig`` on the service to update its configuration: + +.. prompt:: bash # + + ceph orch reconfig alertmanager + +Turn on Certificate Validation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If you are using certificates for alertmanager and want to make sure +these certs are verified, you should set the "secure" option to +true in your alertmanager spec (this defaults to false). + +.. code-block:: yaml + + service_type: alertmanager + spec: + secure: true + +If you already had alertmanager daemons running before applying the spec +you must reconfigure them to update their configuration + +.. prompt:: bash # + + ceph orch reconfig alertmanager + +Further Reading +--------------- + +* :ref:`mgr-prometheus` diff --git a/doc/cephadm/services/nfs.rst b/doc/cephadm/services/nfs.rst new file mode 100644 index 000000000..2f12c5916 --- /dev/null +++ b/doc/cephadm/services/nfs.rst @@ -0,0 +1,215 @@ +.. _deploy-cephadm-nfs-ganesha: + +=========== +NFS Service +=========== + +.. note:: Only the NFSv4 protocol is supported. + +The simplest way to manage NFS is via the ``ceph nfs cluster ...`` +commands; see :ref:`mgr-nfs`. This document covers how to manage the +cephadm services directly, which should only be necessary for unusual NFS +configurations. + +Deploying NFS ganesha +===================== + +Cephadm deploys NFS Ganesha daemon (or set of daemons). The configuration for +NFS is stored in the ``nfs-ganesha`` pool and exports are managed via the +``ceph nfs export ...`` commands and via the dashboard. + +To deploy a NFS Ganesha gateway, run the following command: + +.. prompt:: bash # + + ceph orch apply nfs *<svc_id>* [--port *<port>*] [--placement ...] + +For example, to deploy NFS with a service id of *foo* on the default +port 2049 with the default placement of a single daemon: + +.. prompt:: bash # + + ceph orch apply nfs foo + +See :ref:`orchestrator-cli-placement-spec` for the details of the placement +specification. + +Service Specification +===================== + +Alternatively, an NFS service can be applied using a YAML specification. + +.. code-block:: yaml + + service_type: nfs + service_id: mynfs + placement: + hosts: + - host1 + - host2 + spec: + port: 12345 + +In this example, we run the server on the non-default ``port`` of +12345 (instead of the default 2049) on ``host1`` and ``host2``. + +The specification can then be applied by running the following command: + +.. prompt:: bash # + + ceph orch apply -i nfs.yaml + +.. _cephadm-ha-nfs: + +High-availability NFS +===================== + +Deploying an *ingress* service for an existing *nfs* service will provide: + +* a stable, virtual IP that can be used to access the NFS server +* fail-over between hosts if there is a host failure +* load distribution across multiple NFS gateways (although this is rarely necessary) + +Ingress for NFS can be deployed for an existing NFS service +(``nfs.mynfs`` in this example) with the following specification: + +.. code-block:: yaml + + service_type: ingress + service_id: nfs.mynfs + placement: + count: 2 + spec: + backend_service: nfs.mynfs + frontend_port: 2049 + monitor_port: 9000 + virtual_ip: 10.0.0.123/24 + +A few notes: + + * The *virtual_ip* must include a CIDR prefix length, as in the + example above. The virtual IP will normally be configured on the + first identified network interface that has an existing IP in the + same subnet. You can also specify a *virtual_interface_networks* + property to match against IPs in other networks; see + :ref:`ingress-virtual-ip` for more information. + * The *monitor_port* is used to access the haproxy load status + page. The user is ``admin`` by default, but can be modified by + via an *admin* property in the spec. If a password is not + specified via a *password* property in the spec, the auto-generated password + can be found with: + + .. prompt:: bash # + + ceph config-key get mgr/cephadm/ingress.*{svc_id}*/monitor_password + + For example: + + .. prompt:: bash # + + ceph config-key get mgr/cephadm/ingress.nfs.myfoo/monitor_password + + * The backend service (``nfs.mynfs`` in this example) should include + a *port* property that is not 2049 to avoid conflicting with the + ingress service, which could be placed on the same host(s). + +NFS with virtual IP but no haproxy +---------------------------------- + +Cephadm also supports deploying nfs with keepalived but not haproxy. This +offers a virtual ip supported by keepalived that the nfs daemon can directly bind +to instead of having traffic go through haproxy. + +In this setup, you'll either want to set up the service using the nfs module +(see :ref:`nfs-module-cluster-create`) or place the ingress service first, so +the virtual IP is present for the nfs daemon to bind to. The ingress service +should include the attribute ``keepalive_only`` set to true. For example + +.. code-block:: yaml + + service_type: ingress + service_id: nfs.foo + placement: + count: 1 + hosts: + - host1 + - host2 + - host3 + spec: + backend_service: nfs.foo + monitor_port: 9049 + virtual_ip: 192.168.122.100/24 + keepalive_only: true + +Then, an nfs service could be created that specifies a ``virtual_ip`` attribute +that will tell it to bind to that specific IP. + +.. code-block:: yaml + + service_type: nfs + service_id: foo + placement: + count: 1 + hosts: + - host1 + - host2 + - host3 + spec: + port: 2049 + virtual_ip: 192.168.122.100 + +Note that in these setups, one should make sure to include ``count: 1`` in the +nfs placement, as it's only possible for one nfs daemon to bind to the virtual IP. + +NFS with HAProxy Protocol Support +---------------------------------- + +Cephadm supports deploying NFS in High-Availability mode with additional +HAProxy protocol support. This works just like High-availability NFS but also +supports client IP level configuration on NFS Exports. This feature requires +`NFS-Ganesha v5.0`_ or later. + +.. _NFS-Ganesha v5.0: https://github.com/nfs-ganesha/nfs-ganesha/wiki/ReleaseNotes_5 + +To use this mode, you'll either want to set up the service using the nfs module +(see :ref:`nfs-module-cluster-create`) or manually create services with the +extra parameter ``enable_haproxy_protocol`` set to true. Both NFS Service and +Ingress service must have ``enable_haproxy_protocol`` set to the same value. +For example: + +.. code-block:: yaml + + service_type: ingress + service_id: nfs.foo + placement: + count: 1 + hosts: + - host1 + - host2 + - host3 + spec: + backend_service: nfs.foo + monitor_port: 9049 + virtual_ip: 192.168.122.100/24 + enable_haproxy_protocol: true + +.. code-block:: yaml + + service_type: nfs + service_id: foo + placement: + count: 1 + hosts: + - host1 + - host2 + - host3 + spec: + port: 2049 + enable_haproxy_protocol: true + + +Further Reading +=============== + +* CephFS: :ref:`cephfs-nfs` +* MGR: :ref:`mgr-nfs` diff --git a/doc/cephadm/services/osd.rst b/doc/cephadm/services/osd.rst new file mode 100644 index 000000000..f62b0f831 --- /dev/null +++ b/doc/cephadm/services/osd.rst @@ -0,0 +1,997 @@ +*********** +OSD Service +*********** +.. _device management: ../rados/operations/devices +.. _libstoragemgmt: https://github.com/libstorage/libstoragemgmt + +List Devices +============ + +``ceph-volume`` scans each host in the cluster from time to time in order +to determine which devices are present and whether they are eligible to be +used as OSDs. + +To print a list of devices discovered by ``cephadm``, run this command: + +.. prompt:: bash # + + ceph orch device ls [--hostname=...] [--wide] [--refresh] + +Example:: + + Hostname Path Type Serial Size Health Ident Fault Available + srv-01 /dev/sdb hdd 15P0A0YFFRD6 300G Unknown N/A N/A No + srv-01 /dev/sdc hdd 15R0A08WFRD6 300G Unknown N/A N/A No + srv-01 /dev/sdd hdd 15R0A07DFRD6 300G Unknown N/A N/A No + srv-01 /dev/sde hdd 15P0A0QDFRD6 300G Unknown N/A N/A No + srv-02 /dev/sdb hdd 15R0A033FRD6 300G Unknown N/A N/A No + srv-02 /dev/sdc hdd 15R0A05XFRD6 300G Unknown N/A N/A No + srv-02 /dev/sde hdd 15R0A0ANFRD6 300G Unknown N/A N/A No + srv-02 /dev/sdf hdd 15R0A06EFRD6 300G Unknown N/A N/A No + srv-03 /dev/sdb hdd 15R0A0OGFRD6 300G Unknown N/A N/A No + srv-03 /dev/sdc hdd 15R0A0P7FRD6 300G Unknown N/A N/A No + srv-03 /dev/sdd hdd 15R0A0O7FRD6 300G Unknown N/A N/A No + +Using the ``--wide`` option provides all details relating to the device, +including any reasons that the device might not be eligible for use as an OSD. + +In the above example you can see fields named "Health", "Ident", and "Fault". +This information is provided by integration with `libstoragemgmt`_. By default, +this integration is disabled (because `libstoragemgmt`_ may not be 100% +compatible with your hardware). To make ``cephadm`` include these fields, +enable cephadm's "enhanced device scan" option as follows; + +.. prompt:: bash # + + ceph config set mgr mgr/cephadm/device_enhanced_scan true + +.. warning:: + Although the libstoragemgmt library performs standard SCSI inquiry calls, + there is no guarantee that your firmware fully implements these standards. + This can lead to erratic behaviour and even bus resets on some older + hardware. It is therefore recommended that, before enabling this feature, + you test your hardware's compatibility with libstoragemgmt first to avoid + unplanned interruptions to services. + + There are a number of ways to test compatibility, but the simplest may be + to use the cephadm shell to call libstoragemgmt directly - ``cephadm shell + lsmcli ldl``. If your hardware is supported you should see something like + this: + + :: + + Path | SCSI VPD 0x83 | Link Type | Serial Number | Health Status + ---------------------------------------------------------------------------- + /dev/sda | 50000396082ba631 | SAS | 15P0A0R0FRD6 | Good + /dev/sdb | 50000396082bbbf9 | SAS | 15P0A0YFFRD6 | Good + + +After you have enabled libstoragemgmt support, the output will look something +like this: + +:: + + # ceph orch device ls + Hostname Path Type Serial Size Health Ident Fault Available + srv-01 /dev/sdb hdd 15P0A0YFFRD6 300G Good Off Off No + srv-01 /dev/sdc hdd 15R0A08WFRD6 300G Good Off Off No + : + +In this example, libstoragemgmt has confirmed the health of the drives and the ability to +interact with the Identification and Fault LEDs on the drive enclosures. For further +information about interacting with these LEDs, refer to `device management`_. + +.. note:: + The current release of `libstoragemgmt`_ (1.8.8) supports SCSI, SAS, and SATA based + local disks only. There is no official support for NVMe devices (PCIe) + +.. _cephadm-deploy-osds: + +Deploy OSDs +=========== + +Listing Storage Devices +----------------------- + +In order to deploy an OSD, there must be a storage device that is *available* on +which the OSD will be deployed. + +Run this command to display an inventory of storage devices on all cluster hosts: + +.. prompt:: bash # + + ceph orch device ls + +A storage device is considered *available* if all of the following +conditions are met: + +* The device must have no partitions. +* The device must not have any LVM state. +* The device must not be mounted. +* The device must not contain a file system. +* The device must not contain a Ceph BlueStore OSD. +* The device must be larger than 5 GB. + +Ceph will not provision an OSD on a device that is not available. + +Creating New OSDs +----------------- + +There are a few ways to create new OSDs: + +* Tell Ceph to consume any available and unused storage device: + + .. prompt:: bash # + + ceph orch apply osd --all-available-devices + +* Create an OSD from a specific device on a specific host: + + .. prompt:: bash # + + ceph orch daemon add osd *<host>*:*<device-path>* + + For example: + + .. prompt:: bash # + + ceph orch daemon add osd host1:/dev/sdb + + Advanced OSD creation from specific devices on a specific host: + + .. prompt:: bash # + + ceph orch daemon add osd host1:data_devices=/dev/sda,/dev/sdb,db_devices=/dev/sdc,osds_per_device=2 + +* Create an OSD on a specific LVM logical volume on a specific host: + + .. prompt:: bash # + + ceph orch daemon add osd *<host>*:*<lvm-path>* + + For example: + + .. prompt:: bash # + + ceph orch daemon add osd host1:/dev/vg_osd/lvm_osd1701 + +* You can use :ref:`drivegroups` to categorize device(s) based on their + properties. This might be useful in forming a clearer picture of which + devices are available to consume. Properties include device type (SSD or + HDD), device model names, size, and the hosts on which the devices exist: + + .. prompt:: bash # + + ceph orch apply -i spec.yml + +Dry Run +------- + +The ``--dry-run`` flag causes the orchestrator to present a preview of what +will happen without actually creating the OSDs. + +For example: + +.. prompt:: bash # + + ceph orch apply osd --all-available-devices --dry-run + +:: + + NAME HOST DATA DB WAL + all-available-devices node1 /dev/vdb - - + all-available-devices node2 /dev/vdc - - + all-available-devices node3 /dev/vdd - - + +.. _cephadm-osd-declarative: + +Declarative State +----------------- + +The effect of ``ceph orch apply`` is persistent. This means that drives that +are added to the system after the ``ceph orch apply`` command completes will be +automatically found and added to the cluster. It also means that drives that +become available (by zapping, for example) after the ``ceph orch apply`` +command completes will be automatically found and added to the cluster. + +We will examine the effects of the following command: + +.. prompt:: bash # + + ceph orch apply osd --all-available-devices + +After running the above command: + +* If you add new disks to the cluster, they will automatically be used to + create new OSDs. +* If you remove an OSD and clean the LVM physical volume, a new OSD will be + created automatically. + +If you want to avoid this behavior (disable automatic creation of OSD on available devices), use the ``unmanaged`` parameter: + +.. prompt:: bash # + + ceph orch apply osd --all-available-devices --unmanaged=true + +.. note:: + + Keep these three facts in mind: + + - The default behavior of ``ceph orch apply`` causes cephadm constantly to reconcile. This means that cephadm creates OSDs as soon as new drives are detected. + + - Setting ``unmanaged: True`` disables the creation of OSDs. If ``unmanaged: True`` is set, nothing will happen even if you apply a new OSD service. + + - ``ceph orch daemon add`` creates OSDs, but does not add an OSD service. + +* For cephadm, see also :ref:`cephadm-spec-unmanaged`. + +.. _cephadm-osd-removal: + +Remove an OSD +============= + +Removing an OSD from a cluster involves two steps: + +#. evacuating all placement groups (PGs) from the cluster +#. removing the PG-free OSD from the cluster + +The following command performs these two steps: + +.. prompt:: bash # + + ceph orch osd rm <osd_id(s)> [--replace] [--force] + +Example: + +.. prompt:: bash # + + ceph orch osd rm 0 + +Expected output:: + + Scheduled OSD(s) for removal + +OSDs that are not safe to destroy will be rejected. + +.. note:: + After removing OSDs, if the drives the OSDs were deployed on once again + become available, cephadm may automatically try to deploy more OSDs + on these drives if they match an existing drivegroup spec. If you deployed + the OSDs you are removing with a spec and don't want any new OSDs deployed on + the drives after removal, it's best to modify the drivegroup spec before removal. + Either set ``unmanaged: true`` to stop it from picking up new drives at all, + or modify it in some way that it no longer matches the drives used for the + OSDs you wish to remove. Then re-apply the spec. For more info on drivegroup + specs see :ref:`drivegroups`. For more info on the declarative nature of + cephadm in reference to deploying OSDs, see :ref:`cephadm-osd-declarative` + +Monitoring OSD State +-------------------- + +You can query the state of OSD operation with the following command: + +.. prompt:: bash # + + ceph orch osd rm status + +Expected output:: + + OSD_ID HOST STATE PG_COUNT REPLACE FORCE STARTED_AT + 2 cephadm-dev done, waiting for purge 0 True False 2020-07-17 13:01:43.147684 + 3 cephadm-dev draining 17 False True 2020-07-17 13:01:45.162158 + 4 cephadm-dev started 42 False True 2020-07-17 13:01:45.162158 + + +When no PGs are left on the OSD, it will be decommissioned and removed from the cluster. + +.. note:: + After removing an OSD, if you wipe the LVM physical volume in the device used by the removed OSD, a new OSD will be created. + For more information on this, read about the ``unmanaged`` parameter in :ref:`cephadm-osd-declarative`. + +Stopping OSD Removal +-------------------- + +It is possible to stop queued OSD removals by using the following command: + +.. prompt:: bash # + + ceph orch osd rm stop <osd_id(s)> + +Example: + +.. prompt:: bash # + + ceph orch osd rm stop 4 + +Expected output:: + + Stopped OSD(s) removal + +This resets the initial state of the OSD and takes it off the removal queue. + +.. _cephadm-replacing-an-osd: + +Replacing an OSD +---------------- + +.. prompt:: bash # + + ceph orch osd rm <osd_id(s)> --replace [--force] + +Example: + +.. prompt:: bash # + + ceph orch osd rm 4 --replace + +Expected output:: + + Scheduled OSD(s) for replacement + +This follows the same procedure as the procedure in the "Remove OSD" section, with +one exception: the OSD is not permanently removed from the CRUSH hierarchy, but is +instead assigned a 'destroyed' flag. + +.. note:: + The new OSD that will replace the removed OSD must be created on the same host + as the OSD that was removed. + +**Preserving the OSD ID** + +The 'destroyed' flag is used to determine which OSD ids will be reused in the +next OSD deployment. + +If you use OSDSpecs for OSD deployment, your newly added disks will be assigned +the OSD ids of their replaced counterparts. This assumes that the new disks +still match the OSDSpecs. + +Use the ``--dry-run`` flag to make certain that the ``ceph orch apply osd`` +command does what you want it to. The ``--dry-run`` flag shows you what the +outcome of the command will be without making the changes you specify. When +you are satisfied that the command will do what you want, run the command +without the ``--dry-run`` flag. + +.. tip:: + + The name of your OSDSpec can be retrieved with the command ``ceph orch ls`` + +Alternatively, you can use your OSDSpec file: + +.. prompt:: bash # + + ceph orch apply -i <osd_spec_file> --dry-run + +Expected output:: + + NAME HOST DATA DB WAL + <name_of_osd_spec> node1 /dev/vdb - - + + +When this output reflects your intention, omit the ``--dry-run`` flag to +execute the deployment. + + +Erasing Devices (Zapping Devices) +--------------------------------- + +Erase (zap) a device so that it can be reused. ``zap`` calls ``ceph-volume +zap`` on the remote host. + +.. prompt:: bash # + + ceph orch device zap <hostname> <path> + +Example command: + +.. prompt:: bash # + + ceph orch device zap my_hostname /dev/sdx + +.. note:: + If the unmanaged flag is unset, cephadm automatically deploys drives that + match the OSDSpec. For example, if you use the + ``all-available-devices`` option when creating OSDs, when you ``zap`` a + device the cephadm orchestrator automatically creates a new OSD in the + device. To disable this behavior, see :ref:`cephadm-osd-declarative`. + + +.. _osd_autotune: + +Automatically tuning OSD memory +=============================== + +OSD daemons will adjust their memory consumption based on the +``osd_memory_target`` config option (several gigabytes, by +default). If Ceph is deployed on dedicated nodes that are not sharing +memory with other services, cephadm can automatically adjust the per-OSD +memory consumption based on the total amount of RAM and the number of deployed +OSDs. + +.. warning:: Cephadm sets ``osd_memory_target_autotune`` to ``true`` by default which is unsuitable for hyperconverged infrastructures. + +Cephadm will start with a fraction +(``mgr/cephadm/autotune_memory_target_ratio``, which defaults to +``.7``) of the total RAM in the system, subtract off any memory +consumed by non-autotuned daemons (non-OSDs, for OSDs for which +``osd_memory_target_autotune`` is false), and then divide by the +remaining OSDs. + +The final targets are reflected in the config database with options like:: + + WHO MASK LEVEL OPTION VALUE + osd host:foo basic osd_memory_target 126092301926 + osd host:bar basic osd_memory_target 6442450944 + +Both the limits and the current memory consumed by each daemon are visible from +the ``ceph orch ps`` output in the ``MEM LIMIT`` column:: + + NAME HOST PORTS STATUS REFRESHED AGE MEM USED MEM LIMIT VERSION IMAGE ID CONTAINER ID + osd.1 dael running (3h) 10s ago 3h 72857k 117.4G 17.0.0-3781-gafaed750 7015fda3cd67 9e183363d39c + osd.2 dael running (81m) 10s ago 81m 63989k 117.4G 17.0.0-3781-gafaed750 7015fda3cd67 1f0cc479b051 + osd.3 dael running (62m) 10s ago 62m 64071k 117.4G 17.0.0-3781-gafaed750 7015fda3cd67 ac5537492f27 + +To exclude an OSD from memory autotuning, disable the autotune option +for that OSD and also set a specific memory target. For example, + +.. prompt:: bash # + + ceph config set osd.123 osd_memory_target_autotune false + ceph config set osd.123 osd_memory_target 16G + + +.. _drivegroups: + +Advanced OSD Service Specifications +=================================== + +:ref:`orchestrator-cli-service-spec`\s of type ``osd`` are a way to describe a +cluster layout, using the properties of disks. Service specifications give the +user an abstract way to tell Ceph which disks should turn into OSDs with which +configurations, without knowing the specifics of device names and paths. + +Service specifications make it possible to define a yaml or json file that can +be used to reduce the amount of manual work involved in creating OSDs. + +For example, instead of running the following command: + +.. prompt:: bash [monitor.1]# + + ceph orch daemon add osd *<host>*:*<path-to-device>* + +for each device and each host, we can define a yaml or json file that allows us +to describe the layout. Here's the most basic example. + +Create a file called (for example) ``osd_spec.yml``: + +.. code-block:: yaml + + service_type: osd + service_id: default_drive_group # custom name of the osd spec + placement: + host_pattern: '*' # which hosts to target + spec: + data_devices: # the type of devices you are applying specs to + all: true # a filter, check below for a full list + +This means : + +#. Turn any available device (ceph-volume decides what 'available' is) into an + OSD on all hosts that match the glob pattern '*'. (The glob pattern matches + against the registered hosts from `host ls`) A more detailed section on + host_pattern is available below. + +#. Then pass it to `osd create` like this: + + .. prompt:: bash [monitor.1]# + + ceph orch apply -i /path/to/osd_spec.yml + + This instruction will be issued to all the matching hosts, and will deploy + these OSDs. + + Setups more complex than the one specified by the ``all`` filter are + possible. See :ref:`osd_filters` for details. + + A ``--dry-run`` flag can be passed to the ``apply osd`` command to display a + synopsis of the proposed layout. + +Example + +.. prompt:: bash [monitor.1]# + + ceph orch apply -i /path/to/osd_spec.yml --dry-run + + + +.. _osd_filters: + +Filters +------- + +.. note:: + Filters are applied using an `AND` gate by default. This means that a drive + must fulfill all filter criteria in order to get selected. This behavior can + be adjusted by setting ``filter_logic: OR`` in the OSD specification. + +Filters are used to assign disks to groups, using their attributes to group +them. + +The attributes are based off of ceph-volume's disk query. You can retrieve +information about the attributes with this command: + +.. code-block:: bash + + ceph-volume inventory </path/to/disk> + +Vendor or Model +^^^^^^^^^^^^^^^ + +Specific disks can be targeted by vendor or model: + +.. code-block:: yaml + + model: disk_model_name + +or + +.. code-block:: yaml + + vendor: disk_vendor_name + + +Size +^^^^ + +Specific disks can be targeted by `Size`: + +.. code-block:: yaml + + size: size_spec + +Size specs +__________ + +Size specifications can be of the following forms: + +* LOW:HIGH +* :HIGH +* LOW: +* EXACT + +Concrete examples: + +To include disks of an exact size + +.. code-block:: yaml + + size: '10G' + +To include disks within a given range of size: + +.. code-block:: yaml + + size: '10G:40G' + +To include disks that are less than or equal to 10G in size: + +.. code-block:: yaml + + size: ':10G' + +To include disks equal to or greater than 40G in size: + +.. code-block:: yaml + + size: '40G:' + +Sizes don't have to be specified exclusively in Gigabytes(G). + +Other units of size are supported: Megabyte(M), Gigabyte(G) and Terabyte(T). +Appending the (B) for byte is also supported: ``MB``, ``GB``, ``TB``. + + +Rotational +^^^^^^^^^^ + +This operates on the 'rotational' attribute of the disk. + +.. code-block:: yaml + + rotational: 0 | 1 + +`1` to match all disks that are rotational + +`0` to match all disks that are non-rotational (SSD, NVME etc) + + +All +^^^ + +This will take all disks that are 'available' + +.. note:: This is exclusive for the data_devices section. + +.. code-block:: yaml + + all: true + + +Limiter +^^^^^^^ + +If you have specified some valid filters but want to limit the number of disks that they match, use the ``limit`` directive: + +.. code-block:: yaml + + limit: 2 + +For example, if you used `vendor` to match all disks that are from `VendorA` +but want to use only the first two, you could use `limit`: + +.. code-block:: yaml + + data_devices: + vendor: VendorA + limit: 2 + +.. note:: `limit` is a last resort and shouldn't be used if it can be avoided. + + +Additional Options +------------------ + +There are multiple optional settings you can use to change the way OSDs are deployed. +You can add these options to the base level of an OSD spec for it to take effect. + +This example would deploy all OSDs with encryption enabled. + +.. code-block:: yaml + + service_type: osd + service_id: example_osd_spec + placement: + host_pattern: '*' + spec: + data_devices: + all: true + encrypted: true + +See a full list in the DriveGroupSpecs + +.. py:currentmodule:: ceph.deployment.drive_group + +.. autoclass:: DriveGroupSpec + :members: + :exclude-members: from_json + + +Examples +======== + +The simple case +--------------- + +All nodes with the same setup + +.. code-block:: none + + 20 HDDs + Vendor: VendorA + Model: HDD-123-foo + Size: 4TB + + 2 SSDs + Vendor: VendorB + Model: MC-55-44-ZX + Size: 512GB + +This is a common setup and can be described quite easily: + +.. code-block:: yaml + + service_type: osd + service_id: osd_spec_default + placement: + host_pattern: '*' + spec: + data_devices: + model: HDD-123-foo # Note, HDD-123 would also be valid + db_devices: + model: MC-55-44-XZ # Same here, MC-55-44 is valid + +However, we can improve it by reducing the filters on core properties of the drives: + +.. code-block:: yaml + + service_type: osd + service_id: osd_spec_default + placement: + host_pattern: '*' + spec: + data_devices: + rotational: 1 + db_devices: + rotational: 0 + +Now, we enforce all rotating devices to be declared as 'data devices' and all non-rotating devices will be used as shared_devices (wal, db) + +If you know that drives with more than 2TB will always be the slower data devices, you can also filter by size: + +.. code-block:: yaml + + service_type: osd + service_id: osd_spec_default + placement: + host_pattern: '*' + spec: + data_devices: + size: '2TB:' + db_devices: + size: ':2TB' + +.. note:: All of the above OSD specs are equally valid. Which of those you want to use depends on taste and on how much you expect your node layout to change. + + +Multiple OSD specs for a single host +------------------------------------ + +Here we have two distinct setups + +.. code-block:: none + + 20 HDDs + Vendor: VendorA + Model: HDD-123-foo + Size: 4TB + + 12 SSDs + Vendor: VendorB + Model: MC-55-44-ZX + Size: 512GB + + 2 NVMEs + Vendor: VendorC + Model: NVME-QQQQ-987 + Size: 256GB + + +* 20 HDDs should share 2 SSDs +* 10 SSDs should share 2 NVMes + +This can be described with two layouts. + +.. code-block:: yaml + + service_type: osd + service_id: osd_spec_hdd + placement: + host_pattern: '*' + spec: + data_devices: + rotational: 1 + db_devices: + model: MC-55-44-XZ + limit: 2 # db_slots is actually to be favoured here, but it's not implemented yet + --- + service_type: osd + service_id: osd_spec_ssd + placement: + host_pattern: '*' + spec: + data_devices: + model: MC-55-44-XZ + db_devices: + vendor: VendorC + +This would create the desired layout by using all HDDs as data_devices with two SSD assigned as dedicated db/wal devices. +The remaining SSDs(10) will be data_devices that have the 'VendorC' NVMEs assigned as dedicated db/wal devices. + +Multiple hosts with the same disk layout +---------------------------------------- + +Assuming the cluster has different kinds of hosts each with similar disk +layout, it is recommended to apply different OSD specs matching only one +set of hosts. Typically you will have a spec for multiple hosts with the +same layout. + +The service id as the unique key: In case a new OSD spec with an already +applied service id is applied, the existing OSD spec will be superseded. +cephadm will now create new OSD daemons based on the new spec +definition. Existing OSD daemons will not be affected. See :ref:`cephadm-osd-declarative`. + +Node1-5 + +.. code-block:: none + + 20 HDDs + Vendor: VendorA + Model: SSD-123-foo + Size: 4TB + 2 SSDs + Vendor: VendorB + Model: MC-55-44-ZX + Size: 512GB + +Node6-10 + +.. code-block:: none + + 5 NVMEs + Vendor: VendorA + Model: SSD-123-foo + Size: 4TB + 20 SSDs + Vendor: VendorB + Model: MC-55-44-ZX + Size: 512GB + +You can use the 'placement' key in the layout to target certain nodes. + +.. code-block:: yaml + + service_type: osd + service_id: disk_layout_a + placement: + label: disk_layout_a + spec: + data_devices: + rotational: 1 + db_devices: + rotational: 0 + --- + service_type: osd + service_id: disk_layout_b + placement: + label: disk_layout_b + spec: + data_devices: + model: MC-55-44-XZ + db_devices: + model: SSD-123-foo + + +This applies different OSD specs to different hosts depending on the `placement` key. +See :ref:`orchestrator-cli-placement-spec` + +.. note:: + + Assuming each host has a unique disk layout, each OSD + spec needs to have a different service id + + +Dedicated wal + db +------------------ + +All previous cases co-located the WALs with the DBs. +It's however possible to deploy the WAL on a dedicated device as well, if it makes sense. + +.. code-block:: none + + 20 HDDs + Vendor: VendorA + Model: SSD-123-foo + Size: 4TB + + 2 SSDs + Vendor: VendorB + Model: MC-55-44-ZX + Size: 512GB + + 2 NVMEs + Vendor: VendorC + Model: NVME-QQQQ-987 + Size: 256GB + + +The OSD spec for this case would look like the following (using the `model` filter): + +.. code-block:: yaml + + service_type: osd + service_id: osd_spec_default + placement: + host_pattern: '*' + spec: + data_devices: + model: MC-55-44-XZ + db_devices: + model: SSD-123-foo + wal_devices: + model: NVME-QQQQ-987 + + +It is also possible to specify directly device paths in specific hosts like the following: + +.. code-block:: yaml + + service_type: osd + service_id: osd_using_paths + placement: + hosts: + - Node01 + - Node02 + spec: + data_devices: + paths: + - /dev/sdb + db_devices: + paths: + - /dev/sdc + wal_devices: + paths: + - /dev/sdd + + +This can easily be done with other filters, like `size` or `vendor` as well. + +It's possible to specify the `crush_device_class` parameter within the +DriveGroup spec, and it's applied to all the devices defined by the `paths` +keyword: + +.. code-block:: yaml + + service_type: osd + service_id: osd_using_paths + placement: + hosts: + - Node01 + - Node02 + crush_device_class: ssd + spec: + data_devices: + paths: + - /dev/sdb + - /dev/sdc + db_devices: + paths: + - /dev/sdd + wal_devices: + paths: + - /dev/sde + +The `crush_device_class` parameter, however, can be defined for each OSD passed +using the `paths` keyword with the following syntax: + +.. code-block:: yaml + + service_type: osd + service_id: osd_using_paths + placement: + hosts: + - Node01 + - Node02 + crush_device_class: ssd + spec: + data_devices: + paths: + - path: /dev/sdb + crush_device_class: ssd + - path: /dev/sdc + crush_device_class: nvme + db_devices: + paths: + - /dev/sdd + wal_devices: + paths: + - /dev/sde + +.. _cephadm-osd-activate: + +Activate existing OSDs +====================== + +In case the OS of a host was reinstalled, existing OSDs need to be activated +again. For this use case, cephadm provides a wrapper for :ref:`ceph-volume-lvm-activate` that +activates all existing OSDs on a host. + +.. prompt:: bash # + + ceph cephadm osd activate <host>... + +This will scan all existing disks for OSDs and deploy corresponding daemons. + +Further Reading +=============== + +* :ref:`ceph-volume` +* :ref:`rados-index` diff --git a/doc/cephadm/services/rgw.rst b/doc/cephadm/services/rgw.rst new file mode 100644 index 000000000..20ec39a88 --- /dev/null +++ b/doc/cephadm/services/rgw.rst @@ -0,0 +1,371 @@ +=========== +RGW Service +=========== + +.. _cephadm-deploy-rgw: + +Deploy RGWs +=========== + +Cephadm deploys radosgw as a collection of daemons that manage a +single-cluster deployment or a particular *realm* and *zone* in a +multisite deployment. (For more information about realms and zones, +see :ref:`multisite`.) + +Note that with cephadm, radosgw daemons are configured via the monitor +configuration database instead of via a `ceph.conf` or the command line. If +that configuration isn't already in place (usually in the +``client.rgw.<something>`` section), then the radosgw +daemons will start up with default settings (e.g., binding to port +80). + +To deploy a set of radosgw daemons, with an arbitrary service name +*name*, run the following command: + +.. prompt:: bash # + + ceph orch apply rgw *<name>* [--realm=*<realm-name>*] [--zone=*<zone-name>*] --placement="*<num-daemons>* [*<host1>* ...]" + +Trivial setup +------------- + +For example, to deploy 2 RGW daemons (the default) for a single-cluster RGW deployment +under the arbitrary service id *foo*: + +.. prompt:: bash # + + ceph orch apply rgw foo + +.. _cephadm-rgw-designated_gateways: + +Designated gateways +------------------- + +A common scenario is to have a labeled set of hosts that will act +as gateways, with multiple instances of radosgw running on consecutive +ports 8000 and 8001: + +.. prompt:: bash # + + ceph orch host label add gwhost1 rgw # the 'rgw' label can be anything + ceph orch host label add gwhost2 rgw + ceph orch apply rgw foo '--placement=label:rgw count-per-host:2' --port=8000 + +See also: :ref:`cephadm_co_location`. + +.. _cephadm-rgw-networks: + +Specifying Networks +------------------- + +The RGW service can have the network they bind to configured with a yaml service specification. + +example spec file: + +.. code-block:: yaml + + service_type: rgw + service_id: foo + placement: + label: rgw + count_per_host: 2 + networks: + - 192.169.142.0/24 + spec: + rgw_frontend_port: 8080 + +Passing Frontend Extra Arguments +-------------------------------- + +The RGW service specification can be used to pass extra arguments to the rgw frontend by using +the `rgw_frontend_extra_args` arguments list. + +example spec file: + +.. code-block:: yaml + + service_type: rgw + service_id: foo + placement: + label: rgw + count_per_host: 2 + spec: + rgw_realm: myrealm + rgw_zone: myzone + rgw_frontend_type: "beast" + rgw_frontend_port: 5000 + rgw_frontend_extra_args: + - "tcp_nodelay=1" + - "max_header_size=65536" + +.. note:: cephadm combines the arguments from the `spec` section and the ones from + the `rgw_frontend_extra_args` into a single space-separated arguments list + which is used to set the value of `rgw_frontends` configuration parameter. + +Multisite zones +--------------- + +To deploy RGWs serving the multisite *myorg* realm and the *us-east-1* zone on +*myhost1* and *myhost2*: + +.. prompt:: bash # + + ceph orch apply rgw east --realm=myorg --zonegroup=us-east-zg-1 --zone=us-east-1 --placement="2 myhost1 myhost2" + +Note that in a multisite situation, cephadm only deploys the daemons. It does not create +or update the realm or zone configurations. To create a new realms, zones and zonegroups +you can use :ref:`mgr-rgw-module` or manually using something like: + +.. prompt:: bash # + + radosgw-admin realm create --rgw-realm=<realm-name> + +.. prompt:: bash # + + radosgw-admin zonegroup create --rgw-zonegroup=<zonegroup-name> --master + +.. prompt:: bash # + + radosgw-admin zone create --rgw-zonegroup=<zonegroup-name> --rgw-zone=<zone-name> --master + +.. prompt:: bash # + + radosgw-admin period update --rgw-realm=<realm-name> --commit + +See :ref:`orchestrator-cli-placement-spec` for details of the placement +specification. See :ref:`multisite` for more information of setting up multisite RGW. + +See also :ref:`multisite`. + +Setting up HTTPS +---------------- + +In order to enable HTTPS for RGW services, apply a spec file following this scheme: + +.. code-block:: yaml + + service_type: rgw + service_id: myrgw + spec: + rgw_frontend_ssl_certificate: | + -----BEGIN PRIVATE KEY----- + V2VyIGRhcyBsaWVzdCBpc3QgZG9vZi4gTG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFt + ZXQsIGNvbnNldGV0dXIgc2FkaXBzY2luZyBlbGl0ciwgc2VkIGRpYW0gbm9udW15 + IGVpcm1vZCB0ZW1wb3IgaW52aWR1bnQgdXQgbGFib3JlIGV0IGRvbG9yZSBtYWdu + YSBhbGlxdXlhbSBlcmF0LCBzZWQgZGlhbSB2b2x1cHR1YS4gQXQgdmVybyBlb3Mg + ZXQgYWNjdXNhbSBldCBqdXN0byBkdW8= + -----END PRIVATE KEY----- + -----BEGIN CERTIFICATE----- + V2VyIGRhcyBsaWVzdCBpc3QgZG9vZi4gTG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFt + ZXQsIGNvbnNldGV0dXIgc2FkaXBzY2luZyBlbGl0ciwgc2VkIGRpYW0gbm9udW15 + IGVpcm1vZCB0ZW1wb3IgaW52aWR1bnQgdXQgbGFib3JlIGV0IGRvbG9yZSBtYWdu + YSBhbGlxdXlhbSBlcmF0LCBzZWQgZGlhbSB2b2x1cHR1YS4gQXQgdmVybyBlb3Mg + ZXQgYWNjdXNhbSBldCBqdXN0byBkdW8= + -----END CERTIFICATE----- + ssl: true + +Then apply this yaml document: + +.. prompt:: bash # + + ceph orch apply -i myrgw.yaml + +Note the value of ``rgw_frontend_ssl_certificate`` is a literal string as +indicated by a ``|`` character preserving newline characters. + +Service specification +--------------------- + +.. py:currentmodule:: ceph.deployment.service_spec + +.. autoclass:: RGWSpec + :members: + +.. _orchestrator-haproxy-service-spec: + +High availability service for RGW +================================= + +The *ingress* service allows you to create a high availability endpoint +for RGW with a minimum set of configuration options. The orchestrator will +deploy and manage a combination of haproxy and keepalived to provide load +balancing on a floating virtual IP. + +If the RGW service is configured with SSL enabled, then the ingress service +will use the `ssl` and `verify none` options in the backend configuration. +Trust verification is disabled because the backends are accessed by IP +address instead of FQDN. + +.. image:: ../../images/HAProxy_for_RGW.svg + +There are N hosts where the ingress service is deployed. Each host +has a haproxy daemon and a keepalived daemon. A virtual IP is +automatically configured on only one of these hosts at a time. + +Each keepalived daemon checks every few seconds whether the haproxy +daemon on the same host is responding. Keepalived will also check +that the master keepalived daemon is running without problems. If the +"master" keepalived daemon or the active haproxy is not responding, +one of the remaining keepalived daemons running in backup mode will be +elected as master, and the virtual IP will be moved to that node. + +The active haproxy acts like a load balancer, distributing all RGW requests +between all the RGW daemons available. + +Prerequisites +------------- + +* An existing RGW service. + +Deploying +--------- + +Use the command:: + + ceph orch apply -i <ingress_spec_file> + +Service specification +--------------------- + +It is a yaml format file with the following properties: + +.. code-block:: yaml + + service_type: ingress + service_id: rgw.something # adjust to match your existing RGW service + placement: + hosts: + - host1 + - host2 + - host3 + spec: + backend_service: rgw.something # adjust to match your existing RGW service + virtual_ip: <string>/<string> # ex: 192.168.20.1/24 + frontend_port: <integer> # ex: 8080 + monitor_port: <integer> # ex: 1967, used by haproxy for load balancer status + virtual_interface_networks: [ ... ] # optional: list of CIDR networks + use_keepalived_multicast: <bool> # optional: Default is False. + vrrp_interface_network: <string>/<string> # optional: ex: 192.168.20.0/24 + ssl_cert: | # optional: SSL certificate and key + -----BEGIN CERTIFICATE----- + ... + -----END CERTIFICATE----- + -----BEGIN PRIVATE KEY----- + ... + -----END PRIVATE KEY----- + +.. code-block:: yaml + + service_type: ingress + service_id: rgw.something # adjust to match your existing RGW service + placement: + hosts: + - host1 + - host2 + - host3 + spec: + backend_service: rgw.something # adjust to match your existing RGW service + virtual_ips_list: + - <string>/<string> # ex: 192.168.20.1/24 + - <string>/<string> # ex: 192.168.20.2/24 + - <string>/<string> # ex: 192.168.20.3/24 + frontend_port: <integer> # ex: 8080 + monitor_port: <integer> # ex: 1967, used by haproxy for load balancer status + virtual_interface_networks: [ ... ] # optional: list of CIDR networks + first_virtual_router_id: <integer> # optional: default 50 + ssl_cert: | # optional: SSL certificate and key + -----BEGIN CERTIFICATE----- + ... + -----END CERTIFICATE----- + -----BEGIN PRIVATE KEY----- + ... + -----END PRIVATE KEY----- + + +where the properties of this service specification are: + +* ``service_type`` + Mandatory and set to "ingress" +* ``service_id`` + The name of the service. We suggest naming this after the service you are + controlling ingress for (e.g., ``rgw.foo``). +* ``placement hosts`` + The hosts where it is desired to run the HA daemons. An haproxy and a + keepalived container will be deployed on these hosts. These hosts do not need + to match the nodes where RGW is deployed. +* ``virtual_ip`` + The virtual IP (and network) in CIDR format where the ingress service will be available. +* ``virtual_ips_list`` + The virtual IP address in CIDR format where the ingress service will be available. + Each virtual IP address will be primary on one node running the ingress service. The number + of virtual IP addresses must be less than or equal to the number of ingress nodes. +* ``virtual_interface_networks`` + A list of networks to identify which ethernet interface to use for the virtual IP. +* ``frontend_port`` + The port used to access the ingress service. +* ``ssl_cert``: + SSL certificate, if SSL is to be enabled. This must contain the both the certificate and + private key blocks in .pem format. +* ``use_keepalived_multicast`` + Default is False. By default, cephadm will deploy keepalived config to use unicast IPs, + using the IPs of the hosts. The IPs chosen will be the same IPs cephadm uses to connect + to the machines. But if multicast is prefered, we can set ``use_keepalived_multicast`` + to ``True`` and Keepalived will use multicast IP (224.0.0.18) to communicate between instances, + using the same interfaces as where the VIPs are. +* ``vrrp_interface_network`` + By default, cephadm will configure keepalived to use the same interface where the VIPs are + for VRRP communication. If another interface is needed, it can be set via ``vrrp_interface_network`` + with a network to identify which ethernet interface to use. +* ``first_virtual_router_id`` + Default is 50. When deploying more than 1 ingress, this parameter can be used to ensure each + keepalived will have different virtual_router_id. In the case of using ``virtual_ips_list``, + each IP will create its own virtual router. So the first one will have ``first_virtual_router_id``, + second one will have ``first_virtual_router_id`` + 1, etc. Valid values go from 1 to 255. + +.. _ingress-virtual-ip: + +Selecting ethernet interfaces for the virtual IP +------------------------------------------------ + +You cannot simply provide the name of the network interface on which +to configure the virtual IP because interface names tend to vary +across hosts (and/or reboots). Instead, cephadm will select +interfaces based on other existing IP addresses that are already +configured. + +Normally, the virtual IP will be configured on the first network +interface that has an existing IP in the same subnet. For example, if +the virtual IP is 192.168.0.80/24 and eth2 has the static IP +192.168.0.40/24, cephadm will use eth2. + +In some cases, the virtual IP may not belong to the same subnet as an existing static +IP. In such cases, you can provide a list of subnets to match against existing IPs, +and cephadm will put the virtual IP on the first network interface to match. For example, +if the virtual IP is 192.168.0.80/24 and we want it on the same interface as the machine's +static IP in 10.10.0.0/16, you can use a spec like:: + + service_type: ingress + service_id: rgw.something + spec: + virtual_ip: 192.168.0.80/24 + virtual_interface_networks: + - 10.10.0.0/16 + ... + +A consequence of this strategy is that you cannot currently configure the virtual IP +on an interface that has no existing IP address. In this situation, we suggest +configuring a "dummy" IP address is an unroutable network on the correct interface +and reference that dummy network in the networks list (see above). + + +Useful hints for ingress +------------------------ + +* It is good to have at least 3 RGW daemons. +* We recommend at least 3 hosts for the ingress service. + +Further Reading +=============== + +* :ref:`object-gateway` +* :ref:`mgr-rgw-module` diff --git a/doc/cephadm/services/snmp-gateway.rst b/doc/cephadm/services/snmp-gateway.rst new file mode 100644 index 000000000..f927fdfd0 --- /dev/null +++ b/doc/cephadm/services/snmp-gateway.rst @@ -0,0 +1,171 @@ +==================== +SNMP Gateway Service +==================== + +SNMP_ is still a widely used protocol, to monitor distributed systems and devices across a variety of hardware +and software platforms. Ceph's SNMP integration focuses on forwarding alerts from it's Prometheus Alertmanager +cluster to a gateway daemon. The gateway daemon, transforms the alert into an SNMP Notification and sends +it on to a designated SNMP management platform. The gateway daemon is from the snmp_notifier_ project, +which provides SNMP V2c and V3 support (authentication and encryption). + +Ceph's SNMP gateway service deploys one instance of the gateway by default. You may increase this +by providing placement information. However, bear in mind that if you enable multiple SNMP gateway daemons, +your SNMP management platform will receive multiple notifications for the same event. + +.. _SNMP: https://en.wikipedia.org/wiki/Simple_Network_Management_Protocol +.. _snmp_notifier: https://github.com/maxwo/snmp_notifier + +Compatibility +============= +The table below shows the SNMP versions that are supported by the gateway implementation + +================ =========== =============================================== + SNMP Version Supported Notes +================ =========== =============================================== + V1 ❌ Not supported by snmp_notifier + V2c ✔ + V3 authNoPriv ✔ uses username/password authentication, without + encryption (NoPriv = no privacy) + V3 authPriv ✔ uses username/password authentication with + encryption to the SNMP management platform +================ =========== =============================================== + + +Deploying an SNMP Gateway +========================= +Both SNMP V2c and V3 provide credentials support. In the case of V2c, this is just the community string - but for V3 +environments you must provide additional authentication information. These credentials are not supported on the command +line when deploying the service. Instead, you must create the service using a credentials file (in yaml format), or +specify the complete service definition in a yaml file. + +Command format +-------------- + +.. prompt:: bash # + + ceph orch apply snmp-gateway <snmp_version:V2c|V3> <destination> [<port:int>] [<engine_id>] [<auth_protocol: MD5|SHA>] [<privacy_protocol:DES|AES>] [<placement>] ... + + +Usage Notes + +- you must supply the ``--snmp-version`` parameter +- the ``--destination`` parameter must be of the format hostname:port (no default) +- you may omit ``--port``. It defaults to 9464 +- the ``--engine-id`` is a unique identifier for the device (in hex) and required for SNMP v3 only. + Suggested value: 8000C53F<fsid> where the fsid is from your cluster, without the '-' symbols +- for SNMP V3, the ``--auth-protocol`` setting defaults to **SHA** +- for SNMP V3, with encryption you must define the ``--privacy-protocol`` +- you **must** provide a -i <filename> to pass the secrets/passwords to the orchestrator + +Deployment Examples +=================== + +SNMP V2c +-------- +Here's an example for V2c, showing CLI and service based deployments + +.. prompt:: bash # + + ceph orch apply snmp-gateway --port 9464 --snmp_version=V2c --destination=192.168.122.73:162 -i ./snmp_creds.yaml + +with a credentials file that contains; + +.. code-block:: yaml + + --- + snmp_community: public + +Alternatively, you can create a yaml definition for the gateway and apply it from a single file + +.. prompt:: bash # + + ceph orch apply -i snmp-gateway.yml + +with the file containing the following configuration + +.. code-block:: yaml + + service_type: snmp-gateway + service_name: snmp-gateway + placement: + count: 1 + spec: + credentials: + snmp_community: public + port: 9464 + snmp_destination: 192.168.122.73:162 + snmp_version: V2c + + +SNMP V3 (authNoPriv) +-------------------- +Deploying an snmp-gateway service supporting SNMP V3 with authentication only, would look like this; + +.. prompt:: bash # + + ceph orch apply snmp-gateway --snmp-version=V3 --engine-id=800C53F000000 --destination=192.168.122.1:162 -i ./snmpv3_creds.yml + +with a credentials file as; + +.. code-block:: yaml + + --- + snmp_v3_auth_username: myuser + snmp_v3_auth_password: mypassword + +or as a service configuration file + +.. code-block:: yaml + + service_type: snmp-gateway + service_name: snmp-gateway + placement: + count: 1 + spec: + credentials: + snmp_v3_auth_password: mypassword + snmp_v3_auth_username: myuser + engine_id: 800C53F000000 + port: 9464 + snmp_destination: 192.168.122.1:162 + snmp_version: V3 + + +SNMP V3 (authPriv) +------------------ + +Defining an SNMP V3 gateway service that implements authentication and privacy (encryption), requires two additional values + +.. prompt:: bash # + + ceph orch apply snmp-gateway --snmp-version=V3 --engine-id=800C53F000000 --destination=192.168.122.1:162 --privacy-protocol=AES -i ./snmpv3_creds.yml + +with a credentials file as; + +.. code-block:: yaml + + --- + snmp_v3_auth_username: myuser + snmp_v3_auth_password: mypassword + snmp_v3_priv_password: mysecret + + +.. note:: + + The credentials are stored on the host, restricted to the root user and passed to the snmp_notifier daemon as + an environment file (``--env-file``), to limit exposure. + + +AlertManager Integration +======================== +When an SNMP gateway service is deployed or updated, the Prometheus Alertmanager configuration is automatically updated to forward any +alert that has an OID_ label to the SNMP gateway daemon for processing. + +.. _OID: https://en.wikipedia.org/wiki/Object_identifier + +Implementing the MIB +====================== +To make sense of the SNMP Notification/Trap, you'll need to apply the MIB to your SNMP management platform. The MIB (CEPH-MIB.txt) can +downloaded from the main Ceph repo_ + +.. _repo: https://github.com/ceph/ceph/tree/master/monitoring/snmp diff --git a/doc/cephadm/services/tracing.rst b/doc/cephadm/services/tracing.rst new file mode 100644 index 000000000..e96d601d2 --- /dev/null +++ b/doc/cephadm/services/tracing.rst @@ -0,0 +1,45 @@ +================ +Tracing Services +================ + +.. _cephadm-tracing: + + +Jaeger Tracing +============== + +Ceph uses Jaeger as the tracing backend. in order to use tracing, we need to deploy those services. + +Further details on tracing in ceph: + +`Ceph Tracing documentation <https://docs.ceph.com/en/latest/jaegertracing/#jaeger-distributed-tracing/>`_ + +Deployment +========== + +Jaeger services consist of 3 services: + +1. Jaeger Agent + +2. Jaeger Collector + +3. Jaeger Query + +Jaeger requires a database for the traces. we use ElasticSearch (version 6) by default. + + +To deploy jaeger tracing service, when not using your own ElasticSearch: + +#. Deploy jaeger services, with a new elasticsearch container: + + .. prompt:: bash # + + ceph orch apply jaeger + + +#. Deploy jaeger services, with existing elasticsearch cluster and existing jaeger query (deploy agents and collectors): + + .. prompt:: bash # + + ceph orch apply jaeger --without-query --es_nodes=ip:port,.. + |