diff options
Diffstat (limited to '')
-rw-r--r-- | doc/mgr/modules.rst | 476 |
1 files changed, 476 insertions, 0 deletions
diff --git a/doc/mgr/modules.rst b/doc/mgr/modules.rst new file mode 100644 index 000000000..8979b4e6a --- /dev/null +++ b/doc/mgr/modules.rst @@ -0,0 +1,476 @@ + + +.. _mgr-module-dev: + +ceph-mgr module developer's guide +================================= + +.. warning:: + + This is developer documentation, describing Ceph internals that + are only relevant to people writing ceph-mgr modules. + +Creating a module +----------------- + +In pybind/mgr/, create a python module. Within your module, create a class +that inherits from ``MgrModule``. For ceph-mgr to detect your module, your +directory must contain a file called `module.py`. + +The most important methods to override are: + +* a ``serve`` member function for server-type modules. This + function should block forever. +* a ``notify`` member function if your module needs to + take action when new cluster data is available. +* a ``handle_command`` member function if your module + exposes CLI commands. + +Some modules interface with external orchestrators to deploy +Ceph services. These also inherit from ``Orchestrator``, which adds +additional methods to the base ``MgrModule`` class. See +:ref:`Orchestrator modules <orchestrator-modules>` for more on +creating these modules. + +Installing a module +------------------- + +Once your module is present in the location set by the +``mgr module path`` configuration setting, you can enable it +via the ``ceph mgr module enable`` command:: + + ceph mgr module enable mymodule + +Note that the MgrModule interface is not stable, so any modules maintained +outside of the Ceph tree are liable to break when run against any newer +or older versions of Ceph. + +Logging +------- + +Logging in Ceph manager modules is done as in any other Python program. Just +import the ``logging`` package and get a logger instance with the +``logging.getLogger`` function. + +Each module has a ``log_level`` option that specifies the current Python +logging level of the module. +To change or query the logging level of the module use the following Ceph +commands:: + + ceph config get mgr mgr/<module_name>/log_level + ceph config set mgr mgr/<module_name>/log_level <info|debug|critical|error|warning|> + +The logging level used upon the module's start is determined by the current +logging level of the mgr daemon, unless if the ``log_level`` option was +previously set with the ``config set ...`` command. The mgr daemon logging +level is mapped to the module python logging level as follows: + +* <= 0 is CRITICAL +* <= 1 is WARNING +* <= 4 is INFO +* <= +inf is DEBUG + +We can unset the module log level and fallback to the mgr daemon logging level +by running the following command:: + + ceph config set mgr mgr/<module_name>/log_level '' + +By default, modules' logging messages are processed by the Ceph logging layer +where they will be recorded in the mgr daemon's log file. +But it's also possible to send a module's logging message to it's own file. + +The module's log file will be located in the same directory as the mgr daemon's +log file with the following name pattern:: + + <mgr_daemon_log_file_name>.<module_name>.log + +To enable the file logging on a module use the following command:: + + ceph config set mgr mgr/<module_name>/log_to_file true + +When the module's file logging is enabled, module's logging messages stop +being written to the mgr daemon's log file and are only written to the +module's log file. + +It's also possible to check the status and disable the file logging with the +following commands:: + + ceph config get mgr mgr/<module_name>/log_to_file + ceph config set mgr mgr/<module_name>/log_to_file false + + + + +Exposing commands +----------------- + +There are two approaches for exposing a command. The first one is to +use the ``@CLICommand`` decorator to decorate the method which handles +the command. like this + +.. code:: python + + @CLICommand('antigravity send to blackhole', + perm='rw') + def send_to_blackhole(self, oid: str, blackhole: Optional[str] = None, inbuf: Optional[str] = None): + ''' + Send the specified object to black hole + ''' + obj = self.find_object(oid) + if obj is None: + return HandleCommandResult(-errno.ENOENT, stderr=f"object '{oid}' not found") + if blackhole is not None and inbuf is not None: + try: + location = self.decrypt(blackhole, passphrase=inbuf) + except ValueError: + return HandleCommandResult(-errno.EINVAL, stderr='unable to decrypt location') + else: + location = blackhole + self.send_object_to(obj, location) + return HandleCommandResult(stdout=f'the black hole swallowed '{oid}'") + +The first parameter passed to ``CLICommand`` is the "name" of the command. +Since there are lots of commands in Ceph, we tend to group related commands +with a common prefix. In this case, "antigravity" is used for this purpose. +As the author is probably designing a module which is also able to launch +rockets into the deep space. + +The `type annotations <https://www.python.org/dev/peps/pep-0484/>`_ for the +method parameters are mandatory here, so the usage of the command can be +properly reported to the ``ceph`` CLI, and the manager daemon can convert +the serialized command parameters sent by the clients to the expected type +before passing them to the handler method. With properly implemented types, +one can also perform some sanity checks against the parameters! + +The names of the parameters are part of the command interface, so please +try to take the backward compatibility into consideration when changing +them. But you **cannot** change name of ``inbuf`` parameter, it is used +to pass the content of the file specified by ``ceph --in-file`` option. + +The docstring of the method is used for the description of the command. + +The manager daemon cooks the usage of the command from these ingredients, +like:: + + antigravity send to blackhole <oid> [<blackhole>] Send the specified object to black hole + +as part of the output of ``ceph --help``. + +In addition to ``@CLICommand``, you could also use ``@CLIReadCommand`` or +``@CLIWriteCommand`` if your command only requires read permissions or +write permissions respectively. + +The second one is to set the ``COMMANDS`` class attribute of your module to +a list of dicts like this:: + + COMMANDS = [ + { + "cmd": "foobar name=myarg,type=CephString", + "desc": "Do something awesome", + "perm": "rw", + # optional: + "poll": "true" + } + ] + +The ``cmd`` part of each entry is parsed in the same way as internal +Ceph mon and admin socket commands (see mon/MonCommands.h in +the Ceph source for examples). Note that the "poll" field is optional, +and is set to False by default; this indicates to the ``ceph`` CLI +that it should call this command repeatedly and output results (see +``ceph -h`` and its ``--period`` option). + +Each command is expected to return a tuple ``(retval, stdout, stderr)``. +``retval`` is an integer representing a libc error code (e.g. EINVAL, +EPERM, or 0 for no error), ``stdout`` is a string containing any +non-error output, and ``stderr`` is a string containing any progress or +error explanation output. Either or both of the two strings may be empty. + +Implement the ``handle_command`` function to respond to the commands +when they are sent: + + +.. py:currentmodule:: mgr_module +.. automethod:: MgrModule.handle_command + +Configuration options +--------------------- + +Modules can load and store configuration options using the +``set_module_option`` and ``get_module_option`` methods. + +.. note:: Use ``set_module_option`` and ``get_module_option`` to + manage user-visible configuration options that are not blobs (like + certificates). If you want to persist module-internal data or + binary configuration data consider using the `KV store`_. + +You must declare your available configuration options in the +``MODULE_OPTIONS`` class attribute, like this: + +:: + + MODULE_OPTIONS = [ + { + "name": "my_option" + } + ] + +If you try to use set_module_option or get_module_option on options not declared +in ``MODULE_OPTIONS``, an exception will be raised. + +You may choose to provide setter commands in your module to perform +high level validation. Users can also modify configuration using +the normal `ceph config set` command, where the configuration options +for a mgr module are named like `mgr/<module name>/<option>`. + +If a configuration option is different depending on which node the mgr +is running on, then use *localized* configuration ( +``get_localized_module_option``, ``set_localized_module_option``). +This may be necessary for options such as what address to listen on. +Localized options may also be set externally with ``ceph config set``, +where they key name is like ``mgr/<module name>/<mgr id>/<option>`` + +If you need to load and store data (e.g. something larger, binary, or multiline), +use the KV store instead of configuration options (see next section). + +Hints for using config options: + +* Reads are fast: ceph-mgr keeps a local in-memory copy, so in many cases + you can just do a get_module_option every time you use a option, rather than + copying it out into a variable. +* Writes block until the value is persisted (i.e. round trip to the monitor), + but reads from another thread will see the new value immediately. +* If a user has used `config set` from the command line, then the new + value will become visible to `get_module_option` immediately, although the + mon->mgr update is asynchronous, so `config set` will return a fraction + of a second before the new value is visible on the mgr. +* To delete a config value (i.e. revert to default), just pass ``None`` to + set_module_option. + +.. automethod:: MgrModule.get_module_option +.. automethod:: MgrModule.set_module_option +.. automethod:: MgrModule.get_localized_module_option +.. automethod:: MgrModule.set_localized_module_option + +KV store +-------- + +Modules have access to a private (per-module) key value store, which +is implemented using the monitor's "config-key" commands. Use +the ``set_store`` and ``get_store`` methods to access the KV store from +your module. + +The KV store commands work in a similar way to the configuration +commands. Reads are fast, operating from a local cache. Writes block +on persistence and do a round trip to the monitor. + +This data can be access from outside of ceph-mgr using the +``ceph config-key [get|set]`` commands. Key names follow the same +conventions as configuration options. Note that any values updated +from outside of ceph-mgr will not be seen by running modules until +the next restart. Users should be discouraged from accessing module KV +data externally -- if it is necessary for users to populate data, modules +should provide special commands to set the data via the module. + +Use the ``get_store_prefix`` function to enumerate keys within +a particular prefix (i.e. all keys starting with a particular substring). + + +.. automethod:: MgrModule.get_store +.. automethod:: MgrModule.set_store +.. automethod:: MgrModule.get_localized_store +.. automethod:: MgrModule.set_localized_store +.. automethod:: MgrModule.get_store_prefix + + +Accessing cluster data +---------------------- + +Modules have access to the in-memory copies of the Ceph cluster's +state that the mgr maintains. Accessor functions as exposed +as members of MgrModule. + +Calls that access the cluster or daemon state are generally going +from Python into native C++ routines. There is some overhead to this, +but much less than for example calling into a REST API or calling into +an SQL database. + +There are no consistency rules about access to cluster structures or +daemon metadata. For example, an OSD might exist in OSDMap but +have no metadata, or vice versa. On a healthy cluster these +will be very rare transient states, but modules should be written +to cope with the possibility. + +Note that these accessors must not be called in the modules ``__init__`` +function. This will result in a circular locking exception. + +.. automethod:: MgrModule.get +.. automethod:: MgrModule.get_server +.. automethod:: MgrModule.list_servers +.. automethod:: MgrModule.get_metadata +.. automethod:: MgrModule.get_daemon_status +.. automethod:: MgrModule.get_perf_schema +.. automethod:: MgrModule.get_counter +.. automethod:: MgrModule.get_mgr_id + +Exposing health checks +---------------------- + +Modules can raise first class Ceph health checks, which will be reported +in the output of ``ceph status`` and in other places that report on the +cluster's health. + +If you use ``set_health_checks`` to report a problem, be sure to call +it again with an empty dict to clear your health check when the problem +goes away. + +.. automethod:: MgrModule.set_health_checks + +What if the mons are down? +-------------------------- + +The manager daemon gets much of its state (such as the cluster maps) +from the monitor. If the monitor cluster is inaccessible, whichever +manager was active will continue to run, with the latest state it saw +still in memory. + +However, if you are creating a module that shows the cluster state +to the user then you may well not want to mislead them by showing +them that out of date state. + +To check if the manager daemon currently has a connection to +the monitor cluster, use this function: + +.. automethod:: MgrModule.have_mon_connection + +Reporting if your module cannot run +----------------------------------- + +If your module cannot be run for any reason (such as a missing dependency), +then you can report that by implementing the ``can_run`` function. + +.. automethod:: MgrModule.can_run + +Note that this will only work properly if your module can always be imported: +if you are importing a dependency that may be absent, then do it in a +try/except block so that your module can be loaded far enough to use +``can_run`` even if the dependency is absent. + +Sending commands +---------------- + +A non-blocking facility is provided for sending monitor commands +to the cluster. + +.. automethod:: MgrModule.send_command + +Receiving notifications +----------------------- + +The manager daemon calls the ``notify`` function on all active modules +when certain important pieces of cluster state are updated, such as the +cluster maps. + +The actual data is not passed into this function, rather it is a cue for +the module to go and read the relevant structure if it is interested. Most +modules ignore most types of notification: to ignore a notification +simply return from this function without doing anything. + +.. automethod:: MgrModule.notify + +Accessing RADOS or CephFS +------------------------- + +If you want to use the librados python API to access data stored in +the Ceph cluster, you can access the ``rados`` attribute of your +``MgrModule`` instance. This is an instance of ``rados.Rados`` which +has been constructed for you using the existing Ceph context (an internal +detail of the C++ Ceph code) of the mgr daemon. + +Always use this specially constructed librados instance instead of +constructing one by hand. + +Similarly, if you are using libcephfs to access the file system, then +use the libcephfs ``create_with_rados`` to construct it from the +``MgrModule.rados`` librados instance, and thereby inherit the correct context. + +Remember that your module may be running while other parts of the cluster +are down: do not assume that librados or libcephfs calls will return +promptly -- consider whether to use timeouts or to block if the rest of +the cluster is not fully available. + +Implementing standby mode +------------------------- + +For some modules, it is useful to run on standby manager daemons as well +as on the active daemon. For example, an HTTP server can usefully +serve HTTP redirect responses from the standby managers so that +the user can point his browser at any of the manager daemons without +having to worry about which one is active. + +Standby manager daemons look for a subclass of ``StandbyModule`` +in each module. If the class is not found then the module is not +used at all on standby daemons. If the class is found, then +its ``serve`` method is called. Implementations of ``StandbyModule`` +must inherit from ``mgr_module.MgrStandbyModule``. + +The interface of ``MgrStandbyModule`` is much restricted compared to +``MgrModule`` -- none of the Ceph cluster state is available to +the module. ``serve`` and ``shutdown`` methods are used in the same +way as a normal module class. The ``get_active_uri`` method enables +the standby module to discover the address of its active peer in +order to make redirects. See the ``MgrStandbyModule`` definition +in the Ceph source code for the full list of methods. + +For an example of how to use this interface, look at the source code +of the ``dashboard`` module. + +Communicating between modules +----------------------------- + +Modules can invoke member functions of other modules. + +.. automethod:: MgrModule.remote + +Be sure to handle ``ImportError`` to deal with the case that the desired +module is not enabled. + +If the remote method raises a python exception, this will be converted +to a RuntimeError on the calling side, where the message string describes +the exception that was originally thrown. If your logic intends +to handle certain errors cleanly, it is better to modify the remote method +to return an error value instead of raising an exception. + +At time of writing, inter-module calls are implemented without +copies or serialization, so when you return a python object, you're +returning a reference to that object to the calling module. It +is recommend *not* to rely on this reference passing, as in future the +implementation may change to serialize arguments and return +values. + + +Shutting down cleanly +--------------------- + +If a module implements the ``serve()`` method, it should also implement +the ``shutdown()`` method to shutdown cleanly: misbehaving modules +may otherwise prevent clean shutdown of ceph-mgr. + +Limitations +----------- + +It is not possible to call back into C++ code from a module's +``__init__()`` method. For example calling ``self.get_module_option()`` at +this point will result in an assertion failure in ceph-mgr. For modules +that implement the ``serve()`` method, it usually makes sense to do most +initialization inside that method instead. + +Is something missing? +--------------------- + +The ceph-mgr python interface is not set in stone. If you have a need +that is not satisfied by the current interface, please bring it up +on the ceph-devel mailing list. While it is desired to avoid bloating +the interface, it is not generally very hard to expose existing data +to the Python code when there is a good reason. + |