summaryrefslogtreecommitdiffstats
path: root/doc/dev/developer_guide
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-21 11:54:28 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-21 11:54:28 +0000
commite6918187568dbd01842d8d1d2c808ce16a894239 (patch)
tree64f88b554b444a49f656b6c656111a145cbbaa28 /doc/dev/developer_guide
parentInitial commit. (diff)
downloadceph-e6918187568dbd01842d8d1d2c808ce16a894239.tar.xz
ceph-e6918187568dbd01842d8d1d2c808ce16a894239.zip
Adding upstream version 18.2.2.upstream/18.2.2
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/dev/developer_guide')
-rw-r--r--doc/dev/developer_guide/basic-workflow.rst587
-rw-r--r--doc/dev/developer_guide/dash-devel.rst2748
-rw-r--r--doc/dev/developer_guide/debugging-gdb.rst43
-rw-r--r--doc/dev/developer_guide/essentials.rst346
-rw-r--r--doc/dev/developer_guide/index.rst25
-rw-r--r--doc/dev/developer_guide/intro.rst25
-rw-r--r--doc/dev/developer_guide/issue-tracker.rst39
-rw-r--r--doc/dev/developer_guide/jaegertracing.rst63
-rw-r--r--doc/dev/developer_guide/merging.rst138
-rw-r--r--doc/dev/developer_guide/running-tests-locally.rst171
-rw-r--r--doc/dev/developer_guide/testing_integration_tests/index.rst16
-rw-r--r--doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-debugging-tips.rst158
-rw-r--r--doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-intro.rst660
-rw-r--r--doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-kernel.rst71
-rw-r--r--doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-workflow.rst293
-rw-r--r--doc/dev/developer_guide/testing_integration_tests/tests-sentry-developers-guide.rst6
-rw-r--r--doc/dev/developer_guide/tests-unit-tests.rst177
17 files changed, 5566 insertions, 0 deletions
diff --git a/doc/dev/developer_guide/basic-workflow.rst b/doc/dev/developer_guide/basic-workflow.rst
new file mode 100644
index 000000000..27000fa2b
--- /dev/null
+++ b/doc/dev/developer_guide/basic-workflow.rst
@@ -0,0 +1,587 @@
+.. _basic workflow dev guide:
+
+Basic Workflow
+==============
+
+The following chart illustrates the basic Ceph development workflow:
+
+.. ditaa::
+
+ Upstream Code Your Local Environment
+
+ /----------\ git clone /-------------\
+ | Ceph | -------------------------> | ceph/main |
+ \----------/ \-------------/
+ ^ |
+ | | git branch fix_1
+ | git merge |
+ | v
+ /----------------\ git commit --amend /-------------\
+ | ninja check |---------------------> | ceph/fix_1 |
+ | ceph--qa--suite| \-------------/
+ \----------------/ |
+ ^ | fix changes
+ | | test changes
+ | review | git commit
+ | |
+ | v
+ /--------------\ /-------------\
+ | github |<---------------------- | ceph/fix_1 |
+ | pull request | git push \-------------/
+ \--------------/
+
+This page assumes that you are a new contributor with an idea for a bugfix or
+an enhancement, but you do not know how to proceed. Watch the `Getting Started
+with Ceph Development <https://www.youtube.com/watch?v=t5UIehZ1oLs>`_ video for
+a practical summary of this workflow.
+
+Updating the tracker
+--------------------
+
+Find the :ref:`issue-tracker` (Redmine) number of the bug you intend to fix. If
+no tracker issue exists, create one. There is only one case in which you do not
+have to create a Redmine tracker issue: the case of minor documentation changes.
+
+Simple documentation cleanup does not require a corresponding tracker issue.
+Major documentation changes do require a tracker issue. Major documentation
+changes include adding new documentation chapters or files, and making
+substantial changes to the structure or content of the documentation.
+
+A (Redmine) tracker ticket explains the issue (bug) to other Ceph developers to
+keep them informed as the bug nears resolution. Provide a useful, clear title
+and include detailed information in the description. When composing the title
+of the ticket, ask yourself "If I need to search for this ticket two years from
+now, which keywords am I likely to search for?" Then include those keywords in
+the title.
+
+If your tracker permissions are elevated, assign the bug to yourself by setting
+the ``Assignee`` field. If your tracker permissions have not been elevated,
+just add a comment with a short message that says "I am working on this issue".
+
+Ceph Workflow Overview
+----------------------
+
+Three repositories are involved in the Ceph workflow. They are:
+
+1. The upstream repository (ceph/ceph)
+2. Your fork of the upstream repository (your_github_id/ceph)
+3. Your local working copy of the repository (on your workstation)
+
+The procedure for making changes to the Ceph repository is as follows:
+
+#. Configure your local environment
+
+ #. :ref:`Create a fork<forking>` of the "upstream Ceph"
+ repository.
+
+ #. :ref:`Clone the fork<cloning>` to your local filesystem.
+
+#. Fix the bug
+
+ #. :ref:`Synchronize local main with upstream main<synchronizing>`.
+
+ #. :ref:`Create a bugfix branch<bugfix_branch>` in your local working copy.
+
+ #. :ref:`Make alterations to the local working copy of the repository in your
+ local filesystem<fixing_bug_locally>`.
+
+ #. :ref:`Push the changes in your local working copy to your fork<push_changes>`.
+
+#. Create a Pull Request to push the change upstream.
+
+ #. Create a Pull Request that asks for your changes to be added into the
+ "upstream Ceph" repository.
+
+Preparing Your Local Working Copy of the Ceph Repository
+--------------------------------------------------------
+
+The procedures in this section, "Preparing Your Local Working Copy of the Ceph
+Repository", must be followed only when you are first setting up your local
+environment. If this is your first time working with the Ceph project, then
+these commands are necessary and are the first commands that you should run.
+
+.. _forking:
+
+Creating a Fork of the Ceph Repository
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+See the `GitHub documentation
+<https://help.github.com/articles/fork-a-repo/#platform-linux>`_ for
+detailed instructions on forking. In short, if your GitHub username is
+"mygithubaccount", your fork of the upstream repo will appear at
+``https://github.com/mygithubaccount/ceph``.
+
+.. _cloning:
+
+Cloning Your Fork
+^^^^^^^^^^^^^^^^^
+
+After you have created your fork, clone it by running the following command:
+
+.. prompt:: bash $
+
+ git clone https://github.com/mygithubaccount/ceph
+
+You must fork the Ceph repository before you clone it. If you fail to fork,
+you cannot open a `GitHub pull request
+<https://docs.github.com/en/free-pro-team@latest/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request>`_.
+
+For more information on using GitHub, refer to `GitHub Help
+<https://help.github.com/>`_.
+
+Configuring Your Local Environment
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The commands in this section configure your local git environment so that it
+generates "Signed-off-by:" tags. These commands also set up your local
+environment so that it can stay synchronized with the upstream repository.
+
+These commands are necessary only during the initial setup of your local
+working copy. Another way to say that is "These commands are necessary
+only the first time that you are working with the Ceph repository. They are,
+however, unavoidable, and if you fail to run them then you will not be able
+to work on the Ceph repository.".
+
+1. Configure your local git environment with your name and email address.
+
+ .. note::
+ These commands will work only from within the ``ceph/`` directory
+ that was created when you cloned your fork.
+
+ .. prompt:: bash $
+
+ git config user.name "FIRST_NAME LAST_NAME"
+ git config user.email "MY_NAME@example.com"
+
+2. Add the upstream repo as a "remote" and fetch it:
+
+ .. prompt:: bash $
+
+ git remote add ceph https://github.com/ceph/ceph.git
+ git fetch ceph
+
+ These commands fetch all the branches and commits from ``ceph/ceph.git`` to
+ the local git repo as ``remotes/ceph/$BRANCH_NAME`` and can be referenced as
+ ``ceph/$BRANCH_NAME`` in local git commands.
+
+Fixing the Bug
+--------------
+
+.. _synchronizing:
+
+Synchronizing Local Main with Upstream Main
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In your local working copy, there is a copy of the ``main`` branch in
+``remotes/origin/main``. This is called "local main". This copy of the
+main branch (https://github.com/your_github_id/ceph.git) is "frozen in time"
+at the moment that you cloned it, but the upstream repo
+(https://github.com/ceph/ceph.git, typically abbreviated to ``ceph/ceph.git``)
+that it was forked from is not frozen in time: the upstream repo is still being
+updated by other contributors.
+
+Because upstream main is continually receiving updates from other
+contributors, your fork will drift farther and farther from the state of the
+upstream repo when you cloned it.
+
+Keep your fork's ``main`` branch synchronized with upstream main to reduce drift
+between your fork's main branch and the upstream main branch.
+
+Here are the commands for keeping your fork synchronized with the
+upstream repository:
+
+.. prompt:: bash $
+
+ git fetch ceph
+ git checkout main
+ git reset --hard ceph/main
+ git push -u origin main
+
+Follow this procedure often to keep your local ``main`` in sync with upstream
+``main``.
+
+If the command ``git status`` returns a line that reads "Untracked files", see
+:ref:`the procedure on updating submodules <update-submodules>`.
+
+.. _bugfix_branch:
+
+Creating a Bugfix branch
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Create a branch for your bugfix:
+
+.. prompt:: bash $
+
+ git checkout main
+ git checkout -b fix_1
+ git push -u origin fix_1
+
+The first command (git checkout main) makes sure that the bugfix branch
+"fix_1" is created from the most recent state of the main branch of the
+upstream repository.
+
+The second command (git checkout -b fix_1) creates a "bugfix branch" called
+"fix_1" in your local working copy of the repository. The changes that you make
+in order to fix the bug will be committed to this branch.
+
+The third command (git push -u origin fix_1) pushes the bugfix branch from
+your local working repository to your fork of the upstream repository.
+
+.. _fixing_bug_locally:
+
+Fixing the bug in the local working copy
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+#. **Updating the tracker**
+
+ In the `Ceph issue tracker <https://tracker.ceph.com>`_, change the status
+ of the tracker issue to "In progress". This communicates to other Ceph
+ contributors that you have begun working on a fix, which helps to avoid
+ duplication of effort. If you don't have permission to change that field,
+ just comment that you are working on the issue.
+
+#. **Fixing the bug itself**
+
+ This guide cannot tell you how to fix the bug that you have chosen to fix.
+ This guide assumes that you know what required improvement, and that you
+ know what to do to provide that improvement.
+
+ It might be that your fix is simple and requires only minimal testing. But
+ that's unlikely. It is more likely that the process of fixing your bug will
+ be iterative and will involve trial, error, skill, and patience.
+
+ For a detailed discussion of the tools available for validating bugfixes,
+ see the chapters on testing.
+
+Pushing the Fix to Your Fork
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You have finished work on the bugfix. You have tested the bugfix, and you
+believe that it works.
+
+#. Commit the changes to your local working copy.
+
+ Commit the changes to the `fix_1` branch of your local working copy by using
+ the ``--signoff`` option (here represented as the `s` portion of the `-as`
+ flag):
+
+ .. prompt:: bash $
+
+ git commit -as
+
+ .. _push_changes:
+
+#. Push the changes to your fork:
+
+ Push the changes from the `fix_1` branch of your local working copy to the
+ `fix_1` branch of your fork of the upstream repository:
+
+ .. prompt:: bash $
+
+ git push origin fix_1
+
+ .. note::
+
+ In the command ``git push origin fix_1``, ``origin`` is the name of your
+ fork of the upstream Ceph repository, and can be thought of as a nickname
+ for ``git@github.com:username/ceph.git``, where ``username`` is your
+ GitHub username.
+
+ It is possible that ``origin`` is not the name of your fork. Discover the
+ name of your fork by running ``git remote -v``, as shown here:
+
+ .. code-block:: bash
+
+ $ git remote -v
+ ceph https://github.com/ceph/ceph.git (fetch)
+ ceph https://github.com/ceph/ceph.git (push)
+ origin git@github.com:username/ceph.git (fetch)
+ origin git@github.com:username/ceph.git (push)
+
+ The line::
+
+ origin git@github.com:username/ceph.git (fetch)
+
+ and the line::
+
+ origin git@github.com:username/ceph.git (push)
+
+ provide the information that "origin" is the name of your fork of the
+ Ceph repository.
+
+
+Opening a GitHub pull request
+-----------------------------
+
+After you have pushed the bugfix to your fork, open a GitHub pull request
+(PR). This makes your bugfix visible to the community of Ceph contributors.
+They will review it. They may perform additional testing on your bugfix, and
+they might request changes to the bugfix.
+
+Be prepared to receive suggestions and constructive criticism in the form of
+comments within the PR.
+
+If you don't know how to create and manage pull requests, read `this GitHub
+pull request tutorial`_.
+
+.. _`this GitHub pull request tutorial`:
+ https://help.github.com/articles/using-pull-requests/
+
+To learn what constitutes a "good" pull request, see
+the `Git Commit Good Practice`_ article at the `OpenStack Project Wiki`_.
+
+.. _`Git Commit Good Practice`: https://wiki.openstack.org/wiki/GitCommitMessages
+.. _`OpenStack Project Wiki`: https://wiki.openstack.org/wiki/Main_Page
+
+See also our own `Submitting Patches
+<https://github.com/ceph/ceph/blob/main/SubmittingPatches.rst>`_ document.
+
+After your pull request (PR) has been opened, update the :ref:`issue-tracker`
+by adding a comment directing other contributors to your PR. The comment can be
+as simple as this::
+
+ *PR*: https://github.com/ceph/ceph/pull/$NUMBER_OF_YOUR_PULL_REQUEST
+
+Understanding Automated PR validation
+-------------------------------------
+
+When you create or update your PR, the Ceph project's `Continuous Integration
+(CI) <https://en.wikipedia.org/wiki/Continuous_integration>`_ infrastructure
+automatically tests it. At the time of this writing (May 2022), the automated
+CI testing included many tests. These five are among them:
+
+#. a test to check that the commits are properly signed (see :ref:`submitting-patches`):
+#. a test to check that the documentation builds
+#. a test to check that the submodules are unmodified
+#. a test to check that the API is in order
+#. a :ref:`make check<make-check>` test
+
+Additional tests may be run depending on which files your PR modifies.
+
+The :ref:`make check<make-check>` test builds the PR and runs it through a
+battery of tests. These tests run on servers that are operated by the Ceph
+Continuous Integration (CI) team. When the tests have completed their run, the
+result is shown on GitHub in the pull request itself.
+
+Test your modifications before you open a PR. Refer to the chapters
+on testing for details.
+
+Notes on PR make check test
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The GitHub :ref:`make check<make-check>` test is driven by a Jenkins instance.
+
+Jenkins merges your PR branch into the latest version of the base branch before
+it starts any tests. This means that you don't have to rebase the PR in order
+to pick up any fixes.
+
+You can trigger PR tests at any time by adding a comment to the PR - the
+comment should contain the string "test this please". Since a human who is
+subscribed to the PR might interpret that as a request for him or her to test
+the PR, you must address Jenkins directly. For example, write "jenkins retest
+this please". If you need to run only one of the tests, you can request it with
+a command like "jenkins test signed". A list of these requests is automatically
+added to the end of each new PR's description, so check there to find the
+single test you need.
+
+If there is a build failure and you aren't sure what caused it, check the
+:ref:`make check<make-check>` log. To access the make check log, click the
+"details" (next to the :ref:`make check<make-check>` test in the PR) link to
+enter the Jenkins web GUI. Then click "Console Output" (on the left).
+
+Jenkins is configured to search logs for strings that are known to have been
+associated with :ref:`make check<make-check>` failures in the past. However,
+there is no guarantee that these known strings are associated with any given
+:ref:`make check<make-check>` failure. You'll have to read through the log to
+determine the cause of your specific failure.
+
+Integration tests AKA ceph-qa-suite
+-----------------------------------
+
+It may be necessary to test your fix on real Ceph clusters that run on physical
+or virtual hardware. Tests designed for this purpose live in the `ceph/qa
+sub-directory`_ and are run via the `teuthology framework`_.
+
+.. _`ceph/qa sub-directory`: https://github.com/ceph/ceph/tree/main/qa/
+.. _`teuthology repository`: https://github.com/ceph/teuthology
+.. _`teuthology framework`: https://github.com/ceph/teuthology
+
+The Ceph community has access to the `Sepia lab
+<https://wiki.sepia.ceph.com/doku.php>`_ where `integration tests`_ can be run
+on physical hardware.
+
+Other contributors might add tags like `needs-qa` to your PR. This allows PRs
+to be merged into a single branch and then efficiently tested together.
+Teuthology test suites can take hours (and even days in some cases) to
+complete, so batching tests reduces contention for resources and saves a lot of
+time.
+
+To request access to the Sepia lab, start `here
+<https://wiki.sepia.ceph.com/doku.php?id=vpnaccess>`_.
+
+Integration testing is discussed in more detail in the `integration
+tests`_ chapter.
+
+.. _integration tests: ../testing_integration_tests/tests-integration-testing-teuthology-intro
+
+Code review
+-----------
+
+Once your bugfix has been thoroughly tested, or even during this process,
+it will be subjected to code review by other developers. This typically
+takes the form of comments in the PR itself, but can be supplemented
+by discussions on :ref:`irc` and the :ref:`mailing-list`.
+
+Amending your PR
+----------------
+
+While your PR is going through testing and `Code Review`_, you can
+modify it at any time by editing files in your local branch.
+
+After updates are committed locally (to the ``fix_1`` branch in our
+example), they need to be pushed to GitHub so they appear in the PR.
+
+Modifying the PR is done by adding commits to the ``fix_1`` branch upon
+which it is based, often followed by rebasing to modify the branch's git
+history. See `this tutorial
+<https://www.atlassian.com/git/tutorials/rewriting-history>`_ for a good
+introduction to rebasing. When you are done with your modifications, you
+will need to force push your branch with:
+
+.. prompt:: bash $
+
+ git push --force origin fix_1
+
+Why do we take these extra steps instead of simply adding additional commits
+the PR? It is best practice for a PR to consist of a single commit; this
+makes for clean history, eases peer review of your changes, and facilitates
+merges. In rare circumstances it also makes it easier to cleanly revert
+changes.
+
+Merging
+-------
+
+The bugfix process completes when a project lead merges your PR.
+
+When this happens, it is a signal for you (or the lead who merged the PR)
+to change the :ref:`issue-tracker` status to "Resolved". Some issues may be
+flagged for backporting, in which case the status should be changed to
+"Pending Backport" (see the :ref:`backporting` chapter for details).
+
+See also :ref:`merging` for more information on merging.
+
+Proper Merge Commit Format
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This is the most basic form of a merge commit::
+
+ doc/component: title of the commit
+
+ Reviewed-by: Reviewer Name <rname@example.com>
+
+This consists of two parts:
+
+#. The title of the commit / PR to be merged.
+#. The name and email address of the reviewer. Enclose the reviewer's email
+ address in angle brackets.
+
+Using a browser extension to auto-fill the merge message
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If you use a browser for merging GitHub PRs, the easiest way to fill in
+the merge message is with the `"Ceph GitHub Helper Extension"
+<https://github.com/tspmelo/ceph-github-helper>`_ (available for `Chrome
+<https://chrome.google.com/webstore/detail/ceph-github-helper/ikpfebikkeabmdnccbimlomheocpgkmn>`_
+and `Firefox <https://addons.mozilla.org/en-US/firefox/addon/ceph-github-helper/>`_).
+
+After enabling this extension, if you go to a GitHub PR page, a vertical helper
+will be displayed at the top-right corner. If you click on the user silhouette button
+the merge message input will be automatically populated.
+
+Using .githubmap to Find a Reviewer's Email Address
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+If you cannot find the email address of the reviewer on his or her GitHub
+page, you can look it up in the **.githubmap** file, which can be found in
+the repository at **/ceph/.githubmap**.
+
+Using "git log" to find a Reviewer's Email Address
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+If you cannot find a reviewer's email address by using the above methods, you
+can search the git log for their email address. Reviewers are likely to have
+committed something before. If they have made previous contributions, the git
+log will probably contain their email address.
+
+Use the following command
+
+.. prompt:: bash [branch-under-review]$
+
+ git log
+
+Using ptl-tool to Generate Merge Commits
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Another method of generating merge commits involves using Patrick Donnelly's
+**ptl-tool** pull commits. This tool can be found at
+**/ceph/src/script/ptl-tool.py**. Merge commits that have been generated by
+the **ptl-tool** have the following form::
+
+ Merge PR #36257 into main
+ * refs/pull/36257/head:
+ client: move client_lock to _unmount()
+ client: add timer_lock support
+ Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
+
+Miscellaneous
+-------------
+
+--set-upstream
+^^^^^^^^^^^^^^
+
+If you forget to include the ``--set-upstream origin x`` option in your ``git
+push`` command, you will see the following error message:
+
+::
+
+ fatal: The current branch {x} has no upstream branch.
+ To push the current branch and set the remote as upstream, use
+ git push --set-upstream origin {x}
+
+To set up git to automatically create the upstream branch that corresponds to
+the branch in your local working copy, run this command from within the
+``ceph/`` directory:
+
+.. prompt:: bash $
+
+ git config --global push.autoSetupRemote true
+
+Deleting a Branch Locally
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To delete the branch named ``localBranchName`` from the local working copy, run
+a command of this form:
+
+.. prompt:: bash $
+
+ git branch -d localBranchName
+
+Deleting a Branch Remotely
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To delete the branch named ``remoteBranchName`` from the remote upstream branch
+(which is also your fork of ``ceph/ceph``, as described in :ref:`forking`), run
+a command of this form:
+
+.. prompt:: bash $
+
+ git push origin --delete remoteBranchName
+
+Searching a File Longitudinally for a String
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To search for the commit that introduced a given string (in this example, that
+string is ``foo``) into a given file (in this example, that file is
+``file.rst``), run a command of this form:
+
+.. prompt:: bash $
+
+ git log -S 'foo' file.rst
diff --git a/doc/dev/developer_guide/dash-devel.rst b/doc/dev/developer_guide/dash-devel.rst
new file mode 100644
index 000000000..1277cecc5
--- /dev/null
+++ b/doc/dev/developer_guide/dash-devel.rst
@@ -0,0 +1,2748 @@
+.. _dashdevel:
+
+Ceph Dashboard Developer Documentation
+======================================
+
+.. contents:: Table of Contents
+
+Feature Design
+--------------
+
+To promote collaboration on new Ceph Dashboard features, the first step is
+the definition of a design document. These documents then form the basis of
+implementation scope and permit wider participation in the evolution of the
+Ceph Dashboard UI.
+
+.. toctree::
+ :maxdepth: 1
+ :caption: Design Documents:
+
+ UI Design Goals <../dashboard/ui_goals>
+
+
+Preliminary Steps
+-----------------
+
+The following documentation chapters expect a running Ceph cluster and at
+least a running ``dashboard`` manager module (with few exceptions). This
+chapter gives an introduction on how to set up such a system for development,
+without the need to set up a full-blown production environment. All options
+introduced in this chapter are based on a so called ``vstart`` environment.
+
+.. note::
+
+ Every ``vstart`` environment needs Ceph `to be compiled`_ from its GitHub
+ repository, though Docker environments simplify that step by providing a
+ shell script that contains those instructions.
+
+ One exception to this rule are the `build-free`_ capabilities of
+ `ceph-dev`_. See below for more information.
+
+.. _to be compiled: https://docs.ceph.com/docs/master/install/build-ceph/
+
+vstart
+~~~~~~
+
+"vstart" is actually a shell script in the ``src/`` directory of the Ceph
+repository (``src/vstart.sh``). It is used to start a single node Ceph
+cluster on the machine where it is executed. Several required and some
+optional Ceph internal services are started automatically when it is used to
+start a Ceph cluster. vstart is the basis for the three most commonly used
+development environments in Ceph Dashboard.
+
+You can read more about vstart in :ref:`Deploying a development cluster
+<dev_deploying_a_development_cluster>`. Additional information for developers
+can also be found in the `Developer Guide`_.
+
+.. _Developer Guide: https://docs.ceph.com/docs/master/dev/quick_guide/
+
+Host-based vs Docker-based Development Environments
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This document introduces you to three different development environments, all
+based on vstart. Those are:
+
+- vstart running on your host system
+
+- vstart running in a Docker environment
+
+ * ceph-dev-docker_
+ * ceph-dev_
+
+ Besides their independent development branches and sometimes slightly
+ different approaches, they also differ with respect to their underlying
+ operating systems.
+
+ ========= ====================== ========
+ Release ceph-dev-docker ceph-dev
+ ========= ====================== ========
+ Mimic openSUSE Leap 15 CentOS 7
+ Nautilus openSUSE Leap 15 CentOS 7
+ Octopus openSUSE Leap 15.2 CentOS 8
+ --------- ---------------------- --------
+ Master openSUSE Tumbleweed CentOS 8
+ ========= ====================== ========
+
+.. note::
+
+ Independently of which of these environments you will choose, you need to
+ compile Ceph in that environment. If you compiled Ceph on your host system,
+ you would have to recompile it on Docker to be able to switch to a Docker
+ based solution. The same is true vice versa. If you previously used a
+ Docker development environment and compiled Ceph there and you now want to
+ switch to your host system, you will also need to recompile Ceph (or
+ compile Ceph using another separate repository).
+
+ `ceph-dev`_ is an exception to this rule as one of the options it provides
+ is `build-free`_. This is accomplished through a Ceph installation using
+ RPM system packages. You will still be able to work with a local GitHub
+ repository like you are used to.
+
+
+Development environment on your host system
+...........................................
+
+- No need to learn or have experience with Docker, jump in right away.
+
+- Limited amount of scripts to support automation (like Ceph compilation).
+
+- No pre-configured easy-to-start services (Prometheus, Grafana, etc).
+
+- Limited amount of host operating systems supported, depending on which
+ Ceph version is supposed to be used.
+
+- Dependencies need to be installed on your host.
+
+- You might find yourself in the situation where you need to upgrade your
+ host operating system (for instance due to a change of the GCC version used
+ to compile Ceph).
+
+
+Development environments based on Docker
+........................................
+
+- Some overhead in learning Docker if you are not used to it yet.
+
+- Both Docker projects provide you with scripts that help you getting started
+ and automate recurring tasks.
+
+- Both Docker environments come with partly pre-configured external services
+ which can be used to attach to or complement Ceph Dashboard features, like
+
+ - Prometheus
+ - Grafana
+ - Node-Exporter
+ - Shibboleth
+ - HAProxy
+
+- Works independently of the operating system you use on your host.
+
+
+.. _build-free: https://github.com/rhcs-dashboard/ceph-dev#quick-install-rpm-based
+
+vstart on your host system
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The vstart script is usually called from your `build/` directory like so:
+
+.. code::
+
+ ../src/vstart.sh -n -d
+
+In this case ``-n`` ensures that a new vstart cluster is created and that a
+possibly previously created cluster isn't re-used. ``-d`` enables debug
+messages in log files. There are several more options to chose from. You can
+get a list using the ``--help`` argument.
+
+At the end of the output of vstart, there should be information about the
+dashboard and its URLs::
+
+ vstart cluster complete. Use stop.sh to stop. See out/* (e.g. 'tail -f out/????') for debug output.
+
+ dashboard urls: https://192.168.178.84:41259, https://192.168.178.84:43259, https://192.168.178.84:45259
+ w/ user/pass: admin / admin
+ restful urls: https://192.168.178.84:42259, https://192.168.178.84:44259, https://192.168.178.84:46259
+ w/ user/pass: admin / 598da51f-8cd1-4161-a970-b2944d5ad200
+
+During development (especially in backend development), you also want to
+check on occasions if the dashboard manager module is still running. To do so
+you can call `./bin/ceph mgr services` manually. It will list all the URLs of
+successfully enabled services. Only URLs of services which are available over
+HTTP(S) will be listed there. Ceph Dashboard is one of these services. It
+should look similar to the following output:
+
+.. code::
+
+ $ ./bin/ceph mgr services
+ {
+ "dashboard": "https://home:41931/",
+ "restful": "https://home:42931/"
+ }
+
+By default, this environment uses a randomly chosen port for Ceph Dashboard
+and you need to use this command to find out which one it has become.
+
+Docker
+~~~~~~
+
+Docker development environments usually ship with a lot of useful scripts.
+``ceph-dev-docker`` for instance contains a file called `start-ceph.sh`,
+which cleans up log files, always starts a Rados Gateway service, sets some
+Ceph Dashboard configuration options and automatically runs a frontend proxy,
+all before or after starting up your vstart cluster.
+
+Instructions on how to use those environments are contained in their
+respective repository README files.
+
+- ceph-dev-docker_
+- ceph-dev_
+
+.. _ceph-dev-docker: https://github.com/ricardoasmarques/ceph-dev-docker
+.. _ceph-dev: https://github.com/rhcs-dashboard/ceph-dev
+
+Frontend Development
+--------------------
+
+Before you can start the dashboard from within a development environment, you
+will need to generate the frontend code and either use a compiled and running
+Ceph cluster (e.g. started by ``vstart.sh``) or the standalone development web
+server.
+
+The build process is based on `Node.js <https://nodejs.org/>`_ and requires the
+`Node Package Manager <https://www.npmjs.com/>`_ ``npm`` to be installed.
+
+Prerequisites
+~~~~~~~~~~~~~
+
+ * Node 18.17.0 or higher
+ * NPM 9.6.7 or higher
+
+nodeenv:
+ During Ceph's build we create a virtualenv with ``node`` and ``npm``
+ installed, which can be used as an alternative to installing node/npm in your
+ system.
+
+ If you want to use the node installed in the virtualenv you just need to
+ activate the virtualenv before you run any npm commands. To activate it run
+ ``. build/src/pybind/mgr/dashboard/node-env/bin/activate``.
+
+ Once you finish, you can simply run ``deactivate`` and exit the virtualenv.
+
+Angular CLI:
+ If you do not have the `Angular CLI <https://github.com/angular/angular-cli>`_
+ installed globally, then you need to execute ``ng`` commands with an
+ additional ``npm run`` before it.
+
+Package installation
+~~~~~~~~~~~~~~~~~~~~
+
+Run ``npm ci`` in directory ``src/pybind/mgr/dashboard/frontend`` to
+install the required packages locally.
+
+Adding or updating packages
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Run the following commands to add/update a package::
+
+ npm install <PACKAGE_NAME>
+ npm ci
+
+Setting up a Development Server
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Create the ``proxy.conf.json`` file based on ``proxy.conf.json.sample``.
+
+Run ``npm start`` for a dev server.
+Navigate to ``http://localhost:4200/``. The app will automatically
+reload if you change any of the source files.
+
+Code Scaffolding
+~~~~~~~~~~~~~~~~
+
+Run ``ng generate component component-name`` to generate a new
+component. You can also use
+``ng generate directive|pipe|service|class|guard|interface|enum|module``.
+
+Build the Project
+~~~~~~~~~~~~~~~~~
+
+Run ``npm run build`` to build the project. The build artifacts will be
+stored in the ``dist/`` directory. Use the ``--prod`` flag for a
+production build (``npm run build -- --prod``). Navigate to ``https://localhost:8443``.
+
+Build the Code Documentation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Run ``npm run doc-build`` to generate code docs in the ``documentation/``
+directory. To make them accessible locally for a web browser, run
+``npm run doc-serve`` and they will become available at ``http://localhost:8444``.
+With ``npm run compodoc -- <opts>`` you may
+`fully configure it <https://compodoc.app/guides/usage.html>`_.
+
+Code linting and formatting
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+We use the following tools to lint and format the code in all our TS, SCSS and
+HTML files:
+
+- `codelyzer <http://codelyzer.com/>`_
+- `html-linter <https://github.com/chinchiheather/html-linter>`_
+- `htmllint-cli <https://github.com/htmllint/htmllint-cli>`_
+- `Prettier <https://prettier.io/>`_
+- `ESLint <https://eslint.org/>`_
+- `stylelint <https://stylelint.io/>`_
+
+We added 2 npm scripts to help run these tools:
+
+- ``npm run lint``, will check frontend files against all linters
+- ``npm run fix``, will try to fix all the detected linting errors
+
+Ceph Dashboard and Bootstrap
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Currently we are using Bootstrap on the Ceph Dashboard as a CSS framework. This means that most of our SCSS and HTML
+code can make use of all the utilities and other advantages Bootstrap is offering. In the past we often have used our
+own custom styles and this lead to more and more variables with a single use and double defined variables which
+sometimes are forgotten to be removed or it led to styling be inconsistent because people forgot to change a color or to
+adjust a custom SCSS class.
+
+To get the current version of Bootstrap used inside Ceph please refer to the ``package.json`` and search for:
+
+- ``bootstrap``: For the Bootstrap version used.
+- ``@ng-bootstrap``: For the version of the Angular bindings which we are using.
+
+So for the future please do the following when visiting a component:
+
+- Does this HTML/SCSS code use custom code? - If yes: Is it needed? --> Clean it up before changing the things you want
+ to fix or change.
+- If you are creating a new component: Please make use of Bootstrap as much as reasonably possible! Don't try to
+ reinvent the wheel.
+- If possible please look up if Bootstrap has guidelines on how to extend it properly to do achieve what you want to
+ achieve.
+
+The more bootstrap alike our code is the easier it is to theme, to maintain and the less bugs we will have. Also since
+Bootstrap is a framework which tries to have usability and user experience in mind we increase both points
+exponentially. The biggest benefit of all is that there is less code for us to maintain which makes it easier to read
+for beginners and even more easy for people how are already familiar with the code.
+
+Writing Unit Tests
+~~~~~~~~~~~~~~~~~~
+
+To write unit tests most efficient we have a small collection of tools,
+we use within test suites.
+
+Those tools can be found under
+``src/pybind/mgr/dashboard/frontend/src/testing/``, especially take
+a look at ``unit-test-helper.ts``.
+
+There you will be able to find:
+
+``configureTestBed`` that replaces the initial ``TestBed``
+methods. It takes the same arguments as ``TestBed.configureTestingModule``.
+Using it will run your tests a lot faster in development, as it doesn't
+recreate everything from scratch on every test. To use the default behaviour
+pass ``true`` as the second argument.
+
+``PermissionHelper`` to help determine if
+the correct actions are shown based on the current permissions and selection
+in a list.
+
+``FormHelper`` which makes testing a form a lot easier
+with a few simple methods. It allows you to set a control or multiple
+controls, expect if a control is valid or has an error or just do both with
+one method. Additional you can expect a template element or multiple elements
+to be visible in the rendered template.
+
+Running Unit Tests
+~~~~~~~~~~~~~~~~~~
+
+Run ``npm run test`` to execute the unit tests via `Jest
+<https://facebook.github.io/jest/>`_.
+
+If you get errors on all tests, it could be because `Jest
+<https://facebook.github.io/jest/>`__ or something else was updated.
+There are a few ways how you can try to resolve this:
+
+- Remove all modules with ``rm -rf dist node_modules`` and run ``npm install``
+ again in order to reinstall them
+- Clear the cache of jest by running ``npx jest --clearCache``
+
+Running End-to-End (E2E) Tests
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+We use `Cypress <https://www.cypress.io/>`__ to run our frontend E2E tests.
+
+E2E Prerequisites
+.................
+
+You need to previously build the frontend.
+
+In some environments, depending on your user permissions and the CYPRESS_CACHE_FOLDER,
+you might need to run ``npm ci`` with the ``--unsafe-perm`` flag.
+
+You might need to install additional packages to be able to run Cypress.
+Please run ``npx cypress verify`` to verify it.
+
+run-frontend-e2e-tests.sh
+.........................
+
+Our ``run-frontend-e2e-tests.sh`` script is the go to solution when you wish to
+do a full scale e2e run.
+It will verify if everything needed is installed, start a new vstart cluster
+and run the full test suite.
+
+Start all frontend E2E tests with::
+
+ $ cd src/pybind/mgr/dashboard
+ $ ./run-frontend-e2e-tests.sh
+
+Report:
+ You can follow the e2e report on the terminal and you can find the screenshots
+ of failed test cases by opening the following directory::
+
+ src/pybind/mgr/dashboard/frontend/cypress/screenshots/
+
+Device:
+ You can force the script to use a specific device with the ``-d`` flag::
+
+ $ ./run-frontend-e2e-tests.sh -d <chrome|chromium|electron|docker>
+
+Remote:
+ By default this script will stop and start a new vstart cluster.
+ If you want to run the tests outside the ceph environment, you will need to
+ manually define the dashboard url using ``-r`` and, optionally, credentials
+ (``-u``, ``-p``)::
+
+ $ ./run-frontend-e2e-tests.sh -r <DASHBOARD_URL> -u <E2E_LOGIN_USER> -p <E2E_LOGIN_PWD>
+
+Note:
+ When using docker, as your device, you might need to run the script with sudo
+ permissions.
+
+run-cephadm-e2e-tests.sh
+.........................
+
+``run-cephadm-e2e-tests.sh`` runs a subset of E2E tests to verify that the Dashboard and cephadm as
+Orchestrator backend behave correctly.
+
+Prerequisites: you need to install `KCLI
+<https://kcli.readthedocs.io/en/latest/>`_ and Node.js in your local machine.
+
+Configure KCLI plan requirements::
+
+ $ sudo chown -R $(id -un) /var/lib/libvirt/images
+ $ mkdir -p /var/lib/libvirt/images/ceph-dashboard
+ $ kcli create pool -p /var/lib/libvirt/images/ceph-dashboard ceph-dashboard
+ $ kcli create network -c 192.168.100.0/24 ceph-dashboard
+
+Note:
+ This script is aimed to be run as jenkins job so the cleanup is triggered only in a jenkins
+ environment. In local, the user will shutdown the cluster when desired (i.e. after debugging).
+
+Start E2E tests by running::
+
+ $ cd <your/ceph/repo/dir>
+ $ sudo chown -R $(id -un) src/pybind/mgr/dashboard/frontend/{dist,node_modules,src/environments}
+ $ ./src/pybind/mgr/dashboard/ci/cephadm/run-cephadm-e2e-tests.sh
+
+Note:
+ In fedora 35, there can occur a permission error when trying to mount the shared_folders. This can be
+ fixed by running::
+
+ $ sudo setfacl -R -m u:qemu:rwx <abs-path-to-your-user-home>
+
+ or also by setting the appropriate permission to your $HOME directory
+
+You can also start a cluster in development mode (so the frontend build starts in watch mode and you
+only have to reload the page for the changes to be reflected) by running::
+
+ $ ./src/pybind/mgr/dashboard/ci/cephadm/start-cluster.sh --dev-mode
+
+Note:
+ Add ``--expanded`` if you need a cluster ready to deploy services (one with enough monitor
+ daemons spread across different hosts and enough OSDs).
+
+Test your changes by running:
+
+ $ ./src/pybind/mgr/dashboard/ci/cephadm/run-cephadm-e2e-tests.sh
+
+Shutdown the cluster by running:
+
+ $ kcli delete plan -y ceph
+ $ # In development mode, also kill the npm build watch process (e.g., pkill -f "ng build")
+
+Other running options
+.....................
+
+During active development, it is not recommended to run the previous script,
+as it is not prepared for constant file changes.
+Instead you should use one of the following commands:
+
+- ``npm run e2e`` - This will run ``ng serve`` and open the Cypress Test Runner.
+- ``npm run e2e:ci`` - This will run ``ng serve`` and run the Cypress Test Runner once.
+- ``npx cypress run`` - This calls cypress directly and will run the Cypress Test Runner.
+ You need to have a running frontend server.
+- ``npx cypress open`` - This calls cypress directly and will open the Cypress Test Runner.
+ You need to have a running frontend server.
+
+Calling Cypress directly has the advantage that you can use any of the available
+`flags <https://docs.cypress.io/guides/guides/command-line.html#cypress-run>`__
+to customize your test run and you don't need to start a frontend server each time.
+
+Using one of the ``open`` commands, will open a cypress application where you
+can see all the test files you have and run each individually.
+This is going to be run in watch mode, so if you make any changes to test files,
+it will retrigger the test run.
+This cannot be used inside docker, as it requires X11 environment to be able to open.
+
+By default Cypress will look for the web page at ``https://localhost:4200/``.
+If you are serving it in a different URL you will need to configure it by
+exporting the environment variable CYPRESS_BASE_URL with the new value.
+E.g.: ``CYPRESS_BASE_URL=https://localhost:41076/ npx cypress open``
+
+CYPRESS_CACHE_FOLDER
+.....................
+
+When installing cypress via npm, a binary of the cypress app will also be
+downloaded and stored in a cache folder.
+This removes the need to download it every time you run ``npm ci`` or even when
+using cypress in a separate project.
+
+By default Cypress uses ~/.cache to store the binary.
+To prevent changes to the user home directory, we have changed this folder to
+``/ceph/build/src/pybind/mgr/dashboard/cypress``, so when you build ceph or run
+``run-frontend-e2e-tests.sh`` this is the directory Cypress will use.
+
+When using any other command to install or run cypress,
+it will go back to the default directory. It is recommended that you export the
+CYPRESS_CACHE_FOLDER environment variable with a fixed directory, so you always
+use the same directory no matter which command you use.
+
+
+Writing End-to-End Tests
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+The PagerHelper class
+.....................
+
+The ``PageHelper`` class is supposed to be used for general purpose code that
+can be used on various pages or suites.
+
+Examples are
+
+- ``navigateTo()`` - Navigates to a specific page and waits for it to load
+- ``getFirstTableCell()`` - returns the first table cell. You can also pass a
+ string with the desired content and it will return the first cell that
+ contains it.
+- ``getTabsCount()`` - returns the amount of tabs
+
+Every method that could be useful on several pages belongs there. Also, methods
+which enhance the derived classes of the PageHelper belong there. A good
+example for such a case is the ``restrictTo()`` decorator. It ensures that a
+method implemented in a subclass of PageHelper is called on the correct page.
+It will also show a developer-friendly warning if this is not the case.
+
+Subclasses of PageHelper
+........................
+
+Helper Methods
+""""""""""""""
+
+In order to make code reusable which is specific for a particular suite, make
+sure to put it in a derived class of the ``PageHelper``. For instance, when
+talking about the pool suite, such methods would be ``create()``, ``exist()``
+and ``delete()``. These methods are specific to a pool but are useful for other
+suites.
+
+Methods that return HTML elements which can only be found on a specific page,
+should be either implemented in the helper methods of the subclass of PageHelper
+or as own methods of the subclass of PageHelper.
+
+Using PageHelpers
+"""""""""""""""""
+
+In any suite, an instance of the specific ``Helper`` class should be
+instantiated and called directly.
+
+.. code:: TypeScript
+
+ const pools = new PoolPageHelper();
+
+ it('should create a pool', () => {
+ pools.exist(poolName, false);
+ pools.navigateTo('create');
+ pools.create(poolName, 8);
+ pools.exist(poolName, true);
+ });
+
+Code Style
+..........
+
+Please refer to the official `Cypress Core Concepts
+<https://docs.cypress.io/guides/core-concepts/introduction-to-cypress.html#Cypress-Can-Be-Simple-Sometimes>`__
+for a better insight on how to write and structure tests.
+
+``describe()`` vs ``it()``
+""""""""""""""""""""""""""
+
+Both ``describe()`` and ``it()`` are function blocks, meaning that any
+executable code necessary for the test can be contained in either block.
+However, Typescript scoping rules still apply, therefore any variables declared
+in a ``describe`` are available to the ``it()`` blocks inside of it.
+
+``describe()`` typically are containers for tests, allowing you to break tests
+into multiple parts. Likewise, any setup that must be made before your tests are
+run can be initialized within the ``describe()`` block. Here is an example:
+
+.. code:: TypeScript
+
+ describe('create, edit & delete image test', () => {
+ const poolName = 'e2e_images_pool';
+
+ before(() => {
+ cy.login();
+ pools.navigateTo('create');
+ pools.create(poolName, 8, 'rbd');
+ pools.exist(poolName, true);
+ });
+
+ beforeEach(() => {
+ cy.login();
+ images.navigateTo();
+ });
+
+ //...
+
+ });
+
+As shown, we can initiate the variable ``poolName`` as well as run commands
+before our test suite begins (creating a pool). ``describe()`` block messages
+should include what the test suite is.
+
+``it()`` blocks typically are parts of an overarching test. They contain the
+functionality of the test suite, each performing individual roles.
+Here is an example:
+
+.. code:: TypeScript
+
+ describe('create, edit & delete image test', () => {
+ //...
+
+ it('should create image', () => {
+ images.createImage(imageName, poolName, '1');
+ images.getFirstTableCell(imageName).should('exist');
+ });
+
+ it('should edit image', () => {
+ images.editImage(imageName, poolName, newImageName, '2');
+ images.getFirstTableCell(newImageName).should('exist');
+ });
+
+ //...
+ });
+
+As shown from the previous example, our ``describe()`` test suite is to create,
+edit and delete an image. Therefore, each ``it()`` completes one of these steps,
+one for creating, one for editing, and so on. Likewise, every ``it()`` blocks
+message should be in lowercase and written so long as "it" can be the prefix of
+the message. For example, ``it('edits the test image' () => ...)`` vs.
+``it('image edit test' () => ...)``. As shown, the first example makes
+grammatical sense with ``it()`` as the prefix whereas the second message does
+not. ``it()`` should describe what the individual test is doing and what it
+expects to happen.
+
+
+Visual Regression Testing
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+For visual regression testing, we use `Applitools Eyes <https://applitools.com/products-eyes/>`_
+an AI powered automated visual regression testing tool.
+Applitools integrates with our existing Cypress E2E tests.
+The tests currently are located at: ``ceph/src/pybind/mgr/dashboard/frontend/cypress/integration/visualTests`` and
+follow the naming convention: ``<component-name>.vrt-spec.ts``.
+
+Running Visual Regression Tests Locally
+.......................................
+
+To run the tests locally, you'll need an Applitools API key, if you don't have one, you can sign up
+for a free account. After obtaining the API key, export it as an environment variable: ``APPLITOOLS_API_KEY``.
+
+Now you can run the tests like normal cypress E2E tests, using either ``npx cypress open`` or in headless mode by running ``npx cypress run``.
+
+Capturing Screenshots
+.....................
+
+Baseline screenshots are the screenshots against which checkpoint screenshots
+(or the screenshots from your feature branch) will be tested.
+
+To capture baseline screenshots, you can run the tests against the master branch,
+and then switch to your feature branch and run the tests again to capture checkpoint screenshots.
+
+Now to see your screenshots, login to applitools.com and on the landing page you'll be greeted with
+applitools eyes test runner, where you can see all your screenshots. And if there's any visual regression or difference (diff) between your baseline and checkpoint screenshots, they'll be highlighted with a mask over the diff.
+
+Writing More Visual Regression Tests
+....................................
+
+Please refer to `Applitools's official cypress sdk documentation <https://www.npmjs.com/package/@applitools/eyes-cypress#usage>`_ to write more tests.
+
+Visual Regression Tests In Jenkins
+..................................
+
+Currently, all visual regression tests are being run under `ceph dashboard tests <https://jenkins.ceph.com/job/ceph-dashboard-pull-requests>`_ GitHub check in the Jenkins job.
+
+Accepting or Rejecting Differences
+..................................
+
+Currently, only the ceph dashboard team has read and write access to the applitools test runner. If any differences are reported by the tests, and you want to accept them and update the baseline screenshots, or if the differences are due to a genuine regression you can fail them. To perform the above actions, please follow `this <https://applitools.com/docs/topics/test-manager/pages/page-test-results/tm-accepting-and-rejecting-steps.html>`_ guide.
+
+Debugging Regressions
+.....................
+
+If you're running the tests locally and regressions are reported, you can take advantage of `Applitools's Root Cause Analysis feature <https://applitools.com/docs/topics/test-manager/viewers/root-cause-analysis.html>`_ to find the cause of the regression.
+
+
+Differences between Frontend Unit Tests and End-to-End (E2E) Tests / FAQ
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+General introduction about testing and E2E/unit tests
+
+
+What are E2E/unit tests designed for?
+.....................................
+
+E2E test:
+
+It requires a fully functional system and tests the interaction of all components
+of the application (Ceph, back-end, front-end).
+E2E tests are designed to mimic the behavior of the user when interacting with the application
+- for example when it comes to workflows like creating/editing/deleting an item.
+Also the tests should verify that certain items are displayed as a user would see them
+when clicking through the UI (for example a menu entry or a pool that has been
+created during a test and the pool and its properties should be displayed in the table).
+
+Angular Unit Tests:
+
+Unit tests, as the name suggests, are tests for smaller units of the code.
+Those tests are designed for testing all kinds of Angular components (e.g. services, pipes etc.).
+They do not require a connection to the backend, hence those tests are independent of it.
+The expected data of the backend is mocked in the frontend and by using this data
+the functionality of the frontend can be tested without having to have real data from the backend.
+As previously mentioned, data is either mocked or, in a simple case, contains a static input,
+a function call and an expected static output.
+More complex examples include the state of a component (attributes of the component class),
+that define how the output changes according to the given input.
+
+Which E2E/unit tests are considered to be valid?
+................................................
+
+This is not easy to answer, but new tests that are written in the same way as already existing
+dashboard tests should generally be considered valid.
+Unit tests should focus on the component to be tested.
+This is either an Angular component, directive, service, pipe, etc.
+
+E2E tests should focus on testing the functionality of the whole application.
+Approximately a third of the overall E2E tests should verify the correctness
+of user visible elements.
+
+How should an E2E/unit test look like?
+......................................
+
+Unit tests should focus on the described purpose
+and shouldn't try to test other things in the same `it` block.
+
+E2E tests should contain a description that either verifies
+the correctness of a user visible element or a complete process
+like for example the creation/validation/deletion of a pool.
+
+What should an E2E/unit test cover?
+...................................
+
+E2E tests should mostly, but not exclusively, cover interaction with the backend.
+This way the interaction with the backend is utilized to write integration tests.
+
+A unit test should mostly cover critical or complex functionality
+of a component (Angular Components, Services, Pipes, Directives, etc).
+
+What should an E2E/unit test NOT cover?
+.......................................
+
+Avoid duplicate testing: do not write E2E tests for what's already
+been covered as frontend-unit tests and vice versa.
+It may not be possible to completely avoid an overlap.
+
+Unit tests should not be used to extensively click through components and E2E tests
+shouldn't be used to extensively test a single component of Angular.
+
+Best practices/guideline
+........................
+
+As a general guideline we try to follow the 70/20/10 approach - 70% unit tests,
+20% integration tests and 10% end-to-end tests.
+For further information please refer to `this document
+<https://testing.googleblog.com/2015/04/just-say-no-to-more-end-to-end-tests.html>`__
+and the included "Testing Pyramid".
+
+Further Help
+~~~~~~~~~~~~
+
+To get more help on the Angular CLI use ``ng help`` or go check out the
+`Angular CLI
+README <https://github.com/angular/angular-cli/blob/master/README.md>`__.
+
+Example of a Generator
+~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+ # Create module 'Core'
+ src/app> ng generate module core -m=app --routing
+
+ # Create module 'Auth' under module 'Core'
+ src/app/core> ng generate module auth -m=core --routing
+ or, alternatively:
+ src/app> ng generate module core/auth -m=core --routing
+
+ # Create component 'Login' under module 'Auth'
+ src/app/core/auth> ng generate component login -m=core/auth
+ or, alternatively:
+ src/app> ng generate component core/auth/login -m=core/auth
+
+Frontend Typescript Code Style Guide Recommendations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Group the imports based on its source and separate them with a blank
+line.
+
+The source groups can be either from Angular, external or internal.
+
+Example:
+
+.. code:: javascript
+
+ import { Component } from '@angular/core';
+ import { Router } from '@angular/router';
+
+ import { ToastrManager } from 'ngx-toastr';
+
+ import { Credentials } from '../../../shared/models/credentials.model';
+ import { HostService } from './services/host.service';
+
+Frontend components
+~~~~~~~~~~~~~~~~~~~
+
+There are several components that can be reused on different pages.
+This components are declared on the components module:
+`src/pybind/mgr/dashboard/frontend/src/app/shared/components`.
+
+Helper
+~~~~~~
+
+This component should be used to provide additional information to the user.
+
+Example:
+
+.. code:: html
+
+ <cd-helper>
+ Some <strong>helper</strong> html text
+ </cd-helper>
+
+Terminology and wording
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Instead of using the Ceph component names, the approach
+suggested is to use the logical/generic names (Block over RBD, Filesystem over
+CephFS, Object over RGW). Nevertheless, as Ceph-Dashboard cannot completely hide
+the Ceph internals, some Ceph-specific names might remain visible.
+
+Regarding the wording for action labels and other textual elements (form titles,
+buttons, etc.), the chosen approach is to follow `these guidelines
+<https://www.patternfly.org/styles/terminology-and-wording/#terminology-and-wording-for-action-labels>`_.
+As a rule of thumb, 'Create' and 'Delete' are the proper wording for most forms,
+instead of 'Add' and 'Remove', unless some already created item is either added
+or removed to/from a set of items (e.g.: 'Add permission' to a user vs. 'Create
+(new) permission').
+
+In order to enforce the use of this wording, a service ``ActionLabelsI18n`` has
+been created, which provides translated labels for use in UI elements.
+
+Frontend branding
+~~~~~~~~~~~~~~~~~
+
+Every vendor can customize the 'Ceph dashboard' to his needs. No matter if
+logo, HTML-Template or TypeScript, every file inside the frontend folder can be
+replaced.
+
+To replace files, open ``./frontend/angular.json`` and scroll to the section
+``fileReplacements`` inside the production configuration. Here you can add the
+files you wish to brand. We recommend to place the branded version of a file in
+the same directory as the original one and to add a ``.brand`` to the file
+name, right in front of the file extension. A ``fileReplacement`` could for
+example look like this:
+
+.. code:: javascript
+
+ {
+ "replace": "src/app/core/auth/login/login.component.html",
+ "with": "src/app/core/auth/login/login.component.brand.html"
+ }
+
+To serve or build the branded user interface run:
+
+ $ npm run start -- --prod
+
+or
+
+ $ npm run build -- --prod
+
+Unfortunately it's currently not possible to use multiple configurations when
+serving or building the UI at the same time. That means a configuration just
+for the branding ``fileReplacements`` is not an option, because you want to use
+the production configuration anyway
+(https://github.com/angular/angular-cli/issues/10612).
+Furthermore it's also not possible to use glob expressions for
+``fileReplacements``. As long as the feature hasn't been implemented, you have
+to add the file replacements manually to the angular.json file
+(https://github.com/angular/angular-cli/issues/12354).
+
+Nevertheless you should stick to the suggested naming scheme because it makes
+it easier for you to use glob expressions once it's supported in the future.
+
+To change the variable defaults or add your own ones you can overwrite them in
+``./frontend/src/styles/vendor/_variables.scss``.
+Just reassign the variable you want to change, for example ``$color-primary: teal;``
+To overwrite or extend the default CSS, you can add your own styles in
+``./frontend/src/styles/vendor/_style-overrides.scss``.
+
+UI Style Guide
+~~~~~~~~~~~~~~
+
+The style guide is created to document Ceph Dashboard standards and maintain
+consistency across the project. Its an effort to make it easier for
+contributors to process designing and deciding mockups and designs for
+Dashboard.
+
+The development environment for Ceph Dashboard has live reloading enabled so
+any changes made in UI are reflected in open browser windows. Ceph Dashboard
+uses Bootstrap as the main third-party CSS library.
+
+Avoid duplication of code. Be consistent with the existing UI by reusing
+existing SCSS declarations as much as possible.
+
+Always check for existing code similar to what you want to write.
+You should always try to keep the same look-and-feel as the existing code.
+
+Colors
+......
+
+All the colors used in Ceph Dashboard UI are listed in
+`frontend/src/styles/defaults/_bootstrap-defaults.scss`. If using new color
+always define color variables in the `_bootstrap-defaults.scss` and
+use the variable instead of hard coded color values so that changes to the
+color are reflected in similar UI elements.
+
+The main color for the Ceph Dashboard is `$primary`. The primary color is
+used in navigation components and as the `$border-color` for input components of
+form.
+
+The secondary color is `$secondary` and is the background color for Ceph
+Dashboard.
+
+Buttons
+.......
+
+Buttons are used for performing actions such as: “Submit”, “Edit, “Create" and
+“Update”.
+
+**Forms:** When using to submit forms anywhere in the Dashboard, the main action
+button should use the `cd-submit-button` component and the secondary button should
+use `cd-back-button` component. The text on the action button should be same as the
+form title and follow a title case. The text on the secondary button should be
+`Cancel`. `Perform action` button should always be on right while `Cancel`
+button should always be on left.
+
+**Modals**: The main action button should use the `cd-submit-button` component and
+the secondary button should use `cd-back-button` component. The text on the action
+button should follow a title case and correspond to the action to be performed.
+The text on the secondary button should be `Close`.
+
+**Disclosure Button:** Disclosure buttons should be used to allow users to
+display and hide additional content in the interface.
+
+**Action Button**: Use the action button to perform actions such as edit or update
+a component. All action button should have an icon corresponding to the actions they
+perform and button text should follow title case. The button color should be the
+same as the form's main button color.
+
+**Drop Down Buttons:** Use dropdown buttons to display predefined lists of
+actions. All drop down buttons have icons corresponding to the action they
+perform.
+
+Links
+.....
+
+Use text hyperlinks as navigation to guide users to a new page in the application
+or to anchor users to a section within a page. The color of the hyperlinks
+should be `$primary`.
+
+Forms
+.....
+
+Mark invalid form fields with red outline and show a meaningful error message.
+Use red as font color for message and be as specific as possible.
+`This field is required.` should be the exact error message for required fields.
+Mark valid forms with a green outline and a green tick at the end of the form.
+Sections should not have a bigger header than the parent.
+
+Modals
+......
+
+Blur any interface elements in the background to bring the modal content into
+focus. The heading of the modal should reflect the action it can perform and
+should be clearly mentioned at the top of the modal. Use `cd-back-button`
+component in the footer for closing the modal.
+
+Icons
+.....
+
+We use `Fork Awesome <https://forkaweso.me/Fork-Awesome/>`_ classes for icons.
+We have a list of used icons in `src/app/shared/enum/icons.enum.ts`, these
+should be referenced in the HTML, so its easier to change them later. When
+icons are next to text, they should be center-aligned horizontally. If icons
+are stacked, they should also be center-aligned vertically. Use small icons
+with buttons. For notifications use large icons.
+
+Navigation
+..........
+
+For local navigation use tabs. For overall navigation use expandable vertical
+navigation to collapse and expand items as needed.
+
+Alerts and notifications
+........................
+
+Default notification should have `text-info` color. Success notification should
+have `text-success` color. Failure notification should have `text-danger` color.
+
+Error Handling
+~~~~~~~~~~~~~~
+
+For handling front-end errors, there is a generic Error Component which can be
+found in ``./src/pybind/mgr/dashboard/frontend/src/app/core/error``. For
+reporting a new error, you can simply extend the ``DashboardError`` class
+in ``error.ts`` file and add specific header and message for the new error. Some
+generic error classes are already in place such as ``DashboardNotFoundError``
+and ``DashboardForbiddenError`` which can be called and reused in different
+scenarios.
+
+For example - ``throw new DashboardNotFoundError()``.
+
+Internationalization (i18n)
+---------------------------
+
+How to extract messages from source code?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To extract the I18N messages from the templates and the TypeScript files just
+run the following command in ``src/pybind/mgr/dashboard/frontend``::
+
+ $ npm run i18n:extract
+
+This will extract all marked messages from the HTML templates first and then
+add all marked strings from the TypeScript files to the translation template.
+Since the extraction from TypeScript files is still not supported by Angular
+itself, we are using the
+`ngx-translator <https://github.com/ngx-translate/i18n-polyfill>`_ extractor to
+parse the TypeScript files.
+
+When the command ran successfully, it should have created or updated the file
+``src/locale/messages.xlf``.
+
+The file isn't tracked by git, you can just use it to start with the
+translation offline or add/update the resource files on transifex.
+
+Supported languages
+~~~~~~~~~~~~~~~~~~~
+
+All our supported languages should be registered in both exports in
+``supported-languages.enum.ts`` and have a corresponding test in
+``language-selector.component.spec.ts``.
+
+The ``SupportedLanguages`` enum will provide the list for the default language selection.
+
+Translating process
+~~~~~~~~~~~~~~~~~~~
+
+To facilitate the translation process of the dashboard we are using a web tool
+called `transifex <https://www.transifex.com/>`_.
+
+If you wish to help translating to any language just go to our `transifex
+project page <https://www.transifex.com/ceph/ceph-dashboard/>`_, join the
+project and you can start translating immediately.
+
+All translations will then be reviewed and later pushed upstream.
+
+Updating translated messages
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Any time there are new messages translated and reviewed in a specific language
+we should update the translation file upstream.
+
+To do that, check the settings in the i18n config file
+``src/pybind/mgr/dashboard/frontend/i18n.config.json``:: and make sure that the
+organization is *ceph*, the project is *ceph-dashboard* and the resource is
+the one you want to pull from and push to e.g. *Master:master*. To find a list
+of available resources visit `<https://www.transifex.com/ceph/ceph-dashboard/content/>`_.
+
+After you checked the config go to the directory ``src/pybind/mgr/dashboard/frontend`` and run::
+
+ $ npm run i18n
+
+This command will extract all marked messages from the HTML templates and
+TypeScript files. Once the source file has been created it will push it to
+transifex and pull the latest translations. It will also fill all the
+untranslated strings with the source string.
+The tool will ask you for an api token, unless you added it by running:
+
+ $ npm run i18n:token
+
+To create a transifex api token visit `<https://www.transifex.com/user/settings/api/>`_.
+
+After the command ran successfully, build the UI and check if everything is
+working as expected. You also might want to run the frontend tests.
+
+Add a new release resource to transifex
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In order to organize the translations, we create a
+`transifex resource <https://www.transifex.com/ceph/ceph-dashboard/content/>`_
+for every Ceph release. This means, once a new version has been released, the
+``src/pybind/mgr/dashboard/frontend/i18n.config.json`` needs to be updated on
+the release branch.
+
+Please replace::
+
+"resource": "Master:master"
+
+by::
+
+"resource": "<Release-name>:<release-name>"
+
+E.g. the resource definition for the pacific release::
+
+"resource": "Pacific:pacific"
+
+Note:
+ The first part of the resource definition (before the colon) needs to be
+ written with a capital letter.
+
+Suggestions
+~~~~~~~~~~~
+
+Strings need to start and end in the same line as the element:
+
+.. code-block:: html
+
+ <!-- avoid -->
+ <span i18n>
+ Foo
+ </span>
+
+ <!-- recommended -->
+ <span i18n>Foo</span>
+
+
+ <!-- avoid -->
+ <span i18n>
+ Foo bar baz.
+ Foo bar baz.
+ </span>
+
+ <!-- recommended -->
+ <span i18n>Foo bar baz.
+ Foo bar baz.</span>
+
+Isolated interpolations should not be translated:
+
+.. code-block:: html
+
+ <!-- avoid -->
+ <span i18n>{{ foo }}</span>
+
+ <!-- recommended -->
+ <span>{{ foo }}</span>
+
+Interpolations used in a sentence should be kept in the translation:
+
+.. code-block:: html
+
+ <!-- recommended -->
+ <span i18n>There are {{ x }} OSDs.</span>
+
+Remove elements that are outside the context of the translation:
+
+.. code-block:: html
+
+ <!-- avoid -->
+ <label i18n>
+ Profile
+ <span class="required"></span>
+ </label>
+
+ <!-- recommended -->
+ <label>
+ <ng-container i18n>Profile<ng-container>
+ <span class="required"></span>
+ </label>
+
+Keep elements that affect the sentence:
+
+.. code-block:: html
+
+ <!-- recommended -->
+ <span i18n>Profile <b>foo</b> will be removed.</span>
+
+
+.. _accessibility:
+
+Accessibility
+-------------
+
+Many parts of the Ceph Dashboard are modeled on `Web Content Accessibility Guidelines (WCAG) 2.1 <https://www.w3.org/TR/WCAG21/>`_ level A accessibility conformance guidelines.
+By implementing accessibility best practices, you are improving the usability of the Ceph Dashboard for blind and visually impaired users.
+
+Summary
+~~~~~~~
+
+A few things you should check before introducing a new code change include:
+
+1) Add `ARIA labels and descriptions <https://www.w3.org/TR/wai-aria/>`_ to actionable HTML elements.
+2) Don't forget to tag ARIA labels/descriptions or any user-readable text for translation (i18n-title, i18n-aria-label...).
+3) Add `ARIA roles <https://www.w3.org/TR/wai-aria/#usage_intro>`_ to tag HTML elements that behave different from their intended behaviour (<a> tags behaving as <buttons>) or that provide extended behaviours (roles).
+4) Avoid poor `color contrast choices <https://www.w3.org/TR/WCAG21/#contrast-minimum>`_ (foreground-background) when styling a component. Here are some :ref:`tools <color-contrast-checkers>` you can use.
+5) When testing menus or dropdowns, be sure to scan them with an :ref:`accessibility checker <accessibility-checkers>` in both opened and closed states. Sometimes issues are hidden when menus are closed.
+
+.. _accessibility-checkers:
+
+Accessibility checkers
+~~~~~~~~~~~~~~~~~~~~~~
+
+During development, you can test the accessibility compliance of your features using one of the tools below:
+
+- `Accessibility insights plugin <https://accessibilityinsights.io/downloads/>`_
+- `Site Improve plugin <https://www.siteimprove.com/integrations/browser-extensions/>`_
+- `Axe devtools <https://www.deque.com/axe/devtools/>`_
+
+Testing with two or more of these tools can greatly improve the detection of accessibility violations.
+
+.. _color-contrast-checkers:
+
+Color contrast checkers
+~~~~~~~~~~~~~~~~~~~~~~~
+
+When adding new colors, making sure they are accessible is also important. Here are some tools which can help with color contrast testing:
+
+- `Accessible web color-contrast checker <https://accessibleweb.com/color-contrast-checker/>`_
+- `Colorsafe generator <https://colorsafe.co/>`_
+
+Accessibility linters
+~~~~~~~~~~~~~~~~~~~~~
+
+If you use VSCode, you may install the `axe accessibility linter <https://marketplace.visualstudio.com/items?itemName=deque-systems.vscode-axe-linter>`_,
+which can help you catch and fix potential issues during development.
+
+Accessibility testing
+~~~~~~~~~~~~~~~~~~~~~
+
+Our e2e testing suite, which is based on Cypress, supports the addition of accessibility tests using `axe-core <https://github.com/dequelabs/axe-core>`_
+and `cypress-axe <https://github.com/component-driven/cypress-axe>`_. A custom Cypress command, `cy.checkAccessibility`, can also be used directly.
+This is a great way to prevent accessibility regressions on high impact components.
+
+Tests can be found under the `a11y folder <./src/pybind/mgr/dashboard/frontend/cypress/integration/a11y>`_ in the dashboard. Here is an example:
+
+.. code:: TypeScript
+
+ describe('Navigation accessibility', { retries: 0 }, () => {
+ const shared = new NavigationPageHelper();
+
+ beforeEach(() => {
+ cy.login();
+ shared.navigateTo();
+ });
+
+ it('top-nav should have no accessibility violations', () => {
+ cy.injectAxe();
+ cy.checkAccessibility('.cd-navbar-top');
+ });
+
+ it('sidebar should have no accessibility violations', () => {
+ cy.injectAxe();
+ cy.checkAccessibility('nav[id=sidebar]');
+ });
+
+ });
+
+Additional guidelines
+~~~~~~~~~~~~~~~~~~~~~
+
+If you're unsure about which UI pattern to follow in order to implement an accessibility fix, `patternfly <https://www.patternfly.org/v4/accessibility/accessibility-fundamentals>`_ guidelines can be used.
+
+Backend Development
+-------------------
+
+The Python backend code of this module requires a number of Python modules to be
+installed. They are listed in file ``requirements.txt``. Using `pip
+<https://pypi.python.org/pypi/pip>`_ you may install all required dependencies
+by issuing ``pip install -r requirements.txt`` in directory
+``src/pybind/mgr/dashboard``.
+
+If you're using the `ceph-dev-docker development environment
+<https://github.com/ricardoasmarques/ceph-dev-docker/>`_, simply run
+``./install_deps.sh`` from the toplevel directory to install them.
+
+Unit Testing
+~~~~~~~~~~~~
+
+In dashboard we have two different kinds of backend tests:
+
+1. Unit tests based on ``tox``
+2. API tests based on Teuthology.
+
+Unit tests based on tox
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+We included a ``tox`` configuration file that will run the unit tests under
+Python 3, as well as linting tools to guarantee the uniformity of code.
+
+You need to install ``tox`` and ``coverage`` before running it. To install the
+packages in your system, either install it via your operating system's package
+management tools, e.g. by running ``dnf install python-tox python-coverage`` on
+Fedora Linux.
+
+Alternatively, you can use Python's native package installation method::
+
+ $ pip install tox
+ $ pip install coverage
+
+To run the tests, run ``src/script/run_tox.sh`` in the dashboard directory (where
+``tox.ini`` is located)::
+
+ ## Run Python 3 tests+lint commands:
+ $ ../../../script/run_tox.sh --tox-env py3,lint,check
+
+ ## Run Python 3 arbitrary command (e.g. 1 single test):
+ $ ../../../script/run_tox.sh --tox-env py3 "" tests/test_rgw_client.py::RgwClientTest::test_ssl_verify
+
+You can also run tox instead of ``run_tox.sh``::
+
+ ## Run Python 3 tests command:
+ $ tox -e py3
+
+ ## Run Python 3 arbitrary command (e.g. 1 single test):
+ $ tox -e py3 tests/test_rgw_client.py::RgwClientTest::test_ssl_verify
+
+Python files can be automatically fixed and formatted according to PEP8
+standards by using ``run_tox.sh --tox-env fix`` or ``tox -e fix``.
+
+We also collect coverage information from the backend code when you run tests. You can check the
+coverage information provided by the tox output, or by running the following
+command after tox has finished successfully::
+
+ $ coverage html
+
+This command will create a directory ``htmlcov`` with an HTML representation of
+the code coverage of the backend.
+
+API tests based on Teuthology
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+How to run existing API tests:
+ To run the API tests against a real Ceph cluster, we leverage the Teuthology
+ framework. This has the advantage of catching bugs originated from changes in
+ the internal Ceph code.
+
+ Our ``run-backend-api-tests.sh`` script will start a ``vstart`` Ceph cluster
+ before running the Teuthology tests, and then it stops the cluster after the
+ tests are run. Of course this implies that you have built/compiled Ceph
+ previously.
+
+ Start all dashboard tests by running::
+
+ $ ./run-backend-api-tests.sh
+
+ Or, start one or multiple specific tests by specifying the test name::
+
+ $ ./run-backend-api-tests.sh tasks.mgr.dashboard.test_pool.PoolTest
+
+ Or, ``source`` the script and run the tests manually::
+
+ $ source run-backend-api-tests.sh
+ $ run_teuthology_tests [tests]...
+ $ cleanup_teuthology
+
+How to write your own tests:
+ There are two possible ways to write your own API tests:
+
+ The first is by extending one of the existing test classes in the
+ ``qa/tasks/mgr/dashboard`` directory.
+
+ The second way is by adding your own API test module if you're creating a new
+ controller for example. To do so you'll just need to add the file containing
+ your new test class to the ``qa/tasks/mgr/dashboard`` directory and implement
+ all your tests here.
+
+ .. note:: Don't forget to add the path of the newly created module to
+ ``modules`` section in ``qa/suites/rados/mgr/tasks/dashboard.yaml``.
+
+ Short example: Let's assume you created a new controller called
+ ``my_new_controller.py`` and the related test module
+ ``test_my_new_controller.py``. You'll need to add
+ ``tasks.mgr.dashboard.test_my_new_controller`` to the ``modules`` section in
+ the ``dashboard.yaml`` file.
+
+ Also, if you're removing test modules please keep in mind to remove the
+ related section. Otherwise the Teuthology test run will fail.
+
+ Please run your API tests on your dev environment (as explained above)
+ before submitting a pull request. Also make sure that a full QA run in
+ Teuthology/sepia lab (based on your changes) has completed successfully
+ before it gets merged. You don't need to schedule the QA run yourself, just
+ add the 'needs-qa' label to your pull request as soon as you think it's ready
+ for merging (e.g. make check was successful, the pull request is approved and
+ all comments have been addressed). One of the developers who has access to
+ Teuthology/the sepia lab will take care of it and report the result back to
+ you.
+
+
+How to add a new controller?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+A controller is a Python class that extends from the ``BaseController`` class
+and is decorated with either the ``@Controller``, ``@ApiController`` or
+``@UiApiController`` decorators. The Python class must be stored inside a Python
+file located under the ``controllers`` directory. The Dashboard module will
+automatically load your new controller upon start.
+
+``@ApiController`` and ``@UiApiController`` are both specializations of the
+``@Controller`` decorator.
+
+The ``@ApiController`` should be used for controllers that provide an API-like
+REST interface and the ``@UiApiController`` should be used for endpoints consumed
+by the UI but that are not part of the 'public' API. For any other kinds of
+controllers the ``@Controller`` decorator should be used.
+
+A controller has a URL prefix path associated that is specified in the
+controller decorator, and all endpoints exposed by the controller will share
+the same URL prefix path.
+
+A controller's endpoint is exposed by implementing a method on the controller
+class decorated with the ``@Endpoint`` decorator.
+
+For example create a file ``ping.py`` under ``controllers`` directory with the
+following code:
+
+.. code-block:: python
+
+ from ..tools import Controller, ApiController, UiApiController, BaseController, Endpoint
+
+ @Controller('/ping')
+ class Ping(BaseController):
+ @Endpoint()
+ def hello(self):
+ return {'msg': "Hello"}
+
+ @ApiController('/ping')
+ class ApiPing(BaseController):
+ @Endpoint()
+ def hello(self):
+ return {'msg': "Hello"}
+
+ @UiApiController('/ping')
+ class UiApiPing(BaseController):
+ @Endpoint()
+ def hello(self):
+ return {'msg': "Hello"}
+
+The ``hello`` endpoint of the ``Ping`` controller can be reached by the
+following URL: https://mgr_hostname:8443/ping/hello using HTTP GET requests.
+As you can see the controller URL path ``/ping`` is concatenated to the
+method name ``hello`` to generate the endpoint's URL.
+
+In the case of the ``ApiPing`` controller, the ``hello`` endpoint can be
+reached by the following URL: https://mgr_hostname:8443/api/ping/hello using a
+HTTP GET request.
+The API controller URL path ``/ping`` is prefixed by the ``/api`` path and then
+concatenated to the method name ``hello`` to generate the endpoint's URL.
+Internally, the ``@ApiController`` is actually calling the ``@Controller``
+decorator by passing an additional decorator parameter called ``base_url``::
+
+ @ApiController('/ping') <=> @Controller('/ping', base_url="/api")
+
+``UiApiPing`` works in a similar way than the ``ApiPing``, but the URL will be
+prefixed by ``/ui-api``: https://mgr_hostname:8443/ui-api/ping/hello. ``UiApiPing`` is
+also a ``@Controller`` extension::
+
+ @UiApiController('/ping') <=> @Controller('/ping', base_url="/ui-api")
+
+The ``@Endpoint`` decorator also supports many parameters to customize the
+endpoint:
+
+* ``method="GET"``: the HTTP method allowed to access this endpoint.
+* ``path="/<method_name>"``: the URL path of the endpoint, excluding the
+ controller URL path prefix.
+* ``path_params=[]``: list of method parameter names that correspond to URL
+ path parameters. Can only be used when ``method in ['POST', 'PUT']``.
+* ``query_params=[]``: list of method parameter names that correspond to URL
+ query parameters.
+* ``json_response=True``: indicates if the endpoint response should be
+ serialized in JSON format.
+* ``proxy=False``: indicates if the endpoint should be used as a proxy.
+
+An endpoint method may have parameters declared. Depending on the HTTP method
+defined for the endpoint the method parameters might be considered either
+path parameters, query parameters, or body parameters.
+
+For ``GET`` and ``DELETE`` methods, the method's non-optional parameters are
+considered path parameters by default. Optional parameters are considered
+query parameters. By specifying the ``query_parameters`` in the endpoint
+decorator it is possible to make a non-optional parameter to be a query
+parameter.
+
+For ``POST`` and ``PUT`` methods, all method parameters are considered
+body parameters by default. To override this default, one can use the
+``path_params`` and ``query_params`` to specify which method parameters are
+path and query parameters respectively.
+Body parameters are decoded from the request body, either from a form format, or
+from a dictionary in JSON format.
+
+Let's use an example to better understand the possible ways to customize an
+endpoint:
+
+.. code-block:: python
+
+ from ..tools import Controller, BaseController, Endpoint
+
+ @Controller('/ping')
+ class Ping(BaseController):
+
+ # URL: /ping/{key}?opt1=...&opt2=...
+ @Endpoint(path="/", query_params=['opt1'])
+ def index(self, key, opt1, opt2=None):
+ """..."""
+
+ # URL: /ping/{key}?opt1=...&opt2=...
+ @Endpoint(query_params=['opt1'])
+ def __call__(self, key, opt1, opt2=None):
+ """..."""
+
+ # URL: /ping/post/{key1}/{key2}
+ @Endpoint('POST', path_params=['key1', 'key2'])
+ def post(self, key1, key2, data1, data2=None):
+ """..."""
+
+
+In the above example we see how the ``path`` option can be used to override the
+generated endpoint URL in order to not use the method's name in the URL. In the
+``index`` method we set the ``path`` to ``"/"`` to generate an endpoint that is
+accessible by the root URL of the controller.
+
+An alternative approach to generate an endpoint that is accessible through just
+the controller's path URL is by using the ``__call__`` method, as we show in
+the above example.
+
+From the third method you can see that the path parameters are collected from
+the URL by parsing the list of values separated by slashes ``/`` that come
+after the URL path ``/ping`` for ``index`` method case, and ``/ping/post`` for
+the ``post`` method case.
+
+Defining path parameters in endpoints's URLs using python methods's parameters
+is very easy but it is still a bit strict with respect to the position of these
+parameters in the URL structure.
+Sometimes we may want to explicitly define a URL scheme that
+contains path parameters mixed with static parts of the URL.
+Our controller infrastructure also supports the declaration of URL paths with
+explicit path parameters at both the controller level and method level.
+
+Consider the following example:
+
+.. code-block:: python
+
+ from ..tools import Controller, BaseController, Endpoint
+
+ @Controller('/ping/{node}/stats')
+ class Ping(BaseController):
+
+ # URL: /ping/{node}/stats/{date}/latency?unit=...
+ @Endpoint(path="/{date}/latency")
+ def latency(self, node, date, unit="ms"):
+ """ ..."""
+
+In this example we explicitly declare a path parameter ``{node}`` in the
+controller URL path, and a path parameter ``{date}`` in the ``latency``
+method. The endpoint for the ``latency`` method is then accessible through
+the URL: https://mgr_hostname:8443/ping/{node}/stats/{date}/latency .
+
+For a full set of examples on how to use the ``@Endpoint``
+decorator please check the unit test file: ``tests/test_controllers.py``.
+There you will find many examples of how to customize endpoint methods.
+
+
+Implementing Proxy Controller
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Sometimes you might need to relay some requests from the Dashboard frontend
+directly to an external service.
+For that purpose we provide a decorator called ``@Proxy``.
+(As a concrete example, check the ``controllers/rgw.py`` file where we
+implemented an RGW Admin Ops proxy.)
+
+
+The ``@Proxy`` decorator is a wrapper of the ``@Endpoint`` decorator that
+already customizes the endpoint for working as a proxy.
+A proxy endpoint works by capturing the URL path that follows the controller
+URL prefix path, and does not do any decoding of the request body.
+
+Example:
+
+.. code-block:: python
+
+ from ..tools import Controller, BaseController, Proxy
+
+ @Controller('/foo/proxy')
+ class FooServiceProxy(BaseController):
+
+ @Proxy()
+ def proxy(self, path, **params):
+ """
+ if requested URL is "/foo/proxy/access/service?opt=1"
+ then path is "access/service" and params is {'opt': '1'}
+ """
+
+
+How does the RESTController work?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+We also provide a simple mechanism to create REST based controllers using the
+``RESTController`` class. Any class which inherits from ``RESTController`` will,
+by default, return JSON.
+
+The ``RESTController`` is basically an additional abstraction layer which eases
+and unifies the work with collections. A collection is just an array of objects
+with a specific type. ``RESTController`` enables some default mappings of
+request types and given parameters to specific method names. This may sound
+complicated at first, but it's fairly easy. Lets have look at the following
+example:
+
+.. code-block:: python
+
+ import cherrypy
+ from ..tools import ApiController, RESTController
+
+ @ApiController('ping')
+ class Ping(RESTController):
+ def list(self):
+ return {"msg": "Hello"}
+
+ def get(self, id):
+ return self.objects[id]
+
+In this case, the ``list`` method is automatically used for all requests to
+``api/ping`` where no additional argument is given and where the request type
+is ``GET``. If the request is given an additional argument, the ID in our
+case, it won't map to ``list`` anymore but to ``get`` and return the element
+with the given ID (assuming that ``self.objects`` has been filled before). The
+same applies to other request types:
+
++--------------+------------+----------------+-------------+
+| Request type | Arguments | Method | Status Code |
++==============+============+================+=============+
+| GET | No | list | 200 |
++--------------+------------+----------------+-------------+
+| PUT | No | bulk_set | 200 |
++--------------+------------+----------------+-------------+
+| POST | No | create | 201 |
++--------------+------------+----------------+-------------+
+| DELETE | No | bulk_delete | 204 |
++--------------+------------+----------------+-------------+
+| GET | Yes | get | 200 |
++--------------+------------+----------------+-------------+
+| PUT | Yes | set | 200 |
++--------------+------------+----------------+-------------+
+| DELETE | Yes | delete | 204 |
++--------------+------------+----------------+-------------+
+
+To use a custom endpoint for the above listed methods, you can
+use ``@RESTController.MethodMap``
+
+.. code-block:: python
+
+ import cherrypy
+ from ..tools import ApiController, RESTController
+
+ @RESTController.MethodMap(version='0.1')
+ def create(self):
+ return {"msg": "Hello"}
+
+This decorator supports three parameters to customize the
+endpoint:
+
+* ``resource"``: resource id.
+* ``status=200``: set the HTTP status response code
+* ``version``: version
+
+How to use a custom API endpoint in a RESTController?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If you don't have any access restriction you can use ``@Endpoint``. If you
+have set a permission scope to restrict access to your endpoints,
+``@Endpoint`` will fail, as it doesn't know which permission property should be
+used. To use a custom endpoint inside a restricted ``RESTController`` use
+``@RESTController.Collection`` instead. You can also choose
+``@RESTController.Resource`` if you have set a ``RESOURCE_ID`` in your
+``RESTController`` class.
+
+.. code-block:: python
+
+ import cherrypy
+ from ..tools import ApiController, RESTController
+
+ @ApiController('ping', Scope.Ping)
+ class Ping(RESTController):
+ RESOURCE_ID = 'ping'
+
+ @RESTController.Resource('GET')
+ def some_get_endpoint(self):
+ return {"msg": "Hello"}
+
+ @RESTController.Collection('POST')
+ def some_post_endpoint(self, **data):
+ return {"msg": data}
+
+Both decorators also support five parameters to customize the
+endpoint:
+
+* ``method="GET"``: the HTTP method allowed to access this endpoint.
+* ``path="/<method_name>"``: the URL path of the endpoint, excluding the
+ controller URL path prefix.
+* ``status=200``: set the HTTP status response code
+* ``query_params=[]``: list of method parameter names that correspond to URL
+ query parameters.
+* ``version``: version
+
+How to restrict access to a controller?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+All controllers require authentication by default.
+If you require that the controller can be accessed without authentication,
+then you can add the parameter ``secure=False`` to the controller decorator.
+
+Example:
+
+.. code-block:: python
+
+ import cherrypy
+ from . import ApiController, RESTController
+
+
+ @ApiController('ping', secure=False)
+ class Ping(RESTController):
+ def list(self):
+ return {"msg": "Hello"}
+
+How to create a dedicated UI endpoint which uses the 'public' API?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Sometimes we want to combine multiple calls into one single call
+to save bandwidth or for other performance reasons.
+In order to achieve that, we first have to create an ``@UiApiController`` which
+is used for endpoints consumed by the UI but that are not part of the
+'public' API. Let the ui class inherit from the REST controller class.
+Now you can use all methods from the api controller.
+
+Example:
+
+.. code-block:: python
+
+ import cherrypy
+ from . import UiApiController, ApiController, RESTController
+
+
+ @ApiController('ping', secure=False) # /api/ping
+ class Ping(RESTController):
+ def list(self):
+ return self._list()
+
+ def _list(self): # To not get in conflict with the JSON wrapper
+ return [1,2,3]
+
+
+ @UiApiController('ping', secure=False) # /ui-api/ping
+ class PingUi(Ping):
+ def list(self):
+ return self._list() + [4, 5, 6]
+
+How to access the manager module instance from a controller?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+We provide the manager module instance as a global variable that can be
+imported in any module.
+
+Example:
+
+.. code-block:: python
+
+ import logging
+ import cherrypy
+ from .. import mgr
+ from ..tools import ApiController, RESTController
+
+ logger = logging.getLogger(__name__)
+
+ @ApiController('servers')
+ class Servers(RESTController):
+ def list(self):
+ logger.debug('Listing available servers')
+ return {'servers': mgr.list_servers()}
+
+
+How to write a unit test for a controller?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+We provide a test helper class called ``ControllerTestCase`` to easily create
+unit tests for your controller.
+
+If we want to write a unit test for the above ``Ping`` controller, create a
+``test_ping.py`` file under the ``tests`` directory with the following code:
+
+.. code-block:: python
+
+ from .helper import ControllerTestCase
+ from .controllers.ping import Ping
+
+
+ class PingTest(ControllerTestCase):
+ @classmethod
+ def setup_test(cls):
+ cp_config = {'tools.authenticate.on': True}
+ cls.setup_controllers([Ping], cp_config=cp_config)
+
+ def test_ping(self):
+ self._get("/api/ping")
+ self.assertStatus(200)
+ self.assertJsonBody({'msg': 'Hello'})
+
+The ``ControllerTestCase`` class starts by initializing a CherryPy webserver.
+Then it will call the ``setup_test()`` class method where we can explicitly
+load the controllers that we want to test. In the above example we are only
+loading the ``Ping`` controller. We can also provide ``cp_config`` in order to
+update the controller's cherrypy config (e.g. enable authentication as shown in the example).
+
+How to update or create new dashboards in grafana?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+We are using ``jsonnet`` and ``grafonnet-lib`` to write code for the grafana dashboards.
+All the dashboards are written inside ``grafana_dashboards.jsonnet`` file in the
+monitoring/grafana/dashboards/jsonnet directory.
+
+We generate the dashboard json files directly from this jsonnet file by running this
+command in the grafana/dashboards directory:
+``jsonnet -m . jsonnet/grafana_dashboards.jsonnet``.
+(For the above command to succeed we need ``jsonnet`` package installed and ``grafonnet-lib``
+directory cloned in our machine. Please refer -
+``https://grafana.github.io/grafonnet-lib/getting-started/`` in case you have some trouble.)
+
+To update an existing grafana dashboard or to create a new one, we need to update
+the ``grafana_dashboards.jsonnet`` file and generate the new/updated json files using the
+above mentioned command. For people who are not familiar with grafonnet or jsonnet implementation
+can follow this doc - ``https://grafana.github.io/grafonnet-lib/``.
+
+Example grafana dashboard in jsonnet format:
+
+To specify the grafana dashboard properties such as title, uid etc we can create a local function -
+
+::
+
+ local dashboardSchema(title, uid, time_from, refresh, schemaVersion, tags,timezone, timepicker)
+
+To add a graph panel we can specify the graph schema in a local function such as -
+
+::
+
+ local graphPanelSchema(title, nullPointMode, stack, formatY1, formatY2, labelY1, labelY2, min, fill, datasource)
+
+and then use these functions inside the dashboard definition like -
+
+::
+
+ {
+ radosgw-sync-overview.json: //json file name to be generated
+
+ dashboardSchema(
+ 'RGW Sync Overview', 'rgw-sync-overview', 'now-1h', '15s', .., .., ..
+ )
+
+ .addPanels([
+ graphPanelSchema(
+ 'Replication (throughput) from Source Zone', 'Bps', null, .., .., ..)
+ ])
+ }
+
+The valid grafonnet-lib attributes can be found here - ``https://grafana.github.io/grafonnet-lib/api-docs/``.
+
+
+How to listen for manager notifications in a controller?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The manager notifies the modules of several types of cluster events, such
+as cluster logging event, etc...
+
+Each module has a "global" handler function called ``notify`` that the manager
+calls to notify the module. But this handler function must not block or spend
+too much time processing the event notification.
+For this reason we provide a notification queue that controllers can register
+themselves with to receive cluster notifications.
+
+The example below represents a controller that implements a very simple live
+log viewer page:
+
+.. code-block:: python
+
+ import collections
+
+ import cherrypy
+
+ from ..tools import ApiController, BaseController, NotificationQueue
+
+
+ @ApiController('livelog')
+ class LiveLog(BaseController):
+ log_buffer = collections.deque(maxlen=1000)
+
+ def __init__(self):
+ super(LiveLog, self).__init__()
+ NotificationQueue.register(self.log, 'clog')
+
+ def log(self, log_struct):
+ self.log_buffer.appendleft(log_struct)
+
+ @cherrypy.expose
+ def default(self):
+ ret = '<html><meta http-equiv="refresh" content="2" /><body>'
+ for l in self.log_buffer:
+ ret += "{}<br>".format(l)
+ ret += "</body></html>"
+ return ret
+
+As you can see above, the ``NotificationQueue`` class provides a register
+method that receives the function as its first argument, and receives the
+"notification type" as the second argument.
+You can omit the second argument of the ``register`` method, and in that case
+you are registering to listen all notifications of any type.
+
+Here is an list of notification types (these might change in the future) that
+can be used:
+
+* ``clog``: cluster log notifications
+* ``command``: notification when a command issued by ``MgrModule.send_command``
+ completes
+* ``perf_schema_update``: perf counters schema update
+* ``mon_map``: monitor map update
+* ``fs_map``: cephfs map update
+* ``osd_map``: OSD map update
+* ``service_map``: services (RGW, RBD-Mirror, etc.) map update
+* ``mon_status``: monitor status regular update
+* ``health``: health status regular update
+* ``pg_summary``: regular update of PG status information
+
+
+How to write a unit test when a controller accesses a Ceph module?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Consider the following example that implements a controller that retrieves the
+list of RBD images of the ``rbd`` pool:
+
+.. code-block:: python
+
+ import rbd
+ from .. import mgr
+ from ..tools import ApiController, RESTController
+
+
+ @ApiController('rbdimages')
+ class RbdImages(RESTController):
+ def __init__(self):
+ self.ioctx = mgr.rados.open_ioctx('rbd')
+ self.rbd = rbd.RBD()
+
+ def list(self):
+ return [{'name': n} for n in self.rbd.list(self.ioctx)]
+
+In the example above, we want to mock the return value of the ``rbd.list``
+function, so that we can test the JSON response of the controller.
+
+The unit test code will look like the following:
+
+.. code-block:: python
+
+ import mock
+ from .helper import ControllerTestCase
+
+
+ class RbdImagesTest(ControllerTestCase):
+ @mock.patch('rbd.RBD.list')
+ def test_list(self, rbd_list_mock):
+ rbd_list_mock.return_value = ['img1', 'img2']
+ self._get('/api/rbdimages')
+ self.assertJsonBody([{'name': 'img1'}, {'name': 'img2'}])
+
+
+
+How to add a new configuration setting?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If you need to store some configuration setting for a new feature, we already
+provide an easy mechanism for you to specify/use the new config setting.
+
+For instance, if you want to add a new configuration setting to hold the
+email address of the dashboard admin, just add a setting name as a class
+attribute to the ``Options`` class in the ``settings.py`` file::
+
+ # ...
+ class Options(object):
+ # ...
+
+ ADMIN_EMAIL_ADDRESS = ('admin@admin.com', str)
+
+The value of the class attribute is a pair composed by the default value for that
+setting, and the python type of the value.
+
+By declaring the ``ADMIN_EMAIL_ADDRESS`` class attribute, when you restart the
+dashboard module, you will automatically gain two additional CLI commands to
+get and set that setting::
+
+ $ ceph dashboard get-admin-email-address
+ $ ceph dashboard set-admin-email-address <value>
+
+To access, or modify the config setting value from your Python code, either
+inside a controller or anywhere else, you just need to import the ``Settings``
+class and access it like this:
+
+.. code-block:: python
+
+ from settings import Settings
+
+ # ...
+ tmp_var = Settings.ADMIN_EMAIL_ADDRESS
+
+ # ....
+ Settings.ADMIN_EMAIL_ADDRESS = 'myemail@admin.com'
+
+The settings management implementation will make sure that if you change a
+setting value from the Python code you will see that change when accessing
+that setting from the CLI and vice-versa.
+
+
+How to run a controller read-write operation asynchronously?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Some controllers might need to execute operations that alter the state of the
+Ceph cluster. These operations might take some time to execute and to maintain
+a good user experience in the Web UI, we need to run those operations
+asynchronously and return immediately to frontend some information that the
+operations are running in the background.
+
+To help in the development of the above scenario we added the support for
+asynchronous tasks. To trigger the execution of an asynchronous task we must
+use the following class method of the ``TaskManager`` class::
+
+ from ..tools import TaskManager
+ # ...
+ TaskManager.run(name, metadata, func, args, kwargs)
+
+* ``name`` is a string that can be used to group tasks. For instance
+ for RBD image creation tasks we could specify ``"rbd/create"`` as the
+ name, or similarly ``"rbd/remove"`` for RBD image removal tasks.
+
+* ``metadata`` is a dictionary where we can store key-value pairs that
+ characterize the task. For instance, when creating a task for creating
+ RBD images we can specify the metadata argument as
+ ``{'pool_name': "rbd", image_name': "test-img"}``.
+
+* ``func`` is the python function that implements the operation code, which
+ will be executed asynchronously.
+
+* ``args`` and ``kwargs`` are the positional and named arguments that will be
+ passed to ``func`` when the task manager starts its execution.
+
+The ``TaskManager.run`` method triggers the asynchronous execution of function
+``func`` and returns a ``Task`` object.
+The ``Task`` provides the public method ``Task.wait(timeout)``, which can be
+used to wait for the task to complete up to a timeout defined in seconds and
+provided as an argument. If no argument is provided the ``wait`` method
+blocks until the task is finished.
+
+The ``Task.wait`` is very useful for tasks that usually are fast to execute but
+that sometimes may take a long time to run.
+The return value of the ``Task.wait`` method is a pair ``(state, value)``
+where ``state`` is a string with following possible values:
+
+* ``VALUE_DONE = "done"``
+* ``VALUE_EXECUTING = "executing"``
+
+The ``value`` will store the result of the execution of function ``func`` if
+``state == VALUE_DONE``. If ``state == VALUE_EXECUTING`` then
+``value == None``.
+
+The pair ``(name, metadata)`` should unequivocally identify the task being
+run, which means that if you try to trigger a new task that matches the same
+``(name, metadata)`` pair of the currently running task, then the new task
+is not created and you get the task object of the current running task.
+
+For instance, consider the following example:
+
+.. code-block:: python
+
+ task1 = TaskManager.run("dummy/task", {'attr': 2}, func)
+ task2 = TaskManager.run("dummy/task", {'attr': 2}, func)
+
+If the second call to ``TaskManager.run`` executes while the first task is
+still executing then it will return the same task object:
+``assert task1 == task2``.
+
+
+How to get the list of executing and finished asynchronous tasks?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The list of executing and finished tasks is included in the ``Summary``
+controller, which is already polled every 5 seconds by the dashboard frontend.
+But we also provide a dedicated controller to get the same list of executing
+and finished tasks.
+
+The ``Task`` controller exposes the ``/api/task`` endpoint that returns the
+list of executing and finished tasks. This endpoint accepts the ``name``
+parameter that accepts a glob expression as its value.
+For instance, an HTTP GET request of the URL ``/api/task?name=rbd/*``
+will return all executing and finished tasks which name starts with ``rbd/``.
+
+To prevent the finished tasks list from growing unbounded, we will always
+maintain the 10 most recent finished tasks, and the remaining older finished
+tasks will be removed when reaching a TTL of 1 minute. The TTL is calculated
+using the timestamp when the task finished its execution. After a minute, when
+the finished task information is retrieved, either by the summary controller or
+by the task controller, it is automatically deleted from the list and it will
+not be included in further task queries.
+
+Each executing task is represented by the following dictionary::
+
+ {
+ 'name': "name", # str
+ 'metadata': { }, # dict
+ 'begin_time': "2018-03-14T15:31:38.423605Z", # str (ISO 8601 format)
+ 'progress': 0 # int (percentage)
+ }
+
+Each finished task is represented by the following dictionary::
+
+ {
+ 'name': "name", # str
+ 'metadata': { }, # dict
+ 'begin_time': "2018-03-14T15:31:38.423605Z", # str (ISO 8601 format)
+ 'end_time': "2018-03-14T15:31:39.423605Z", # str (ISO 8601 format)
+ 'duration': 0.0, # float
+ 'progress': 0 # int (percentage)
+ 'success': True, # bool
+ 'ret_value': None, # object, populated only if 'success' == True
+ 'exception': None, # str, populated only if 'success' == False
+ }
+
+
+How to use asynchronous APIs with asynchronous tasks?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ``TaskManager.run`` method as described in a previous section, is well
+suited for calling blocking functions, as it runs the function inside a newly
+created thread. But sometimes we want to call some function of an API that is
+already asynchronous by nature.
+
+For these cases we want to avoid creating a new thread for just running a
+non-blocking function, and want to leverage the asynchronous nature of the
+function. The ``TaskManager.run`` is already prepared to be used with
+non-blocking functions by passing an object of the type ``TaskExecutor`` as an
+additional parameter called ``executor``. The full method signature of
+``TaskManager.run``::
+
+ TaskManager.run(name, metadata, func, args=None, kwargs=None, executor=None)
+
+
+The ``TaskExecutor`` class is responsible for code that executes a given task
+function, and defines three methods that can be overridden by
+subclasses::
+
+ def init(self, task)
+ def start(self)
+ def finish(self, ret_value, exception)
+
+The ``init`` method is called before the running the task function, and
+receives the task object (of class ``Task``).
+
+The ``start`` method runs the task function. The default implementation is to
+run the task function in the current thread context.
+
+The ``finish`` method should be called when the task function finishes with
+either the ``ret_value`` populated with the result of the execution, or with
+an exception object in the case that execution raised an exception.
+
+To leverage the asynchronous nature of a non-blocking function, the developer
+should implement a custom executor by creating a subclass of the
+``TaskExecutor`` class, and provide an instance of the custom executor class
+as the ``executor`` parameter of the ``TaskManager.run``.
+
+To better understand the expressive power of executors, we write a full example
+of use a custom executor to execute the ``MgrModule.send_command`` asynchronous
+function:
+
+.. code-block:: python
+
+ import json
+ from mgr_module import CommandResult
+ from .. import mgr
+ from ..tools import ApiController, RESTController, NotificationQueue, \
+ TaskManager, TaskExecutor
+
+
+ class SendCommandExecutor(TaskExecutor):
+ def __init__(self):
+ super(SendCommandExecutor, self).__init__()
+ self.tag = None
+ self.result = None
+
+ def init(self, task):
+ super(SendCommandExecutor, self).init(task)
+
+ # we need to listen for 'command' events to know when the command
+ # finishes
+ NotificationQueue.register(self._handler, 'command')
+
+ # store the CommandResult object to retrieve the results
+ self.result = self.task.fn_args[0]
+ if len(self.task.fn_args) > 4:
+ # the user specified a tag for the command, so let's use it
+ self.tag = self.task.fn_args[4]
+ else:
+ # let's generate a unique tag for the command
+ self.tag = 'send_command_{}'.format(id(self))
+ self.task.fn_args.append(self.tag)
+
+ def _handler(self, data):
+ if data == self.tag:
+ # the command has finished, notifying the task with the result
+ self.finish(self.result.wait(), None)
+ # deregister listener to avoid memory leaks
+ NotificationQueue.deregister(self._handler, 'command')
+
+
+ @ApiController('test')
+ class Test(RESTController):
+
+ def _run_task(self, osd_id):
+ task = TaskManager.run("test/task", {}, mgr.send_command,
+ [CommandResult(''), 'osd', osd_id,
+ json.dumps({'prefix': 'perf histogram dump'})],
+ executor=SendCommandExecutor())
+ return task.wait(1.0)
+
+ def get(self, osd_id):
+ status, value = self._run_task(osd_id)
+ return {'status': status, 'value': value}
+
+
+The above ``SendCommandExecutor`` executor class can be used for any call to
+``MgrModule.send_command``. This means that we should need just one custom
+executor class implementation for each non-blocking API that we use in our
+controllers.
+
+The default executor, used when no executor object is passed to
+``TaskManager.run``, is the ``ThreadedExecutor``. You can check its
+implementation in the ``tools.py`` file.
+
+
+How to update the execution progress of an asynchronous task?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The asynchronous tasks infrastructure provides support for updating the
+execution progress of an executing task.
+The progress can be updated from within the code the task is executing, which
+usually is the place where we have the progress information available.
+
+To update the progress from within the task code, the ``TaskManager`` class
+provides a method to retrieve the current task object::
+
+ TaskManager.current_task()
+
+The above method is only available when using the default executor
+``ThreadedExecutor`` for executing the task.
+The ``current_task()`` method returns the current ``Task`` object. The
+``Task`` object provides two public methods to update the execution progress
+value: the ``set_progress(percentage)``, and the ``inc_progress(delta)``
+methods.
+
+The ``set_progress`` method receives as argument an integer value representing
+the absolute percentage that we want to set to the task.
+
+The ``inc_progress`` method receives as argument an integer value representing
+the delta we want to increment to the current execution progress percentage.
+
+Take the following example of a controller that triggers a new task and
+updates its progress:
+
+.. code-block:: python
+
+ import random
+ import time
+ import cherrypy
+ from ..tools import TaskManager, ApiController, BaseController
+
+
+ @ApiController('dummy_task')
+ class DummyTask(BaseController):
+ def _dummy(self):
+ top = random.randrange(100)
+ for i in range(top):
+ TaskManager.current_task().set_progress(i*100/top)
+ # or TaskManager.current_task().inc_progress(100/top)
+ time.sleep(1)
+ return "finished"
+
+ @cherrypy.expose
+ @cherrypy.tools.json_out()
+ def default(self):
+ task = TaskManager.run("dummy/task", {}, self._dummy)
+ return task.wait(5) # wait for five seconds
+
+
+How to deal with asynchronous tasks in the front-end?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+All executing and most recently finished asynchronous tasks are displayed on
+"Background-Tasks" and if finished on "Recent-Notifications" in the menu bar.
+For each task a operation name for three states (running, success and failure),
+a function that tells who is involved and error descriptions, if any, have to
+be provided. This can be achieved by appending
+``TaskManagerMessageService.messages``. This has to be done to achieve
+consistency among all tasks and states.
+
+Operation Object
+ Ensures consistency among all tasks. It consists of three verbs for each
+ different state f.e.
+ ``{running: 'Creating', failure: 'create', success: 'Created'}``.
+
+#. Put running operations in present participle f.e. ``'Updating'``.
+#. Failed messages always start with ``'Failed to '`` and should be continued
+ with the operation in present tense f.e. ``'update'``.
+#. Put successful operations in past tense f.e. ``'Updated'``.
+
+Involves Function
+ Ensures consistency among all messages of a task, it resembles who's
+ involved by the operation. It's a function that returns a string which
+ takes the metadata from the task to return f.e.
+ ``"RBD 'somePool/someImage'"``.
+
+Both combined create the following messages:
+
+* Failure => ``"Failed to create RBD 'somePool/someImage'"``
+* Running => ``"Creating RBD 'somePool/someImage'"``
+* Success => ``"Created RBD 'somePool/someImage'"``
+
+For automatic task handling use ``TaskWrapperService.wrapTaskAroundCall``.
+
+If for some reason ``wrapTaskAroundCall`` is not working for you,
+you have to subscribe to your asynchronous task manually through
+``TaskManagerService.subscribe``, and provide it with a callback,
+in case of a success to notify the user. A notification can
+be triggered with ``NotificationService.notifyTask``. It will use
+``TaskManagerMessageService.messages`` to display a message based on the state
+of a task.
+
+Notifications of API errors are handled by ``ApiInterceptorService``.
+
+Usage example:
+
+.. code-block:: javascript
+
+ export class TaskManagerMessageService {
+ // ...
+ messages = {
+ // Messages for task 'rbd/create'
+ 'rbd/create': new TaskManagerMessage(
+ // Message prefixes
+ ['create', 'Creating', 'Created'],
+ // Message suffix
+ (metadata) => `RBD '${metadata.pool_name}/${metadata.image_name}'`,
+ (metadata) => ({
+ // Error code and description
+ '17': `Name is already used by RBD '${metadata.pool_name}/${
+ metadata.image_name}'.`
+ })
+ ),
+ // ...
+ };
+ // ...
+ }
+
+ export class RBDFormComponent {
+ // ...
+ createAction() {
+ const request = this.createRequest();
+ // Subscribes to 'call' with submitted 'task' and handles notifications
+ return this.taskWrapper.wrapTaskAroundCall({
+ task: new FinishedTask('rbd/create', {
+ pool_name: request.pool_name,
+ image_name: request.name
+ }),
+ call: this.rbdService.create(request)
+ });
+ }
+ // ...
+ }
+
+
+REST API documentation
+~~~~~~~~~~~~~~~~~~~~~~
+Ceph-Dashboard provides two types of documentation for the **Ceph RESTful API**:
+
+* **Static documentation**: available at :ref:`mgr ceph api`. This comes from a versioned specification located at ``src/pybind/mgr/dashboard/openapi.yaml``.
+* **Interactive documentation**: available from a running Ceph-Dashboard instance (top-right ``?`` icon > API Docs).
+
+If changes are made to the ``controllers/`` directory, it's very likely that
+they will result in changes to the generated OpenAPI specification. For that
+reason, a checker has been implemented to block unintended changes. This check
+is automatically triggered by the Pull Request CI (``make check``) and can be
+also manually invoked: ``tox -e openapi-check``.
+
+If that checker failed, it means that the current Pull Request is modifying the
+Ceph API and therefore:
+
+#. The versioned OpenAPI specification should be updated explicitly: ``tox -e openapi-fix``.
+#. The team @ceph/api will be requested for reviews (this is automated via GitHub CODEOWNERS), in order to asses the impact of changes.
+
+Additionally, Sphinx documentation can be generated from the OpenAPI
+specification with ``tox -e openapi-doc``.
+
+The Ceph RESTful OpenAPI specification is dynamically generated from the
+``Controllers`` in ``controllers/`` directory. However, by default it is not
+very detailed, so there are two decorators that can and should be used to add
+more information:
+
+* ``@EndpointDoc()`` for documentation of endpoints. It has four optional arguments
+ (explained below): ``description``, ``group``, ``parameters`` and
+ ``responses``.
+* ``@ControllerDoc()`` for documentation of controller or group associated with
+ the endpoints. It only takes the two first arguments: ``description`` and
+ ``group``.
+
+
+``description``: A a string with a short (1-2 sentences) description of the object.
+
+
+``group``: By default, an endpoint is grouped together with other endpoints
+within the same controller class. ``group`` is a string that can be used to
+assign an endpoint or all endpoints in a class to another controller or a
+conceived group name.
+
+
+``parameters``: A dict used to describe path, query or request body parameters.
+By default, all parameters for an endpoint are listed on the Swagger UI page,
+including information of whether the parameter is optional/required and default
+values. However, there will be no description of the parameter and the parameter
+type will only be displayed in some cases.
+When adding information, each parameters should be described as in the example
+below. Note that the parameter type should be expressed as a built-in python
+type and not as a string. Allowed values are ``str``, ``int``, ``bool``, ``float``.
+
+.. code-block:: python
+
+ @EndpointDoc(parameters={'my_string': (str, 'Description of my_string')})
+ def method(my_string): pass
+
+For body parameters, more complex cases are possible. If the parameter is a
+dictionary, the type should be replaced with a ``dict`` containing its nested
+parameters. When describing nested parameters, the same format as other
+parameters is used. However, all nested parameters are set as required by default.
+If the nested parameter is optional this must be specified as for ``item2`` in
+the example below. If a nested parameters is set to optional, it is also
+possible to specify the default value (this will not be provided automatically
+for nested parameters).
+
+.. code-block:: python
+
+ @EndpointDoc(parameters={
+ 'my_dictionary': ({
+ 'item1': (str, 'Description of item1'),
+ 'item2': (str, 'Description of item2', True), # item2 is optional
+ 'item3': (str, 'Description of item3', True, 'foo'), # item3 is optional with 'foo' as default value
+ }, 'Description of my_dictionary')})
+ def method(my_dictionary): pass
+
+If the parameter is a ``list`` of primitive types, the type should be
+surrounded with square brackets.
+
+.. code-block:: python
+
+ @EndpointDoc(parameters={'my_list': ([int], 'Description of my_list')})
+ def method(my_list): pass
+
+If the parameter is a ``list`` with nested parameters, the nested parameters
+should be placed in a dictionary and surrounded with square brackets.
+
+.. code-block:: python
+
+ @EndpointDoc(parameters={
+ 'my_list': ([{
+ 'list_item': (str, 'Description of list_item'),
+ 'list_item2': (str, 'Description of list_item2')
+ }], 'Description of my_list')})
+ def method(my_list): pass
+
+
+``responses``: A dict used for describing responses. Rules for describing
+responses are the same as for request body parameters, with one difference:
+responses also needs to be assigned to the related response code as in the
+example below:
+
+.. code-block:: python
+
+ @EndpointDoc(responses={
+ '400':{'my_response': (str, 'Description of my_response')}})
+ def method(): pass
+
+
+Error Handling in Python
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Good error handling is a key requirement in creating a good user experience
+and providing a good API.
+
+Dashboard code should not duplicate C++ code. Thus, if error handling in C++
+is sufficient to provide good feedback, a new wrapper to catch these errors
+is not necessary. On the other hand, input validation is the best place to
+catch errors and generate the best error messages. If required, generate
+errors as soon as possible.
+
+The backend provides few standard ways of returning errors.
+
+First, there is a generic Internal Server Error::
+
+ Status Code: 500
+ {
+ "version": <cherrypy version, e.g. 13.1.0>,
+ "detail": "The server encountered an unexpected condition which prevented it from fulfilling the request.",
+ }
+
+
+For errors generated by the backend, we provide a standard error
+format::
+
+ Status Code: 400
+ {
+ "detail": str(e), # E.g. "[errno -42] <some error message>"
+ "component": "rbd", # this can be null to represent a global error code
+ "code": "3", # Or a error name, e.g. "code": "some_error_key"
+ }
+
+
+In case, the API Endpoints uses @ViewCache to temporarily cache results,
+the error looks like so::
+
+ Status Code 400
+ {
+ "detail": str(e), # E.g. "[errno -42] <some error message>"
+ "component": "rbd", # this can be null to represent a global error code
+ "code": "3", # Or a error name, e.g. "code": "some_error_key"
+ 'status': 3, # Indicating the @ViewCache error status
+ }
+
+In case, the API Endpoints uses a task the error looks like so::
+
+ Status Code 400
+ {
+ "detail": str(e), # E.g. "[errno -42] <some error message>"
+ "component": "rbd", # this can be null to represent a global error code
+ "code": "3", # Or a error name, e.g. "code": "some_error_key"
+ "task": { # Information about the task itself
+ "name": "taskname",
+ "metadata": {...}
+ }
+ }
+
+
+Our WebUI should show errors generated by the API to the user. Especially
+field-related errors in wizards and dialogs or show non-intrusive notifications.
+
+Handling exceptions in Python should be an exception. In general, we
+should have few exception handlers in our project. Per default, propagate
+errors to the API, as it will take care of all exceptions anyway. In general,
+log the exception by adding ``logger.exception()`` with a description to the
+handler.
+
+We need to distinguish between user errors from internal errors and
+programming errors. Using different exception types will ease the
+task for the API layer and for the user interface:
+
+Standard Python errors, like ``SystemError``, ``ValueError`` or ``KeyError``
+will end up as internal server errors in the API.
+
+In general, do not ``return`` error responses in the REST API. They will be
+returned by the error handler. Instead, raise the appropriate exception.
+
+Plug-ins
+~~~~~~~~
+
+New functionality can be provided by means of a plug-in architecture. Among the
+benefits this approach brings in, loosely coupled development is one of the most
+notable. As the Ceph Dashboard grows in feature richness, its code-base becomes
+more and more complex. The hook-based nature of a plug-in architecture allows to
+extend functionality in a controlled manner, and isolate the scope of the
+changes.
+
+Ceph Dashboard relies on `Pluggy <https://pluggy.readthedocs.io>`_ to provide
+for plug-ing support. On top of pluggy, an interface-based approach has been
+implemented, with some safety checks (method override and abstract method
+checks).
+
+In order to create a new plugin, the following steps are required:
+
+#. Add a new file under ``src/pybind/mgr/dashboard/plugins``.
+#. Import the ``PLUGIN_MANAGER`` instance and the ``Interfaces``.
+#. Create a class extending the desired interfaces. The plug-in library will
+ check if all the methods of the interfaces have been properly overridden.
+#. Register the plugin in the ``PLUGIN_MANAGER`` instance.
+#. Import the plug-in from within the Ceph Dashboard ``module.py`` (currently no
+ dynamic loading is implemented).
+
+The available Mixins (helpers) are:
+
+- ``CanMgr``: provides the plug-in with access to the ``mgr`` instance under ``self.mgr``.
+
+The available Interfaces are:
+
+- ``Initializable``: requires overriding ``init()`` hook. This method is run at
+ the very beginning of the dashboard module, right after all imports have been
+ performed.
+- ``Setupable``: requires overriding ``setup()`` hook. This method is run in the
+ Ceph Dashboard ``serve()`` method, right after CherryPy has been configured,
+ but before it is started. It's a placeholder for the plug-in initialization
+ logic.
+- ``HasOptions``: requires overriding ``get_options()`` hook by returning a list
+ of ``Options()``. The options returned here are added to the
+ ``MODULE_OPTIONS``.
+- ``HasCommands``: requires overriding ``register_commands()`` hook by defining
+ the commands the plug-in can handle and decorating them with ``@CLICommand``.
+ The commands can be optionally returned, so that they can be invoked
+ externally (which makes unit testing easier).
+- ``HasControllers``: requires overriding ``get_controllers()`` hook by defining
+ and returning the controllers as usual.
+- ``FilterRequest.BeforeHandler``: requires overriding
+ ``filter_request_before_handler()`` hook. This method receives a
+ ``cherrypy.request`` object for processing. A usual implementation of this
+ method will allow some requests to pass or will raise a ``cherrypy.HTTPError``
+ based on the ``request`` metadata and other conditions.
+
+New interfaces and hooks should be added as soon as they are required to
+implement new functionality. The above list only comprises the hooks needed for
+the existing plugins.
+
+A sample plugin implementation would look like this:
+
+.. code-block:: python
+
+ # src/pybind/mgr/dashboard/plugins/mute.py
+
+ from . import PLUGIN_MANAGER as PM
+ from . import interfaces as I
+
+ from mgr_module import CLICommand, Option
+ import cherrypy
+
+ @PM.add_plugin
+ class Mute(I.CanMgr, I.Setupable, I.HasOptions, I.HasCommands,
+ I.FilterRequest.BeforeHandler, I.HasControllers):
+ @PM.add_hook
+ def get_options(self):
+ return [Option('mute', default=False, type='bool')]
+
+ @PM.add_hook
+ def setup(self):
+ self.mute = self.mgr.get_module_option('mute')
+
+ @PM.add_hook
+ def register_commands(self):
+ @CLICommand("dashboard mute")
+ def _(mgr):
+ self.mute = True
+ self.mgr.set_module_option('mute', True)
+ return 0
+
+ @PM.add_hook
+ def filter_request_before_handler(self, request):
+ if self.mute:
+ raise cherrypy.HTTPError(500, "I'm muted :-x")
+
+ @PM.add_hook
+ def get_controllers(self):
+ from ..controllers import ApiController, RESTController
+
+ @ApiController('/mute')
+ class MuteController(RESTController):
+ def get(_):
+ return self.mute
+
+ return [MuteController]
+
+
+Additionally, a helper for creating plugins ``SimplePlugin`` is provided. It
+facilitates the basic tasks (Options, Commands, and common Mixins). The previous
+plugin could be rewritten like this:
+
+.. code-block:: python
+
+ from . import PLUGIN_MANAGER as PM
+ from . import interfaces as I
+ from .plugin import SimplePlugin as SP
+
+ import cherrypy
+
+ @PM.add_plugin
+ class Mute(SP, I.Setupable, I.FilterRequest.BeforeHandler, I.HasControllers):
+ OPTIONS = [
+ SP.Option('mute', default=False, type='bool')
+ ]
+
+ def shut_up(self):
+ self.set_option('mute', True)
+ self.mute = True
+ return 0
+
+ COMMANDS = [
+ SP.Command("dashboard mute", handler=shut_up)
+ ]
+
+ @PM.add_hook
+ def setup(self):
+ self.mute = self.get_option('mute')
+
+ @PM.add_hook
+ def filter_request_before_handler(self, request):
+ if self.mute:
+ raise cherrypy.HTTPError(500, "I'm muted :-x")
+
+ @PM.add_hook
+ def get_controllers(self):
+ from ..controllers import ApiController, RESTController
+
+ @ApiController('/mute')
+ class MuteController(RESTController):
+ def get(_):
+ return self.mute
+
+ return [MuteController]
diff --git a/doc/dev/developer_guide/debugging-gdb.rst b/doc/dev/developer_guide/debugging-gdb.rst
new file mode 100644
index 000000000..153144431
--- /dev/null
+++ b/doc/dev/developer_guide/debugging-gdb.rst
@@ -0,0 +1,43 @@
+GDB - The GNU Project Debugger
+==============================
+
+`The GNU Project Debugger (GDB) <https://www.sourceware.org/gdb>`_ is
+a powerful tool that allows you to analyze the execution flow
+of a process.
+GDB can help to find bugs, uncover crash errors or track the
+source code during execution of a development cluster.
+It can also be used to debug Teuthology test runs.
+
+GET STARTED WITH GDB
+--------------------
+
+Basic usage with examples can be found `here. <https://geeksforgeeks.org/gdb-command-in-linux-with-examples>`_
+GDB can be attached to a running process. For instance, after deploying a
+development cluster, the process number (PID) of a ``ceph-osd`` daemon can be found in::
+
+ $ cd build
+ $ cat out/osd.0.pid
+
+Attaching gdb to the process::
+
+ $ gdb ./bin/ceph-osd -p <pid>
+
+.. note::
+ It is recommended to compile without any optimizations (``-O0`` gcc flag)
+ in order to avoid elimination of intermediate values.
+
+Stopping for breakpoints while debugging may cause timeouts, so the following
+configuration options are suggested::
+
+ [osd]
+ osd_op_thread_timeout = 1500
+ osd_op_thread_suicide_timeout = 1500
+
+Debugging Teuthology Tests
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``src/script/ceph-debug-docker.sh`` can be used to analyze Teuthology failures::
+
+ $ ./ceph-debug-docker.sh <branch-name>
+
+Refer to the script header for more information.
diff --git a/doc/dev/developer_guide/essentials.rst b/doc/dev/developer_guide/essentials.rst
new file mode 100644
index 000000000..5a31e430b
--- /dev/null
+++ b/doc/dev/developer_guide/essentials.rst
@@ -0,0 +1,346 @@
+Essentials (tl;dr)
+==================
+
+This chapter presents essential information that every Ceph developer needs
+to know.
+
+Leads
+-----
+
+The Ceph project was created by Sage Weil and is led by the Ceph Leadership
+Team (CLT). In addition, each major project component has its own lead. The
+following table shows all the leads and their nicks on `GitHub`_:
+
+.. _github: https://github.com/
+
+========= ================ =============
+Scope Lead GitHub nick
+========= ================ =============
+Ceph Sage Weil liewegas
+RADOS Neha Ojha neha-ojha
+RGW Yehuda Sadeh yehudasa
+RGW Matt Benjamin mattbenjamin
+RBD Ilya Dryomov dis
+CephFS Venky Shankar vshankar
+Dashboard Ernesto Puerta epuertat
+MON Joao Luis jecluis
+Build/Ops Ken Dreyer ktdreyer
+Docs Zac Dover zdover23
+========= ================ =============
+
+The Ceph-specific acronyms in the table are explained in
+:doc:`/architecture`.
+
+History
+-------
+
+See the `History chapter of the Wikipedia article`_.
+
+.. _`History chapter of the Wikipedia article`: https://en.wikipedia.org/wiki/Ceph_%28software%29#History
+
+Licensing
+---------
+
+Ceph is free software.
+
+Unless stated otherwise, the Ceph source code is distributed under the
+terms of the LGPL2.1 or LGPL3.0. For full details, see the file
+`COPYING`_ in the top-level directory of the source-code tree.
+
+.. _`COPYING`:
+ https://github.com/ceph/ceph/blob/master/COPYING
+
+Source code repositories
+------------------------
+
+The source code of Ceph lives on `GitHub`_ in a number of repositories below
+the `Ceph "organization"`_.
+
+.. _`Ceph "organization"`: https://github.com/ceph
+
+A working knowledge of git_ is essential to make a meaningful contribution to the project as a developer.
+
+.. _git: https://git-scm.com/doc
+
+Although the `Ceph "organization"`_ includes several software repositories,
+this document covers only one: https://github.com/ceph/ceph.
+
+Redmine issue tracker
+---------------------
+
+Although `GitHub`_ is used for code, Ceph-related issues (Bugs, Features,
+Backports, Documentation, etc.) are tracked at http://tracker.ceph.com,
+which is powered by `Redmine`_.
+
+.. _Redmine: http://www.redmine.org
+
+The tracker has a Ceph project with a number of subprojects loosely
+corresponding to the various architectural components (see
+:doc:`/architecture`).
+
+Mere `registration`_ in the tracker automatically grants permissions
+sufficient to open new issues and comment on existing ones.
+
+.. _registration: http://tracker.ceph.com/account/register
+
+To report a bug or propose a new feature, `jump to the Ceph project`_ and
+click on `New issue`_.
+
+.. _`jump to the Ceph project`: http://tracker.ceph.com/projects/ceph
+.. _`New issue`: http://tracker.ceph.com/projects/ceph/issues/new
+
+Slack
+-----
+
+Ceph's Slack is https://ceph-storage.slack.com/.
+
+.. _mailing-list:
+
+Mailing lists
+-------------
+
+Ceph Development Mailing List
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The ``dev@ceph.io`` list is for discussion about the development of Ceph,
+its interoperability with other technology, and the operations of the
+project itself.
+
+The email discussion list for Ceph development is open to all. Subscribe by
+sending a message to ``dev-request@ceph.io`` with the following line in the
+body of the message::
+
+ subscribe ceph-devel
+
+
+Ceph Client Patch Review Mailing List
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The ``ceph-devel@vger.kernel.org`` list is for discussion and patch review
+for the Linux kernel Ceph client component. Note that this list used to
+be an all-encompassing list for developers. When searching the archives,
+remember that this list contains the generic devel-ceph archives before mid-2018.
+
+Subscribe to the list covering the Linux kernel Ceph client component by sending
+a message to ``majordomo@vger.kernel.org`` with the following line in the body
+of the message::
+
+ subscribe ceph-devel
+
+
+Other Ceph Mailing Lists
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+There are also `other Ceph-related mailing lists`_.
+
+.. _`other Ceph-related mailing lists`: https://ceph.com/irc/
+
+.. _irc:
+
+
+IRC
+---
+
+In addition to mailing lists, the Ceph community also communicates in real time
+using `Internet Relay Chat`_.
+
+.. _`Internet Relay Chat`: http://www.irchelp.org/
+
+The Ceph community gathers in the #ceph channel of the Open and Free Technology
+Community (OFTC) IRC network.
+
+Created in 1988, Internet Relay Chat (IRC) is a relay-based, real-time chat
+protocol. It is mainly designed for group (many-to-many) communication in
+discussion forums called channels, but also allows one-to-one communication via
+private message. On IRC you can talk to many other members using Ceph, on
+topics ranging from idle chit-chat to support questions. Though a channel might
+have many people in it at any one time, they might not always be at their
+keyboard; so if no-one responds, just wait around and someone will hopefully
+answer soon enough.
+
+Registration
+~~~~~~~~~~~~
+
+If you intend to use the IRC service on a continued basis, you are advised to
+register an account. Registering gives you a unique IRC identity and allows you
+to access channels where unregistered users have been locked out for technical
+reasons.
+
+See ``the official OFTC (Open and Free Technology Community) documentation's
+registration instructions
+<https://www.oftc.net/Services/#register-your-account>`` to learn how to
+register your IRC account.
+
+Channels
+~~~~~~~~
+
+To connect to the OFTC IRC network, download an IRC client and configure it to
+connect to ``irc.oftc.net``. Then join one or more of the channels. Discussions
+inside #ceph are logged and archives are available online.
+
+Here are the real-time discussion channels for the Ceph community:
+
+ - #ceph
+ - #ceph-devel
+ - #cephfs
+ - #ceph-dashboard
+ - #ceph-orchestrators
+ - #sepia
+
+
+.. _submitting-patches:
+
+Submitting patches
+------------------
+
+The canonical instructions for submitting patches are contained in the
+file `CONTRIBUTING.rst`_ in the top-level directory of the source-code
+tree. There may be some overlap between this guide and that file.
+
+.. _`CONTRIBUTING.rst`:
+ https://github.com/ceph/ceph/blob/main/CONTRIBUTING.rst
+
+All newcomers are encouraged to read that file carefully.
+
+Building from source
+--------------------
+
+See instructions at :doc:`/install/build-ceph`.
+
+Using ccache to speed up local builds
+-------------------------------------
+`ccache`_ can make the process of rebuilding the ceph source tree faster.
+
+Before you use `ccache`_ to speed up your rebuilds of the ceph source tree,
+make sure that your source tree is clean and will produce no build failures.
+When you have a clean source tree, you can confidently use `ccache`_, secure in
+the knowledge that you're not using a dirty tree.
+
+Old build artifacts can cause build failures. You might introduce these
+artifacts unknowingly when switching from one branch to another. If you see
+build errors when you attempt a local build, follow the procedure below to
+clean your source tree.
+
+Cleaning the Source Tree
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. prompt:: bash $
+
+ ninja clean
+
+.. note:: The following commands will remove everything in the source tree
+ that isn't tracked by git. Make sure to back up your log files
+ and configuration options before running these commands.
+
+.. prompt:: bash $
+
+ git clean -fdx; git submodule foreach git clean -fdx
+
+Building Ceph with ccache
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+``ccache`` is available as a package in most distros. To build ceph with
+ccache, run the following command.
+
+.. prompt:: bash $
+
+ cmake -DWITH_CCACHE=ON ..
+
+Using ccache to Speed Up Build Times
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+``ccache`` can be used for speeding up all builds of the system. For more
+details, refer to the `run modes`_ section of the ccache manual. The default
+settings of ``ccache`` can be displayed with the ``ccache -s`` command.
+
+.. note:: We recommend overriding the ``max_size``. The default is 10G.
+ Use a larger value, like 25G. Refer to the `configuration`_ section
+ of the ccache manual for more information.
+
+To further increase the cache hit rate and reduce compile times in a
+development environment, set the version information and build timestamps to
+fixed values. This makes it unnecessary to rebuild the binaries that contain
+this information.
+
+This can be achieved by adding the following settings to the ``ccache``
+configuration file ``ccache.conf``::
+
+ sloppiness = time_macros
+ run_second_cpp = true
+
+Now, set the environment variable ``SOURCE_DATE_EPOCH`` to a fixed value (a
+UNIX timestamp) and set ``ENABLE_GIT_VERSION`` to ``OFF`` when running
+``cmake``:
+
+.. prompt:: bash $
+
+ export SOURCE_DATE_EPOCH=946684800
+ cmake -DWITH_CCACHE=ON -DENABLE_GIT_VERSION=OFF ..
+
+.. note:: Binaries produced with these build options are not suitable for
+ production or debugging purposes, as they do not contain the correct build
+ time and git version information.
+
+.. _`ccache`: https://ccache.samba.org/
+.. _`run modes`: https://ccache.samba.org/manual.html#_run_modes
+.. _`configuration`: https://ccache.samba.org/manual.html#_configuration
+
+Development-mode cluster
+------------------------
+
+See :doc:`/dev/quick_guide`.
+
+Kubernetes/Rook development cluster
+-----------------------------------
+
+See :ref:`kubernetes-dev`
+
+.. _backporting:
+
+Backporting
+-----------
+
+All bugfixes should be merged to the ``main`` branch before being
+backported. To flag a bugfix for backporting, make sure it has a
+`tracker issue`_ associated with it and set the ``Backport`` field to a
+comma-separated list of previous releases (e.g. "hammer,jewel") that you think
+need the backport.
+The rest (including the actual backporting) will be taken care of by the
+`Stable Releases and Backports`_ team.
+
+.. _`tracker issue`: http://tracker.ceph.com/
+.. _`Stable Releases and Backports`: http://tracker.ceph.com/projects/ceph-releases/wiki
+
+Dependabot
+----------
+
+Dependabot is a GitHub bot that scans the dependencies in the repositories for
+security vulnerabilities (CVEs). If a fix is available for a discovered CVE,
+Dependabot creates a pull request to update the dependency.
+
+Dependabot also indicates the compatibility score of the upgrade. This score is
+based on the number of CI failures that occur in other GitHub repositories
+where the fix was applied.
+
+With some configuration, Dependabot can perform non-security updates (for
+example, it can upgrade to the latest minor version or patch version).
+
+Dependabot supports `several languages and package managers
+<https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/about-dependabot-version-updates#supported-repositories-and-ecosystems>`_.
+As of July 2022, the Ceph project receives alerts only from pip (based on the
+`requirements.txt` files) and npm (`package*.json`). It is possible to extend
+these alerts to git submodules, Golang, and Java. As of July 2022, there is no
+support for C++ package managers such as vcpkg, conan, C++20 modules.
+
+Many of the dependencies discovered by Dependabot will best be updated
+elsewhere than the Ceph Github repository (distribution packages, for example,
+will be a better place to update some of the dependencies). Nonetheless, the
+list of new and existing vulnerabilities generated by Dependabot will be
+useful.
+
+`Here is an example of a Dependabot pull request.
+<https://github.com/ceph/ceph/pull/46998>`_
+
+Guidance for use of cluster log
+-------------------------------
+
+If your patches emit messages to the Ceph cluster log, please consult
+this: :doc:`/dev/logging`.
diff --git a/doc/dev/developer_guide/index.rst b/doc/dev/developer_guide/index.rst
new file mode 100644
index 000000000..e9832bea6
--- /dev/null
+++ b/doc/dev/developer_guide/index.rst
@@ -0,0 +1,25 @@
+============================================
+Contributing to Ceph: A Guide for Developers
+============================================
+
+:Author: Loic Dachary
+:Author: Nathan Cutler
+:License: Creative Commons Attribution Share Alike 3.0 (CC-BY-SA-3.0)
+
+.. note:: You may also be interested in the :doc:`/dev/internals` documentation.
+
+.. toctree::
+ :maxdepth: 1
+
+ Introduction <intro>
+ Essentials <essentials>
+ What is Merged and When <merging>
+ Issue tracker <issue-tracker>
+ Basic workflow <basic-workflow>
+ Tests: Unit Tests <tests-unit-tests>
+ Tests: Integration Tests (Teuthology) <testing_integration_tests/index>
+ Tests: Running Tests (Locally) <running-tests-locally>
+ Ceph Dashboard Developer Documentation (formerly HACKING.rst) <dash-devel>
+ Tracing Developer Documentation <jaegertracing>
+ Cephadm Developer Documentation <../cephadm/index>
+ Debugging with GDB <debugging-gdb>
diff --git a/doc/dev/developer_guide/intro.rst b/doc/dev/developer_guide/intro.rst
new file mode 100644
index 000000000..67b449c55
--- /dev/null
+++ b/doc/dev/developer_guide/intro.rst
@@ -0,0 +1,25 @@
+Introduction
+============
+
+This guide has two aims. First, it should lower the barrier to entry for
+software developers who wish to get involved in the Ceph project. Second,
+it should serve as a reference for Ceph developers.
+
+We assume that readers are already familiar with Ceph (the distributed
+object store and file system designed to provide excellent performance,
+reliability and scalability). If not, please refer to the `project website`_
+and especially the `publications list`_. Another way to learn about what's
+happening in Ceph is to check out our `youtube channel`_ , where we post Tech
+Talks, Code walk-throughs and Ceph Developer Monthly recordings.
+
+.. _`project website`: https://ceph.com
+.. _`publications list`: https://ceph.com/publications/
+.. _`youtube channel`: https://www.youtube.com/c/CephStorage
+
+Since this document is to be consumed by developers, who are assumed to
+have Internet access, topics covered elsewhere, either within the Ceph
+documentation or elsewhere on the web, are treated by linking. If you
+notice that a link is broken or if you know of a better link, please
+`report it as a bug`_.
+
+.. _`report it as a bug`: http://tracker.ceph.com/projects/ceph/issues/new
diff --git a/doc/dev/developer_guide/issue-tracker.rst b/doc/dev/developer_guide/issue-tracker.rst
new file mode 100644
index 000000000..eae68f3f0
--- /dev/null
+++ b/doc/dev/developer_guide/issue-tracker.rst
@@ -0,0 +1,39 @@
+.. _issue-tracker:
+
+Issue Tracker
+=============
+
+See `Redmine Issue Tracker`_ for a brief introduction to the Ceph Issue
+Tracker.
+
+Ceph developers use the issue tracker to
+
+1. keep track of issues - bugs, fix requests, feature requests, backport
+requests, etc.
+
+2. communicate with other developers and keep them informed as work
+on the issues progresses.
+
+Issue tracker conventions
+-------------------------
+
+When you start working on an existing issue, it's nice to let the other
+developers know this - to avoid duplication of labor. Typically, this is
+done by changing the :code:`Assignee` field (to yourself) and changing the
+:code:`Status` to *In progress*. Newcomers to the Ceph community typically do
+not have sufficient privileges to update these fields, however: they can
+simply update the issue with a brief note.
+
+.. table:: Meanings of some commonly used statuses
+
+ ================ ===========================================
+ Status Meaning
+ ================ ===========================================
+ New Initial status
+ In Progress Somebody is working on it
+ Need Review Pull request is open with a fix
+ Pending Backport Fix has been merged, backport(s) pending
+ Resolved Fix and backports (if any) have been merged
+ ================ ===========================================
+
+.. _Redmine issue tracker: https://tracker.ceph.com
diff --git a/doc/dev/developer_guide/jaegertracing.rst b/doc/dev/developer_guide/jaegertracing.rst
new file mode 100644
index 000000000..73a48ad83
--- /dev/null
+++ b/doc/dev/developer_guide/jaegertracing.rst
@@ -0,0 +1,63 @@
+JAEGER- DISTRIBUTED TRACING
+===========================
+
+Jaeger + Opentracing provides ready to use tracing services for distributed
+systems and is becoming the widely used standard because of their simplicity and
+standardization.
+
+We use a modified `jaeger-cpp-client
+<https://github.com/ceph/jaeger-client-cpp>`_ the backend provided to the
+Opentracing API, which is responsible for the collection of spans, these spans
+are made with the use of smart pointers that carry the timestamp, TraceID and other
+meta info like a specific tag/log associated with the span to uniquely identify
+it across the distributed system.
+
+
+BASIC ARCHITECTURE AND TERMINOLOGY
+----------------------------------
+
+refer to the `Ceph Tracing documentation <../../../jaegertracing/#basic-architecture-and-terminology>`_
+
+
+HOW TO GET STARTED USING TRACING?
+---------------------------------
+
+Enabling jaegertracing with Ceph needs deployment Jaeger daemons + compiling
+Ceph with Jaeger, orchestrated to be used in vstart cluster for developers, this
+uses a jaeger `all-in-one docker
+<https://www.jaegertracing.io/docs/1.22/getting-started/#all-in-one>`_ which
+isn't recommended for production, but for testing purposes. Let's look at all the
+steps needed:
+
+ 1. Update system with Jaeger dependencies, using install-deps::
+
+ $ WITH_JAEGER=true ./install-deps.sh
+
+ 2. Compile Ceph with Jaeger enabled:
+
+ - for precompiled build::
+
+ $ cd build
+ $ cmake -DWITH_JAEGER=ON ..
+
+ - for fresh compilation using do_cmake.sh::
+
+ $ ./do_cmake.sh -DWITH_JAEGER=ON && ninja vstart
+
+ 3. After successful compiling, start a vstart cluster with `--jaeger` which
+ will deploy `jaeger all-in-one <https://www.jaegertracing.io/docs/1.20/getting-started/#all-in-one>`_
+ using container deployment services(docker/podman)::
+
+ $ MON=1 MGR=0 OSD=1 ../src/vstart.sh --with-jaeger
+
+ if the deployment is unsuccessful, you can deploy `all-in-one
+ <https://www.jaegertracing.io/docs/1.20/getting- started/#all-in-one>`_
+ service manually and start vstart cluster without jaeger as well.
+
+
+ 4. Test the traces using rados-bench write::
+
+ $ bin/rados -p test bench 5 write --no-cleanup
+
+.. seealso::
+ `using-jaeger-cpp-client-for-distributed-tracing-in-ceph <https://medium.com/@deepikaupadhyay/using-jaeger-cpp-client-for-distributed-tracing-in-ceph-8b1f4906ca2>` \ No newline at end of file
diff --git a/doc/dev/developer_guide/merging.rst b/doc/dev/developer_guide/merging.rst
new file mode 100644
index 000000000..7e41bd483
--- /dev/null
+++ b/doc/dev/developer_guide/merging.rst
@@ -0,0 +1,138 @@
+.. _merging:
+
+Commit merging: scope and cadence
+==================================
+
+Commits are merged into branches according to criteria specific to each phase
+of the Ceph release lifecycle. This chapter codifies these criteria.
+
+Development releases (i.e. x.0.z)
+---------------------------------
+
+What ?
+^^^^^^
+
+* Features
+* Bug fixes
+
+Where ?
+^^^^^^^
+
+Features are merged to the *main* branch. Bug fixes should be merged to the
+corresponding named branch (e.g. *nautilus* for 14.0.z, *pacific* for 16.0.z,
+etc.). However, this is not mandatory - bug fixes and documentation
+enhancements can be merged to the *main* branch as well, since the *main*
+branch is itself occasionally merged to the named branch during the development
+releases phase. In either case, if a bug fix is important it can also be
+flagged for backport to one or more previous stable releases.
+
+When ?
+^^^^^^
+
+After each stable release, candidate branches for previous releases enter
+phase 2 (see below). For example: the *jewel* named branch was created when
+the *infernalis* release candidates entered phase 2. From this point on,
+*main* was no longer associated with *infernalis*. After he named branch of
+the next stable release is created, *main* will be occasionally merged into
+it.
+
+Branch merges
+^^^^^^^^^^^^^
+
+* The latest stable release branch is merged periodically into main.
+* The main branch is merged periodically into the branch of the stable release.
+* The main is merged into the stable release branch
+ immediately after each development (x.0.z) release.
+
+Stable release candidates (i.e. x.1.z) phase 1
+----------------------------------------------
+
+What ?
+^^^^^^
+
+* Bug fixes only
+
+Where ?
+^^^^^^^
+
+The stable release branch (e.g. *jewel* for 10.0.z, *luminous*
+for 12.0.z, etc.) or *main*. Bug fixes should be merged to the named
+branch corresponding to the stable release candidate (e.g. *jewel* for
+10.1.z) or to *main*. During this phase, all commits to *main* will be
+merged to the named branch, and vice versa. In other words, it makes
+no difference whether a commit is merged to the named branch or to
+*main* - it will make it into the next release candidate either way.
+
+When ?
+^^^^^^
+
+After the first stable release candidate is published, i.e. after the
+x.1.0 tag is set in the release branch.
+
+Branch merges
+^^^^^^^^^^^^^
+
+* The stable release branch is merged periodically into *main*.
+* The *main* branch is merged periodically into the stable release branch.
+* The *main* branch is merged into the stable release branch
+ immediately after each x.1.z release candidate.
+
+Stable release candidates (i.e. x.1.z) phase 2
+----------------------------------------------
+
+What ?
+^^^^^^
+
+* Bug fixes only
+
+Where ?
+^^^^^^^
+
+The stable release branch (e.g. *mimic* for 13.0.z, *octopus* for 15.0.z
+,etc.). During this phase, all commits to the named branch will be merged into
+*main*. Cherry-picking to the named branch during release candidate phase 2
+is performed manually since the official backporting process begins only when
+the release is pronounced "stable".
+
+When ?
+^^^^^^
+
+After the CLT announces that it is time for phase 2 to happen.
+
+Branch merges
+^^^^^^^^^^^^^
+
+* The stable release branch is occasionally merged into main.
+
+Stable releases (i.e. x.2.z)
+----------------------------
+
+What ?
+^^^^^^
+
+* Bug fixes
+* Features are sometime accepted
+* Commits should be cherry-picked from *main* when possible
+* Commits that are not cherry-picked from *main* must pertain to a bug unique to
+ the stable release
+* See also the `backport HOWTO`_ document
+
+.. _`backport HOWTO`:
+ http://tracker.ceph.com/projects/ceph-releases/wiki/HOWTO#HOWTO
+
+Where ?
+^^^^^^^
+
+The stable release branch (*hammer* for 0.94.x, *infernalis* for 9.2.x,
+etc.)
+
+When ?
+^^^^^^
+
+After the stable release is published, i.e. after the "vx.2.0" tag is set in
+the release branch.
+
+Branch merges
+^^^^^^^^^^^^^
+
+Never
diff --git a/doc/dev/developer_guide/running-tests-locally.rst b/doc/dev/developer_guide/running-tests-locally.rst
new file mode 100644
index 000000000..262683bfb
--- /dev/null
+++ b/doc/dev/developer_guide/running-tests-locally.rst
@@ -0,0 +1,171 @@
+Running Unit Tests
+==================
+
+How to run s3-tests locally
+---------------------------
+
+RGW code can be tested by building Ceph locally from source, starting a vstart
+cluster, and running the "s3-tests" suite against it.
+
+The following instructions should work on jewel and above.
+
+Step 1 - build Ceph
+^^^^^^^^^^^^^^^^^^^
+
+Refer to :doc:`/install/build-ceph`.
+
+You can do step 2 separately while it is building.
+
+Step 2 - vstart
+^^^^^^^^^^^^^^^
+
+When the build completes, and still in the top-level directory of the git
+clone where you built Ceph, do the following, for cmake builds::
+
+ cd build/
+ RGW=1 ../src/vstart.sh -n
+
+This will produce a lot of output as the vstart cluster is started up. At the
+end you should see a message like::
+
+ started. stop.sh to stop. see out/* (e.g. 'tail -f out/????') for debug output.
+
+This means the cluster is running.
+
+
+Step 3 - run s3-tests
+^^^^^^^^^^^^^^^^^^^^^
+
+.. highlight:: console
+
+To run the s3tests suite do the following::
+
+ $ ../qa/workunits/rgw/run-s3tests.sh
+
+
+Running test using vstart_runner.py
+-----------------------------------
+CephFS and Ceph Manager code is be tested using `vstart_runner.py`_.
+
+Running your first test
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+The Python tests in Ceph repository can be executed on your local machine
+using `vstart_runner.py`_. To do that, you'd need `teuthology`_ installed::
+
+ $ virtualenv --python=python3 venv
+ $ source venv/bin/activate
+ $ pip install 'setuptools >= 12'
+ $ pip install teuthology[test]@git+https://github.com/ceph/teuthology
+ $ deactivate
+
+The above steps installs teuthology in a virtual environment. Before running
+a test locally, build Ceph successfully from the source (refer
+:doc:`/install/build-ceph`) and do::
+
+ $ cd build
+ $ ../src/vstart.sh -n -d -l
+ $ source ~/path/to/teuthology/venv/bin/activate
+
+To run a specific test, say `test_reconnect_timeout`_ from
+`TestClientRecovery`_ in ``qa/tasks/cephfs/test_client_recovery``, you can
+do::
+
+ $ python ../qa/tasks/vstart_runner.py tasks.cephfs.test_client_recovery.TestClientRecovery.test_reconnect_timeout
+
+The above command runs vstart_runner.py and passes the test to be executed as
+an argument to vstart_runner.py. In a similar way, you can also run the group
+of tests in the following manner::
+
+ $ # run all tests in class TestClientRecovery
+ $ python ../qa/tasks/vstart_runner.py tasks.cephfs.test_client_recovery.TestClientRecovery
+ $ # run all tests in test_client_recovery.py
+ $ python ../qa/tasks/vstart_runner.py tasks.cephfs.test_client_recovery
+
+Based on the argument passed, vstart_runner.py collects tests and executes as
+it would execute a single test.
+
+vstart_runner.py can take the following options -
+
+--clear-old-log deletes old log file before running the test
+--create create Ceph cluster before running a test
+--create-cluster-only creates the cluster and quits; tests can be issued
+ later
+--interactive drops a Python shell when a test fails
+--log-ps-output logs ps output; might be useful while debugging
+--teardown tears Ceph cluster down after test(s) has finished
+ running
+--kclient use the kernel cephfs client instead of FUSE
+--brxnet=<net/mask> specify a new net/mask for the mount clients' network
+ namespace container (Default: 192.168.0.0/16)
+
+.. note:: If using the FUSE client, ensure that the fuse package is installed
+ and enabled on the system and that ``user_allow_other`` is added
+ to ``/etc/fuse.conf``.
+
+.. note:: If using the kernel client, the user must have the ability to run
+ commands with passwordless sudo access.
+
+.. note:: A failure on the kernel client may crash the host, so it's
+ recommended to use this functionality within a virtual machine.
+
+Internal working of vstart_runner.py -
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+vstart_runner.py primarily does three things -
+
+* collects and runs the tests
+ vstart_runner.py setups/teardowns the cluster and collects and runs the
+ test. This is implemented using methods ``scan_tests()``, ``load_tests()``
+ and ``exec_test()``. This is where all the options that vstart_runner.py
+ takes are implemented along with other features like logging and copying
+ the traceback to the bottom of the log.
+
+* provides an interface for issuing and testing shell commands
+ The tests are written assuming that the cluster exists on remote machines.
+ vstart_runner.py provides an interface to run the same tests with the
+ cluster that exists within the local machine. This is done using the class
+ ``LocalRemote``. Class ``LocalRemoteProcess`` can manage the process that
+ executes the commands from ``LocalRemote``, class ``LocalDaemon`` provides
+ an interface to handle Ceph daemons and class ``LocalFuseMount`` can
+ create and handle FUSE mounts.
+
+* provides an interface to operate Ceph cluster
+ ``LocalCephManager`` provides methods to run Ceph cluster commands with
+ and without admin socket and ``LocalCephCluster`` provides methods to set
+ or clear ``ceph.conf``.
+
+.. note:: vstart_runner.py deletes "adjust-ulimits" and "ceph-coverage" from
+ the command arguments unconditionally since they are not applicable
+ when tests are run on a developer's machine.
+
+.. note:: "omit_sudo" is re-set to False unconditionally in cases of commands
+ "passwd" and "chown".
+
+.. note:: The presence of binary file named after the first argument is
+ checked in ``<ceph-repo-root>/build/bin/``. If present, the first
+ argument is replaced with the path to binary file.
+
+Running Workunits Using vstart_enviroment.sh
+--------------------------------------------
+
+Code can be tested by building Ceph locally from source, starting a vstart
+cluster, and running any suite against it.
+Similar to S3-Tests, other workunits can be run against by configuring your environment.
+
+Set up the environment
+^^^^^^^^^^^^^^^^^^^^^^
+
+Configure your environment::
+
+ $ . ./build/vstart_enviroment.sh
+
+Running a test
+^^^^^^^^^^^^^^
+
+To run a workunit (e.g ``mon/osd.sh``) do the following::
+
+ $ ./qa/workunits/mon/osd.sh
+
+.. _test_reconnect_timeout: https://github.com/ceph/ceph/blob/master/qa/tasks/cephfs/test_client_recovery.py#L133
+.. _TestClientRecovery: https://github.com/ceph/ceph/blob/master/qa/tasks/cephfs/test_client_recovery.py#L86
+.. _teuthology: https://github.com/ceph/teuthology
+.. _vstart_runner.py: https://github.com/ceph/ceph/blob/master/qa/tasks/vstart_runner.py
diff --git a/doc/dev/developer_guide/testing_integration_tests/index.rst b/doc/dev/developer_guide/testing_integration_tests/index.rst
new file mode 100644
index 000000000..363e2d212
--- /dev/null
+++ b/doc/dev/developer_guide/testing_integration_tests/index.rst
@@ -0,0 +1,16 @@
+=======================
+Teuthology User Guide
+=======================
+
+.. rubric:: Contents
+
+.. toctree::
+ :glob:
+ :titlesonly:
+
+ Introduction <tests-integration-testing-teuthology-intro>
+ Workflow <tests-integration-testing-teuthology-workflow>
+ Debugging Tips <tests-integration-testing-teuthology-debugging-tips>
+ Kernel Development <tests-integration-testing-teuthology-kernel>
+ Sentry Notes <tests-sentry-developers-guide>
+
diff --git a/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-debugging-tips.rst b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-debugging-tips.rst
new file mode 100644
index 000000000..a959240ba
--- /dev/null
+++ b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-debugging-tips.rst
@@ -0,0 +1,158 @@
+.. _tests-integration-testing-teuthology-debugging-tips:
+
+Analyzing and Debugging A Teuthology Job
+========================================
+
+To learn more about how to schedule an integration test, refer to `Scheduling
+Test Run`_.
+
+Viewing Test Results
+--------------------
+
+When a teuthology run has been completed successfully, use `pulpito`_ dashboard
+to view the results::
+
+ http://pulpito.front.sepia.ceph.com/<job-name>/<job-id>/
+
+.. _pulpito: https://pulpito.ceph.com
+
+or ssh into the teuthology server to view the results of the integration test:
+
+ .. prompt:: bash $
+
+ ssh <username>@teuthology.front.sepia.ceph.com
+
+and access `teuthology archives`_, as in this example:
+
+ .. prompt:: bash $
+
+ nano /a/teuthology-2021-01-06_07:01:02-rados-master-distro-basic-smithi/
+
+.. note:: This requires you to have access to the Sepia lab. To learn how to
+ request access to the Sepia lab, see:
+ https://ceph.github.io/sepia/adding_users/
+
+Identifying Failed Jobs
+-----------------------
+
+On pulpito, a job in red means either a failed job or a dead job. A job is
+combination of daemons and configurations defined in the yaml fragments in
+`qa/suites`_ . Teuthology uses these configurations and runs the tasks listed
+in `qa/tasks`_, which are commands that set up the test environment and test
+Ceph's components. These tasks cover a large subset of use cases and help to
+expose bugs not exposed by `make check`_ testing.
+
+.. _make check: ../tests-integration-testing-teuthology-intro/#make-check
+
+A job failure might be caused by one or more of the following reasons:
+
+* environment setup (`testing on varied
+ systems <https://github.com/ceph/ceph/tree/master/qa/distros/supported>`_):
+ testing compatibility with stable releases for supported versions.
+
+* permutation of config values: for instance, `qa/suites/rados/thrash
+ <https://github.com/ceph/ceph/tree/master/qa/suites/rados/thrash>`_ ensures
+ that we run thrashing tests against Ceph under stressful workloads so that we
+ can catch corner-case bugs. The final setup config yaml file used for testing
+ can be accessed at::
+
+ /a/<job-name>/<job-id>/orig.config.yaml
+
+More details about config.yaml can be found at `detailed test config`_
+
+Triaging the cause of failure
+------------------------------
+
+When a job fails, you will need to read its teuthology log in order to triage
+the cause of its failure. Use the job's name and id from pulpito to locate your
+failed job's teuthology log::
+
+ http://qa-proxy.ceph.com/<job-name>/<job-id>/teuthology.log
+
+Open the log file::
+
+ /a/<job-name>/<job-id>/teuthology.log
+
+For example:
+
+ .. prompt:: bash $
+
+ nano /a/teuthology-2021-01-06_07:01:02-rados-master-distro-basic-smithi/5759282/teuthology.log
+
+Every job failure is recorded in the teuthology log as a Traceback and is
+added to the job summary.
+
+Find the ``Traceback`` keyword and search the call stack and the logs for
+issues that caused the failure. Usually the traceback will include the command
+that failed.
+
+.. note:: The teuthology logs are deleted from time to time. If you are unable
+ to access the link in this example, just use any other case from
+ http://pulpito.front.sepia.ceph.com/
+
+Reporting the Issue
+-------------------
+
+In short: first check to see if your job failure was caused by a known issue,
+and if it wasn't, raise a tracker ticket.
+
+After you have triaged the cause of the failure and you have determined that it
+wasn't caused by the changes that you made to the code, this might indicate
+that you have encountered a known failure in the upstream branch (in the
+example we're considering in this section, the upstream branch is "octopus").
+If the failure was not caused by the changes you made to the code, go to
+https://tracker.ceph.com and look for tracker issues related to the failure by
+using keywords spotted in the failure under investigation.
+
+If you find a similar issue on https://tracker.ceph.com, leave a comment on
+that issue explaining the failure as you understand it and make sure to
+include a link to your recent test run. If you don't find a similar issue,
+create a new tracker ticket for this issue and explain the cause of your job's
+failure as thoroughly as you can. If you're not sure what caused the job's
+failure, ask one of the team members for help.
+
+Debugging an issue using interactive-on-error
+---------------------------------------------
+
+When you encounter a job failure during testing, you should attempt to
+reproduce it. This is where ``--interactive-on-error`` comes in. This
+section explains how to use ``interactive-on-error`` and what it does.
+
+When you have verified that a job has failed, run the same job again in
+teuthology but add the `interactive-on-error`_ flag::
+
+ ideepika@teuthology:~/teuthology$ ./virtualenv/bin/teuthology -v --lock --block $<your-config-yaml> --interactive-on-error
+
+Use either `custom config.yaml`_ or the yaml file from the failed job. If
+you use the yaml file from the failed job, copy ``orig.config.yaml`` to
+your local directory::
+
+ ideepika@teuthology:~/teuthology$ cp /a/teuthology-2021-01-06_07:01:02-rados-master-distro-basic-smithi/5759282/orig.config.yaml test.yaml
+ ideepika@teuthology:~/teuthology$ ./virtualenv/bin/teuthology -v --lock --block test.yaml --interactive-on-error
+
+If a job fails when the ``interactive-on-error`` flag is used, teuthology
+will lock the machines required by ``config.yaml``. Teuthology will halt
+the testing machines and hold them in the state that they were in at the
+time of the job failure. You will be put into an interactive python
+session. From there, you can ssh into the system to investigate the cause
+of the job failure.
+
+After you have investigated the failure, just terminate the session.
+Teuthology will then clean up the session and unlock the machines.
+
+Suggested Resources
+--------------------
+
+ * `Testing Ceph: Pains & Pleasures <https://www.youtube.com/watch?v=gj1OXrKdSrs>`_
+ * `Teuthology Training <https://www.youtube.com/playlist?list=PLrBUGiINAakNsOwHaIM27OBGKezQbUdM->`_
+ * `Intro to Teuthology <https://www.youtube.com/watch?v=WiEUzoS6Nc4>`_
+
+.. _Scheduling Test Run: ../tests-integration-testing-teuthology-workflow/#scheduling-test-run
+.. _detailed test config: https://docs.ceph.com/projects/teuthology/en/latest/detailed_test_config.html
+.. _teuthology archives: ../tests-integration-testing-teuthology-workflow/#teuthology-archives
+.. _qa/suites: https://github.com/ceph/ceph/tree/master/qa/suites
+.. _qa/tasks: https://github.com/ceph/ceph/tree/master/qa/tasks
+.. _interactive-on-error: https://docs.ceph.com/projects/teuthology/en/latest/detailed_test_config.html#troubleshooting
+.. _custom config.yaml: https://docs.ceph.com/projects/teuthology/en/latest/detailed_test_config.html#test-configuration
+.. _testing priority: ../tests-integration-testing-teuthology-intro/#testing-priority
+.. _thrash: https://github.com/ceph/ceph/tree/master/qa/suites/rados/thrash
diff --git a/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-intro.rst b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-intro.rst
new file mode 100644
index 000000000..3cbe51241
--- /dev/null
+++ b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-intro.rst
@@ -0,0 +1,660 @@
+.. _tests-integration-testing-teuthology-intro:
+
+Testing - Integration Tests - Introduction
+==========================================
+
+Ceph has two types of tests: :ref:`make check <make-check>` tests and
+integration tests. When a test requires multiple machines, root access, or lasts
+for a long time (for example, to simulate a realistic Ceph workload), it is
+deemed to be an integration test. Integration tests are organized into "suites",
+which are defined in the `ceph/qa sub-directory`_ and run with the
+``teuthology-suite`` command.
+
+The ``teuthology-suite`` command is part of the `teuthology framework`_.
+In the sections that follow we attempt to provide a detailed introduction
+to that framework from the perspective of a beginning Ceph developer.
+
+Teuthology consumes packages
+----------------------------
+
+It may take some time to understand the significance of this fact, but it
+is `very` significant. It means that automated tests can be conducted on
+multiple platforms using the same packages (RPM, DEB) that can be
+installed on any machine running those platforms.
+
+Teuthology has a `list of platforms that it supports
+<https://github.com/ceph/ceph/tree/master/qa/distros/supported>`_ (as of
+September 2020 the list consisted of "RHEL/CentOS 8" and "Ubuntu 18.04"). It
+expects to be provided pre-built Ceph packages for these platforms. Teuthology
+deploys these platforms on machines (bare-metal or cloud-provisioned), installs
+the packages on them, and deploys Ceph clusters on them - all as called for by
+the test.
+
+The Nightlies
+-------------
+
+A number of integration tests are run on a regular basis in the `Sepia
+lab`_ against the official Ceph repositories (on the ``master`` development
+branch and the stable branches). Traditionally, these tests are called "the
+nightlies" because the Ceph core developers used to live and work in
+the same time zone and from their perspective the tests were run overnight.
+
+The results of nightly test runs are published at http://pulpito.ceph.com/
+under the user ``teuthology``. The developer nick appears in URL of the the
+test results and in the first column of the Pulpito dashboard. The results are
+also reported on the `ceph-qa mailing list <https://ceph.com/irc/>`_.
+
+Testing Priority
+----------------
+
+In brief: in the ``teuthology-suite`` command option ``-p <N>``, set the value of ``<N>`` to a number lower than 1000. An explanation of why follows.
+
+The ``teuthology-suite`` command includes an option ``-p <N>``. This option specifies the priority of the jobs submitted to the queue. The lower the value of ``N``, the higher the priority.
+
+The default value of ``N`` is ``1000``. This is the same priority value given to the nightly tests (the nightlies). Often, the volume of testing done during the nightly tests is so great that the full number of nightly tests do not get run during the time allotted for their run.
+
+Set the value of ``N`` lower than ``1000``, or your tests will not have priority over the nightly tests. This means that they might never run.
+
+Select your job's priority (the value of ``N``) in accordance with the following guidelines:
+
+.. list-table::
+ :widths: 30 30
+ :header-rows: 1
+
+ * - Priority
+ - Explanation
+ * - **N < 10**
+ - Use this if the sky is falling and some group of tests must be run ASAP.
+ * - **10 <= N < 50**
+ - Use this if your tests are urgent and blocking other important
+ development.
+ * - **50 <= N < 75**
+ - Use this if you are testing a particular feature/fix and running fewer
+ than about 25 jobs. This range is also used for urgent release testing.
+ * - **75 <= N < 100**
+ - Tech Leads regularly schedule integration tests with this priority to
+ verify pull requests against master.
+ * - **100 <= N < 150**
+ - This priority is used for QE validation of point releases.
+ * - **150 <= N < 200**
+ - Use this priority for 100 jobs or fewer that test a particular feature
+ or fix. Results are available in about 24 hours.
+ * - **200 <= N < 1000**
+ - Use this priority for large test runs. Results are available in about a
+ week.
+
+To see how many jobs the ``teuthology-suite`` command will trigger, use the
+``--dry-run`` flag. If you are happy with the number of jobs returned by the
+dry run, issue the ``teuthology-suite`` command again without ``--dry-run`` and
+with ``-p`` and an appropriate number as an argument.
+
+To skip the priority check, use ``--force-priority``. Be considerate of the needs of other developers to run tests, and use ``--force-priority`` only in emergencies.
+
+Suites Inventory
+----------------
+
+The ``suites`` directory of the `ceph/qa sub-directory`_ contains all the
+integration tests for all the Ceph components.
+
+.. list-table:: **Suites**
+
+ * - **Component**
+ - **Function**
+
+ * - `ceph-deploy <https://github.com/ceph/ceph/tree/master/qa/suites/ceph-deploy>`_
+ - install a Ceph cluster with ``ceph-deploy`` (`ceph-deploy man page`_)
+
+ * - `dummy <https://github.com/ceph/ceph/tree/master/qa/suites/dummy>`_
+ - get a machine, do nothing and return success (commonly used to verify
+ that the integration testing infrastructure works as expected)
+
+ * - `fs <https://github.com/ceph/ceph/tree/master/qa/suites/fs>`_
+ - test CephFS mounted using kernel and FUSE clients, also with multiple MDSs.
+
+ * - `krbd <https://github.com/ceph/ceph/tree/master/qa/suites/krbd>`_
+ - test the RBD kernel module
+
+ * - `powercycle <https://github.com/ceph/ceph/tree/master/qa/suites/powercycle>`_
+ - verify the Ceph cluster behaves when machines are powered off and on
+ again
+
+ * - `rados <https://github.com/ceph/ceph/tree/master/qa/suites/rados>`_
+ - run Ceph clusters including OSDs and MONs, under various conditions of
+ stress
+
+ * - `rbd <https://github.com/ceph/ceph/tree/master/qa/suites/rbd>`_
+ - run RBD tests using actual Ceph clusters, with and without qemu
+
+ * - `rgw <https://github.com/ceph/ceph/tree/master/qa/suites/rgw>`_
+ - run RGW tests using actual Ceph clusters
+
+ * - `smoke <https://github.com/ceph/ceph/tree/master/qa/suites/smoke>`_
+ - run tests that exercise the Ceph API with an actual Ceph cluster
+
+ * - `teuthology <https://github.com/ceph/ceph/tree/master/qa/suites/teuthology>`_
+ - verify that teuthology can run integration tests, with and without OpenStack
+
+ * - `upgrade <https://github.com/ceph/ceph/tree/master/qa/suites/upgrade>`_
+ - for various versions of Ceph, verify that upgrades can happen without disrupting an ongoing workload (`Upgrade Testing`_)
+
+teuthology-describe
+-------------------
+
+``teuthology-describe`` was added to the `teuthology framework`_ to facilitate
+documentation and better understanding of integration tests.
+
+Tests can be documented by embedding ``meta:`` annotations in the yaml files
+used to define the tests. The results can be seen in the `teuthology-describe
+usecases`_
+
+Since this is a new feature, many yaml files have yet to be annotated.
+Developers are encouraged to improve the coverage and the quality of the
+documentation.
+
+How to run integration tests
+----------------------------
+
+Typically, the `Sepia lab`_ is used to run integration tests. But as a new Ceph
+developer, you will probably not have access to the `Sepia lab`_. You might
+however be able to run some integration tests in an environment separate from
+the `Sepia lab`_ . Ask members from the relevant team how to do this.
+
+One way to run your own integration tests is to set up a teuthology cluster on
+bare metal. Setting up a teuthology cluster on bare metal is a complex task.
+Here are `some notes
+<https://docs.ceph.com/projects/teuthology/en/latest/LAB_SETUP.html>`_ to get
+you started if you decide that you are interested in undertaking the complex
+task of setting up a teuthology cluster on bare metal.
+
+Running integration tests on your code contributions and publishing the results
+allows reviewers to verify that changes to the code base do not cause
+regressions, and allows reviewers to analyze test failures when they occur.
+
+Every teuthology cluster, whether bare-metal or cloud-provisioned, has a
+so-called "teuthology machine" from which tests suites are triggered using the
+``teuthology-suite`` command.
+
+A detailed and up-to-date description of each `teuthology-suite`_ option is
+available by running the following command on the teuthology machine:
+
+.. prompt:: bash $
+
+ teuthology-suite --help
+
+.. _teuthology-suite: https://docs.ceph.com/projects/teuthology/en/latest/commands/teuthology-suite.html
+
+How integration tests are defined
+---------------------------------
+
+Integration tests are defined by yaml files found in the ``suites``
+subdirectory of the `ceph/qa sub-directory`_ and implemented by python
+code found in the ``tasks`` subdirectory. Some tests ("standalone tests")
+are defined in a single yaml file, while other tests are defined by a
+directory tree containing yaml files that are combined, at runtime, into a
+larger yaml file.
+
+
+.. _reading-standalone-test:
+
+Reading a standalone test
+-------------------------
+
+Let us first examine a standalone test, or "singleton".
+
+Here is a commented example using the integration test
+`rados/singleton/all/admin-socket.yaml
+<https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/admin-socket.yaml>`_
+
+.. code-block:: yaml
+
+ roles:
+ - - mon.a
+ - osd.0
+ - osd.1
+ tasks:
+ - install:
+ - ceph:
+ - admin_socket:
+ osd.0:
+ version:
+ git_version:
+ help:
+ config show:
+ config set filestore_dump_file /tmp/foo:
+ perf dump:
+ perf schema:
+
+The ``roles`` array determines the composition of the cluster (how
+many MONs, OSDs, etc.) on which this test is designed to run, as well
+as how these roles will be distributed over the machines in the
+testing cluster. In this case, there is only one element in the
+top-level array: therefore, only one machine is allocated to the
+test. The nested array declares that this machine shall run a MON with
+id ``a`` (that is the ``mon.a`` in the list of roles) and two OSDs
+(``osd.0`` and ``osd.1``).
+
+The body of the test is in the ``tasks`` array: each element is
+evaluated in order, causing the corresponding python file found in the
+``tasks`` subdirectory of the `teuthology repository`_ or
+`ceph/qa sub-directory`_ to be run. "Running" in this case means calling
+the ``task()`` function defined in that file.
+
+In this case, the `install
+<https://github.com/ceph/teuthology/blob/master/teuthology/task/install/__init__.py>`_
+task comes first. It installs the Ceph packages on each machine (as
+defined by the ``roles`` array). A full description of the ``install``
+task is `found in the python file
+<https://github.com/ceph/teuthology/blob/master/teuthology/task/install/__init__.py>`_
+(search for "def task").
+
+The ``ceph`` task, which is documented `here
+<https://github.com/ceph/ceph/blob/master/qa/tasks/ceph.py>`__ (again,
+search for "def task"), starts OSDs and MONs (and possibly MDSs as well)
+as required by the ``roles`` array. In this example, it will start one MON
+(``mon.a``) and two OSDs (``osd.0`` and ``osd.1``), all on the same
+machine. Control moves to the next task when the Ceph cluster reaches
+``HEALTH_OK`` state.
+
+The next task is ``admin_socket`` (`source code
+<https://github.com/ceph/ceph/blob/master/qa/tasks/admin_socket.py>`_).
+The parameter of the ``admin_socket`` task (and any other task) is a
+structure which is interpreted as documented in the task. In this example
+the parameter is a set of commands to be sent to the admin socket of
+``osd.0``. The task verifies that each of them returns on success (i.e.
+exit code zero).
+
+This test can be run with
+
+.. prompt:: bash $
+
+ teuthology-suite --machine-type smithi --suite rados/singleton/all/admin-socket.yaml fs/ext4.yaml
+
+Test descriptions
+-----------------
+
+Each test has a "test description", which is similar to a directory path,
+but not the same. In the case of a standalone test, like the one in
+`Reading a standalone test`_, the test description is identical to the
+relative path (starting from the ``suites/`` directory of the
+`ceph/qa sub-directory`_) of the yaml file defining the test.
+
+Much more commonly, tests are defined not by a single yaml file, but by a
+`directory tree of yaml files`. At runtime, the tree is walked and all yaml
+files (facets) are combined into larger yaml "programs" that define the
+tests. A full listing of the yaml defining the test is included at the
+beginning of every test log.
+
+In these cases, the description of each test consists of the
+subdirectory under `suites/
+<https://github.com/ceph/ceph/tree/master/qa/suites>`_ containing the
+yaml facets, followed by an expression in curly braces (``{}``) consisting of
+a list of yaml facets in order of concatenation. For instance the
+test description::
+
+ ceph-deploy/basic/{distros/centos_7.0.yaml tasks/ceph-deploy.yaml}
+
+signifies the concatenation of two files:
+
+* ceph-deploy/basic/distros/centos_7.0.yaml
+* ceph-deploy/basic/tasks/ceph-deploy.yaml
+
+How tests are built from directories
+------------------------------------
+
+As noted in the previous section, most tests are not defined in a single
+yaml file, but rather as a `combination` of files collected from a
+directory tree within the ``suites/`` subdirectory of the `ceph/qa sub-directory`_.
+
+The set of all tests defined by a given subdirectory of ``suites/`` is
+called an "integration test suite", or a "teuthology suite".
+
+Combination of yaml facets is controlled by special files (``%`` and
+``+``) that are placed within the directory tree and can be thought of as
+operators. The ``%`` file is the "convolution" operator and ``+``
+signifies concatenation.
+
+Convolution operator
+^^^^^^^^^^^^^^^^^^^^
+
+The convolution operator, implemented as a (typically empty) file called ``%``,
+tells teuthology to construct a test matrix from yaml facets found in
+subdirectories below the directory containing the operator.
+
+For example, the `ceph-deploy suite
+<https://github.com/ceph/ceph/tree/master/qa/suites/ceph-deploy/>`_ is
+defined by the ``suites/ceph-deploy/`` tree, which consists of the files and
+subdirectories in the following structure
+
+.. code-block:: none
+
+ qa/suites/ceph-deploy
+ ├── %
+ ├── distros
+ │   ├── centos_latest.yaml
+ │   └── ubuntu_latest.yaml
+ └── tasks
+ ├── ceph-admin-commands.yaml
+ └── rbd_import_export.yaml
+
+This is interpreted as a 2x1 matrix consisting of two tests:
+
+1. ceph-deploy/basic/{distros/centos_7.0.yaml tasks/ceph-deploy.yaml}
+2. ceph-deploy/basic/{distros/ubuntu_16.04.yaml tasks/ceph-deploy.yaml}
+
+i.e. the concatenation of centos_7.0.yaml and ceph-deploy.yaml and
+the concatenation of ubuntu_16.04.yaml and ceph-deploy.yaml, respectively.
+In human terms, this means that the task found in ``ceph-deploy.yaml`` is
+intended to run on both CentOS 7.0 and Ubuntu 16.04.
+
+Without the file percent, the ``ceph-deploy`` tree would be interpreted as
+three standalone tests:
+
+* ceph-deploy/basic/distros/centos_7.0.yaml
+* ceph-deploy/basic/distros/ubuntu_16.04.yaml
+* ceph-deploy/basic/tasks/ceph-deploy.yaml
+
+(which would of course be wrong in this case).
+
+Referring to the `ceph/qa sub-directory`_, you will notice that the
+``centos_7.0.yaml`` and ``ubuntu_16.04.yaml`` files in the
+``suites/ceph-deploy/basic/distros/`` directory are implemented as symlinks.
+By using symlinks instead of copying, a single file can appear in multiple
+suites. This eases the maintenance of the test framework as a whole.
+
+All the tests generated from the ``suites/ceph-deploy/`` directory tree
+(also known as the "ceph-deploy suite") can be run with
+
+.. prompt:: bash $
+
+ teuthology-suite --machine-type smithi --suite ceph-deploy
+
+An individual test from the `ceph-deploy suite`_ can be run by adding the
+``--filter`` option
+
+.. prompt:: bash $
+
+ teuthology-suite \
+ --machine-type smithi \
+ --suite ceph-deploy/basic \
+ --filter 'ceph-deploy/basic/{distros/ubuntu_16.04.yaml tasks/ceph-deploy.yaml}'
+
+.. note:: To run a standalone test like the one in `Reading a standalone
+ test`_, ``--suite`` alone is sufficient. If you want to run a single
+ test from a suite that is defined as a directory tree, ``--suite`` must
+ be combined with ``--filter``. This is because the ``--suite`` option
+ understands POSIX relative paths only.
+
+Nested Subsets
+^^^^^^^^^^^^^^
+
+Suites can get quite large with the combinatorial explosion of yaml
+configurations. At the time of writing, the ``rados``` suite is more than
+100,000 jobs. For this reason, scheduling often uses the ``--subset`` option to
+only run a subset of the jobs (see also: :ref:`subset`). However, this applies
+only at the top-level of the suite being run (e.g. ``fs``). That may
+incidentally inflate the ratio of jobs for some larger sub-suites (like
+``fs:workload``) vs. smaller but critical suites (like ``fs:volumes``).
+
+It is therefore attractive to automatically subset some sub-suites which are
+never run fully. This is done by providing an integer divisor for the ``%``
+convolution operator file instead of leaving it empty. That divisor
+automatically subsets the resulting matrix. For example, if the convolution
+file ``%`` contains ``2``, the matrix will be divided into two using the same
+logic as the ``--subset`` mechanism.
+
+Note the numerator is not specified as with the ``--subset`` option as there is
+no meaningful way to express this when there could be several layers of
+nesting. Instead, a random subset is selected (1 of 2 in our example). The
+choice is based off the random seed (``--seed``) used for the scheduling.
+Remember that seed is saved in the results so that a ``--rerun`` of failed
+tests will still preserve the correct numerator (subset of subsets).
+
+You can disable nested subsets using the ``--no-nested-subset`` argument to
+``teuthology-suite``.
+
+Concatenation operator
+^^^^^^^^^^^^^^^^^^^^^^
+
+For even greater flexibility in sharing yaml files between suites, the
+special file plus (``+``) can be used to concatenate files within a
+directory. For instance, consider the `suites/rbd/thrash
+<https://github.com/ceph/ceph/tree/master/qa/suites/rbd/thrash>`_
+tree
+
+.. code-block:: none
+
+ qa/suites/rbd/thrash
+ ├── %
+ ├── clusters
+ │   ├── +
+ │   ├── fixed-2.yaml
+ │   └── openstack.yaml
+ └── workloads
+ ├── rbd_api_tests_copy_on_read.yaml
+ ├── rbd_api_tests.yaml
+ └── rbd_fsx_rate_limit.yaml
+
+This creates two tests:
+
+* rbd/thrash/{clusters/fixed-2.yaml clusters/openstack.yaml workloads/rbd_api_tests_copy_on_read.yaml}
+* rbd/thrash/{clusters/fixed-2.yaml clusters/openstack.yaml workloads/rbd_api_tests.yaml}
+
+Because the ``clusters/`` subdirectory contains the special file plus
+(``+``), all the other files in that subdirectory (``fixed-2.yaml`` and
+``openstack.yaml`` in this case) are concatenated together
+and treated as a single file. Without the special file plus, they would
+have been convolved with the files from the workloads directory to create
+a 2x2 matrix:
+
+* rbd/thrash/{clusters/openstack.yaml workloads/rbd_api_tests_copy_on_read.yaml}
+* rbd/thrash/{clusters/openstack.yaml workloads/rbd_api_tests.yaml}
+* rbd/thrash/{clusters/fixed-2.yaml workloads/rbd_api_tests_copy_on_read.yaml}
+* rbd/thrash/{clusters/fixed-2.yaml workloads/rbd_api_tests.yaml}
+
+The ``clusters/fixed-2.yaml`` file is shared among many suites to
+define the following ``roles``
+
+.. code-block:: yaml
+
+ roles:
+ - [mon.a, mon.c, osd.0, osd.1, osd.2, client.0]
+ - [mon.b, osd.3, osd.4, osd.5, client.1]
+
+The ``rbd/thrash`` suite as defined above, consisting of two tests,
+can be run with
+
+.. prompt:: bash $
+
+ teuthology-suite --machine-type smithi --suite rbd/thrash
+
+A single test from the rbd/thrash suite can be run by adding the
+``--filter`` option
+
+.. prompt:: bash $
+
+ teuthology-suite \
+ --machine-type smithi \
+ --suite rbd/thrash \
+ --filter 'rbd/thrash/{clusters/fixed-2.yaml clusters/openstack.yaml workloads/rbd_api_tests_copy_on_read.yaml}'
+
+.. _upgrade-testing:
+
+Upgrade Testing
+^^^^^^^^^^^^^^^
+
+Using the upgrade suite we are able to verify that upgrades from earlier releases can complete
+successfully without disrupting any ongoing workload.
+Each Release branch upgrade directory includes 2-x upgrade testing.
+Meaning, we are able to test the upgrade from 2 preceding releases to the current one.
+The upgrade sequence is done in `parallel <https://github.com/ceph/teuthology/blob/main/teuthology/task/parallel.py>`_
+with other given workloads.
+
+For instance, the upgrade test directory from the Quincy release branch is as follows:
+
+.. code-block:: none
+
+ .
+ ├── octopus-x
+ └── pacific-x
+
+It is possible to test upgrades from Octopus (2-x) or from Pacific (1-x) to Quincy (x).
+A simple upgrade test consists the following order:
+
+.. code-block:: none
+
+ ├── 0-start.yaml
+ ├── 1-tasks.yaml
+ ├── upgrade-sequence.yaml
+ └── workload
+
+After starting the cluster with the older release we begin running the given ``workload``
+and the ``upgrade-sequnce`` in parallel.
+
+.. code-block:: yaml
+
+ - print: "**** done start parallel"
+ - parallel:
+ - workload
+ - upgrade-sequence
+ - print: "**** done end parallel"
+
+While the ``workload`` directory consists regular yaml files just as in any other suite,
+the ``upgrade-sequnce`` is resposible for running the upgrade and awaitng its completion:
+
+.. code-block:: yaml
+
+ - print: "**** done start upgrade, wait"
+ ...
+ mon.a:
+ - ceph orch upgrade start --image quay.ceph.io/ceph-ci/ceph:$sha1
+ - while ceph orch upgrade status | jq '.in_progress' | grep true ; do ceph orch ps ; ceph versions ; sleep 30 ; done\
+ ...
+ - print: "**** done end upgrade, wait..."
+
+
+It is also possible to upgrade in stages while running workloads in between those:
+
+.. code-block:: none
+
+ ├── %
+ ├── 0-cluster
+ ├── 1-ceph-install
+ ├── 2-partial-upgrade
+ ├── 3-thrash
+ ├── 4-workload
+ ├── 5-finish-upgrade.yaml
+ ├── 6-quincy.yaml
+ └── 8-final-workload
+
+After starting a cluster we upgrade only 2/3 of the cluster (``2-partial-upgrade``).
+The next stage is running thrash tests and given workload tests. Later on, continuing to upgrade the
+rest of the cluster (``5-finish-upgrade.yaml``).
+The last stage is requiring the updated release (``ceph require-osd-release quincy``,
+``ceph osd set-require-min-compat-client quincy``) and running the ``final-workload``.
+
+Position Independent Linking
+----------------------------
+
+Under the ``qa/suites`` directory are ``.qa`` symbolic links in every
+directory. Each link is recursive by always linking to ``../.qa/``. The final
+terminating link is in the ``qa/`` directory itself as ``qa/.qa -> .``. This
+layout of symbolic links allows a suite to be easily copied or moved without
+breaking a number of symbolic links. For example::
+
+ qa/suites/fs/upgrade/nofs/centos_latest.yaml -> .qa/distros/supported/centos_latest.yaml
+
+If we copy the ``nofs`` suite somewhere else, add a parent directory above
+``nofs``, or move the ``centos_latest.yaml`` fragment into a sub-directory, the
+link will not break. Compare to::
+
+ qa/suites/fs/upgrade/nofs/centos_latest.yaml -> ../../../../distros/supported/centos_latest.yaml
+
+If the link is moved, it is very likely it will break because the number of
+parent directories to reach the ``distros`` directory may change.
+
+When adding new directories or suites, it is recommended to also remember
+adding ``.qa`` symbolic links. A trivial find command may do this for you:
+
+.. prompt:: bash $
+
+ find qa/suites/ -type d -execdir ln -sfT ../.qa/ {}/.qa \;
+
+
+Filtering tests by their description
+------------------------------------
+
+When a few jobs fail and need to be run again, the ``--filter`` option
+can be used to select tests with a matching description. For instance, if the
+``rados`` suite fails the `all/peer.yaml <https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/peer.yaml>`_ test, the following will only
+run the tests that contain this file
+
+.. prompt:: bash $
+
+ teuthology-suite --machine-type smithi --suite rados --filter all/peer.yaml
+
+The ``--filter-out`` option does the opposite (it matches tests that do `not`
+contain a given string), and can be combined with the ``--filter`` option.
+
+Both ``--filter`` and ``--filter-out`` take a comma-separated list of strings
+(which means the comma character is implicitly forbidden in filenames found in
+the `ceph/qa sub-directory`_). For instance
+
+.. prompt:: bash $
+
+ teuthology-suite --machine-type smithi --suite rados --filter all/peer.yaml,all/rest-api.yaml
+
+will run tests that contain either
+`all/peer.yaml <https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/peer.yaml>`_
+or
+`all/rest-api.yaml <https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/rest-api.yaml>`_
+
+Each string is looked up anywhere in the test description and has to
+be an exact match: they are not regular expressions.
+
+
+.. _subset:
+
+Reducing the number of tests
+----------------------------
+
+The ``rados`` suite generates tens or even hundreds of thousands of tests out
+of a few hundred files. This happens because teuthology constructs test
+matrices from subdirectories wherever it encounters a file named ``%``. For
+instance, all tests in the `rados/basic suite
+<https://github.com/ceph/ceph/tree/master/qa/suites/rados/basic>`_ run with
+different messenger types: ``simple``, ``async`` and ``random``, because they
+are combined (via the special file ``%``) with the `msgr directory
+<https://github.com/ceph/ceph/tree/master/qa/suites/rados/basic/msgr>`_
+
+All integration tests are required to be run before a Ceph release is
+published. When merely verifying whether a contribution can be merged without
+risking a trivial regression, it is enough to run a subset. The ``--subset``
+option can be used to reduce the number of tests that are triggered. For
+instance
+
+.. prompt:: bash $
+
+ teuthology-suite --machine-type smithi --suite rados --subset 0/4000
+
+will run as few tests as possible. The tradeoff in this case is that
+not all combinations of test variations will together,
+but no matter how small a ratio is provided in the ``--subset``,
+teuthology will still ensure that all files in the suite are in at
+least one test. Understanding the actual logic that drives this
+requires reading the teuthology source code.
+
+Note: some suites are now using a **nested subset** feature that automatically
+applies a subset to a carefully chosen set of YAML configurations. You may
+disable this behavior (for some custom filtering, perhaps) using the
+``--no-nested-subset`` option.
+
+The ``--limit`` option only runs the first ``N`` tests in the suite:
+this is rarely useful, however, because there is no way to control which
+test will be first.
+
+.. _ceph/qa sub-directory: https://github.com/ceph/ceph/tree/master/qa
+.. _Sepia Lab: https://wiki.sepia.ceph.com/doku.php
+.. _teuthology repository: https://github.com/ceph/teuthology
+.. _teuthology framework: https://github.com/ceph/teuthology
+.. _teuthology-describe usecases: https://gist.github.com/jdurgin/09711d5923b583f60afc
+.. _ceph-deploy man page: ../../../../man/8/ceph-deploy
diff --git a/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-kernel.rst b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-kernel.rst
new file mode 100644
index 000000000..e7c20ee24
--- /dev/null
+++ b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-kernel.rst
@@ -0,0 +1,71 @@
+.. _tests-integration-testing-teuthology-kernel:
+
+Integration Tests for Kernel Development
+========================================
+
+
+.. _kernel-cephfs:
+
+CephFS
+------
+
+The ``fs`` suite runs various kernels as described by the `kernel YAML
+fragments`_. These are symbolically linked by other sub-suites under the ``fs``
+suite.
+
+The matrix of fragments allows for testing the following configurations:
+
+* The "stock" kernel on RHEL 8 (i.e. the kernel that ships with it).
+
+* The `testing branch`_ by the kernel development team which represents the
+ patches undergoing active testing. These patches may or may not be in the next
+ upstream kernel release and include a mix of CephFS or kRBD changes. For the
+ testing kernel, we test with whatever distributions are specified by the
+ sub-suite. For example, the ``fs:functional`` sub-suite uses a random selection
+ of the `supported random distros`_.
+
+
+
+
+Testing custom kernels
+----------------------
+
+If you have a kernel branch on `ceph-client.git`_ and have built it using
+shaman, then you can also test that easily by specifying an override for the
+kernel. This is done via a YAML fragment passed to the ``teuthology-suite``
+command:
+
+::
+
+ $ cat custom-kernel.yaml
+ overrides:
+ kernel:
+ branch: for-linus
+
+This specifies an override for the kernel branch specified in the suite's
+matrix. You can also specify an override as a tag or SHA1 for the ``kernel``
+task. When overriding the kernel, you should reduce the selection of jobs as
+the matrix will include a number of kernel configurations you won't care to
+test, as mentioned in the :ref:`kernel-cephfs` section; the override YAML will
+apply to all configurations of the kernel so it will result in duplicate tests.
+The command to run tests will look like:
+
+.. prompt:: bash $
+
+ teuthology-suite ... --suite fs --filter k-testing custom-kernel.yaml
+
+Where ``...`` indicates other typical options that are normally specified when
+running ``teuthology-suite``. The important filter ``--filter k-testing``
+will limit the selection of jobs to those using the ``testing`` branch of the
+kernel (see the `k-testing.yaml`_ file). So you'll only select jobs using the
+kernel client with the ``testing`` branch. Your custom YAML file,
+``custom-kernel.yaml``, will further override the ``testing`` branch to use
+whatever you specify.
+
+
+
+.. _kernel YAML fragments: https://github.com/ceph/ceph/tree/63f84c50e0851d456fc38b3330945c54162dd544/qa/cephfs/mount/kclient/overrides/distro
+.. _ceph-client.git: https://github.com/ceph/ceph-client/tree/testing
+.. _testing branch: https://github.com/ceph/ceph-client/tree/testing
+.. _supported random distros: https://github.com/ceph/ceph/blob/63f84c50e0851d456fc38b3330945c54162dd544/qa/suites/fs/functional/distro
+.. _k-testing.yaml: https://github.com/ceph/ceph/blob/63f84c50e0851d456fc38b3330945c54162dd544/qa/cephfs/mount/kclient/overrides/distro/testing/k-testing.yaml
diff --git a/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-workflow.rst b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-workflow.rst
new file mode 100644
index 000000000..64b006c57
--- /dev/null
+++ b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-workflow.rst
@@ -0,0 +1,293 @@
+.. _tests-integration-testing-teuthology-workflow:
+
+Integration Tests using Teuthology Workflow
+===========================================
+
+Scheduling Test Run
+-------------------
+
+Getting binaries
+****************
+
+Ceph binaries must be built for your branch before you can use teuthology to run integration tests on them. Follow these steps to build the Ceph binaries:
+
+#. Push the branch to the `ceph-ci`_ repository. This triggers the process of
+ building the binaries on the Jenkins CI.
+
+#. To ensure that the build process has been initiated, confirm that the branch
+ name has appeared in the list of "Latest Builds Available" at `Shaman`_.
+ Soon after you start the build process, the testing infrastructure adds
+ other, similarly-named builds to the list of "Latest Builds Available".
+ The names of these new builds will contain the names of various Linux
+ distributions of Linux and will be used to test your build against those
+ Linux distributions.
+
+#. Wait for the packages to be built and uploaded to `Chacra`_, and wait for
+ the repositories offering the packages to be created. The entries for the
+ branch names in the list of "Latest Builds Available" on `Shaman`_ will turn
+ green to indicate that the packages have been uploaded to `Chacra`_ and to
+ indicate that their repositories have been created. Wait until each entry
+ is coloured green. This usually takes between two and three hours depending
+ on the availability of the machines.
+
+ The Chacra URL for a particular build can be queried from `the Chacra site`_.
+
+.. note:: The branch to be pushed on ceph-ci can be any branch. The branch does
+ not have to be a PR branch.
+
+.. note:: If you intend to push master or any other standard branch, check
+ `Shaman`_ beforehand since it might already have completed builds for it.
+
+.. _the Chacra site: https://shaman.ceph.com/api/search/?status=ready&project=ceph
+
+
+Triggering Tests
+****************
+
+After you have built Ceph binaries for your branch, you can run tests using
+teuthology. This procedure explains how to run tests using teuthology.
+
+#. Log in to the teuthology machine:
+
+ .. prompt:: bash $
+
+ ssh <username>@teuthology.front.sepia.ceph.com
+
+ This requires Sepia lab access. To request access to the Sepia lab, see:
+ https://ceph.github.io/sepia/adding_users/
+
+#. Run the ``teuthology-suite`` command:
+
+ .. prompt:: bash $
+
+ teuthology-suite -v \
+ -m smithi \
+ -c wip-devname-feature-x \
+ -s fs \
+ -p 110 \
+ --filter "cephfs-shell" \
+ -e foo@gmail.com \
+
+ The options in the above command are defined here:
+
+ ============= =========================================================
+ Option Meaning
+ ============= =========================================================
+ -v verbose
+ -m machine name
+ -c the name of the branch that was pushed on ceph-ci
+ -s test-suite name
+ -p the higher the number, the lower the priority of
+ the job
+ --filter filter tests in a given suite. The argument
+ passed to this filter specifies which test you
+ want to run
+ -e <email> When tests finish or time out, send an email to the
+ specified address. Can also be specified in
+ ~/.teuthology.yaml as 'results_email'
+ ============= =========================================================
+
+ .. note:: The priority number present in the command above is a placeholder.
+ Do not use it in your own tests. See `Testing Priority`_ for information
+ about recommended values.
+
+ .. note:: Do not issue a command without a priority number. The default
+ value is 1000, a value so large that your job is unlikely ever to run.
+
+ Run ``teuthology-suite --help`` to read descriptions of these and other
+ available options.
+
+#. Wait for the tests to run. ``teuthology-suite`` prints a link to
+ `Pulpito`_ where the test results can be viewed.
+
+
+
+Other frequently used/useful options are ``-d`` (or ``--distro``),
+``--distroversion``, ``--filter-out``, ``--timeout``, ``flavor``, ``-rerun``,
+``-l`` (for limiting number of jobs) , ``-N`` (for how many times the job will
+run), and ``--subset`` (used to reduce the number of tests that are triggered). Run
+``teuthology-suite --help`` to read descriptions of these and other options.
+
+.. _teuthology_testing_qa_changes:
+
+Testing QA changes (without re-building binaries)
+*************************************************
+
+If you are making changes only in the ``qa/`` directory, you do not have to
+rebuild the binaries before you re-run tests. If you make changes only in
+``qa/``, you can use the binaries built for the ceph-ci branch to re-run tests.
+You just have to make sure to tell the ``teuthology-suite`` command to use a
+separate branch for running the tests.
+
+If you made changes only in ``qa/``
+(https://github.com/ceph/ceph/tree/master/qa), you do not need to rebuild the
+binaries. You can use existing binaries that are built periodically for master and other stable branches and run your test changes against them.
+Your branch with the qa changes can be tested by passing two extra arguments to the ``teuthology-suite`` command: (1) ``--suite-repo``, specifying your ceph repo, and (2) ``--suite-branch``, specifying your branch name.
+
+For example, if you want to make changes in ``qa/`` after testing ``branch-x``
+(for which the ceph-ci branch is ``wip-username-branch-x``), run the following
+command
+
+.. prompt:: bash $
+
+ teuthology-suite -v \
+ -m smithi \
+ -c wip-username-branch-x \
+ -s fs \
+ -p 50 \
+ --filter cephfs-shell
+
+Then make modifications locally, update the PR branch, and trigger tests from
+your PR branch as follows:
+
+.. prompt:: bash $
+
+ teuthology-suite -v \
+ -m smithi \
+ -c wip-username-branch-x \
+ -s fs -p 50 \
+ --filter cephfs-shell \
+ --suite-repo https://github.com/$username/ceph \
+ --suite-branch branch-x
+
+You can verify that the tests were run using this branch by looking at the
+values for the keys ``suite_branch``, ``suite_repo`` and ``suite_sha1`` in the
+job config printed at the beginning of the teuthology job.
+
+.. note:: If you are making changes that are not in the ``qa/`` directory,
+ you must follow the standard process of triggering builds, waiting
+ for the builds to finish, then triggering tests and waiting for
+ the test results.
+
+About Suites and Filters
+************************
+
+See `Suites Inventory`_ for a list of available suites of integration tests.
+Each directory under ``qa/suites`` in the Ceph repository is an integration
+test suite, and arguments appropriate to follow ``-s`` can be found there.
+
+Keywords for filtering tests can be found in
+``qa/suites/<suite-name>/<subsuite-name>/tasks`` and can be used as arguments
+for ``--filter``. Each YAML file in that directory can trigger tests; using the
+name of the file without its filename extension as an argument to the
+``--filter`` triggers those tests.
+
+For example, in the command above in the :ref:`Testing QA Changes
+<teuthology_testing_qa_changes>` section, ``cephfs-shell`` is specified.
+This works because there is a file named ``cephfs-shell.yaml`` in
+``qa/suites/fs/basic_functional/tasks/``.
+
+If the filename doesn't suggest what kind of tests it triggers, search the
+contents of the file for the ``modules`` attribute. For ``cephfs-shell.yaml``
+the ``modules`` attribute is ``tasks.cephfs.test_cephfs_shell``. This means
+that it triggers all tests in ``qa/tasks/cephfs/test_cephfs_shell.py``.
+
+Viewing Test Results
+---------------------
+
+Pulpito Dashboard
+*****************
+
+After the teuthology job is scheduled, the status and results of the test run
+can be checked at https://pulpito.ceph.com/.
+
+Teuthology Archives
+*******************
+
+After the tests have finished running, the log for the job can be obtained by
+clicking on the job ID at the Pulpito page associated with your tests. It's
+more convenient to download the log and then view it rather than viewing it in
+an internet browser since these logs can easily be up to 1 GB in size. It is
+easier to ssh into the teuthology machine (``teuthology.front.sepia.ceph.com``)
+and access the following path::
+
+ /ceph/teuthology-archive/<test-id>/<job-id>/teuthology.log
+
+For example: for the above test ID, the path is::
+
+ /ceph/teuthology-archive/teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi/4588482/teuthology.log
+
+This method can be used to view the log more quickly than would be possible through a browser.
+
+.. note:: To access archives more conveniently, ``/a/`` has been symbolically
+ linked to ``/ceph/teuthology-archive/``. For instance, to access the previous
+ example, we can use something like::
+
+ /a/teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi/4588482/teuthology.log
+
+Killing Tests
+-------------
+``teuthology-kill`` can be used to kill jobs that have been running
+unexpectedly for several hours, or when developers want to terminate tests
+before they complete.
+
+Here is the command that terminates jobs:
+
+.. prompt:: bash $
+
+ teuthology-kill -r teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi
+
+Let's call the argument passed to ``-r`` as test ID. It can be found
+easily in the link to the Pulpito page for the tests you triggered. For
+example, for the above test ID, the link is - http://pulpito.front.sepia.ceph.com/teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi/
+
+Re-running Tests
+----------------
+
+The ``teuthology-suite`` command has a ``-r`` (or ``--rerun``) option, which
+allows you to re-run tests. This is handy when your tests have failed or end
+up dead. The ``--rerun`` option takes the name of a teuthology run as an
+argument. Option ``-R`` (or ``--rerun-statuses``) can be passed along with
+``-r`` to choose which kind of tests should be picked from the run. For
+example, you can re-run only those tests from previous run which had ended up
+as dead. Following is a practical example:
+
+.. prompt:: bash $
+
+ teuthology-suite -v \
+ -m smithi \
+ -c wip-rishabh-fs-test_cephfs_shell-fix \
+ -p 50 \
+ --r teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi \
+ -R fail,dead,queued \
+ -e $CEPH_QA_MAIL
+
+Following's the definition of new options introduced in this section:
+
+ ======================= ===============================================
+ Option Meaning
+ ======================= ===============================================
+ -r, --rerun Attempt to reschedule a run, selecting only
+ those jobs whose status are mentioned by
+ --rerun-status.
+ -R, --rerun-statuses A comma-separated list of statuses to be used
+ with --rerun. Supported statuses: 'dead',
+ 'fail', 'pass', 'queued', 'running' and
+ 'waiting'. Default value: 'fail,dead'
+ ======================= ===============================================
+
+Naming the ceph-ci branch
+-------------------------
+Prepend your branch with your name before you push it to ceph-ci. For example,
+a branch named ``feature-x`` should be named ``wip-$yourname-feature-x``, where
+``$yourname`` is replaced with your name. Identifying your branch with your
+name makes your branch easily findable on Shaman and Pulpito.
+
+If you are using one of the stable branches (`quincy`, `pacific`, etc.), include
+the name of that stable branch in your ceph-ci branch name.
+For example, the ``feature-x`` PR branch should be named
+``wip-feature-x-nautilus``. *This is not just a convention. This ensures that your branch is built in the correct environment.*
+
+Delete the branch from ceph-ci when you no longer need it. If you are
+logged in to GitHub, all your branches on ceph-ci can be found here:
+https://github.com/ceph/ceph-ci/branches.
+
+.. _ceph-ci: https://github.com/ceph/ceph-ci
+.. _Chacra: https://github.com/ceph/chacra/blob/master/README.rst
+.. _Pulpito: http://pulpito.front.sepia.ceph.com/
+.. _Running Your First Test: ../../running-tests-locally/#running-your-first-test
+.. _Shaman: https://shaman.ceph.com/builds/ceph/
+.. _Suites Inventory: ../tests-integration-testing-teuthology-intro/#suites-inventory
+.. _Testing Priority: ../tests-integration-testing-teuthology-intro/#testing-priority
+.. _Triggering Tests: ../tests-integration-testing-teuthology-workflow/#triggering-tests
+.. _tests-sentry-developers-guide: ../tests-sentry-developers-guide/
diff --git a/doc/dev/developer_guide/testing_integration_tests/tests-sentry-developers-guide.rst b/doc/dev/developer_guide/testing_integration_tests/tests-sentry-developers-guide.rst
new file mode 100644
index 000000000..94dfae39a
--- /dev/null
+++ b/doc/dev/developer_guide/testing_integration_tests/tests-sentry-developers-guide.rst
@@ -0,0 +1,6 @@
+.. _tests-sentry-developers-guide:
+
+Sentry Notes
+============
+
+To be updated. Feel free to contribute.
diff --git a/doc/dev/developer_guide/tests-unit-tests.rst b/doc/dev/developer_guide/tests-unit-tests.rst
new file mode 100644
index 000000000..72d724d98
--- /dev/null
+++ b/doc/dev/developer_guide/tests-unit-tests.rst
@@ -0,0 +1,177 @@
+Testing - unit tests
+====================
+
+The Ceph GitHub repository has two types of tests: unit tests (also called
+``make check`` tests) and integration tests. Strictly speaking, the
+``make check`` tests are not "unit tests", but rather tests that can be run
+easily on a single build machine after compiling Ceph from source, whereas
+integration tests require package installation and multi-machine clusters to
+run.
+
+.. _make-check:
+
+What does "make check" mean?
+----------------------------
+
+After compiling Ceph, the code can be run through a battery of tests. For
+historical reasons, this is often referred to as ``make check`` even though
+the actual command used to run the tests is now ``ctest``. To be included in
+this group of tests, a test must:
+
+* bind ports that do not conflict with other tests
+* not require root access
+* not require more than one machine to run
+* complete within a few minutes
+
+For the sake of simplicity, this class of tests is referred to as "make
+check tests" or "unit tests". This is meant to distinguish these tests from
+the more complex "integration tests" that are run via the `teuthology
+framework`_.
+
+While it is possible to run ``ctest`` directly, it can be tricky to correctly
+set up your environment for it. Fortunately, there is a script that makes it
+easy to run the unit tests on your code. This script can be run from the
+top-level directory of the Ceph source tree by invoking:
+
+ .. prompt:: bash $
+
+ ./run-make-check.sh
+
+You will need a minimum of 8GB of RAM and 32GB of free drive space for this
+command to complete successfully on x86_64 architectures; other architectures
+may have different requirements. Depending on your hardware, it can take from
+twenty minutes to three hours to complete.
+
+
+How unit tests are declared
+---------------------------
+
+Unit tests are declared in the ``CMakeLists.txt`` file, which is found in the
+``./src`` directory. The ``add_ceph_test`` and ``add_ceph_unittest`` CMake
+functions are used to declare unit tests. ``add_ceph_test`` and
+``add_ceph_unittest`` are themselves defined in
+``./cmake/modules/AddCephTest.cmake``.
+
+Some unit tests are scripts and other unit tests are binaries that are
+compiled during the build process.
+
+* ``add_ceph_test`` function - used to declare unit test scripts
+* ``add_ceph_unittest`` function - used for unit test binaries
+
+Unit testing of CLI tools
+-------------------------
+Some of the CLI tools are tested using special files ending with the extension
+``.t`` and stored under ``./src/test/cli``. These tests are run using a tool
+called `cram`_ via a shell script called ``./src/test/run-cli-tests``.
+`cram`_ tests that are not suitable for ``make check`` can also be run by
+teuthology using the `cram task`_.
+
+.. _`cram`: https://bitheap.org/cram/
+.. _`cram task`: https://github.com/ceph/ceph/blob/master/qa/tasks/cram.py
+
+Tox-based testing of Python modules
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Some of the Python modules in Ceph use `tox <https://tox.readthedocs.io/en/latest/>`_
+to run their unit tests.
+
+Most of these Python modules can be found in the directory ``./src/pybind/``.
+
+Currently (December 2020) the following modules use **tox**:
+
+* Cephadm (``./src/cephadm/tox.ini``)
+* Ceph Manager Python API (``./src/pybind/mgr``)
+
+ * ``./src/pybind/mgr/tox.ini``
+
+ * ``./src/pybind/mgr/dashboard/tox.ini``
+
+ * ``./src/pybind/tox.ini``
+
+* Dashboard (``./src/pybind/mgr/dashboard``)
+* Python common (``./src/python-common/tox.ini``)
+* CephFS (``./src/tools/cephfs/tox.ini``)
+* ceph-volume
+
+ * ``./src/ceph-volume/tox.ini``
+
+ * ``./src/ceph-volume/plugin/zfs/tox.ini``
+
+ * ``./src/ceph-volume/ceph_volume/tests/functional/batch/tox.ini``
+
+ * ``./src/ceph-volume/ceph_volume/tests/functional/simple/tox.ini``
+
+ * ``./src/ceph-volume/ceph_volume/tests/functional/lvm/tox.ini``
+
+Configuring Tox environments and tasks
+""""""""""""""""""""""""""""""""""""""
+Most tox configurations support multiple environments and tasks.
+
+The list of environments and tasks that are supported is in the ``tox.ini``
+file, under ``envlist``. For example, here are the first three lines of
+``./src/cephadm/tox.ini``::
+
+ [tox]
+ envlist = py3, mypy
+ skipsdist=true
+
+In this example, the ``Python 3`` and ``mypy`` environments are specified.
+
+The list of environments can be retrieved with the following command:
+
+ .. prompt:: bash $
+
+ tox --list
+
+Or:
+
+ .. prompt:: bash $
+
+ tox -l
+
+Running Tox
+"""""""""""
+To run **tox**, just execute ``tox`` in the directory containing
+``tox.ini``. If you do not specify any environments (for example, ``-e
+$env1,$env2``), then ``tox`` will run all environments. Jenkins will run
+``tox`` by executing ``./src/script/run_tox.sh``.
+
+Here are some examples from Ceph Dashboard that show how to specify different
+environments and run options::
+
+ ## Run Python 2+3 tests+lint commands:
+ $ tox -e py27,py3,lint,check
+
+ ## Run Python 3 tests+lint commands:
+ $ tox -e py3,lint,check
+
+ ## To run it as Jenkins would:
+ $ ../../../script/run_tox.sh --tox-env py3,lint,check
+
+Manager core unit tests
+"""""""""""""""""""""""
+
+Currently only doctests_ inside ``mgr_util.py`` are run.
+
+To add more files to be tested inside the core of the manager, open the
+``tox.ini`` file and add the files to be tested at the end of the line that
+includes ``mgr_util.py``.
+
+.. _doctests: https://docs.python.org/3/library/doctest.html
+
+Unit test caveats
+-----------------
+
+#. Unlike the various Ceph daemons and ``ceph-fuse``, the unit tests are
+ linked against the default memory allocator (glibc) unless they are
+ explicitly linked against something else. This enables tools such as
+ **valgrind** to be used in the tests.
+
+#. Google Test unit testing library hides the client output from the shell.
+ In order to debug the client after setting the desired debug level
+ (e.g ``ceph config set client debug_rbd 20``), the debug log file can
+ be found at ``build/out/client.admin.<pid>.log``.
+ This can also be handy when examining teuthology failed unit test
+ jobs, the job's debug level can be set at the relevant yaml file.
+
+.. _make check:
+.. _teuthology framework: https://github.com/ceph/teuthology