diff options
Diffstat (limited to 'doc/dev/developer_guide')
-rw-r--r-- | doc/dev/developer_guide/basic-workflow.rst | 515 | ||||
-rw-r--r-- | doc/dev/developer_guide/dash-devel.rst | 2590 | ||||
-rw-r--r-- | doc/dev/developer_guide/essentials.rst | 338 | ||||
-rw-r--r-- | doc/dev/developer_guide/index.rst | 25 | ||||
-rw-r--r-- | doc/dev/developer_guide/intro.rst | 25 | ||||
-rw-r--r-- | doc/dev/developer_guide/issue-tracker.rst | 39 | ||||
-rw-r--r-- | doc/dev/developer_guide/merging.rst | 138 | ||||
-rw-r--r-- | doc/dev/developer_guide/running-tests-in-cloud.rst | 289 | ||||
-rw-r--r-- | doc/dev/developer_guide/running-tests-locally.rst | 138 | ||||
-rw-r--r-- | doc/dev/developer_guide/running-tests-using-teuth.rst | 183 | ||||
-rw-r--r-- | doc/dev/developer_guide/tests-integration-tests.rst | 522 | ||||
-rw-r--r-- | doc/dev/developer_guide/tests-unit-tests.rst | 177 |
12 files changed, 4979 insertions, 0 deletions
diff --git a/doc/dev/developer_guide/basic-workflow.rst b/doc/dev/developer_guide/basic-workflow.rst new file mode 100644 index 000000000..5917b56be --- /dev/null +++ b/doc/dev/developer_guide/basic-workflow.rst @@ -0,0 +1,515 @@ +.. _basic workflow dev guide: + +Basic Workflow +============== + +The following chart illustrates the basic Ceph development workflow: + +.. ditaa:: + + Upstream Code Your Local Environment + + /----------\ git clone /-------------\ + | Ceph | -------------------------> | ceph/main | + \----------/ \-------------/ + ^ | + | | git branch fix_1 + | git merge | + | v + /----------------\ git commit --amend /-------------\ + | make check |---------------------> | ceph/fix_1 | + | ceph--qa--suite| \-------------/ + \----------------/ | + ^ | fix changes + | | test changes + | review | git commit + | | + | v + /--------------\ /-------------\ + | github |<---------------------- | ceph/fix_1 | + | pull request | git push \-------------/ + \--------------/ + +This page assumes that you are a new contributor with an idea for a bugfix or +an enhancement, but you do not know how to proceed. Watch the `Getting Started +with Ceph Development <https://www.youtube.com/watch?v=t5UIehZ1oLs>`_ video for +a practical summary of this workflow. + +Updating the tracker +-------------------- + +Find the :ref:`issue-tracker` (Redmine) number of the bug you intend to fix. If +no tracker issue exists, create one. There is only one case in which you do not +have to create a Redmine tracker issue: the case of minor documentation changes. + +Simple documentation cleanup does not require a corresponding tracker issue. +Major documenatation changes do require a tracker issue. Major documentation +changes include adding new documentation chapters or files, and making +substantial changes to the structure or content of the documentation. + +A (Redmine) tracker ticket explains the issue (bug) to other Ceph developers to +keep them informed as the bug nears resolution. Provide a useful, clear title +and include detailed information in the description. When composing the title +of the ticket, ask yourself "If I need to search for this ticket two years from +now, which keywords am I likely to search for?" Then include those keywords in +the title. + +If your tracker permissions are elevated, assign the bug to yourself by setting +the ``Assignee`` field. If your tracker permissions have not been elevated, +just add a comment with a short message that says "I am working on this issue". + +Ceph Workflow Overview +---------------------- + +Three repositories are involved in the Ceph workflow. They are: + +1. The upstream repository (ceph/ceph) +2. Your fork of the upstream repository (your_github_id/ceph) +3. Your local working copy of the repository (on your workstation) + +The procedure for making changes to the Ceph repository is as follows: + +#. Configure your local environment + + #. :ref:`Create a fork<forking>` of the "upstream Ceph" + repository. + + #. :ref:`Clone the fork<cloning>` to your local filesystem. + +#. Fix the bug + + #. :ref:`Synchronize local main with upstream main<synchronizing>`. + + #. :ref:`Create a bugfix branch<bugfix_branch>` in your local working copy. + + #. :ref:`Make alterations to the local working copy of the repository in your + local filesystem<fixing_bug_locally>`. + + #. :ref:`Push the changes in your local working copy to your fork<push_changes>`. + +#. Create a Pull Request to push the change upstream + + #. Create a Pull Request that asks for your changes to be added into the + "upstream Ceph" repository. + +Preparing Your Local Working Copy of the Ceph Repository +-------------------------------------------------------- + +The procedures in this section, "Preparing Your Local Working Copy of the Ceph +Repository", must be followed only when you are first setting up your local +environment. If this is your first time working with the Ceph project, then +these commands are necessary and are the first commands that you should run. + +.. _forking: + +Creating a Fork of the Ceph Repository +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +See the `GitHub documentation +<https://help.github.com/articles/fork-a-repo/#platform-linux>`_ for +detailed instructions on forking. In short, if your GitHub username is +"mygithubaccount", your fork of the upstream repo will appear at +``https://github.com/mygithubaccount/ceph``. + +.. _cloning: + +Cloning Your Fork +^^^^^^^^^^^^^^^^^ + +After you have created your fork, clone it by running the following command: + +.. prompt:: bash $ + + git clone https://github.com/mygithubaccount/ceph + +You must fork the Ceph repository before you clone it. If you fail to fork, +you cannot open a `GitHub pull request +<https://docs.github.com/en/free-pro-team@latest/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request>`_. + +For more information on using GitHub, refer to `GitHub Help +<https://help.github.com/>`_. + +Configuring Your Local Environment +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The commands in this section configure your local git environment so that it +generates "Signed-off-by:" tags. These commands also set up your local +environment so that it can stay synchronized with the upstream repository. + +These commands are necessary only during the initial setup of your local +working copy. Another way to say that is "These commands are necessary +only the first time that you are working with the Ceph repository. They are, +however, unavoidable, and if you fail to run them then you will not be able +to work on the Ceph repository.". + +1. Configure your local git environment with your name and email address. + + .. note:: + These commands will work only from within the ``ceph/`` directory + that was created when you cloned your fork. + + .. prompt:: bash $ + + git config user.name "FIRST_NAME LAST_NAME" + git config user.email "MY_NAME@example.com" + +2. Add the upstream repo as a "remote" and fetch it: + + .. prompt:: bash $ + + git remote add ceph https://github.com/ceph/ceph.git + git fetch ceph + + These commands fetch all the branches and commits from ``ceph/ceph.git`` to + the local git repo as ``remotes/ceph/$BRANCH_NAME`` and can be referenced as + ``ceph/$BRANCH_NAME`` in local git commands. + +Fixing the Bug +-------------- + +.. _synchronizing: + +Synchronizing Local Main with Upstream Main +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In your local working copy, there is a copy of the ``main`` branch in +``remotes/origin/main``. This is called "local main". This copy of the +main branch (https://github.com/your_github_id/ceph.git) is "frozen in time" +at the moment that you cloned it, but the upstream repo +(https://github.com/ceph/ceph.git, typically abbreviated to ``ceph/ceph.git``) +that it was forked from is not frozen in time: the upstream repo is still being +updated by other contributors. + +Because upstream main is continually receiving updates from other +contributors, your fork will drift farther and farther from the state of the +upstream repo when you cloned it. + +Keep your fork's ``main`` branch synchronized with upstream main to reduce drift +between your fork's main branch and the upstream main branch. + +Here are the commands for keeping your fork synchronized with the +upstream repository: + +.. prompt:: bash $ + + git fetch ceph + git checkout main + git reset --hard ceph/main + git push -u origin main + +Follow this procedure often to keep your local ``main`` in sync with upstream +``main``. + +If the command ``git status`` returns a line that reads "Untracked files", see +:ref:`the procedure on updating submodules <update-submodules>`. + +.. _bugfix_branch: + +Creating a Bugfix branch +^^^^^^^^^^^^^^^^^^^^^^^^ + +Create a branch for your bugfix: + +.. prompt:: bash $ + + git checkout main + git checkout -b fix_1 + git push -u origin fix_1 + +The first command (git checkout main) makes sure that the bugfix branch +"fix_1" is created from the most recent state of the main branch of the +upstream repository. + +The second command (git checkout -b fix_1) creates a "bugfix branch" called +"fix_1" in your local working copy of the repository. The changes that you make +in order to fix the bug will be commited to this branch. + +The third command (git push -u origin fix_1) pushes the bugfix branch from +your local working repository to your fork of the upstream repository. + +.. _fixing_bug_locally: + +Fixing the bug in the local working copy +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +#. **Updating the tracker** + + In the `Ceph issue tracker <https://tracker.ceph.com>`_, change the status + of the tracker issue to "In progress". This communicates to other Ceph + contributors that you have begun working on a fix, which helps to avoid + duplication of effort. If you don't have permission to change that field, + just comment that you are working on the issue. + +#. **Fixing the bug itself** + + This guide cannot tell you how to fix the bug that you have chosen to fix. + This guide assumes that you know what required improvement, and that you + know what to do to provide that improvement. + + It might be that your fix is simple and requires only minimal testing. But + that's unlikely. It is more likely that the process of fixing your bug will + be iterative and will involve trial, error, skill, and patience. + + For a detailed discussion of the tools available for validating bugfixes, + see the chapters on testing. + +Pushing the Fix to Your Fork +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +You have finished work on the bugfix. You have tested the bugfix, and you +believe that it works. + +#. Commit the changes to your local working copy. + + Commit the changes to the `fix_1` branch of your local working copy by using + the ``--signoff`` option (here represented as the `s` portion of the `-as` + flag): + + .. prompt:: bash $ + + git commit -as + + .. _push_changes: + +#. Push the changes to your fork: + + Push the changes from the `fix_1` branch of your local working copy to the + `fix_1` branch of your fork of the upstream repository: + + .. prompt:: bash $ + + git push origin fix_1 + + .. note:: + + In the command ``git push origin fix_1``, ``origin`` is the name of your + fork of the upstream Ceph repository, and can be thought of as a nickname + for ``git@github.com:username/ceph.git``, where ``username`` is your + GitHub username. + + It is possible that ``origin`` is not the name of your fork. Discover the + name of your fork by running ``git remote -v``, as shown here: + + .. code-block:: bash + + $ git remote -v + ceph https://github.com/ceph/ceph.git (fetch) + ceph https://github.com/ceph/ceph.git (push) + origin git@github.com:username/ceph.git (fetch) + origin git@github.com:username/ceph.git (push) + + The line:: + + origin git@github.com:username/ceph.git (fetch) + + and the line:: + + origin git@github.com:username/ceph.git (push) + + provide the information that "origin" is the name of your fork of the + Ceph repository. + + +Opening a GitHub pull request +----------------------------- + +After you have pushed the bugfix to your fork, open a GitHub pull request +(PR). This makes your bugfix visible to the community of Ceph contributors. +They will review it. They may perform additional testing on your bugfix, and +they might request changes to the bugfix. + +Be prepared to receive suggestions and constructive criticism in the form of +comments within the PR. + +If you don't know how to create and manage pull requests, read `this GitHub +pull request tutorial`_. + +.. _`this GitHub pull request tutorial`: + https://help.github.com/articles/using-pull-requests/ + +To learn what constitutes a "good" pull request, see +the `Git Commit Good Practice`_ article at the `OpenStack Project Wiki`_. + +.. _`Git Commit Good Practice`: https://wiki.openstack.org/wiki/GitCommitMessages +.. _`OpenStack Project Wiki`: https://wiki.openstack.org/wiki/Main_Page + +See also our own `Submitting Patches +<https://github.com/ceph/ceph/blob/main/SubmittingPatches.rst>`_ document. + +After your pull request (PR) has been opened, update the :ref:`issue-tracker` +by adding a comment directing other contributors to your PR. The comment can be +as simple as this:: + + *PR*: https://github.com/ceph/ceph/pull/$NUMBER_OF_YOUR_PULL_REQUEST + +Understanding Automated PR validation +------------------------------------- + +When you create or update your PR, the Ceph project's `Continuous Integration +(CI) <https://en.wikipedia.org/wiki/Continuous_integration>`_ infrastructure +automatically tests it. At the time of this writing (May 2022), the automated +CI testing included many tests. These five are among them: + +#. a test to check that the commits are properly signed (see :ref:`submitting-patches`): +#. a test to check that the documentation builds +#. a test to check that the submodules are unmodified +#. a test to check that the API is in order +#. a :ref:`make check<make-check>` test + +Additional tests may be run depending on which files your PR modifies. + +The :ref:`make check<make-check>` test builds the PR and runs it through a +battery of tests. These tests run on servers that are operated by the Ceph +Continuous Integration (CI) team. When the tests have completed their run, the +result is shown on GitHub in the pull request itself. + +Test your modifications before you open a PR. Refer to the chapters +on testing for details. + +Notes on PR make check test +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The GitHub :ref:`make check<make-check>` test is driven by a Jenkins instance. + +Jenkins merges your PR branch into the latest version of the base branch before +it starts any tests. This means that you don't have to rebase the PR in order +to pick up any fixes. + +You can trigger PR tests at any time by adding a comment to the PR - the +comment should contain the string "test this please". Since a human who is +subscribed to the PR might interpret that as a request for him or her to test +the PR, you must address Jenkins directly. For example, write "jenkins retest +this please". If you need to run only one of the tests, you can request it with +a command like "jenkins test signed". A list of these requests is automatically +added to the end of each new PR's description, so check there to find the +single test you need. + +If there is a build failure and you aren't sure what caused it, check the +:ref:`make check<make-check>` log. To access the make check log, click the +"details" (next to the :ref:`make check<make-check>` test in the PR) link to +enter the Jenkins web GUI. Then click "Console Output" (on the left). + +Jenkins is configured to search logs for strings that are known to have been +associated with :ref:`make check<make-check>` failures in the past. However, +there is no guarantee that these known strings are associated with any given +:ref:`make check<make-check>` failure. You'll have to read through the log to +determine the cause of your specific failure. + +Integration tests AKA ceph-qa-suite +----------------------------------- + +Since Ceph is complex, it may be necessary to test your fix to +see how it behaves on real clusters running on physical or virtual +hardware. Tests designed for this purpose live in the `ceph/qa +sub-directory`_ and are run via the `teuthology framework`_. + +.. _`ceph/qa sub-directory`: https://github.com/ceph/ceph/tree/main/qa/ +.. _`teuthology repository`: https://github.com/ceph/teuthology +.. _`teuthology framework`: https://github.com/ceph/teuthology + +The Ceph community has access to the `Sepia lab +<https://wiki.sepia.ceph.com/doku.php>`_ where :ref:`testing-integration-tests` can be +run on physical hardware. Other developers may add tags like "needs-qa" to your +PR. This allows PRs that need testing to be merged into a single branch and +tested all at the same time. Since teuthology suites can take hours (even +days in some cases) to run, this can save a lot of time. + +To request access to the Sepia lab, start `here <https://wiki.sepia.ceph.com/doku.php?id=vpnaccess>`_. + +Integration testing is discussed in more detail in the :ref:`testing-integration-tests` +chapter. + +Code review +----------- + +Once your bugfix has been thoroughly tested, or even during this process, +it will be subjected to code review by other developers. This typically +takes the form of comments in the PR itself, but can be supplemented +by discussions on :ref:`irc` and the :ref:`mailing-list`. + +Amending your PR +---------------- + +While your PR is going through testing and `Code Review`_, you can +modify it at any time by editing files in your local branch. + +After updates are committed locally (to the ``fix_1`` branch in our +example), they need to be pushed to GitHub so they appear in the PR. + +Modifying the PR is done by adding commits to the ``fix_1`` branch upon +which it is based, often followed by rebasing to modify the branch's git +history. See `this tutorial +<https://www.atlassian.com/git/tutorials/rewriting-history>`_ for a good +introduction to rebasing. When you are done with your modifications, you +will need to force push your branch with: + +.. prompt:: bash $ + + git push --force origin fix_1 + +Why do we take these extra steps instead of simply adding additional commits +the PR? It is best practice for a PR to consist of a single commit; this +makes for clean history, eases peer review of your changes, and facilitates +merges. In rare circumstances it also makes it easier to cleanly revert +changes. + +Merging +------- + +The bugfix process completes when a project lead merges your PR. + +When this happens, it is a signal for you (or the lead who merged the PR) +to change the :ref:`issue-tracker` status to "Resolved". Some issues may be +flagged for backporting, in which case the status should be changed to +"Pending Backport" (see the :ref:`backporting` chapter for details). + +See also :ref:`merging` for more information on merging. + +Proper Merge Commit Format +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This is the most basic form of a merge commit:: + + doc/component: title of the commit + + Reviewed-by: Reviewer Name <rname@example.com> + +This consists of two parts: + +#. The title of the commit / PR to be merged. +#. The name and email address of the reviewer. Enclose the reviewer's email + address in angle brackets. + +Using .githubmap to Find a Reviewer's Email Address +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +If you cannot find the email address of the reviewer on his or her GitHub +page, you can look it up in the **.githubmap** file, which can be found in +the repository at **/ceph/.githubmap**. + +Using "git log" to find a Reviewer's Email Address +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +If you cannot find a reviewer's email address by using the above methods, you +can search the git log for their email address. Reviewers are likely to have +committed something before. If they have made previous contributions, the git +log will probably contain their email address. + +Use the following command + +.. prompt:: bash [branch-under-review]$ + + git log + +Using ptl-tool to Generate Merge Commits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Another method of generating merge commits involves using Patrick Donnelly's +**ptl-tool** pull commits. This tool can be found at +**/ceph/src/script/ptl-tool.py**. Merge commits that have been generated by +the **ptl-tool** have the following form:: + + Merge PR #36257 into main + * refs/pull/36257/head: + client: move client_lock to _unmount() + client: add timer_lock support + Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> + diff --git a/doc/dev/developer_guide/dash-devel.rst b/doc/dev/developer_guide/dash-devel.rst new file mode 100644 index 000000000..29974555a --- /dev/null +++ b/doc/dev/developer_guide/dash-devel.rst @@ -0,0 +1,2590 @@ +.. _dashdevel: + +Ceph Dashboard Developer Documentation +====================================== + +.. contents:: Table of Contents + +Feature Design +-------------- + +To promote collaboration on new Ceph Dashboard features, the first step is +the definition of a design document. These documents then form the basis of +implementation scope and permit wider participation in the evolution of the +Ceph Dashboard UI. + +.. toctree:: + :maxdepth: 1 + :caption: Design Documents: + + UI Design Goals <../dashboard/ui_goals> + + +Preliminary Steps +----------------- + +The following documentation chapters expect a running Ceph cluster and at +least a running ``dashboard`` manager module (with few exceptions). This +chapter gives an introduction on how to set up such a system for development, +without the need to set up a full-blown production environment. All options +introduced in this chapter are based on a so called ``vstart`` environment. + +.. note:: + + Every ``vstart`` environment needs Ceph `to be compiled`_ from its Github + repository, though Docker environments simplify that step by providing a + shell script that contains those instructions. + + One exception to this rule are the `build-free`_ capabilities of + `ceph-dev`_. See below for more information. + +.. _to be compiled: https://docs.ceph.com/docs/master/install/build-ceph/ + +vstart +~~~~~~ + +"vstart" is actually a shell script in the ``src/`` directory of the Ceph +repository (``src/vstart.sh``). It is used to start a single node Ceph +cluster on the machine where it is executed. Several required and some +optional Ceph internal services are started automatically when it is used to +start a Ceph cluster. vstart is the basis for the three most commonly used +development environments in Ceph Dashboard. + +You can read more about vstart in `Deploying a development cluster`_. +Additional information for developers can also be found in the `Developer +Guide`_. + +.. _Deploying a development cluster: https://docs.ceph.com/docs/master/dev/dev_cluster_deployement/ +.. _Developer Guide: https://docs.ceph.com/docs/master/dev/quick_guide/ + +Host-based vs Docker-based Development Environments +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This document introduces you to three different development environments, all +based on vstart. Those are: + +- vstart running on your host system + +- vstart running in a Docker environment + + * ceph-dev-docker_ + * ceph-dev_ + + Besides their independent development branches and sometimes slightly + different approaches, they also differ with respect to their underlying + operating systems. + + ========= ====================== ======== + Release ceph-dev-docker ceph-dev + ========= ====================== ======== + Mimic openSUSE Leap 15 CentOS 7 + Nautilus openSUSE Leap 15 CentOS 7 + Octopus openSUSE Leap 15.2 CentOS 8 + --------- ---------------------- -------- + Master openSUSE Tumbleweed CentOS 8 + ========= ====================== ======== + +.. note:: + + Independently of which of these environments you will choose, you need to + compile Ceph in that environment. If you compiled Ceph on your host system, + you would have to recompile it on Docker to be able to switch to a Docker + based solution. The same is true vice versa. If you previously used a + Docker development environment and compiled Ceph there and you now want to + switch to your host system, you will also need to recompile Ceph (or + compile Ceph using another separate repository). + + `ceph-dev`_ is an exception to this rule as one of the options it provides + is `build-free`_. This is accomplished through a Ceph installation using + RPM system packages. You will still be able to work with a local Github + repository like you are used to. + + +Development environment on your host system +........................................... + +- No need to learn or have experience with Docker, jump in right away. + +- Limited amount of scripts to support automation (like Ceph compilation). + +- No pre-configured easy-to-start services (Prometheus, Grafana, etc). + +- Limited amount of host operating systems supported, depending on which + Ceph version is supposed to be used. + +- Dependencies need to be installed on your host. + +- You might find yourself in the situation where you need to upgrade your + host operating system (for instance due to a change of the GCC version used + to compile Ceph). + + +Development environments based on Docker +........................................ + +- Some overhead in learning Docker if you are not used to it yet. + +- Both Docker projects provide you with scripts that help you getting started + and automate recurring tasks. + +- Both Docker environments come with partly pre-configured external services + which can be used to attach to or complement Ceph Dashboard features, like + + - Prometheus + - Grafana + - Node-Exporter + - Shibboleth + - HAProxy + +- Works independently of the operating system you use on your host. + + +.. _build-free: https://github.com/rhcs-dashboard/ceph-dev#quick-install-rpm-based + +vstart on your host system +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The vstart script is usually called from your `build/` directory like so: + +.. code:: + + ../src/vstart.sh -n -d + +In this case ``-n`` ensures that a new vstart cluster is created and that a +possibly previously created cluster isn't re-used. ``-d`` enables debug +messages in log files. There are several more options to chose from. You can +get a list using the ``--help`` argument. + +At the end of the output of vstart, there should be information about the +dashboard and its URLs:: + + vstart cluster complete. Use stop.sh to stop. See out/* (e.g. 'tail -f out/????') for debug output. + + dashboard urls: https://192.168.178.84:41259, https://192.168.178.84:43259, https://192.168.178.84:45259 + w/ user/pass: admin / admin + restful urls: https://192.168.178.84:42259, https://192.168.178.84:44259, https://192.168.178.84:46259 + w/ user/pass: admin / 598da51f-8cd1-4161-a970-b2944d5ad200 + +During development (especially in backend development), you also want to +check on occasions if the dashboard manager module is still running. To do so +you can call `./bin/ceph mgr services` manually. It will list all the URLs of +successfully enabled services. Only URLs of services which are available over +HTTP(S) will be listed there. Ceph Dashboard is one of these services. It +should look similar to the following output: + +.. code:: + + $ ./bin/ceph mgr services + { + "dashboard": "https://home:41931/", + "restful": "https://home:42931/" + } + +By default, this environment uses a randomly chosen port for Ceph Dashboard +and you need to use this command to find out which one it has become. + +Docker +~~~~~~ + +Docker development environments usually ship with a lot of useful scripts. +``ceph-dev-docker`` for instance contains a file called `start-ceph.sh`, +which cleans up log files, always starts a Rados Gateway service, sets some +Ceph Dashboard configuration options and automatically runs a frontend proxy, +all before or after starting up your vstart cluster. + +Instructions on how to use those environments are contained in their +respective repository README files. + +- ceph-dev-docker_ +- ceph-dev_ + +.. _ceph-dev-docker: https://github.com/ricardoasmarques/ceph-dev-docker +.. _ceph-dev: https://github.com/rhcs-dashboard/ceph-dev + +Frontend Development +-------------------- + +Before you can start the dashboard from within a development environment, you +will need to generate the frontend code and either use a compiled and running +Ceph cluster (e.g. started by ``vstart.sh``) or the standalone development web +server. + +The build process is based on `Node.js <https://nodejs.org/>`_ and requires the +`Node Package Manager <https://www.npmjs.com/>`_ ``npm`` to be installed. + +Prerequisites +~~~~~~~~~~~~~ + + * Node 12.18.2 or higher + * NPM 6.13.4 or higher + +nodeenv: + During Ceph's build we create a virtualenv with ``node`` and ``npm`` + installed, which can be used as an alternative to installing node/npm in your + system. + + If you want to use the node installed in the virtualenv you just need to + activate the virtualenv before you run any npm commands. To activate it run + ``. build/src/pybind/mgr/dashboard/node-env/bin/activate``. + + Once you finish, you can simply run ``deactivate`` and exit the virtualenv. + +Angular CLI: + If you do not have the `Angular CLI <https://github.com/angular/angular-cli>`_ + installed globally, then you need to execute ``ng`` commands with an + additional ``npm run`` before it. + +Package installation +~~~~~~~~~~~~~~~~~~~~ + +Run ``npm ci`` in directory ``src/pybind/mgr/dashboard/frontend`` to +install the required packages locally. + +Adding or updating packages +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Run the following commands to add/update a package:: + + npm install <PACKAGE_NAME> + npm ci + +Setting up a Development Server +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Create the ``proxy.conf.json`` file based on ``proxy.conf.json.sample``. + +Run ``npm start`` for a dev server. +Navigate to ``http://localhost:4200/``. The app will automatically +reload if you change any of the source files. + +Code Scaffolding +~~~~~~~~~~~~~~~~ + +Run ``ng generate component component-name`` to generate a new +component. You can also use +``ng generate directive|pipe|service|class|guard|interface|enum|module``. + +Build the Project +~~~~~~~~~~~~~~~~~ + +Run ``npm run build`` to build the project. The build artifacts will be +stored in the ``dist/`` directory. Use the ``--prod`` flag for a +production build (``npm run build -- --prod``). Navigate to ``https://localhost:8443``. + +Build the Code Documentation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Run ``npm run doc-build`` to generate code docs in the ``documentation/`` +directory. To make them accessible locally for a web browser, run +``npm run doc-serve`` and they will become available at ``http://localhost:8444``. +With ``npm run compodoc -- <opts>`` you may +`fully configure it <https://compodoc.app/guides/usage.html>`_. + +Code linting and formatting +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +We use the following tools to lint and format the code in all our TS, SCSS and +HTML files: + +- `codelyzer <http://codelyzer.com/>`_ +- `html-linter <https://github.com/chinchiheather/html-linter>`_ +- `htmllint-cli <https://github.com/htmllint/htmllint-cli>`_ +- `Prettier <https://prettier.io/>`_ +- `TSLint <https://palantir.github.io/tslint/>`_ +- `stylelint <https://stylelint.io/>`_ + +We added 2 npm scripts to help run these tools: + +- ``npm run lint``, will check frontend files against all linters +- ``npm run fix``, will try to fix all the detected linting errors + +Ceph Dashboard and Bootstrap +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Currently we are using Bootstrap on the Ceph Dashboard as a CSS framework. This means that most of our SCSS and HTML +code can make use of all the utilities and other advantages Bootstrap is offering. In the past we often have used our +own custom styles and this lead to more and more variables with a single use and double defined variables which +sometimes are forgotten to be removed or it led to styling be inconsistent because people forgot to change a color or to +adjust a custom SCSS class. + +To get the current version of Bootstrap used inside Ceph please refer to the ``package.json`` and search for: + +- ``bootstrap``: For the Bootstrap version used. +- ``@ng-bootstrap``: For the version of the Angular bindings which we are using. + +So for the future please do the following when visiting a component: + +- Does this HTML/SCSS code use custom code? - If yes: Is it needed? --> Clean it up before changing the things you want + to fix or change. +- If you are creating a new component: Please make use of Bootstrap as much as reasonably possible! Don't try to + reinvent the wheel. +- If possible please look up if Bootstrap has guidelines on how to extend it properly to do achieve what you want to + achieve. + +The more bootstrap alike our code is the easier it is to theme, to maintain and the less bugs we will have. Also since +Bootstrap is a framework which tries to have usability and user experience in mind we increase both points +exponentially. The biggest benefit of all is that there is less code for us to maintain which makes it easier to read +for beginners and even more easy for people how are already familiar with the code. + +Writing Unit Tests +~~~~~~~~~~~~~~~~~~ + +To write unit tests most efficient we have a small collection of tools, +we use within test suites. + +Those tools can be found under +``src/pybind/mgr/dashboard/frontend/src/testing/``, especially take +a look at ``unit-test-helper.ts``. + +There you will be able to find: + +``configureTestBed`` that replaces the initial ``TestBed`` +methods. It takes the same arguments as ``TestBed.configureTestingModule``. +Using it will run your tests a lot faster in development, as it doesn't +recreate everything from scratch on every test. To use the default behaviour +pass ``true`` as the second argument. + +``PermissionHelper`` to help determine if +the correct actions are shown based on the current permissions and selection +in a list. + +``FormHelper`` which makes testing a form a lot easier +with a few simple methods. It allows you to set a control or multiple +controls, expect if a control is valid or has an error or just do both with +one method. Additional you can expect a template element or multiple elements +to be visible in the rendered template. + +Running Unit Tests +~~~~~~~~~~~~~~~~~~ + +Run ``npm run test`` to execute the unit tests via `Jest +<https://facebook.github.io/jest/>`_. + +If you get errors on all tests, it could be because `Jest +<https://facebook.github.io/jest/>`__ or something else was updated. +There are a few ways how you can try to resolve this: + +- Remove all modules with ``rm -rf dist node_modules`` and run ``npm install`` + again in order to reinstall them +- Clear the cache of jest by running ``npx jest --clearCache`` + +Running End-to-End (E2E) Tests +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +We use `Cypress <https://www.cypress.io/>`__ to run our frontend E2E tests. + +E2E Prerequisites +................. + +You need to previously build the frontend. + +In some environments, depending on your user permissions and the CYPRESS_CACHE_FOLDER, +you might need to run ``npm ci`` with the ``--unsafe-perm`` flag. + +You might need to install additional packages to be able to run Cypress. +Please run ``npx cypress verify`` to verify it. + +run-frontend-e2e-tests.sh +......................... + +Our ``run-frontend-e2e-tests.sh`` script is the go to solution when you wish to +do a full scale e2e run. +It will verify if everything needed is installed, start a new vstart cluster +and run the full test suite. + +Start all frontend E2E tests by running:: + + $ ./run-frontend-e2e-tests.sh + +Report: + You can follow the e2e report on the terminal and you can find the screenshots + of failed test cases by opening the following directory:: + + src/pybind/mgr/dashboard/frontend/cypress/screenshots/ + +Device: + You can force the script to use a specific device with the ``-d`` flag:: + + $ ./run-frontend-e2e-tests.sh -d <chrome|chromium|electron|docker> + +Remote: + By default this script will stop and start a new vstart cluster. + If you want to run the tests outside the ceph environment, you will need to + manually define the dashboard url using ``-r`` and, optionally, credentials + (``-u``, ``-p``):: + + $ ./run-frontend-e2e-tests.sh -r <DASHBOARD_URL> -u <E2E_LOGIN_USER> -p <E2E_LOGIN_PWD> + +Note: + When using docker, as your device, you might need to run the script with sudo + permissions. + +run-cephadm-e2e-tests.sh +......................... + +``run-cephadm-e2e-tests.sh`` runs a subset of E2E tests to verify that the Dashboard and cephadm as +Orchestrator backend behave correctly. + +Prerequisites: you need to install `KCLI +<https://kcli.readthedocs.io/en/latest/>`_ and Node.js in your local machine. + +Configure KCLI plan requirements:: + + $ sudo chown -R $(id -un) /var/lib/libvirt/images + $ mkdir -p /var/lib/libvirt/images/ceph-dashboard dashboard + $ kcli create pool -p /var/lib/libvirt/images/ceph-dashboard dashboard + $ kcli create network -c 192.168.100.0/24 dashboard + +Note: + This script is aimed to be run as jenkins job so the cleanup is triggered only in a jenkins + environment. In local, the user will shutdown the cluster when desired (i.e. after debugging). + +Start E2E tests by running:: + + $ cd <your/ceph/repo/dir> + $ sudo chown -R $(id -un) src/pybind/mgr/dashboard/frontend/{dist,node_modules,src/environments} + $ ./src/pybind/mgr/dashboard/ci/cephadm/run-cephadm-e2e-tests.sh + +Note: + In fedora 35, there can occur a permission error when trying to mount the shared_folders. This can be + fixed by running:: + + $ sudo setfacl -R -m u:qemu:rwx <abs-path-to-your-user-home> + + or also by setting the appropriate permission to your $HOME directory + +You can also start a cluster in development mode (so the frontend build starts in watch mode and you +only have to reload the page for the changes to be reflected) by running:: + + $ ./src/pybind/mgr/dashboard/ci/cephadm/start-cluster.sh --dev-mode + +Note: + Add ``--expanded`` if you need a cluster ready to deploy services (one with enough monitor + daemons spread across different hosts and enough OSDs). + +Test your changes by running: + + $ ./src/pybind/mgr/dashboard/ci/cephadm/run-cephadm-e2e-tests.sh + +Shutdown the cluster by running: + + $ kcli delete plan -y ceph + $ # In development mode, also kill the npm build watch process (e.g., pkill -f "ng build") + +Other running options +..................... + +During active development, it is not recommended to run the previous script, +as it is not prepared for constant file changes. +Instead you should use one of the following commands: + +- ``npm run e2e`` - This will run ``ng serve`` and open the Cypress Test Runner. +- ``npm run e2e:ci`` - This will run ``ng serve`` and run the Cypress Test Runner once. +- ``npx cypress run`` - This calls cypress directly and will run the Cypress Test Runner. + You need to have a running frontend server. +- ``npx cypress open`` - This calls cypress directly and will open the Cypress Test Runner. + You need to have a running frontend server. + +Calling Cypress directly has the advantage that you can use any of the available +`flags <https://docs.cypress.io/guides/guides/command-line.html#cypress-run>`__ +to customize your test run and you don't need to start a frontend server each time. + +Using one of the ``open`` commands, will open a cypress application where you +can see all the test files you have and run each individually. +This is going to be run in watch mode, so if you make any changes to test files, +it will retrigger the test run. +This cannot be used inside docker, as it requires X11 environment to be able to open. + +By default Cypress will look for the web page at ``https://localhost:4200/``. +If you are serving it in a different URL you will need to configure it by +exporting the environment variable CYPRESS_BASE_URL with the new value. +E.g.: ``CYPRESS_BASE_URL=https://localhost:41076/ npx cypress open`` + +CYPRESS_CACHE_FOLDER +..................... + +When installing cypress via npm, a binary of the cypress app will also be +downloaded and stored in a cache folder. +This removes the need to download it every time you run ``npm ci`` or even when +using cypress in a separate project. + +By default Cypress uses ~/.cache to store the binary. +To prevent changes to the user home directory, we have changed this folder to +``/ceph/build/src/pybind/mgr/dashboard/cypress``, so when you build ceph or run +``run-frontend-e2e-tests.sh`` this is the directory Cypress will use. + +When using any other command to install or run cypress, +it will go back to the default directory. It is recommended that you export the +CYPRESS_CACHE_FOLDER environment variable with a fixed directory, so you always +use the same directory no matter which command you use. + + +Writing End-to-End Tests +~~~~~~~~~~~~~~~~~~~~~~~~ + +The PagerHelper class +..................... + +The ``PageHelper`` class is supposed to be used for general purpose code that +can be used on various pages or suites. + +Examples are + +- ``navigateTo()`` - Navigates to a specific page and waits for it to load +- ``getFirstTableCell()`` - returns the first table cell. You can also pass a + string with the desired content and it will return the first cell that + contains it. +- ``getTabsCount()`` - returns the amount of tabs + +Every method that could be useful on several pages belongs there. Also, methods +which enhance the derived classes of the PageHelper belong there. A good +example for such a case is the ``restrictTo()`` decorator. It ensures that a +method implemented in a subclass of PageHelper is called on the correct page. +It will also show a developer-friendly warning if this is not the case. + +Subclasses of PageHelper +........................ + +Helper Methods +"""""""""""""" + +In order to make code reusable which is specific for a particular suite, make +sure to put it in a derived class of the ``PageHelper``. For instance, when +talking about the pool suite, such methods would be ``create()``, ``exist()`` +and ``delete()``. These methods are specific to a pool but are useful for other +suites. + +Methods that return HTML elements which can only be found on a specific page, +should be either implemented in the helper methods of the subclass of PageHelper +or as own methods of the subclass of PageHelper. + +Using PageHelpers +""""""""""""""""" + +In any suite, an instance of the specific ``Helper`` class should be +instantiated and called directly. + +.. code:: TypeScript + + const pools = new PoolPageHelper(); + + it('should create a pool', () => { + pools.exist(poolName, false); + pools.navigateTo('create'); + pools.create(poolName, 8); + pools.exist(poolName, true); + }); + +Code Style +.......... + +Please refer to the official `Cypress Core Concepts +<https://docs.cypress.io/guides/core-concepts/introduction-to-cypress.html#Cypress-Can-Be-Simple-Sometimes>`__ +for a better insight on how to write and structure tests. + +``describe()`` vs ``it()`` +"""""""""""""""""""""""""" + +Both ``describe()`` and ``it()`` are function blocks, meaning that any +executable code necessary for the test can be contained in either block. +However, Typescript scoping rules still apply, therefore any variables declared +in a ``describe`` are available to the ``it()`` blocks inside of it. + +``describe()`` typically are containers for tests, allowing you to break tests +into multiple parts. Likewise, any setup that must be made before your tests are +run can be initialized within the ``describe()`` block. Here is an example: + +.. code:: TypeScript + + describe('create, edit & delete image test', () => { + const poolName = 'e2e_images_pool'; + + before(() => { + cy.login(); + pools.navigateTo('create'); + pools.create(poolName, 8, 'rbd'); + pools.exist(poolName, true); + }); + + beforeEach(() => { + cy.login(); + images.navigateTo(); + }); + + //... + + }); + +As shown, we can initiate the variable ``poolName`` as well as run commands +before our test suite begins (creating a pool). ``describe()`` block messages +should include what the test suite is. + +``it()`` blocks typically are parts of an overarching test. They contain the +functionality of the test suite, each performing individual roles. +Here is an example: + +.. code:: TypeScript + + describe('create, edit & delete image test', () => { + //... + + it('should create image', () => { + images.createImage(imageName, poolName, '1'); + images.getFirstTableCell(imageName).should('exist'); + }); + + it('should edit image', () => { + images.editImage(imageName, poolName, newImageName, '2'); + images.getFirstTableCell(newImageName).should('exist'); + }); + + //... + }); + +As shown from the previous example, our ``describe()`` test suite is to create, +edit and delete an image. Therefore, each ``it()`` completes one of these steps, +one for creating, one for editing, and so on. Likewise, every ``it()`` blocks +message should be in lowercase and written so long as "it" can be the prefix of +the message. For example, ``it('edits the test image' () => ...)`` vs. +``it('image edit test' () => ...)``. As shown, the first example makes +grammatical sense with ``it()`` as the prefix whereas the second message does +not. ``it()`` should describe what the individual test is doing and what it +expects to happen. + +Differences between Frontend Unit Tests and End-to-End (E2E) Tests / FAQ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +General introduction about testing and E2E/unit tests + + +What are E2E/unit tests designed for? +..................................... + +E2E test: + +It requires a fully functional system and tests the interaction of all components +of the application (Ceph, back-end, front-end). +E2E tests are designed to mimic the behavior of the user when interacting with the application +- for example when it comes to workflows like creating/editing/deleting an item. +Also the tests should verify that certain items are displayed as a user would see them +when clicking through the UI (for example a menu entry or a pool that has been +created during a test and the pool and its properties should be displayed in the table). + +Angular Unit Tests: + +Unit tests, as the name suggests, are tests for smaller units of the code. +Those tests are designed for testing all kinds of Angular components (e.g. services, pipes etc.). +They do not require a connection to the backend, hence those tests are independent of it. +The expected data of the backend is mocked in the frontend and by using this data +the functionality of the frontend can be tested without having to have real data from the backend. +As previously mentioned, data is either mocked or, in a simple case, contains a static input, +a function call and an expected static output. +More complex examples include the state of a component (attributes of the component class), +that define how the output changes according to the given input. + +Which E2E/unit tests are considered to be valid? +................................................ + +This is not easy to answer, but new tests that are written in the same way as already existing +dashboard tests should generally be considered valid. +Unit tests should focus on the component to be tested. +This is either an Angular component, directive, service, pipe, etc. + +E2E tests should focus on testing the functionality of the whole application. +Approximately a third of the overall E2E tests should verify the correctness +of user visible elements. + +How should an E2E/unit test look like? +...................................... + +Unit tests should focus on the described purpose +and shouldn't try to test other things in the same `it` block. + +E2E tests should contain a description that either verifies +the correctness of a user visible element or a complete process +like for example the creation/validation/deletion of a pool. + +What should an E2E/unit test cover? +................................... + +E2E tests should mostly, but not exclusively, cover interaction with the backend. +This way the interaction with the backend is utilized to write integration tests. + +A unit test should mostly cover critical or complex functionality +of a component (Angular Components, Services, Pipes, Directives, etc). + +What should an E2E/unit test NOT cover? +....................................... + +Avoid duplicate testing: do not write E2E tests for what's already +been covered as frontend-unit tests and vice versa. +It may not be possible to completely avoid an overlap. + +Unit tests should not be used to extensively click through components and E2E tests +shouldn't be used to extensively test a single component of Angular. + +Best practices/guideline +........................ + +As a general guideline we try to follow the 70/20/10 approach - 70% unit tests, +20% integration tests and 10% end-to-end tests. +For further information please refer to `this document +<https://testing.googleblog.com/2015/04/just-say-no-to-more-end-to-end-tests.html>`__ +and the included "Testing Pyramid". + +Further Help +~~~~~~~~~~~~ + +To get more help on the Angular CLI use ``ng help`` or go check out the +`Angular CLI +README <https://github.com/angular/angular-cli/blob/master/README.md>`__. + +Example of a Generator +~~~~~~~~~~~~~~~~~~~~~~ + +:: + + # Create module 'Core' + src/app> ng generate module core -m=app --routing + + # Create module 'Auth' under module 'Core' + src/app/core> ng generate module auth -m=core --routing + or, alternatively: + src/app> ng generate module core/auth -m=core --routing + + # Create component 'Login' under module 'Auth' + src/app/core/auth> ng generate component login -m=core/auth + or, alternatively: + src/app> ng generate component core/auth/login -m=core/auth + +Frontend Typescript Code Style Guide Recommendations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Group the imports based on its source and separate them with a blank +line. + +The source groups can be either from Angular, external or internal. + +Example: + +.. code:: javascript + + import { Component } from '@angular/core'; + import { Router } from '@angular/router'; + + import { ToastrManager } from 'ngx-toastr'; + + import { Credentials } from '../../../shared/models/credentials.model'; + import { HostService } from './services/host.service'; + +Frontend components +~~~~~~~~~~~~~~~~~~~ + +There are several components that can be reused on different pages. +This components are declared on the components module: +`src/pybind/mgr/dashboard/frontend/src/app/shared/components`. + +Helper +~~~~~~ + +This component should be used to provide additional information to the user. + +Example: + +.. code:: html + + <cd-helper> + Some <strong>helper</strong> html text + </cd-helper> + +Terminology and wording +~~~~~~~~~~~~~~~~~~~~~~~ + +Instead of using the Ceph component names, the approach +suggested is to use the logical/generic names (Block over RBD, Filesystem over +CephFS, Object over RGW). Nevertheless, as Ceph-Dashboard cannot completely hide +the Ceph internals, some Ceph-specific names might remain visible. + +Regarding the wording for action labels and other textual elements (form titles, +buttons, etc.), the chosen approach is to follow `these guidelines +<https://www.patternfly.org/styles/terminology-and-wording/#terminology-and-wording-for-action-labels>`_. +As a rule of thumb, 'Create' and 'Delete' are the proper wording for most forms, +instead of 'Add' and 'Remove', unless some already created item is either added +or removed to/from a set of items (e.g.: 'Add permission' to a user vs. 'Create +(new) permission'). + +In order to enforce the use of this wording, a service ``ActionLabelsI18n`` has +been created, which provides translated labels for use in UI elements. + +Frontend branding +~~~~~~~~~~~~~~~~~ + +Every vendor can customize the 'Ceph dashboard' to his needs. No matter if +logo, HTML-Template or TypeScript, every file inside the frontend folder can be +replaced. + +To replace files, open ``./frontend/angular.json`` and scroll to the section +``fileReplacements`` inside the production configuration. Here you can add the +files you wish to brand. We recommend to place the branded version of a file in +the same directory as the original one and to add a ``.brand`` to the file +name, right in front of the file extension. A ``fileReplacement`` could for +example look like this: + +.. code:: javascript + + { + "replace": "src/app/core/auth/login/login.component.html", + "with": "src/app/core/auth/login/login.component.brand.html" + } + +To serve or build the branded user interface run: + + $ npm run start -- --prod + +or + + $ npm run build -- --prod + +Unfortunately it's currently not possible to use multiple configurations when +serving or building the UI at the same time. That means a configuration just +for the branding ``fileReplacements`` is not an option, because you want to use +the production configuration anyway +(https://github.com/angular/angular-cli/issues/10612). +Furthermore it's also not possible to use glob expressions for +``fileReplacements``. As long as the feature hasn't been implemented, you have +to add the file replacements manually to the angular.json file +(https://github.com/angular/angular-cli/issues/12354). + +Nevertheless you should stick to the suggested naming scheme because it makes +it easier for you to use glob expressions once it's supported in the future. + +To change the variable defaults or add your own ones you can overwrite them in +``./frontend/src/styles/vendor/_variables.scss``. +Just reassign the variable you want to change, for example ``$color-primary: teal;`` +To overwrite or extend the default CSS, you can add your own styles in +``./frontend/src/styles/vendor/_style-overrides.scss``. + +UI Style Guide +~~~~~~~~~~~~~~ + +The style guide is created to document Ceph Dashboard standards and maintain +consistency across the project. Its an effort to make it easier for +contributors to process designing and deciding mockups and designs for +Dashboard. + +The development environment for Ceph Dashboard has live reloading enabled so +any changes made in UI are reflected in open browser windows. Ceph Dashboard +uses Bootstrap as the main third-party CSS library. + +Avoid duplication of code. Be consistent with the existing UI by reusing +existing SCSS declarations as much as possible. + +Always check for existing code similar to what you want to write. +You should always try to keep the same look-and-feel as the existing code. + +Colors +...... + +All the colors used in Ceph Dashboard UI are listed in +`frontend/src/styles/defaults/_bootstrap-defaults.scss`. If using new color +always define color variables in the `_bootstrap-defaults.scss` and +use the variable instead of hard coded color values so that changes to the +color are reflected in similar UI elements. + +The main color for the Ceph Dashboard is `$primary`. The primary color is +used in navigation components and as the `$border-color` for input components of +form. + +The secondary color is `$secondary` and is the background color for Ceph +Dashboard. + +Buttons +....... + +Buttons are used for performing actions such as: “Submit”, “Edit, “Create" and +“Update”. + +**Forms:** When using to submit forms anywhere in the Dashboard, the main action +button should use the `cd-submit-button` component and the secondary button should +use `cd-back-button` component. The text on the action button should be same as the +form title and follow a title case. The text on the secondary button should be +`Cancel`. `Perform action` button should always be on right while `Cancel` +button should always be on left. + +**Modals**: The main action button should use the `cd-submit-button` component and +the secondary button should use `cd-back-button` component. The text on the action +button should follow a title case and correspond to the action to be performed. +The text on the secondary button should be `Close`. + +**Disclosure Button:** Disclosure buttons should be used to allow users to +display and hide additional content in the interface. + +**Action Button**: Use the action button to perform actions such as edit or update +a component. All action button should have an icon corresponding to the actions they +perform and button text should follow title case. The button color should be the +same as the form's main button color. + +**Drop Down Buttons:** Use dropdown buttons to display predefined lists of +actions. All drop down buttons have icons corresponding to the action they +perform. + +Links +..... + +Use text hyperlinks as navigation to guide users to a new page in the application +or to anchor users to a section within a page. The color of the hyperlinks +should be `$primary`. + +Forms +..... + +Mark invalid form fields with red outline and show a meaningful error message. +Use red as font color for message and be as specific as possible. +`This field is required.` should be the exact error message for required fields. +Mark valid forms with a green outline and a green tick at the end of the form. +Sections should not have a bigger header than the parent. + +Modals +...... + +Blur any interface elements in the background to bring the modal content into +focus. The heading of the modal should reflect the action it can perform and +should be clearly mentioned at the top of the modal. Use `cd-back-button` +component in the footer for closing the modal. + +Icons +..... + +We use `Fork Awesome <https://forkaweso.me/Fork-Awesome/>`_ classes for icons. +We have a list of used icons in `src/app/shared/enum/icons.enum.ts`, these +should be referenced in the HTML, so its easier to change them later. When +icons are next to text, they should be center-aligned horizontally. If icons +are stacked, they should also be center-aligned vertically. Use small icons +with buttons. For notifications use large icons. + +Navigation +.......... + +For local navigation use tabs. For overall navigation use expandable vertical +navigation to collapse and expand items as needed. + +Alerts and notifications +........................ + +Default notification should have `text-info` color. Success notification should +have `text-success` color. Failure notification should have `text-danger` color. + +Error Handling +~~~~~~~~~~~~~~ + +For handling front-end errors, there is a generic Error Component which can be +found in ``./src/pybind/mgr/dashboard/frontend/src/app/core/error``. For +reporting a new error, you can simply extend the ``DashboardError`` class +in ``error.ts`` file and add specific header and message for the new error. Some +generic error classes are already in place such as ``DashboardNotFoundError`` +and ``DashboardForbiddenError`` which can be called and reused in different +scenarios. + +For example - ``throw new DashboardNotFoundError()``. + +I18N +---- + +How to extract messages from source code? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To extract the I18N messages from the templates and the TypeScript files just +run the following command in ``src/pybind/mgr/dashboard/frontend``:: + + $ npm run i18n:extract + +This will extract all marked messages from the HTML templates first and then +add all marked strings from the TypeScript files to the translation template. +Since the extraction from TypeScript files is still not supported by Angular +itself, we are using the +`ngx-translator <https://github.com/ngx-translate/i18n-polyfill>`_ extractor to +parse the TypeScript files. + +When the command ran successfully, it should have created or updated the file +``src/locale/messages.xlf``. + +The file isn't tracked by git, you can just use it to start with the +translation offline or add/update the resource files on transifex. + +Supported languages +~~~~~~~~~~~~~~~~~~~ + +All our supported languages should be registered in both exports in +``supported-languages.enum.ts`` and have a corresponding test in +``language-selector.component.spec.ts``. + +The ``SupportedLanguages`` enum will provide the list for the default language selection. + +Translating process +~~~~~~~~~~~~~~~~~~~ + +To facilitate the translation process of the dashboard we are using a web tool +called `transifex <https://www.transifex.com/>`_. + +If you wish to help translating to any language just go to our `transifex +project page <https://www.transifex.com/ceph/ceph-dashboard/>`_, join the +project and you can start translating immediately. + +All translations will then be reviewed and later pushed upstream. + +Updating translated messages +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Any time there are new messages translated and reviewed in a specific language +we should update the translation file upstream. + +To do that, check the settings in the i18n config file +``src/pybind/mgr/dashboard/frontend/i18n.config.json``:: and make sure that the +organization is *ceph*, the project is *ceph-dashboard* and the resource is +the one you want to pull from and push to e.g. *Master:master*. To find a list +of available resources visit `<https://www.transifex.com/ceph/ceph-dashboard/content/>`_. + +After you checked the config go to the directory ``src/pybind/mgr/dashboard/frontend`` and run:: + + $ npm run i18n + +This command will extract all marked messages from the HTML templates and +TypeScript files. Once the source file has been created it will push it to +transifex and pull the latest translations. It will also fill all the +untranslated strings with the source string. +The tool will ask you for an api token, unless you added it by running: + + $ npm run i18n:token + +To create a transifex api token visit `<https://www.transifex.com/user/settings/api/>`_. + +After the command ran successfully, build the UI and check if everything is +working as expected. You also might want to run the frontend tests. + +Suggestions +~~~~~~~~~~~ + +Strings need to start and end in the same line as the element: + +.. code-block:: html + + <!-- avoid --> + <span i18n> + Foo + </span> + + <!-- recommended --> + <span i18n>Foo</span> + + + <!-- avoid --> + <span i18n> + Foo bar baz. + Foo bar baz. + </span> + + <!-- recommended --> + <span i18n>Foo bar baz. + Foo bar baz.</span> + +Isolated interpolations should not be translated: + +.. code-block:: html + + <!-- avoid --> + <span i18n>{{ foo }}</span> + + <!-- recommended --> + <span>{{ foo }}</span> + +Interpolations used in a sentence should be kept in the translation: + +.. code-block:: html + + <!-- recommended --> + <span i18n>There are {{ x }} OSDs.</span> + +Remove elements that are outside the context of the translation: + +.. code-block:: html + + <!-- avoid --> + <label i18n> + Profile + <span class="required"></span> + </label> + + <!-- recommended --> + <label> + <ng-container i18n>Profile<ng-container> + <span class="required"></span> + </label> + +Keep elements that affect the sentence: + +.. code-block:: html + + <!-- recommended --> + <span i18n>Profile <b>foo</b> will be removed.</span> + +Backend Development +------------------- + +The Python backend code of this module requires a number of Python modules to be +installed. They are listed in file ``requirements.txt``. Using `pip +<https://pypi.python.org/pypi/pip>`_ you may install all required dependencies +by issuing ``pip install -r requirements.txt`` in directory +``src/pybind/mgr/dashboard``. + +If you're using the `ceph-dev-docker development environment +<https://github.com/ricardoasmarques/ceph-dev-docker/>`_, simply run +``./install_deps.sh`` from the toplevel directory to install them. + +Unit Testing +~~~~~~~~~~~~ + +In dashboard we have two different kinds of backend tests: + +1. Unit tests based on ``tox`` +2. API tests based on Teuthology. + +Unit tests based on tox +~~~~~~~~~~~~~~~~~~~~~~~~ + +We included a ``tox`` configuration file that will run the unit tests under +Python 3, as well as linting tools to guarantee the uniformity of code. + +You need to install ``tox`` and ``coverage`` before running it. To install the +packages in your system, either install it via your operating system's package +management tools, e.g. by running ``dnf install python-tox python-coverage`` on +Fedora Linux. + +Alternatively, you can use Python's native package installation method:: + + $ pip install tox + $ pip install coverage + +To run the tests, run ``src/script/run_tox.sh`` in the dashboard directory (where +``tox.ini`` is located):: + + ## Run Python 3 tests+lint commands: + $ ../../../script/run_tox.sh --tox-env py3,lint,check + + ## Run Python 3 arbitrary command (e.g. 1 single test): + $ ../../../script/run_tox.sh --tox-env py3 "" tests/test_rgw_client.py::RgwClientTest::test_ssl_verify + +You can also run tox instead of ``run_tox.sh``:: + + ## Run Python 3 tests command: + $ tox -e py3 + + ## Run Python 3 arbitrary command (e.g. 1 single test): + $ tox -e py3 tests/test_rgw_client.py::RgwClientTest::test_ssl_verify + +Python files can be automatically fixed and formatted according to PEP8 +standards by using ``run_tox.sh --tox-env fix`` or ``tox -e fix``. + +We also collect coverage information from the backend code when you run tests. You can check the +coverage information provided by the tox output, or by running the following +command after tox has finished successfully:: + + $ coverage html + +This command will create a directory ``htmlcov`` with an HTML representation of +the code coverage of the backend. + +API tests based on Teuthology +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +How to run existing API tests: + To run the API tests against a real Ceph cluster, we leverage the Teuthology + framework. This has the advantage of catching bugs originated from changes in + the internal Ceph code. + + Our ``run-backend-api-tests.sh`` script will start a ``vstart`` Ceph cluster + before running the Teuthology tests, and then it stops the cluster after the + tests are run. Of course this implies that you have built/compiled Ceph + previously. + + Start all dashboard tests by running:: + + $ ./run-backend-api-tests.sh + + Or, start one or multiple specific tests by specifying the test name:: + + $ ./run-backend-api-tests.sh tasks.mgr.dashboard.test_pool.PoolTest + + Or, ``source`` the script and run the tests manually:: + + $ source run-backend-api-tests.sh + $ run_teuthology_tests [tests]... + $ cleanup_teuthology + +How to write your own tests: + There are two possible ways to write your own API tests: + + The first is by extending one of the existing test classes in the + ``qa/tasks/mgr/dashboard`` directory. + + The second way is by adding your own API test module if you're creating a new + controller for example. To do so you'll just need to add the file containing + your new test class to the ``qa/tasks/mgr/dashboard`` directory and implement + all your tests here. + + .. note:: Don't forget to add the path of the newly created module to + ``modules`` section in ``qa/suites/rados/mgr/tasks/dashboard.yaml``. + + Short example: Let's assume you created a new controller called + ``my_new_controller.py`` and the related test module + ``test_my_new_controller.py``. You'll need to add + ``tasks.mgr.dashboard.test_my_new_controller`` to the ``modules`` section in + the ``dashboard.yaml`` file. + + Also, if you're removing test modules please keep in mind to remove the + related section. Otherwise the Teuthology test run will fail. + + Please run your API tests on your dev environment (as explained above) + before submitting a pull request. Also make sure that a full QA run in + Teuthology/sepia lab (based on your changes) has completed successfully + before it gets merged. You don't need to schedule the QA run yourself, just + add the 'needs-qa' label to your pull request as soon as you think it's ready + for merging (e.g. make check was successful, the pull request is approved and + all comments have been addressed). One of the developers who has access to + Teuthology/the sepia lab will take care of it and report the result back to + you. + + +How to add a new controller? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A controller is a Python class that extends from the ``BaseController`` class +and is decorated with either the ``@Controller``, ``@ApiController`` or +``@UiApiController`` decorators. The Python class must be stored inside a Python +file located under the ``controllers`` directory. The Dashboard module will +automatically load your new controller upon start. + +``@ApiController`` and ``@UiApiController`` are both specializations of the +``@Controller`` decorator. + +The ``@ApiController`` should be used for controllers that provide an API-like +REST interface and the ``@UiApiController`` should be used for endpoints consumed +by the UI but that are not part of the 'public' API. For any other kinds of +controllers the ``@Controller`` decorator should be used. + +A controller has a URL prefix path associated that is specified in the +controller decorator, and all endpoints exposed by the controller will share +the same URL prefix path. + +A controller's endpoint is exposed by implementing a method on the controller +class decorated with the ``@Endpoint`` decorator. + +For example create a file ``ping.py`` under ``controllers`` directory with the +following code: + +.. code-block:: python + + from ..tools import Controller, ApiController, UiApiController, BaseController, Endpoint + + @Controller('/ping') + class Ping(BaseController): + @Endpoint() + def hello(self): + return {'msg': "Hello"} + + @ApiController('/ping') + class ApiPing(BaseController): + @Endpoint() + def hello(self): + return {'msg': "Hello"} + + @UiApiController('/ping') + class UiApiPing(BaseController): + @Endpoint() + def hello(self): + return {'msg': "Hello"} + +The ``hello`` endpoint of the ``Ping`` controller can be reached by the +following URL: https://mgr_hostname:8443/ping/hello using HTTP GET requests. +As you can see the controller URL path ``/ping`` is concatenated to the +method name ``hello`` to generate the endpoint's URL. + +In the case of the ``ApiPing`` controller, the ``hello`` endpoint can be +reached by the following URL: https://mgr_hostname:8443/api/ping/hello using a +HTTP GET request. +The API controller URL path ``/ping`` is prefixed by the ``/api`` path and then +concatenated to the method name ``hello`` to generate the endpoint's URL. +Internally, the ``@ApiController`` is actually calling the ``@Controller`` +decorator by passing an additional decorator parameter called ``base_url``:: + + @ApiController('/ping') <=> @Controller('/ping', base_url="/api") + +``UiApiPing`` works in a similar way than the ``ApiPing``, but the URL will be +prefixed by ``/ui-api``: https://mgr_hostname:8443/ui-api/ping/hello. ``UiApiPing`` is +also a ``@Controller`` extension:: + + @UiApiController('/ping') <=> @Controller('/ping', base_url="/ui-api") + +The ``@Endpoint`` decorator also supports many parameters to customize the +endpoint: + +* ``method="GET"``: the HTTP method allowed to access this endpoint. +* ``path="/<method_name>"``: the URL path of the endpoint, excluding the + controller URL path prefix. +* ``path_params=[]``: list of method parameter names that correspond to URL + path parameters. Can only be used when ``method in ['POST', 'PUT']``. +* ``query_params=[]``: list of method parameter names that correspond to URL + query parameters. +* ``json_response=True``: indicates if the endpoint response should be + serialized in JSON format. +* ``proxy=False``: indicates if the endpoint should be used as a proxy. + +An endpoint method may have parameters declared. Depending on the HTTP method +defined for the endpoint the method parameters might be considered either +path parameters, query parameters, or body parameters. + +For ``GET`` and ``DELETE`` methods, the method's non-optional parameters are +considered path parameters by default. Optional parameters are considered +query parameters. By specifying the ``query_parameters`` in the endpoint +decorator it is possible to make a non-optional parameter to be a query +parameter. + +For ``POST`` and ``PUT`` methods, all method parameters are considered +body parameters by default. To override this default, one can use the +``path_params`` and ``query_params`` to specify which method parameters are +path and query parameters respectively. +Body parameters are decoded from the request body, either from a form format, or +from a dictionary in JSON format. + +Let's use an example to better understand the possible ways to customize an +endpoint: + +.. code-block:: python + + from ..tools import Controller, BaseController, Endpoint + + @Controller('/ping') + class Ping(BaseController): + + # URL: /ping/{key}?opt1=...&opt2=... + @Endpoint(path="/", query_params=['opt1']) + def index(self, key, opt1, opt2=None): + """...""" + + # URL: /ping/{key}?opt1=...&opt2=... + @Endpoint(query_params=['opt1']) + def __call__(self, key, opt1, opt2=None): + """...""" + + # URL: /ping/post/{key1}/{key2} + @Endpoint('POST', path_params=['key1', 'key2']) + def post(self, key1, key2, data1, data2=None): + """...""" + + +In the above example we see how the ``path`` option can be used to override the +generated endpoint URL in order to not use the method's name in the URL. In the +``index`` method we set the ``path`` to ``"/"`` to generate an endpoint that is +accessible by the root URL of the controller. + +An alternative approach to generate an endpoint that is accessible through just +the controller's path URL is by using the ``__call__`` method, as we show in +the above example. + +From the third method you can see that the path parameters are collected from +the URL by parsing the list of values separated by slashes ``/`` that come +after the URL path ``/ping`` for ``index`` method case, and ``/ping/post`` for +the ``post`` method case. + +Defining path parameters in endpoints's URLs using python methods's parameters +is very easy but it is still a bit strict with respect to the position of these +parameters in the URL structure. +Sometimes we may want to explicitly define a URL scheme that +contains path parameters mixed with static parts of the URL. +Our controller infrastructure also supports the declaration of URL paths with +explicit path parameters at both the controller level and method level. + +Consider the following example: + +.. code-block:: python + + from ..tools import Controller, BaseController, Endpoint + + @Controller('/ping/{node}/stats') + class Ping(BaseController): + + # URL: /ping/{node}/stats/{date}/latency?unit=... + @Endpoint(path="/{date}/latency") + def latency(self, node, date, unit="ms"): + """ ...""" + +In this example we explicitly declare a path parameter ``{node}`` in the +controller URL path, and a path parameter ``{date}`` in the ``latency`` +method. The endpoint for the ``latency`` method is then accessible through +the URL: https://mgr_hostname:8443/ping/{node}/stats/{date}/latency . + +For a full set of examples on how to use the ``@Endpoint`` +decorator please check the unit test file: ``tests/test_controllers.py``. +There you will find many examples of how to customize endpoint methods. + + +Implementing Proxy Controller +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Sometimes you might need to relay some requests from the Dashboard frontend +directly to an external service. +For that purpose we provide a decorator called ``@Proxy``. +(As a concrete example, check the ``controllers/rgw.py`` file where we +implemented an RGW Admin Ops proxy.) + + +The ``@Proxy`` decorator is a wrapper of the ``@Endpoint`` decorator that +already customizes the endpoint for working as a proxy. +A proxy endpoint works by capturing the URL path that follows the controller +URL prefix path, and does not do any decoding of the request body. + +Example: + +.. code-block:: python + + from ..tools import Controller, BaseController, Proxy + + @Controller('/foo/proxy') + class FooServiceProxy(BaseController): + + @Proxy() + def proxy(self, path, **params): + """ + if requested URL is "/foo/proxy/access/service?opt=1" + then path is "access/service" and params is {'opt': '1'} + """ + + +How does the RESTController work? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +We also provide a simple mechanism to create REST based controllers using the +``RESTController`` class. Any class which inherits from ``RESTController`` will, +by default, return JSON. + +The ``RESTController`` is basically an additional abstraction layer which eases +and unifies the work with collections. A collection is just an array of objects +with a specific type. ``RESTController`` enables some default mappings of +request types and given parameters to specific method names. This may sound +complicated at first, but it's fairly easy. Lets have look at the following +example: + +.. code-block:: python + + import cherrypy + from ..tools import ApiController, RESTController + + @ApiController('ping') + class Ping(RESTController): + def list(self): + return {"msg": "Hello"} + + def get(self, id): + return self.objects[id] + +In this case, the ``list`` method is automatically used for all requests to +``api/ping`` where no additional argument is given and where the request type +is ``GET``. If the request is given an additional argument, the ID in our +case, it won't map to ``list`` anymore but to ``get`` and return the element +with the given ID (assuming that ``self.objects`` has been filled before). The +same applies to other request types: + ++--------------+------------+----------------+-------------+ +| Request type | Arguments | Method | Status Code | ++==============+============+================+=============+ +| GET | No | list | 200 | ++--------------+------------+----------------+-------------+ +| PUT | No | bulk_set | 200 | ++--------------+------------+----------------+-------------+ +| POST | No | create | 201 | ++--------------+------------+----------------+-------------+ +| DELETE | No | bulk_delete | 204 | ++--------------+------------+----------------+-------------+ +| GET | Yes | get | 200 | ++--------------+------------+----------------+-------------+ +| PUT | Yes | set | 200 | ++--------------+------------+----------------+-------------+ +| DELETE | Yes | delete | 204 | ++--------------+------------+----------------+-------------+ + +To use a custom endpoint for the above listed methods, you can +use ``@RESTController.MethodMap`` + +.. code-block:: python + + import cherrypy + from ..tools import ApiController, RESTController + + @RESTController.MethodMap(version='0.1') + def create(self): + return {"msg": "Hello"} + +This decorator supports three parameters to customize the +endpoint: + +* ``resource"``: resource id. +* ``status=200``: set the HTTP status response code +* ``version``: version + +How to use a custom API endpoint in a RESTController? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If you don't have any access restriction you can use ``@Endpoint``. If you +have set a permission scope to restrict access to your endpoints, +``@Endpoint`` will fail, as it doesn't know which permission property should be +used. To use a custom endpoint inside a restricted ``RESTController`` use +``@RESTController.Collection`` instead. You can also choose +``@RESTController.Resource`` if you have set a ``RESOURCE_ID`` in your +``RESTController`` class. + +.. code-block:: python + + import cherrypy + from ..tools import ApiController, RESTController + + @ApiController('ping', Scope.Ping) + class Ping(RESTController): + RESOURCE_ID = 'ping' + + @RESTController.Resource('GET') + def some_get_endpoint(self): + return {"msg": "Hello"} + + @RESTController.Collection('POST') + def some_post_endpoint(self, **data): + return {"msg": data} + +Both decorators also support five parameters to customize the +endpoint: + +* ``method="GET"``: the HTTP method allowed to access this endpoint. +* ``path="/<method_name>"``: the URL path of the endpoint, excluding the + controller URL path prefix. +* ``status=200``: set the HTTP status response code +* ``query_params=[]``: list of method parameter names that correspond to URL + query parameters. +* ``version``: version + +How to restrict access to a controller? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +All controllers require authentication by default. +If you require that the controller can be accessed without authentication, +then you can add the parameter ``secure=False`` to the controller decorator. + +Example: + +.. code-block:: python + + import cherrypy + from . import ApiController, RESTController + + + @ApiController('ping', secure=False) + class Ping(RESTController): + def list(self): + return {"msg": "Hello"} + +How to create a dedicated UI endpoint which uses the 'public' API? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Sometimes we want to combine multiple calls into one single call +to save bandwidth or for other performance reasons. +In order to achieve that, we first have to create an ``@UiApiController`` which +is used for endpoints consumed by the UI but that are not part of the +'public' API. Let the ui class inherit from the REST controller class. +Now you can use all methods from the api controller. + +Example: + +.. code-block:: python + + import cherrypy + from . import UiApiController, ApiController, RESTController + + + @ApiController('ping', secure=False) # /api/ping + class Ping(RESTController): + def list(self): + return self._list() + + def _list(self): # To not get in conflict with the JSON wrapper + return [1,2,3] + + + @UiApiController('ping', secure=False) # /ui-api/ping + class PingUi(Ping): + def list(self): + return self._list() + [4, 5, 6] + +How to access the manager module instance from a controller? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +We provide the manager module instance as a global variable that can be +imported in any module. + +Example: + +.. code-block:: python + + import logging + import cherrypy + from .. import mgr + from ..tools import ApiController, RESTController + + logger = logging.getLogger(__name__) + + @ApiController('servers') + class Servers(RESTController): + def list(self): + logger.debug('Listing available servers') + return {'servers': mgr.list_servers()} + + +How to write a unit test for a controller? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +We provide a test helper class called ``ControllerTestCase`` to easily create +unit tests for your controller. + +If we want to write a unit test for the above ``Ping`` controller, create a +``test_ping.py`` file under the ``tests`` directory with the following code: + +.. code-block:: python + + from .helper import ControllerTestCase + from .controllers.ping import Ping + + + class PingTest(ControllerTestCase): + @classmethod + def setup_test(cls): + cp_config = {'tools.authenticate.on': True} + cls.setup_controllers([Ping], cp_config=cp_config) + + def test_ping(self): + self._get("/api/ping") + self.assertStatus(200) + self.assertJsonBody({'msg': 'Hello'}) + +The ``ControllerTestCase`` class starts by initializing a CherryPy webserver. +Then it will call the ``setup_test()`` class method where we can explicitly +load the controllers that we want to test. In the above example we are only +loading the ``Ping`` controller. We can also provide ``cp_config`` in order to +update the controller's cherrypy config (e.g. enable authentication as shown in the example). + +How to update or create new dashboards in grafana? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +We are using ``jsonnet`` and ``grafonnet-lib`` to write code for the grafana dashboards. +All the dashboards are written inside ``grafana_dashboards.jsonnet`` file in the +monitoring/grafana/dashboards/jsonnet directory. + +We generate the dashboard json files directly from this jsonnet file by running this +command in the grafana/dashboards directory: +``jsonnet -m . jsonnet/grafana_dashboards.jsonnet``. +(For the above command to succeed we need ``jsonnet`` package installed and ``grafonnet-lib`` +directory cloned in our machine. Please refer - +``https://grafana.github.io/grafonnet-lib/getting-started/`` in case you have some trouble.) + +To update an existing grafana dashboard or to create a new one, we need to update +the ``grafana_dashboards.jsonnet`` file and generate the new/updated json files using the +above mentioned command. For people who are not familiar with grafonnet or jsonnet implementation +can follow this doc - ``https://grafana.github.io/grafonnet-lib/``. + +Example grafana dashboard in jsonnet format: + +To specify the grafana dashboard properties such as title, uid etc we can create a local function - + +:: + + local dashboardSchema(title, uid, time_from, refresh, schemaVersion, tags,timezone, timepicker) + +To add a graph panel we can spcify the graph schema in a local function such as - + +:: + + local graphPanelSchema(title, nullPointMode, stack, formatY1, formatY2, labelY1, labelY2, min, fill, datasource) + +and then use these functions inside the dashboard definition like - + +:: + + { + radosgw-sync-overview.json: //json file name to be generated + + dashboardSchema( + 'RGW Sync Overview', 'rgw-sync-overview', 'now-1h', '15s', .., .., .. + ) + + .addPanels([ + graphPanelSchema( + 'Replication (throughput) from Source Zone', 'Bps', null, .., .., ..) + ]) + } + +The valid grafonnet-lib attributes can be found here - ``https://grafana.github.io/grafonnet-lib/api-docs/``. + + +How to listen for manager notifications in a controller? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The manager notifies the modules of several types of cluster events, such +as cluster logging event, etc... + +Each module has a "global" handler function called ``notify`` that the manager +calls to notify the module. But this handler function must not block or spend +too much time processing the event notification. +For this reason we provide a notification queue that controllers can register +themselves with to receive cluster notifications. + +The example below represents a controller that implements a very simple live +log viewer page: + +.. code-block:: python + + from __future__ import absolute_import + + import collections + + import cherrypy + + from ..tools import ApiController, BaseController, NotificationQueue + + + @ApiController('livelog') + class LiveLog(BaseController): + log_buffer = collections.deque(maxlen=1000) + + def __init__(self): + super(LiveLog, self).__init__() + NotificationQueue.register(self.log, 'clog') + + def log(self, log_struct): + self.log_buffer.appendleft(log_struct) + + @cherrypy.expose + def default(self): + ret = '<html><meta http-equiv="refresh" content="2" /><body>' + for l in self.log_buffer: + ret += "{}<br>".format(l) + ret += "</body></html>" + return ret + +As you can see above, the ``NotificationQueue`` class provides a register +method that receives the function as its first argument, and receives the +"notification type" as the second argument. +You can omit the second argument of the ``register`` method, and in that case +you are registering to listen all notifications of any type. + +Here is an list of notification types (these might change in the future) that +can be used: + +* ``clog``: cluster log notifications +* ``command``: notification when a command issued by ``MgrModule.send_command`` + completes +* ``perf_schema_update``: perf counters schema update +* ``mon_map``: monitor map update +* ``fs_map``: cephfs map update +* ``osd_map``: OSD map update +* ``service_map``: services (RGW, RBD-Mirror, etc.) map update +* ``mon_status``: monitor status regular update +* ``health``: health status regular update +* ``pg_summary``: regular update of PG status information + + +How to write a unit test when a controller accesses a Ceph module? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Consider the following example that implements a controller that retrieves the +list of RBD images of the ``rbd`` pool: + +.. code-block:: python + + import rbd + from .. import mgr + from ..tools import ApiController, RESTController + + + @ApiController('rbdimages') + class RbdImages(RESTController): + def __init__(self): + self.ioctx = mgr.rados.open_ioctx('rbd') + self.rbd = rbd.RBD() + + def list(self): + return [{'name': n} for n in self.rbd.list(self.ioctx)] + +In the example above, we want to mock the return value of the ``rbd.list`` +function, so that we can test the JSON response of the controller. + +The unit test code will look like the following: + +.. code-block:: python + + import mock + from .helper import ControllerTestCase + + + class RbdImagesTest(ControllerTestCase): + @mock.patch('rbd.RBD.list') + def test_list(self, rbd_list_mock): + rbd_list_mock.return_value = ['img1', 'img2'] + self._get('/api/rbdimages') + self.assertJsonBody([{'name': 'img1'}, {'name': 'img2'}]) + + + +How to add a new configuration setting? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If you need to store some configuration setting for a new feature, we already +provide an easy mechanism for you to specify/use the new config setting. + +For instance, if you want to add a new configuration setting to hold the +email address of the dashboard admin, just add a setting name as a class +attribute to the ``Options`` class in the ``settings.py`` file:: + + # ... + class Options(object): + # ... + + ADMIN_EMAIL_ADDRESS = ('admin@admin.com', str) + +The value of the class attribute is a pair composed by the default value for that +setting, and the python type of the value. + +By declaring the ``ADMIN_EMAIL_ADDRESS`` class attribute, when you restart the +dashboard module, you will automatically gain two additional CLI commands to +get and set that setting:: + + $ ceph dashboard get-admin-email-address + $ ceph dashboard set-admin-email-address <value> + +To access, or modify the config setting value from your Python code, either +inside a controller or anywhere else, you just need to import the ``Settings`` +class and access it like this: + +.. code-block:: python + + from settings import Settings + + # ... + tmp_var = Settings.ADMIN_EMAIL_ADDRESS + + # .... + Settings.ADMIN_EMAIL_ADDRESS = 'myemail@admin.com' + +The settings management implementation will make sure that if you change a +setting value from the Python code you will see that change when accessing +that setting from the CLI and vice-versa. + + +How to run a controller read-write operation asynchronously? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Some controllers might need to execute operations that alter the state of the +Ceph cluster. These operations might take some time to execute and to maintain +a good user experience in the Web UI, we need to run those operations +asynchronously and return immediately to frontend some information that the +operations are running in the background. + +To help in the development of the above scenario we added the support for +asynchronous tasks. To trigger the execution of an asynchronous task we must +use the following class method of the ``TaskManager`` class:: + + from ..tools import TaskManager + # ... + TaskManager.run(name, metadata, func, args, kwargs) + +* ``name`` is a string that can be used to group tasks. For instance + for RBD image creation tasks we could specify ``"rbd/create"`` as the + name, or similarly ``"rbd/remove"`` for RBD image removal tasks. + +* ``metadata`` is a dictionary where we can store key-value pairs that + characterize the task. For instance, when creating a task for creating + RBD images we can specify the metadata argument as + ``{'pool_name': "rbd", image_name': "test-img"}``. + +* ``func`` is the python function that implements the operation code, which + will be executed asynchronously. + +* ``args`` and ``kwargs`` are the positional and named arguments that will be + passed to ``func`` when the task manager starts its execution. + +The ``TaskManager.run`` method triggers the asynchronous execution of function +``func`` and returns a ``Task`` object. +The ``Task`` provides the public method ``Task.wait(timeout)``, which can be +used to wait for the task to complete up to a timeout defined in seconds and +provided as an argument. If no argument is provided the ``wait`` method +blocks until the task is finished. + +The ``Task.wait`` is very useful for tasks that usually are fast to execute but +that sometimes may take a long time to run. +The return value of the ``Task.wait`` method is a pair ``(state, value)`` +where ``state`` is a string with following possible values: + +* ``VALUE_DONE = "done"`` +* ``VALUE_EXECUTING = "executing"`` + +The ``value`` will store the result of the execution of function ``func`` if +``state == VALUE_DONE``. If ``state == VALUE_EXECUTING`` then +``value == None``. + +The pair ``(name, metadata)`` should unequivocally identify the task being +run, which means that if you try to trigger a new task that matches the same +``(name, metadata)`` pair of the currently running task, then the new task +is not created and you get the task object of the current running task. + +For instance, consider the following example: + +.. code-block:: python + + task1 = TaskManager.run("dummy/task", {'attr': 2}, func) + task2 = TaskManager.run("dummy/task", {'attr': 2}, func) + +If the second call to ``TaskManager.run`` executes while the first task is +still executing then it will return the same task object: +``assert task1 == task2``. + + +How to get the list of executing and finished asynchronous tasks? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The list of executing and finished tasks is included in the ``Summary`` +controller, which is already polled every 5 seconds by the dashboard frontend. +But we also provide a dedicated controller to get the same list of executing +and finished tasks. + +The ``Task`` controller exposes the ``/api/task`` endpoint that returns the +list of executing and finished tasks. This endpoint accepts the ``name`` +parameter that accepts a glob expression as its value. +For instance, an HTTP GET request of the URL ``/api/task?name=rbd/*`` +will return all executing and finished tasks which name starts with ``rbd/``. + +To prevent the finished tasks list from growing unbounded, we will always +maintain the 10 most recent finished tasks, and the remaining older finished +tasks will be removed when reaching a TTL of 1 minute. The TTL is calculated +using the timestamp when the task finished its execution. After a minute, when +the finished task information is retrieved, either by the summary controller or +by the task controller, it is automatically deleted from the list and it will +not be included in further task queries. + +Each executing task is represented by the following dictionary:: + + { + 'name': "name", # str + 'metadata': { }, # dict + 'begin_time': "2018-03-14T15:31:38.423605Z", # str (ISO 8601 format) + 'progress': 0 # int (percentage) + } + +Each finished task is represented by the following dictionary:: + + { + 'name': "name", # str + 'metadata': { }, # dict + 'begin_time': "2018-03-14T15:31:38.423605Z", # str (ISO 8601 format) + 'end_time': "2018-03-14T15:31:39.423605Z", # str (ISO 8601 format) + 'duration': 0.0, # float + 'progress': 0 # int (percentage) + 'success': True, # bool + 'ret_value': None, # object, populated only if 'success' == True + 'exception': None, # str, populated only if 'success' == False + } + + +How to use asynchronous APIs with asynchronous tasks? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``TaskManager.run`` method as described in a previous section, is well +suited for calling blocking functions, as it runs the function inside a newly +created thread. But sometimes we want to call some function of an API that is +already asynchronous by nature. + +For these cases we want to avoid creating a new thread for just running a +non-blocking function, and want to leverage the asynchronous nature of the +function. The ``TaskManager.run`` is already prepared to be used with +non-blocking functions by passing an object of the type ``TaskExecutor`` as an +additional parameter called ``executor``. The full method signature of +``TaskManager.run``:: + + TaskManager.run(name, metadata, func, args=None, kwargs=None, executor=None) + + +The ``TaskExecutor`` class is responsible for code that executes a given task +function, and defines three methods that can be overridden by +subclasses:: + + def init(self, task) + def start(self) + def finish(self, ret_value, exception) + +The ``init`` method is called before the running the task function, and +receives the task object (of class ``Task``). + +The ``start`` method runs the task function. The default implementation is to +run the task function in the current thread context. + +The ``finish`` method should be called when the task function finishes with +either the ``ret_value`` populated with the result of the execution, or with +an exception object in the case that execution raised an exception. + +To leverage the asynchronous nature of a non-blocking function, the developer +should implement a custom executor by creating a subclass of the +``TaskExecutor`` class, and provide an instance of the custom executor class +as the ``executor`` parameter of the ``TaskManager.run``. + +To better understand the expressive power of executors, we write a full example +of use a custom executor to execute the ``MgrModule.send_command`` asynchronous +function: + +.. code-block:: python + + import json + from mgr_module import CommandResult + from .. import mgr + from ..tools import ApiController, RESTController, NotificationQueue, \ + TaskManager, TaskExecutor + + + class SendCommandExecutor(TaskExecutor): + def __init__(self): + super(SendCommandExecutor, self).__init__() + self.tag = None + self.result = None + + def init(self, task): + super(SendCommandExecutor, self).init(task) + + # we need to listen for 'command' events to know when the command + # finishes + NotificationQueue.register(self._handler, 'command') + + # store the CommandResult object to retrieve the results + self.result = self.task.fn_args[0] + if len(self.task.fn_args) > 4: + # the user specified a tag for the command, so let's use it + self.tag = self.task.fn_args[4] + else: + # let's generate a unique tag for the command + self.tag = 'send_command_{}'.format(id(self)) + self.task.fn_args.append(self.tag) + + def _handler(self, data): + if data == self.tag: + # the command has finished, notifying the task with the result + self.finish(self.result.wait(), None) + # deregister listener to avoid memory leaks + NotificationQueue.deregister(self._handler, 'command') + + + @ApiController('test') + class Test(RESTController): + + def _run_task(self, osd_id): + task = TaskManager.run("test/task", {}, mgr.send_command, + [CommandResult(''), 'osd', osd_id, + json.dumps({'prefix': 'perf histogram dump'})], + executor=SendCommandExecutor()) + return task.wait(1.0) + + def get(self, osd_id): + status, value = self._run_task(osd_id) + return {'status': status, 'value': value} + + +The above ``SendCommandExecutor`` executor class can be used for any call to +``MgrModule.send_command``. This means that we should need just one custom +executor class implementation for each non-blocking API that we use in our +controllers. + +The default executor, used when no executor object is passed to +``TaskManager.run``, is the ``ThreadedExecutor``. You can check its +implementation in the ``tools.py`` file. + + +How to update the execution progress of an asynchronous task? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The asynchronous tasks infrastructure provides support for updating the +execution progress of an executing task. +The progress can be updated from within the code the task is executing, which +usually is the place where we have the progress information available. + +To update the progress from within the task code, the ``TaskManager`` class +provides a method to retrieve the current task object:: + + TaskManager.current_task() + +The above method is only available when using the default executor +``ThreadedExecutor`` for executing the task. +The ``current_task()`` method returns the current ``Task`` object. The +``Task`` object provides two public methods to update the execution progress +value: the ``set_progress(percentage)``, and the ``inc_progress(delta)`` +methods. + +The ``set_progress`` method receives as argument an integer value representing +the absolute percentage that we want to set to the task. + +The ``inc_progress`` method receives as argument an integer value representing +the delta we want to increment to the current execution progress percentage. + +Take the following example of a controller that triggers a new task and +updates its progress: + +.. code-block:: python + + from __future__ import absolute_import + import random + import time + import cherrypy + from ..tools import TaskManager, ApiController, BaseController + + + @ApiController('dummy_task') + class DummyTask(BaseController): + def _dummy(self): + top = random.randrange(100) + for i in range(top): + TaskManager.current_task().set_progress(i*100/top) + # or TaskManager.current_task().inc_progress(100/top) + time.sleep(1) + return "finished" + + @cherrypy.expose + @cherrypy.tools.json_out() + def default(self): + task = TaskManager.run("dummy/task", {}, self._dummy) + return task.wait(5) # wait for five seconds + + +How to deal with asynchronous tasks in the front-end? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +All executing and most recently finished asynchronous tasks are displayed on +"Background-Tasks" and if finished on "Recent-Notifications" in the menu bar. +For each task a operation name for three states (running, success and failure), +a function that tells who is involved and error descriptions, if any, have to +be provided. This can be achieved by appending +``TaskManagerMessageService.messages``. This has to be done to achieve +consistency among all tasks and states. + +Operation Object + Ensures consistency among all tasks. It consists of three verbs for each + different state f.e. + ``{running: 'Creating', failure: 'create', success: 'Created'}``. + +#. Put running operations in present participle f.e. ``'Updating'``. +#. Failed messages always start with ``'Failed to '`` and should be continued + with the operation in present tense f.e. ``'update'``. +#. Put successful operations in past tense f.e. ``'Updated'``. + +Involves Function + Ensures consistency among all messages of a task, it resembles who's + involved by the operation. It's a function that returns a string which + takes the metadata from the task to return f.e. + ``"RBD 'somePool/someImage'"``. + +Both combined create the following messages: + +* Failure => ``"Failed to create RBD 'somePool/someImage'"`` +* Running => ``"Creating RBD 'somePool/someImage'"`` +* Success => ``"Created RBD 'somePool/someImage'"`` + +For automatic task handling use ``TaskWrapperService.wrapTaskAroundCall``. + +If for some reason ``wrapTaskAroundCall`` is not working for you, +you have to subscribe to your asynchronous task manually through +``TaskManagerService.subscribe``, and provide it with a callback, +in case of a success to notify the user. A notification can +be triggered with ``NotificationService.notifyTask``. It will use +``TaskManagerMessageService.messages`` to display a message based on the state +of a task. + +Notifications of API errors are handled by ``ApiInterceptorService``. + +Usage example: + +.. code-block:: javascript + + export class TaskManagerMessageService { + // ... + messages = { + // Messages for task 'rbd/create' + 'rbd/create': new TaskManagerMessage( + // Message prefixes + ['create', 'Creating', 'Created'], + // Message suffix + (metadata) => `RBD '${metadata.pool_name}/${metadata.image_name}'`, + (metadata) => ({ + // Error code and description + '17': `Name is already used by RBD '${metadata.pool_name}/${ + metadata.image_name}'.` + }) + ), + // ... + }; + // ... + } + + export class RBDFormComponent { + // ... + createAction() { + const request = this.createRequest(); + // Subscribes to 'call' with submitted 'task' and handles notifications + return this.taskWrapper.wrapTaskAroundCall({ + task: new FinishedTask('rbd/create', { + pool_name: request.pool_name, + image_name: request.name + }), + call: this.rbdService.create(request) + }); + } + // ... + } + + +REST API documentation +~~~~~~~~~~~~~~~~~~~~~~ +Ceph-Dashboard provides two types of documentation for the **Ceph RESTful API**: + +* **Static documentation**: available at :ref:`mgr-ceph-api`. This comes from a versioned specification located at ``src/pybind/mgr/dashboard/openapi.yaml``. +* **Interactive documentation**: available from a running Ceph-Dashboard instance (top-right ``?`` icon > API Docs). + +If changes are made to the ``controllers/`` directory, it's very likely that +they will result in changes to the generated OpenAPI specification. For that +reason, a checker has been implemented to block unintended changes. This check +is automatically triggered by the Pull Request CI (``make check``) and can be +also manually invoked: ``tox -e openapi-check``. + +If that checker failed, it means that the current Pull Request is modifying the +Ceph API and therefore: + +#. The versioned OpenAPI specification should be updated explicitly: ``tox -e openapi-fix``. +#. The team @ceph/api will be requested for reviews (this is automated via Github CODEOWNERS), in order to asses the impact of changes. + +Additionally, Sphinx documentation can be generated from the OpenAPI +specification with ``tox -e openapi-doc``. + +The Ceph RESTful OpenAPI specification is dynamically generated from the +``Controllers`` in ``controllers/`` directory. However, by default it is not +very detailed, so there are two decorators that can and should be used to add +more information: + +* ``@EndpointDoc()`` for documentation of endpoints. It has four optional arguments + (explained below): ``description``, ``group``, ``parameters`` and + ``responses``. +* ``@ControllerDoc()`` for documentation of controller or group associated with + the endpoints. It only takes the two first arguments: ``description`` and + ``group``. + + +``description``: A a string with a short (1-2 sentences) description of the object. + + +``group``: By default, an endpoint is grouped together with other endpoints +within the same controller class. ``group`` is a string that can be used to +assign an endpoint or all endpoints in a class to another controller or a +conceived group name. + + +``parameters``: A dict used to describe path, query or request body parameters. +By default, all parameters for an endpoint are listed on the Swagger UI page, +including information of whether the parameter is optional/required and default +values. However, there will be no description of the parameter and the parameter +type will only be displayed in some cases. +When adding information, each parameters should be described as in the example +below. Note that the parameter type should be expressed as a built-in python +type and not as a string. Allowed values are ``str``, ``int``, ``bool``, ``float``. + +.. code-block:: python + + @EndpointDoc(parameters={'my_string': (str, 'Description of my_string')}) + def method(my_string): pass + +For body parameters, more complex cases are possible. If the parameter is a +dictionary, the type should be replaced with a ``dict`` containing its nested +parameters. When describing nested parameters, the same format as other +parameters is used. However, all nested parameters are set as required by default. +If the nested parameter is optional this must be specified as for ``item2`` in +the example below. If a nested parameters is set to optional, it is also +possible to specify the default value (this will not be provided automatically +for nested parameters). + +.. code-block:: python + + @EndpointDoc(parameters={ + 'my_dictionary': ({ + 'item1': (str, 'Description of item1'), + 'item2': (str, 'Description of item2', True), # item2 is optional + 'item3': (str, 'Description of item3', True, 'foo'), # item3 is optional with 'foo' as default value + }, 'Description of my_dictionary')}) + def method(my_dictionary): pass + +If the parameter is a ``list`` of primitive types, the type should be +surrounded with square brackets. + +.. code-block:: python + + @EndpointDoc(parameters={'my_list': ([int], 'Description of my_list')}) + def method(my_list): pass + +If the parameter is a ``list`` with nested parameters, the nested parameters +should be placed in a dictionary and surrounded with square brackets. + +.. code-block:: python + + @EndpointDoc(parameters={ + 'my_list': ([{ + 'list_item': (str, 'Description of list_item'), + 'list_item2': (str, 'Description of list_item2') + }], 'Description of my_list')}) + def method(my_list): pass + + +``responses``: A dict used for describing responses. Rules for describing +responses are the same as for request body parameters, with one difference: +responses also needs to be assigned to the related response code as in the +example below: + +.. code-block:: python + + @EndpointDoc(responses={ + '400':{'my_response': (str, 'Description of my_response')}}) + def method(): pass + + +Error Handling in Python +~~~~~~~~~~~~~~~~~~~~~~~~ + +Good error handling is a key requirement in creating a good user experience +and providing a good API. + +Dashboard code should not duplicate C++ code. Thus, if error handling in C++ +is sufficient to provide good feedback, a new wrapper to catch these errors +is not necessary. On the other hand, input validation is the best place to +catch errors and generate the best error messages. If required, generate +errors as soon as possible. + +The backend provides few standard ways of returning errors. + +First, there is a generic Internal Server Error:: + + Status Code: 500 + { + "version": <cherrypy version, e.g. 13.1.0>, + "detail": "The server encountered an unexpected condition which prevented it from fulfilling the request.", + } + + +For errors generated by the backend, we provide a standard error +format:: + + Status Code: 400 + { + "detail": str(e), # E.g. "[errno -42] <some error message>" + "component": "rbd", # this can be null to represent a global error code + "code": "3", # Or a error name, e.g. "code": "some_error_key" + } + + +In case, the API Endpoints uses @ViewCache to temporarily cache results, +the error looks like so:: + + Status Code 400 + { + "detail": str(e), # E.g. "[errno -42] <some error message>" + "component": "rbd", # this can be null to represent a global error code + "code": "3", # Or a error name, e.g. "code": "some_error_key" + 'status': 3, # Indicating the @ViewCache error status + } + +In case, the API Endpoints uses a task the error looks like so:: + + Status Code 400 + { + "detail": str(e), # E.g. "[errno -42] <some error message>" + "component": "rbd", # this can be null to represent a global error code + "code": "3", # Or a error name, e.g. "code": "some_error_key" + "task": { # Information about the task itself + "name": "taskname", + "metadata": {...} + } + } + + +Our WebUI should show errors generated by the API to the user. Especially +field-related errors in wizards and dialogs or show non-intrusive notifications. + +Handling exceptions in Python should be an exception. In general, we +should have few exception handlers in our project. Per default, propagate +errors to the API, as it will take care of all exceptions anyway. In general, +log the exception by adding ``logger.exception()`` with a description to the +handler. + +We need to distinguish between user errors from internal errors and +programming errors. Using different exception types will ease the +task for the API layer and for the user interface: + +Standard Python errors, like ``SystemError``, ``ValueError`` or ``KeyError`` +will end up as internal server errors in the API. + +In general, do not ``return`` error responses in the REST API. They will be +returned by the error handler. Instead, raise the appropriate exception. + +Plug-ins +~~~~~~~~ + +New functionality can be provided by means of a plug-in architecture. Among the +benefits this approach brings in, loosely coupled development is one of the most +notable. As the Ceph Dashboard grows in feature richness, its code-base becomes +more and more complex. The hook-based nature of a plug-in architecture allows to +extend functionality in a controlled manner, and isolate the scope of the +changes. + +Ceph Dashboard relies on `Pluggy <https://pluggy.readthedocs.io>`_ to provide +for plug-ing support. On top of pluggy, an interface-based approach has been +implemented, with some safety checks (method override and abstract method +checks). + +In order to create a new plugin, the following steps are required: + +#. Add a new file under ``src/pybind/mgr/dashboard/plugins``. +#. Import the ``PLUGIN_MANAGER`` instance and the ``Interfaces``. +#. Create a class extending the desired interfaces. The plug-in library will + check if all the methods of the interfaces have been properly overridden. +#. Register the plugin in the ``PLUGIN_MANAGER`` instance. +#. Import the plug-in from within the Ceph Dashboard ``module.py`` (currently no + dynamic loading is implemented). + +The available Mixins (helpers) are: + +- ``CanMgr``: provides the plug-in with access to the ``mgr`` instance under ``self.mgr``. + +The available Interfaces are: + +- ``Initializable``: requires overriding ``init()`` hook. This method is run at + the very beginning of the dashboard module, right after all imports have been + performed. +- ``Setupable``: requires overriding ``setup()`` hook. This method is run in the + Ceph Dashboard ``serve()`` method, right after CherryPy has been configured, + but before it is started. It's a placeholder for the plug-in initialization + logic. +- ``HasOptions``: requires overriding ``get_options()`` hook by returning a list + of ``Options()``. The options returned here are added to the + ``MODULE_OPTIONS``. +- ``HasCommands``: requires overriding ``register_commands()`` hook by defining + the commands the plug-in can handle and decorating them with ``@CLICommand``. + The commands can be optionally returned, so that they can be invoked + externally (which makes unit testing easier). +- ``HasControllers``: requires overriding ``get_controllers()`` hook by defining + and returning the controllers as usual. +- ``FilterRequest.BeforeHandler``: requires overriding + ``filter_request_before_handler()`` hook. This method receives a + ``cherrypy.request`` object for processing. A usual implementation of this + method will allow some requests to pass or will raise a ``cherrypy.HTTPError`` + based on the ``request`` metadata and other conditions. + +New interfaces and hooks should be added as soon as they are required to +implement new functionality. The above list only comprises the hooks needed for +the existing plugins. + +A sample plugin implementation would look like this: + +.. code-block:: python + + # src/pybind/mgr/dashboard/plugins/mute.py + + from . import PLUGIN_MANAGER as PM + from . import interfaces as I + + from mgr_module import CLICommand, Option + import cherrypy + + @PM.add_plugin + class Mute(I.CanMgr, I.Setupable, I.HasOptions, I.HasCommands, + I.FilterRequest.BeforeHandler, I.HasControllers): + @PM.add_hook + def get_options(self): + return [Option('mute', default=False, type='bool')] + + @PM.add_hook + def setup(self): + self.mute = self.mgr.get_module_option('mute') + + @PM.add_hook + def register_commands(self): + @CLICommand("dashboard mute") + def _(mgr): + self.mute = True + self.mgr.set_module_option('mute', True) + return 0 + + @PM.add_hook + def filter_request_before_handler(self, request): + if self.mute: + raise cherrypy.HTTPError(500, "I'm muted :-x") + + @PM.add_hook + def get_controllers(self): + from ..controllers import ApiController, RESTController + + @ApiController('/mute') + class MuteController(RESTController): + def get(_): + return self.mute + + return [MuteController] + + +Additionally, a helper for creating plugins ``SimplePlugin`` is provided. It +facilitates the basic tasks (Options, Commands, and common Mixins). The previous +plugin could be rewritten like this: + +.. code-block:: python + + from . import PLUGIN_MANAGER as PM + from . import interfaces as I + from .plugin import SimplePlugin as SP + + import cherrypy + + @PM.add_plugin + class Mute(SP, I.Setupable, I.FilterRequest.BeforeHandler, I.HasControllers): + OPTIONS = [ + SP.Option('mute', default=False, type='bool') + ] + + def shut_up(self): + self.set_option('mute', True) + self.mute = True + return 0 + + COMMANDS = [ + SP.Command("dashboard mute", handler=shut_up) + ] + + @PM.add_hook + def setup(self): + self.mute = self.get_option('mute') + + @PM.add_hook + def filter_request_before_handler(self, request): + if self.mute: + raise cherrypy.HTTPError(500, "I'm muted :-x") + + @PM.add_hook + def get_controllers(self): + from ..controllers import ApiController, RESTController + + @ApiController('/mute') + class MuteController(RESTController): + def get(_): + return self.mute + + return [MuteController] diff --git a/doc/dev/developer_guide/essentials.rst b/doc/dev/developer_guide/essentials.rst new file mode 100644 index 000000000..2fe7a13cd --- /dev/null +++ b/doc/dev/developer_guide/essentials.rst @@ -0,0 +1,338 @@ +Essentials (tl;dr) +================== + +This chapter presents essential information that every Ceph developer needs +to know. + +Leads +----- + +The Ceph project is led by Sage Weil. In addition, each major project +component has its own lead. The following table shows all the leads and +their nicks on `GitHub`_: + +.. _github: https://github.com/ + +========= ================ ============= +Scope Lead GitHub nick +========= ================ ============= +Ceph Sage Weil liewegas +RADOS Neha Ojha neha-ojha +RGW Yehuda Sadeh yehudasa +RGW Matt Benjamin mattbenjamin +RBD Jason Dillaman dillaman +CephFS Patrick Donnelly batrick +Dashboard Lenz Grimmer LenzGr +MON Joao Luis jecluis +Build/Ops Ken Dreyer ktdreyer +Docs Zac Dover zdover23 +========= ================ ============= + +The Ceph-specific acronyms in the table are explained in +:doc:`/architecture`. + +History +------- + +See the `History chapter of the Wikipedia article`_. + +.. _`History chapter of the Wikipedia article`: https://en.wikipedia.org/wiki/Ceph_%28software%29#History + +Licensing +--------- + +Ceph is free software. + +Unless stated otherwise, the Ceph source code is distributed under the +terms of the LGPL2.1 or LGPL3.0. For full details, see the file +`COPYING`_ in the top-level directory of the source-code tree. + +.. _`COPYING`: + https://github.com/ceph/ceph/blob/master/COPYING + +Source code repositories +------------------------ + +The source code of Ceph lives on `GitHub`_ in a number of repositories below +the `Ceph "organization"`_. + +.. _`Ceph "organization"`: https://github.com/ceph + +A working knowledge of git_ is essential to make a meaningful contribution to the project as a developer. + +.. _git: https://git-scm.com/doc + +Although the `Ceph "organization"`_ includes several software repositories, +this document covers only one: https://github.com/ceph/ceph. + +Redmine issue tracker +--------------------- + +Although `GitHub`_ is used for code, Ceph-related issues (Bugs, Features, +Backports, Documentation, etc.) are tracked at http://tracker.ceph.com, +which is powered by `Redmine`_. + +.. _Redmine: http://www.redmine.org + +The tracker has a Ceph project with a number of subprojects loosely +corresponding to the various architectural components (see +:doc:`/architecture`). + +Mere `registration`_ in the tracker automatically grants permissions +sufficient to open new issues and comment on existing ones. + +.. _registration: http://tracker.ceph.com/account/register + +To report a bug or propose a new feature, `jump to the Ceph project`_ and +click on `New issue`_. + +.. _`jump to the Ceph project`: http://tracker.ceph.com/projects/ceph +.. _`New issue`: http://tracker.ceph.com/projects/ceph/issues/new + +.. _mailing-list: + +Mailing lists +------------- + +Ceph Development Mailing List +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +The ``dev@ceph.io`` list is for discussion about the development of Ceph, +its interoperability with other technology, and the operations of the +project itself. + +The email discussion list for Ceph development is open to all. Subscribe by +sending a message to ``dev-request@ceph.io`` with the following line in the +body of the message:: + + subscribe ceph-devel + + +Ceph Client Patch Review Mailing List +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +The ``ceph-devel@vger.kernel.org`` list is for discussion and patch review +for the Linux kernel Ceph client component. Note that this list used to +be an all-encompassing list for developers. When searching the archives, +remember that this list contains the generic devel-ceph archives before mid-2018. + +Subscribe to the list covering the Linux kernel Ceph client component by sending +a message to ``majordomo@vger.kernel.org`` with the following line in the body +of the message:: + + subscribe ceph-devel + + +Other Ceph Mailing Lists +^^^^^^^^^^^^^^^^^^^^^^^^ + +There are also `other Ceph-related mailing lists`_. + +.. _`other Ceph-related mailing lists`: https://ceph.com/irc/ + +.. _irc: + + +IRC +--- + +In addition to mailing lists, the Ceph community also communicates in real time +using `Internet Relay Chat`_. + +.. _`Internet Relay Chat`: http://www.irchelp.org/ + +The Ceph community gathers in the #ceph channel of the Open and Free Technology +Community (OFTC) IRC network. + +Created in 1988, Internet Relay Chat (IRC) is a relay-based, real-time chat +protocol. It is mainly designed for group (many-to-many) communication in +discussion forums called channels, but also allows one-to-one communication via +private message. On IRC you can talk to many other members using Ceph, on +topics ranging from idle chit-chat to support questions. Though a channel might +have many people in it at any one time, they might not always be at their +keyboard; so if no-one responds, just wait around and someone will hopefully +answer soon enough. + +Registration +^^^^^^^^^^^^ + +If you intend to use the IRC service on a continued basis, you are advised to +register an account. Registering gives you a unique IRC identity and allows you +to access channels where unregistered users have been locked out for technical +reasons. + +See ``the official OFTC (Open and Free Technology Community) documentation's +registration instructions +<https://www.oftc.net/Services/#register-your-account>`` to learn how to +register your IRC account. + +Channels +~~~~~~~~ + +To connect to the OFTC IRC network, download an IRC client and configure it to +connect to ``irc.oftc.net``. Then join one or more of the channels. Discussions +inside #ceph are logged and archives are available online. + +Here are the real-time discussion channels for the Ceph community: + + - #ceph + - #ceph-devel + - #cephfs + - #ceph-dashboard + - #ceph-orchestrators + - #sepia + +.. _submitting-patches: + +Submitting patches +------------------ + +The canonical instructions for submitting patches are contained in the +file `CONTRIBUTING.rst`_ in the top-level directory of the source-code +tree. There may be some overlap between this guide and that file. + +.. _`CONTRIBUTING.rst`: + https://github.com/ceph/ceph/blob/main/CONTRIBUTING.rst + +All newcomers are encouraged to read that file carefully. + +Building from source +-------------------- + +See instructions at :doc:`/install/build-ceph`. + +Using ccache to speed up local builds +------------------------------------- +`ccache`_ can make the process of rebuilding the ceph source tree faster. + +Before you use `ccache`_ to speed up your rebuilds of the ceph source tree, +make sure that your source tree is clean and will produce no build failures. +When you have a clean source tree, you can confidently use `ccache`_, secure in +the knowledge that you're not using a dirty tree. + +Old build artifacts can cause build failures. You might introduce these +artifacts unknowingly when switching from one branch to another. If you see +build errors when you attempt a local build, follow the procedure below to +clean your source tree. + +Cleaning the Source Tree +^^^^^^^^^^^^^^^^^^^^^^^^ + +.. prompt:: bash $ + + make clean + +.. note:: The following commands will remove everything in the source tree + that isn't tracked by git. Make sure to back up your log files + and configuration options before running these commands. + +.. prompt:: bash $ + + git clean -fdx; git submodule foreach git clean -fdx + +Building Ceph with ccache +^^^^^^^^^^^^^^^^^^^^^^^^^ +``ccache`` is available as a package in most distros. To build ceph with +ccache, run the following command. + +.. prompt:: bash $ + + cmake -DWITH_CCACHE=ON .. + +Using ccache to Speed Up Build Times +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +``ccache`` can be used for speeding up all builds of the system. For more +details, refer to the `run modes`_ section of the ccache manual. The default +settings of ``ccache`` can be displayed with the ``ccache -s`` command. + +.. note:: We recommend overriding the ``max_size``. The default is 10G. + Use a larger value, like 25G. Refer to the `configuration`_ section + of the ccache manual for more information. + +To further increase the cache hit rate and reduce compile times in a +development environment, set the version information and build timestamps to +fixed values. This makes it unnecessary to rebuild the binaries that contain +this information. + +This can be achieved by adding the following settings to the ``ccache`` +configuration file ``ccache.conf``:: + + sloppiness = time_macros + run_second_cpp = true + +Now, set the environment variable ``SOURCE_DATE_EPOCH`` to a fixed value (a +UNIX timestamp) and set ``ENABLE_GIT_VERSION`` to ``OFF`` when running +``cmake``: + +.. prompt:: bash $ + + export SOURCE_DATE_EPOCH=946684800 + cmake -DWITH_CCACHE=ON -DENABLE_GIT_VERSION=OFF .. + +.. note:: Binaries produced with these build options are not suitable for + production or debugging purposes, as they do not contain the correct build + time and git version information. + +.. _`ccache`: https://ccache.samba.org/ +.. _`run modes`: https://ccache.samba.org/manual.html#_run_modes +.. _`configuration`: https://ccache.samba.org/manual.html#_configuration + +Development-mode cluster +------------------------ + +See :doc:`/dev/quick_guide`. + +Kubernetes/Rook development cluster +----------------------------------- + +See :ref:`kubernetes-dev` + +.. _backporting: + +Backporting +----------- + +All bugfixes should be merged to the ``main`` branch before being +backported. To flag a bugfix for backporting, make sure it has a +`tracker issue`_ associated with it and set the ``Backport`` field to a +comma-separated list of previous releases (e.g. "hammer,jewel") that you think +need the backport. +The rest (including the actual backporting) will be taken care of by the +`Stable Releases and Backports`_ team. + +.. _`tracker issue`: http://tracker.ceph.com/ +.. _`Stable Releases and Backports`: http://tracker.ceph.com/projects/ceph-releases/wiki + +Dependabot +---------- + +Dependabot is a GitHub bot that scans the dependencies in the repositories for +security vulnerabilities (CVEs). If a fix is available for a discovered CVE, +Dependabot creates a pull request to update the dependency. + +Dependabot also indicates the compatibility score of the upgrade. This score is +based on the number of CI failures that occur in other GitHub repositories +where the fix was applied. + +With some configuration, Dependabot can perform non-security updates (for +example, it can upgrade to the latest minor version or patch version). + +Dependabot supports `several languages and package managers +<https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/about-dependabot-version-updates#supported-repositories-and-ecosystems>`_. +As of July 2022, the Ceph project receives alerts only from pip (based on the +`requirements.txt` files) and npm (`package*.json`). It is possible to extend +these alerts to git submodules, Golang, and Java. As of July 2022, there is no +support for C++ package managers such as vcpkg, conan, C++20 modules. + +Many of the dependencies discovered by Dependabot will best be updated +elsewhere than the Ceph Github repository (distribution packages, for example, +will be a better place to update some of the dependencies). Nonetheless, the +list of new and existing vulnerabilities generated by Dependabot will be +useful. + +`Here is an example of a Dependabot pull request. +<https://github.com/ceph/ceph/pull/46998>`_ + +Guidance for use of cluster log +------------------------------- + +If your patches emit messages to the Ceph cluster log, please consult +this: :doc:`/dev/logging`. diff --git a/doc/dev/developer_guide/index.rst b/doc/dev/developer_guide/index.rst new file mode 100644 index 000000000..30d5e5e22 --- /dev/null +++ b/doc/dev/developer_guide/index.rst @@ -0,0 +1,25 @@ +============================================ +Contributing to Ceph: A Guide for Developers +============================================ + +:Author: Loic Dachary +:Author: Nathan Cutler +:License: Creative Commons Attribution Share Alike 3.0 (CC-BY-SA-3.0) + +.. note:: You may also be interested in the :doc:`/dev/internals` documentation. + +.. toctree:: + :maxdepth: 1 + + Introduction <intro> + Essentials <essentials> + What is Merged and When <merging> + Issue tracker <issue-tracker> + Basic workflow <basic-workflow> + Tests: Unit Tests <tests-unit-tests> + Tests: Integration Tests <tests-integration-tests> + Running Tests Locally <running-tests-locally> + Running Integration Tests using Teuthology <running-tests-using-teuth> + Running Tests in the Cloud <running-tests-in-cloud> + Ceph Dashboard Developer Documentation (formerly HACKING.rst) <dash-devel> + cephadm Developer Documentation <../cephadm/index> diff --git a/doc/dev/developer_guide/intro.rst b/doc/dev/developer_guide/intro.rst new file mode 100644 index 000000000..67b449c55 --- /dev/null +++ b/doc/dev/developer_guide/intro.rst @@ -0,0 +1,25 @@ +Introduction +============ + +This guide has two aims. First, it should lower the barrier to entry for +software developers who wish to get involved in the Ceph project. Second, +it should serve as a reference for Ceph developers. + +We assume that readers are already familiar with Ceph (the distributed +object store and file system designed to provide excellent performance, +reliability and scalability). If not, please refer to the `project website`_ +and especially the `publications list`_. Another way to learn about what's +happening in Ceph is to check out our `youtube channel`_ , where we post Tech +Talks, Code walk-throughs and Ceph Developer Monthly recordings. + +.. _`project website`: https://ceph.com +.. _`publications list`: https://ceph.com/publications/ +.. _`youtube channel`: https://www.youtube.com/c/CephStorage + +Since this document is to be consumed by developers, who are assumed to +have Internet access, topics covered elsewhere, either within the Ceph +documentation or elsewhere on the web, are treated by linking. If you +notice that a link is broken or if you know of a better link, please +`report it as a bug`_. + +.. _`report it as a bug`: http://tracker.ceph.com/projects/ceph/issues/new diff --git a/doc/dev/developer_guide/issue-tracker.rst b/doc/dev/developer_guide/issue-tracker.rst new file mode 100644 index 000000000..eae68f3f0 --- /dev/null +++ b/doc/dev/developer_guide/issue-tracker.rst @@ -0,0 +1,39 @@ +.. _issue-tracker: + +Issue Tracker +============= + +See `Redmine Issue Tracker`_ for a brief introduction to the Ceph Issue +Tracker. + +Ceph developers use the issue tracker to + +1. keep track of issues - bugs, fix requests, feature requests, backport +requests, etc. + +2. communicate with other developers and keep them informed as work +on the issues progresses. + +Issue tracker conventions +------------------------- + +When you start working on an existing issue, it's nice to let the other +developers know this - to avoid duplication of labor. Typically, this is +done by changing the :code:`Assignee` field (to yourself) and changing the +:code:`Status` to *In progress*. Newcomers to the Ceph community typically do +not have sufficient privileges to update these fields, however: they can +simply update the issue with a brief note. + +.. table:: Meanings of some commonly used statuses + + ================ =========================================== + Status Meaning + ================ =========================================== + New Initial status + In Progress Somebody is working on it + Need Review Pull request is open with a fix + Pending Backport Fix has been merged, backport(s) pending + Resolved Fix and backports (if any) have been merged + ================ =========================================== + +.. _Redmine issue tracker: https://tracker.ceph.com diff --git a/doc/dev/developer_guide/merging.rst b/doc/dev/developer_guide/merging.rst new file mode 100644 index 000000000..36e10fc84 --- /dev/null +++ b/doc/dev/developer_guide/merging.rst @@ -0,0 +1,138 @@ +.. _merging: + +Commit merging: scope and cadence +================================== + +Commits are merged into branches according to criteria specific to each phase +of the Ceph release lifecycle. This chapter codifies these criteria. + +Development releases (i.e. x.0.z) +--------------------------------- + +What ? +^^^^^^ + +* Features +* Bug fixes + +Where ? +^^^^^^^ + +Features are merged to the *main* branch. Bug fixes should be merged to the +corresponding named branch (e.g. *nautilus* for 14.0.z, *pacific* for 16.0.z, +etc.). However, this is not mandatory - bug fixes and documentation +enhancements can be merged to the *main* branch as well, since the *main* +branch is itself occasionally merged to the named branch during the development +releases phase. In either case, if a bug fix is important it can also be +flagged for backport to one or more previous stable releases. + +When ? +^^^^^^ + +After each stable release, candidate branches for previous releases enter +phase 2 (see below). For example: the *jewel* named branch was created when +the *infernalis* release candidates entered phase 2. From this point on, +*main* was no longer associated with *infernalis*. After he named branch of +the next stable release is created, *main* will be occasionally merged into +it. + +Branch merges +^^^^^^^^^^^^^ + +* The latest stable release branch is merged periodically into main. +* The main branch is merged periodically into the branch of the stable release. +* The main is merged into the stable release branch + immediately after each development (x.0.z) release. + +Stable release candidates (i.e. x.1.z) phase 1 +---------------------------------------------- + +What ? +^^^^^^ + +* Bug fixes only + +Where ? +^^^^^^^ + +The stable release branch (e.g. *jewel* for 10.0.z, *luminous* +for 12.0.z, etc.) or *main*. Bug fixes should be merged to the named +branch corresponding to the stable release candidate (e.g. *jewel* for +10.1.z) or to *main*. During this phase, all commits to *main* will be +merged to the named branch, and vice versa. In other words, it makes +no difference whether a commit is merged to the named branch or to +*main* - it will make it into the next release candidate either way. + +When ? +^^^^^^ + +After the first stable release candidate is published, i.e. after the +x.1.0 tag is set in the release branch. + +Branch merges +^^^^^^^^^^^^^ + +* The stable release branch is merged periodically into *main*. +* The *main* branch is merged periodically into the stable release branch. +* The *main* branch is merged into the stable release branch + immediately after each x.1.z release candidate. + +Stable release candidates (i.e. x.1.z) phase 2 +---------------------------------------------- + +What ? +^^^^^^ + +* Bug fixes only + +Where ? +^^^^^^^ + +The stable release branch (e.g. *mimic* for 13.0.z, *octopus* for 15.0.z +,etc.). During this phase, all commits to the named branch will be merged into +*main*. Cherry-picking to the named branch during release candidate phase 2 +is performed manually since the official backporting process begins only when +the release is pronounced "stable". + +When ? +^^^^^^ + +After Sage Weil announces that it is time for phase 2 to happen. + +Branch merges +^^^^^^^^^^^^^ + +* The stable release branch is occasionally merged into main. + +Stable releases (i.e. x.2.z) +---------------------------- + +What ? +^^^^^^ + +* Bug fixes +* Features are sometime accepted +* Commits should be cherry-picked from *main* when possible +* Commits that are not cherry-picked from *main* must pertain to a bug unique to + the stable release +* See also the `backport HOWTO`_ document + +.. _`backport HOWTO`: + http://tracker.ceph.com/projects/ceph-releases/wiki/HOWTO#HOWTO + +Where ? +^^^^^^^ + +The stable release branch (*hammer* for 0.94.x, *infernalis* for 9.2.x, +etc.) + +When ? +^^^^^^ + +After the stable release is published, i.e. after the "vx.2.0" tag is set in +the release branch. + +Branch merges +^^^^^^^^^^^^^ + +Never diff --git a/doc/dev/developer_guide/running-tests-in-cloud.rst b/doc/dev/developer_guide/running-tests-in-cloud.rst new file mode 100644 index 000000000..60118aefd --- /dev/null +++ b/doc/dev/developer_guide/running-tests-in-cloud.rst @@ -0,0 +1,289 @@ +Running Tests in the Cloud +========================== + +In this chapter, we will explain in detail how use an OpenStack +tenant as an environment for Ceph `integration testing`_. + +Assumptions and caveat +---------------------- + +We assume that: + +1. you are the only person using the tenant +2. you have the credentials +3. the tenant supports the ``nova`` and ``cinder`` APIs + +Caveat: be aware that, as of this writing (July 2016), testing in +OpenStack clouds is a new feature. Things may not work as advertised. +If you run into trouble, ask for help on `IRC`_ or the `Mailing list`_, or +open a bug report at the `ceph-workbench bug tracker`_. + +.. _`ceph-workbench bug tracker`: http://ceph-workbench.dachary.org/root/ceph-workbench/issues + +Prepare tenant +-------------- + +If you have not tried to use ``ceph-workbench`` with this tenant before, +proceed to the next step. + +To start with a clean slate, login to your tenant via the Horizon dashboard +and: + +* terminate the ``teuthology`` and ``packages-repository`` instances, if any +* delete the ``teuthology`` and ``teuthology-worker`` security groups, if any +* delete the ``teuthology`` and ``teuthology-myself`` key pairs, if any + +Also do the above if you ever get key-related errors ("invalid key", etc.) +when trying to schedule suites. + +Getting ceph-workbench +---------------------- + +Since testing in the cloud is done using the ``ceph-workbench ceph-qa-suite`` +tool, you will need to install that first. It is designed +to be installed via Docker, so if you don't have Docker running on your +development machine, take care of that first. You can follow `the official +tutorial <https://docs.docker.com/engine/installation/>`_ to install if +you have not installed yet. + +Once Docker is up and running, install ``ceph-workbench`` by following the +`Installation instructions in the ceph-workbench documentation +<http://ceph-workbench.readthedocs.io/en/latest/#installation>`_. + +Linking ceph-workbench with your OpenStack tenant +------------------------------------------------- + +Before you can trigger your first teuthology suite, you will need to link +``ceph-workbench`` with your OpenStack account. + +First, download a ``openrc.sh`` file by clicking on the "Download OpenStack +RC File" button, which can be found in the "API Access" tab of the "Access +& Security" dialog of the OpenStack Horizon dashboard. + +Second, create a ``~/.ceph-workbench`` directory, set its permissions to +700, and move the ``openrc.sh`` file into it. Make sure that the filename +is exactly ``~/.ceph-workbench/openrc.sh``. + +Third, edit the file so it does not ask for your OpenStack password +interactively. Comment out the relevant lines and replace them with +something like:: + +.. prompt:: bash $ + + export OS_PASSWORD="aiVeth0aejee3eep8rogho3eep7Pha6ek" + +When ``ceph-workbench ceph-qa-suite`` connects to your OpenStack tenant for +the first time, it will generate two keypairs: ``teuthology-myself`` and +``teuthology``. + +.. If this is not the first time you have tried to use +.. ``ceph-workbench ceph-qa-suite`` with this tenant, make sure to delete any +.. stale keypairs with these names! + +Run the dummy suite +------------------- + +You are now ready to take your OpenStack teuthology setup for a test +drive + +.. prompt:: bash $ + + ceph-workbench ceph-qa-suite --suite dummy + +Be forewarned that the first run of ``ceph-workbench ceph-qa-suite`` on a +pristine tenant will take a long time to complete because it downloads a VM +image and during this time the command may not produce any output. + +The images are cached in OpenStack, so they are only downloaded once. +Subsequent runs of the same command will complete faster. + +Although ``dummy`` suite does not run any tests, in all other respects it +behaves just like a teuthology suite and produces some of the same +artifacts. + +The last bit of output should look something like this:: + + pulpito web interface: http://149.202.168.201:8081/ + ssh access : ssh -i /home/smithfarm/.ceph-workbench/teuthology-myself.pem ubuntu@149.202.168.201 # logs in /usr/share/nginx/html + +What this means is that ``ceph-workbench ceph-qa-suite`` triggered the test +suite run. It does not mean that the suite run has completed. To monitor +progress of the run, check the Pulpito web interface URL periodically, or +if you are impatient, ssh to the teuthology machine using the ssh command +shown and do + +.. prompt:: bash $ + + tail -f /var/log/teuthology.* + +The `/usr/share/nginx/html` directory contains the complete logs of the +test suite. If we had provided the ``--upload`` option to the +``ceph-workbench ceph-qa-suite`` command, these logs would have been +uploaded to http://teuthology-logs.public.ceph.com. + +Run a standalone test +--------------------- + +The standalone test explained in `Reading a standalone test`_ can be run +with the following command + +.. prompt:: bash $ + + ceph-workbench ceph-qa-suite --suite rados/singleton/all/admin-socket.yaml + +This will run the suite shown on the current ``master`` branch of +``ceph/ceph.git``. You can specify a different branch with the ``--ceph`` +option, and even a different git repo with the ``--ceph-git-url`` option. (Run +``ceph-workbench ceph-qa-suite --help`` for an up-to-date list of available +options.) + +The first run of a suite will also take a long time, because ceph packages +have to be built, first. Again, the packages so built are cached and +``ceph-workbench ceph-qa-suite`` will not build identical packages a second +time. + +Interrupt a running suite +------------------------- + +Teuthology suites take time to run. From time to time one may wish to +interrupt a running suite. One obvious way to do this is:: + +.. prompt:: bash $ + + ceph-workbench ceph-qa-suite --teardown + +This destroys all VMs created by ``ceph-workbench ceph-qa-suite`` and +returns the OpenStack tenant to a "clean slate". + +Sometimes you may wish to interrupt the running suite, but keep the logs, +the teuthology VM, the packages-repository VM, etc. To do this, you can +``ssh`` to the teuthology VM (using the ``ssh access`` command reported +when you triggered the suite -- see `Run the dummy suite`_) and, once +there + +.. prompt:: bash $ + + sudo /etc/init.d/teuthology restart + +This will keep the teuthology machine, the logs and the packages-repository +instance but nuke everything else. + +Upload logs to archive server +----------------------------- + +Since the teuthology instance in OpenStack is only semi-permanent, with +limited space for storing logs, ``teuthology-openstack`` provides an +``--upload`` option which, if included in the ``ceph-workbench ceph-qa-suite`` +command, will cause logs from all failed jobs to be uploaded to the log +archive server maintained by the Ceph project. The logs will appear at the +URL:: + + http://teuthology-logs.public.ceph.com/$RUN + +where ``$RUN`` is the name of the run. It will be a string like this:: + + ubuntu-2016-07-23_16:08:12-rados-hammer-backports---basic-openstack + +Even if you don't providing the ``--upload`` option, however, all the logs can +still be found on the teuthology machine in the directory +``/usr/share/nginx/html``. + +Provision VMs ad hoc +-------------------- + +From the teuthology VM, it is possible to provision machines on an "ad hoc" +basis, to use however you like. The magic incantation is:: + +.. prompt:: bash $ + + teuthology-lock --lock-many $NUMBER_OF_MACHINES \ + --os-type $OPERATING_SYSTEM \ + --os-version $OS_VERSION \ + --machine-type openstack \ + --owner $EMAIL_ADDRESS + +The command must be issued from the ``~/teuthology`` directory. The possible +values for ``OPERATING_SYSTEM`` AND ``OS_VERSION`` can be found by examining +the contents of the directory ``teuthology/openstack/``. For example + +.. prompt:: bash $ + + teuthology-lock --lock-many 1 --os-type ubuntu --os-version 16.04 \ + --machine-type openstack --owner foo@example.com + +When you are finished with the machine, find it in the list of machines + +.. prompt:: bash $ + + openstack server list + +to determine the name or ID, and then terminate it with + +.. prompt:: bash $ + + openstack server delete $NAME_OR_ID + +Deploy a cluster for manual testing +----------------------------------- + +The `teuthology framework`_ and ``ceph-workbench ceph-qa-suite`` are +versatile tools that automatically provision Ceph clusters in the cloud and +run various tests on them in an automated fashion. This enables a single +engineer, in a matter of hours, to perform thousands of tests that would +keep dozens of human testers occupied for days or weeks if conducted +manually. + +However, there are times when the automated tests do not cover a particular +scenario and manual testing is desired. It turns out that it is simple to +adapt a test to stop and wait after the Ceph installation phase, and the +engineer can then ssh into the running cluster. Simply add the following +snippet in the desired place within the test YAML and schedule a run with the +test:: + + tasks: + - exec: + client.0: + - sleep 1000000000 # forever + +(Make sure you have a ``client.0`` defined in your ``roles`` stanza or adapt +accordingly.) + +The same effect can be achieved using the ``interactive`` task:: + + tasks: + - interactive + +By following the test log, you can determine when the test cluster has entered +the "sleep forever" condition. At that point, you can ssh to the teuthology +machine and from there to one of the target VMs (OpenStack) or teuthology +worker machines machine (Sepia) where the test cluster is running. + +The VMs (or "instances" in OpenStack terminology) created by +``ceph-workbench ceph-qa-suite`` are named as follows: + +``teuthology`` - the teuthology machine + +``packages-repository`` - VM where packages are stored + +``ceph-*`` - VM where packages are built + +``target*`` - machines where tests are run + +The VMs named ``target*`` are used by tests. If you are monitoring the +teuthology log for a given test, the hostnames of these target machines can +be found out by searching for the string ``Locked targets``:: + + 2016-03-20T11:39:06.166 INFO:teuthology.task.internal:Locked targets: + target149202171058.teuthology: null + target149202171059.teuthology: null + +The IP addresses of the target machines can be found by running ``openstack +server list`` on the teuthology machine, but the target VM hostnames (e.g. +``target149202171058.teuthology``) are resolvable within the teuthology +cluster. + +.. _Integration testing: ../tests-integration-tests +.. _IRC: ../essentials/#irc +.. _Mailing List: ../essentials/#mailing-list +.. _Reading A Standalone Test: ../testing-integration-tests/#reading-a-standalone-test +.. _teuthology framework: https://github.com/ceph/teuthology diff --git a/doc/dev/developer_guide/running-tests-locally.rst b/doc/dev/developer_guide/running-tests-locally.rst new file mode 100644 index 000000000..b786c12e8 --- /dev/null +++ b/doc/dev/developer_guide/running-tests-locally.rst @@ -0,0 +1,138 @@ +Running Unit Tests +================== + +How to run s3-tests locally +--------------------------- + +RGW code can be tested by building Ceph locally from source, starting a vstart +cluster, and running the "s3-tests" suite against it. + +The following instructions should work on jewel and above. + +Step 1 - build Ceph +^^^^^^^^^^^^^^^^^^^ + +Refer to :doc:`/install/build-ceph`. + +You can do step 2 separately while it is building. + +Step 2 - vstart +^^^^^^^^^^^^^^^ + +When the build completes, and still in the top-level directory of the git +clone where you built Ceph, do the following, for cmake builds:: + + cd build/ + RGW=1 ../src/vstart.sh -n + +This will produce a lot of output as the vstart cluster is started up. At the +end you should see a message like:: + + started. stop.sh to stop. see out/* (e.g. 'tail -f out/????') for debug output. + +This means the cluster is running. + + +Step 3 - run s3-tests +^^^^^^^^^^^^^^^^^^^^^ + +.. highlight:: console + +To run the s3tests suite do the following:: + + $ ../qa/workunits/rgw/run-s3tests.sh + + +Running test using vstart_runner.py +----------------------------------- +CephFS and Ceph Manager code is be tested using `vstart_runner.py`_. + +Running your first test +^^^^^^^^^^^^^^^^^^^^^^^^^^ +The Python tests in Ceph repository can be executed on your local machine +using `vstart_runner.py`_. To do that, you'd need `teuthology`_ installed:: + + $ virtualenv --python=python3 venv + $ source venv/bin/activate + $ pip install 'setuptools >= 12' + $ pip install git+https://github.com/ceph/teuthology#egg=teuthology[test] + $ deactivate + +The above steps installs teuthology in a virtual environment. Before running +a test locally, build Ceph successfully from the source (refer +:doc:`/install/build-ceph`) and do:: + + $ cd build + $ ../src/vstart.sh -n -d -l + $ source ~/path/to/teuthology/venv/bin/activate + +To run a specific test, say `test_reconnect_timeout`_ from +`TestClientRecovery`_ in ``qa/tasks/cephfs/test_client_recovery``, you can +do:: + + $ python ../qa/tasks/vstart_runner.py tasks.cephfs.test_client_recovery.TestClientRecovery.test_reconnect_timeout + +The above command runs vstart_runner.py and passes the test to be executed as +an argument to vstart_runner.py. In a similar way, you can also run the group +of tests in the following manner:: + + $ # run all tests in class TestClientRecovery + $ python ../qa/tasks/vstart_runner.py tasks.cephfs.test_client_recovery.TestClientRecovery + $ # run all tests in test_client_recovery.py + $ python ../qa/tasks/vstart_runner.py tasks.cephfs.test_client_recovery + +Based on the argument passed, vstart_runner.py collects tests and executes as +it would execute a single test. + +vstart_runner.py can take the following options - + +--clear-old-log deletes old log file before running the test +--create create Ceph cluster before running a test +--create-cluster-only creates the cluster and quits; tests can be issued + later +--interactive drops a Python shell when a test fails +--log-ps-output logs ps output; might be useful while debugging +--teardown tears Ceph cluster down after test(s) has finished + runnng +--kclient use the kernel cephfs client instead of FUSE +--brxnet=<net/mask> specify a new net/mask for the mount clients' network + namespace container (Default: 192.168.0.0/16) + +.. note:: If using the FUSE client, ensure that the fuse package is installed + and enabled on the system and that ``user_allow_other`` is added + to ``/etc/fuse.conf``. + +.. note:: If using the kernel client, the user must have the ability to run + commands with passwordless sudo access. A failure on the kernel + client may crash the host, so it's recommended to use this + functionality within a virtual machine. + +Internal working of vstart_runner.py - +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +vstart_runner.py primarily does three things - + +* collects and runs the tests + vstart_runner.py setups/teardowns the cluster and collects and runs the + test. This is implemented using methods ``scan_tests()``, ``load_tests()`` + and ``exec_test()``. This is where all the options that vstart_runner.py + takes are implemented along with other features like logging and copying + the traceback to the bottom of the log. + +* provides an interface for issuing and testing shell commands + The tests are written assuming that the cluster exists on remote machines. + vstart_runner.py provides an interface to run the same tests with the + cluster that exists within the local machine. This is done using the class + ``LocalRemote``. Class ``LocalRemoteProcess`` can manage the process that + executes the commands from ``LocalRemote``, class ``LocalDaemon`` provides + an interface to handle Ceph daemons and class ``LocalFuseMount`` can + create and handle FUSE mounts. + +* provides an interface to operate Ceph cluster + ``LocalCephManager`` provides methods to run Ceph cluster commands with + and without admin socket and ``LocalCephCluster`` provides methods to set + or clear ``ceph.conf``. + +.. _test_reconnect_timeout: https://github.com/ceph/ceph/blob/master/qa/tasks/cephfs/test_client_recovery.py#L133 +.. _TestClientRecovery: https://github.com/ceph/ceph/blob/master/qa/tasks/cephfs/test_client_recovery.py#L86 +.. _teuthology: https://github.com/ceph/teuthology +.. _vstart_runner.py: https://github.com/ceph/ceph/blob/master/qa/tasks/vstart_runner.py diff --git a/doc/dev/developer_guide/running-tests-using-teuth.rst b/doc/dev/developer_guide/running-tests-using-teuth.rst new file mode 100644 index 000000000..492b7790e --- /dev/null +++ b/doc/dev/developer_guide/running-tests-using-teuth.rst @@ -0,0 +1,183 @@ +Running Integration Tests using Teuthology +========================================== + +Getting binaries +---------------- +To run integration tests using teuthology, you need to have Ceph binaries +built for your branch. Follow these steps to initiate the build process - + +#. Push the branch to `ceph-ci`_ repository. This triggers the process of + building the binaries. + +#. To confirm that the build process has been initiated, spot the branch name + at `Shaman`_. Little after the build process has been initiated, the single + entry with your branch name would multiply, each new entry for a different + combination of distro and flavour. + +#. Wait until the packages are built and uploaded, and the repository offering + them are created. This is marked by colouring the entries for the branch + name green. Preferably, wait until each entry is coloured green. Usually, + it takes around 2-3 hours depending on the availability of the machines. + +.. note:: Branch to be pushed on ceph-ci can be any branch, it shouldn't + necessarily be a PR branch. + +.. note:: In case you are pushing master or any other standard branch, check + `Shaman`_ beforehand since it already might have builds ready for it. + +Triggering Tests +---------------- +After building is complete, proceed to trigger tests - + +#. Log in to the teuthology machine:: + + ssh <username>@teuthology.front.sepia.ceph.com + + This would require Sepia lab access. To know how to request it, see: https://ceph.github.io/sepia/adding_users/ + +#. Next, get teuthology installed. Run the first set of commands in + `Running Your First Test`_ for that. After that, activate the virtual + environment in which teuthology is installed. + +#. Run the ``teuthology-suite`` command:: + + teuthology-suite -v -m smithi -c wip-devname-feature-x -s fs -p 110 --filter "cephfs-shell" + + Following are the options used in above command with their meanings - + -v verbose + -m machine name + -c branch name, the branch that was pushed on ceph-ci + -s test-suite name + -p higher the number, lower the priority of the job + --filter filter tests in given suite that needs to run, the arg to + filter should be the test you want to run + +.. note:: The priority number present in the command above is just a + placeholder. It might be highly inappropriate for the jobs you may want to + trigger. See `Testing Priority`_ section to pick a priority number. + +.. note:: Don't skip passing a priority number, the default value is 1000 + which way too high; the job probably might never run. + +#. Wait for the tests to run. ``teuthology-suite`` prints a link to the + `Pulpito`_ page created for the tests triggered. + +Other frequently used/useful options are ``-d`` (or ``--distro``), +``--distroversion``, ``--filter-out``, ``--timeout``, ``flavor``, ``-rerun``, +``-l`` (for limiting number of jobs) , ``-n`` (for how many times job would +run) and ``-e`` (for email notifications). Run ``teuthology-suite --help`` +to read description of these and every other options available. + +Testing QA changes (without re-building binaires) +------------------------------------------------- +While writing a PR you might need to test your PR repeatedly using teuthology. +If you are making non-QA changes, you need to follow the standard process of +triggering builds, waiting for it to finish and then triggering tests and +wait for the result. But if changes you made are purely changes in qa/, +you don't need rebuild the binaries. Instead you can test binaries built for +the ceph-ci branch and instruct ``teuthology-suite`` command to use a separate +branch for running tests. The separate branch can be passed to the command +by using ``--suite-repo`` and ``--suite-branch``. Pass the link to the GitHub +fork where your PR branch exists to the first option and pass the PR branch +name to the second option. + +For example, if you want to make changes in ``qa/`` after testing ``branch-x`` +(of which has ceph-ci branch is ``wip-username-branch-x``) by running +following command:: + + teuthology-suite -v -m smithi -c wip-username-branch-x -s fs -p 50 --filter cephfs-shell + +You can make the modifications locally, update the PR branch and then +trigger tests from your PR branch as follows:: + + teuthology-suite -v -m smithi -c wip-username-branch-x -s fs -p 50 --filter cephfs-shell --suite-repo https://github.com/username/ceph --suite-branch branch-x + +You can verify if the tests were run using this branch by looking at values +for the keys ``suite_branch``, ``suite_repo`` and ``suite_sha1`` in the job +config printed at the very beginning of the teuthology job. + +About Suites and Filters +------------------------ +See `Suites Inventory`_ for a list of suites of integration tests present +right now. Alternatively, each directory under ``qa/suites`` in Ceph +repository is an integration test suite, so looking within that directory +to decide an appropriate argument for ``-s`` also works. + +For picking an argument for ``--filter``, look within +``qa/suites/<suite-name>/<subsuite-name>/tasks`` to get keywords for filtering +tests. Each YAML file in there can trigger a bunch of tests; using the name of +the file, without the extension part of the file name, as an argument to the +``--filter`` will trigger those tests. For example, the sample command above +uses ``cephfs-shell`` since there's a file named ``cephfs-shell.yaml`` in +``qa/suites/fs/basic_functional/tasks/``. In case, the file name doesn't hint +what bunch of tests it would trigger, look at the contents of the file for +``modules`` attribute. For ``cephfs-shell.yaml`` the ``modules`` attribute +is ``tasks.cephfs.test_cephfs_shell`` which means it'll trigger all tests in +``qa/tasks/cephfs/test_cephfs_shell.py``. + +Killing Tests +------------- +Sometimes a teuthology job might not complete running for several minutes or +even hours after tests that were trigged have completed running and other +times wrong set of tests can be triggered is filter wasn't chosen carefully. +To save resource it's better to termniate such a job. Following is the command +to terminate a job:: + + teuthology-kill -r teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi + +Let's call the argument passed to ``-r`` as test ID. It can be found +easily in the link to the Pulpito page for the tests you triggered. For +example, for the above test ID, the link is - http://pulpito.front.sepia.ceph.com/teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi/ + +Re-running Tests +---------------- +Pass ``--rerun`` option, with test ID as an argument to it, to +``teuthology-suite`` command:: + + teuthology-suite -v -m smithi -c wip-rishabh-fs-test_cephfs_shell-fix -p 50 --rerun teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi + +The meaning of rest of the options is already covered in `Triggering Tests` +section. + +Teuthology Archives +------------------- +Once the tests have finished running, the log for the job can be obtained by +clicking on job ID at the Pulpito page for your tests. It's more convenient to +download the log and then view it rather than viewing it in an internet +browser since these logs can easily be upto size of 1 GB. What's much more +easier is to log in to the teuthology machine again +(``teuthology.front.sepia.ceph.com``), and access the following path:: + + /ceph/teuthology-archive/<test-id>/<job-id>/teuthology.log + +For example, for above test ID path is:: + + /ceph/teuthology-archive/teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi/4588482/teuthology.log + +This way the log remotely can be viewed remotely without having to wait too +much. + +Naming the ceph-ci branch +------------------------- +There are no hard conventions (except for the case of stable branch; see +next paragraph) for how the branch pushed on ceph-ci is named. But, to make +builds and tests easily identitifiable on Shaman and Pulpito respectively, +prepend it with your name. For example branch ``feature-x`` can be named +``wip-yourname-feature-x`` while pushing on ceph-ci. + +In case you are using one of the stable branches (e.g. nautilis, mimic, +etc.), include the name of that stable branch in your ceph-ci branch name. +For example, ``feature-x`` PR branch should be named as +``wip-feature-x-nautilus``. *This is not just a matter of convention but this, +more essentially, builds your branch in the correct environment.* + +Delete the branch from ceph-ci, once it's not required anymore. If you are +logged in at GitHub, all your branches on ceph-ci can be easily found here - +https://github.com/ceph/ceph-ci/branches. + +.. _ceph-ci: https://github.com/ceph/ceph-ci +.. _Pulpito: http://pulpito.front.sepia.ceph.com/ +.. _Running Your First Test: ../running-tests-locally/#running-your-first-test +.. _Shaman: https://shaman.ceph.com/builds/ceph/ +.. _Suites Inventory: ../tests-integration-tests/#suites-inventory +.. _Testing Priority: ../tests-integration-tests/#testing-priority diff --git a/doc/dev/developer_guide/tests-integration-tests.rst b/doc/dev/developer_guide/tests-integration-tests.rst new file mode 100644 index 000000000..c8e6dbcd4 --- /dev/null +++ b/doc/dev/developer_guide/tests-integration-tests.rst @@ -0,0 +1,522 @@ +.. _testing-integration-tests: + +Testing - Integration Tests +=========================== + +Ceph has two types of tests: :ref:`make check <make-check>` tests and integration tests. +When a test requires multiple machines, root access or lasts for a +longer time (for example, to simulate a realistic Ceph deployment), it +is deemed to be an integration test. Integration tests are organized into +"suites", which are defined in the `ceph/qa sub-directory`_ and run with +the ``teuthology-suite`` command. + +The ``teuthology-suite`` command is part of the `teuthology framework`_. +In the sections that follow we attempt to provide a detailed introduction +to that framework from the perspective of a beginning Ceph developer. + +Teuthology consumes packages +---------------------------- + +It may take some time to understand the significance of this fact, but it +is `very` significant. It means that automated tests can be conducted on +multiple platforms using the same packages (RPM, DEB) that can be +installed on any machine running those platforms. + +Teuthology has a `list of platforms that it supports +<https://github.com/ceph/ceph/tree/master/qa/distros/supported>`_ (as +of September 2020 the list consisted of "RHEL/CentOS 8" and "Ubuntu 18.04"). It +expects to be provided pre-built Ceph packages for these platforms. +Teuthology deploys these platforms on machines (bare-metal or +cloud-provisioned), installs the packages on them, and deploys Ceph +clusters on them - all as called for by the test. + +The Nightlies +------------- + +A number of integration tests are run on a regular basis in the `Sepia +lab`_ against the official Ceph repositories (on the ``master`` development +branch and the stable branches). Traditionally, these tests are called "the +nightlies" because the Ceph core developers used to live and work in +the same time zone and from their perspective the tests were run overnight. + +The results of the nightlies are published at http://pulpito.ceph.com/. The +developer nick shows in the +test results URL and in the first column of the Pulpito dashboard. The +results are also reported on the `ceph-qa mailing list +<https://ceph.com/irc/>`_ for analysis. + +Testing Priority +---------------- + +The ``teuthology-suite`` command includes an almost mandatory option ``-p <N>`` +which specifies the priority of the jobs submitted to the queue. The lower +the value of ``N``, the higher the priority. The option is almost mandatory +because the default is ``1000`` which matches the priority of the nightlies. +Nightlies are often half-finished and cancelled due to the volume of testing +done so your jobs may never finish. Therefore, it is common to select a +priority less than 1000. + +Job priority should be selected based on the following recommendations: + +* **Priority < 10:** Use this if the sky is falling and some group of tests + must be run ASAP. + +* **10 <= Priority < 50:** Use this if your tests are urgent and blocking + other important development. + +* **50 <= Priority < 75:** Use this if you are testing a particular + feature/fix and running fewer than about 25 jobs. This range can also be + used for urgent release testing. + +* **75 <= Priority < 100:** Tech Leads will regularly schedule integration + tests with this priority to verify pull requests against master. + +* **100 <= Priority < 150:** This priority is to be used for QE validation of + point releases. + +* **150 <= Priority < 200:** Use this priority for 100 jobs or fewer of a + particular feature/fix that you'd like results on in a day or so. + +* **200 <= Priority < 1000:** Use this priority for large test runs that can + be done over the course of a week. + +In case you don't know how many jobs would be triggered by +``teuthology-suite`` command, use ``--dry-run`` to get a count first and then +issue ``teuthology-suite`` command again, this time without ``--dry-run`` and +with ``-p`` and an appropriate number as an argument to it. + +To skip the priority check, use ``--force-priority``. In order to be sensitive +to the runs of other developers who also need to do testing, please use it in +emergency only. + +Suites Inventory +---------------- + +The ``suites`` directory of the `ceph/qa sub-directory`_ contains +all the integration tests, for all the Ceph components. + +`ceph-deploy <https://github.com/ceph/ceph/tree/master/qa/suites/ceph-deploy>`_ + install a Ceph cluster with ``ceph-deploy`` (:ref:`ceph-deploy man page <ceph-deploy>`) + +`dummy <https://github.com/ceph/ceph/tree/master/qa/suites/dummy>`_ + get a machine, do nothing and return success (commonly used to + verify the :ref:`testing-integration-tests` infrastructure works as expected) + +`fs <https://github.com/ceph/ceph/tree/master/qa/suites/fs>`_ + test CephFS mounted using kernel and FUSE clients, also with multiple MDSs. + +`krbd <https://github.com/ceph/ceph/tree/master/qa/suites/krbd>`_ + test the RBD kernel module + +`powercycle <https://github.com/ceph/ceph/tree/master/qa/suites/powercycle>`_ + verify the Ceph cluster behaves when machines are powered off + and on again + +`rados <https://github.com/ceph/ceph/tree/master/qa/suites/rados>`_ + run Ceph clusters including OSDs and MONs, under various conditions of + stress + +`rbd <https://github.com/ceph/ceph/tree/master/qa/suites/rbd>`_ + run RBD tests using actual Ceph clusters, with and without qemu + +`rgw <https://github.com/ceph/ceph/tree/master/qa/suites/rgw>`_ + run RGW tests using actual Ceph clusters + +`smoke <https://github.com/ceph/ceph/tree/master/qa/suites/smoke>`_ + run tests that exercise the Ceph API with an actual Ceph cluster + +`teuthology <https://github.com/ceph/ceph/tree/master/qa/suites/teuthology>`_ + verify that teuthology can run integration tests, with and without OpenStack + +`upgrade <https://github.com/ceph/ceph/tree/master/qa/suites/upgrade>`_ + for various versions of Ceph, verify that upgrades can happen + without disrupting an ongoing workload + +.. _`ceph-deploy man page`: ../../man/8/ceph-deploy + +teuthology-describe-tests +------------------------- + +In February 2016, a new feature called ``teuthology-describe-tests`` was +added to the `teuthology framework`_ to facilitate documentation and better +understanding of integration tests (`feature announcement +<http://article.gmane.org/gmane.comp.file-systems.ceph.devel/29287>`_). + +The upshot is that tests can be documented by embedding ``meta:`` +annotations in the yaml files used to define the tests. The results can be +seen in the `ceph-qa-suite wiki +<http://tracker.ceph.com/projects/ceph-qa-suite/wiki/>`_. + +Since this is a new feature, many yaml files have yet to be annotated. +Developers are encouraged to improve the documentation, in terms of both +coverage and quality. + +How integration tests are run +----------------------------- + +Given that - as a new Ceph developer - you will typically not have access +to the `Sepia lab`_, you may rightly ask how you can run the integration +tests in your own environment. + +One option is to set up a teuthology cluster on bare metal. Though this is +a non-trivial task, it `is` possible. Here are `some notes +<http://docs.ceph.com/teuthology/docs/LAB_SETUP.html>`_ to get you started +if you decide to go this route. + +If you have access to an OpenStack tenant, you have another option: the +`teuthology framework`_ has an OpenStack backend, which is documented `here +<https://github.com/dachary/teuthology/tree/openstack#openstack-backend>`__. +This OpenStack backend can build packages from a given git commit or +branch, provision VMs, install the packages and run integration tests +on those VMs. This process is controlled using a tool called +``ceph-workbench ceph-qa-suite``. This tool also automates publishing of +test results at http://teuthology-logs.public.ceph.com. + +Running integration tests on your code contributions and publishing the +results allows reviewers to verify that changes to the code base do not +cause regressions, or to analyze test failures when they do occur. + +Every teuthology cluster, whether bare-metal or cloud-provisioned, has a +so-called "teuthology machine" from which tests suites are triggered using the +``teuthology-suite`` command. + +A detailed and up-to-date description of each `teuthology-suite`_ option is +available by running the following command on the teuthology machine + +.. prompt:: bash $ + + teuthology-suite --help + +.. _teuthology-suite: http://docs.ceph.com/teuthology/docs/teuthology.suite.html + +How integration tests are defined +--------------------------------- + +Integration tests are defined by yaml files found in the ``suites`` +subdirectory of the `ceph/qa sub-directory`_ and implemented by python +code found in the ``tasks`` subdirectory. Some tests ("standalone tests") +are defined in a single yaml file, while other tests are defined by a +directory tree containing yaml files that are combined, at runtime, into a +larger yaml file. + +Reading a standalone test +------------------------- + +Let us first examine a standalone test, or "singleton". + +Here is a commented example using the integration test +`rados/singleton/all/admin-socket.yaml +<https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/admin-socket.yaml>`_ + +.. code-block:: yaml + + roles: + - - mon.a + - osd.0 + - osd.1 + tasks: + - install: + - ceph: + - admin_socket: + osd.0: + version: + git_version: + help: + config show: + config set filestore_dump_file /tmp/foo: + perf dump: + perf schema: + +The ``roles`` array determines the composition of the cluster (how +many MONs, OSDs, etc.) on which this test is designed to run, as well +as how these roles will be distributed over the machines in the +testing cluster. In this case, there is only one element in the +top-level array: therefore, only one machine is allocated to the +test. The nested array declares that this machine shall run a MON with +id ``a`` (that is the ``mon.a`` in the list of roles) and two OSDs +(``osd.0`` and ``osd.1``). + +The body of the test is in the ``tasks`` array: each element is +evaluated in order, causing the corresponding python file found in the +``tasks`` subdirectory of the `teuthology repository`_ or +`ceph/qa sub-directory`_ to be run. "Running" in this case means calling +the ``task()`` function defined in that file. + +In this case, the `install +<https://github.com/ceph/teuthology/blob/master/teuthology/task/install/__init__.py>`_ +task comes first. It installs the Ceph packages on each machine (as +defined by the ``roles`` array). A full description of the ``install`` +task is `found in the python file +<https://github.com/ceph/teuthology/blob/master/teuthology/task/install/__init__.py>`_ +(search for "def task"). + +The ``ceph`` task, which is documented `here +<https://github.com/ceph/ceph/blob/master/qa/tasks/ceph.py>`__ (again, +search for "def task"), starts OSDs and MONs (and possibly MDSs as well) +as required by the ``roles`` array. In this example, it will start one MON +(``mon.a``) and two OSDs (``osd.0`` and ``osd.1``), all on the same +machine. Control moves to the next task when the Ceph cluster reaches +``HEALTH_OK`` state. + +The next task is ``admin_socket`` (`source code +<https://github.com/ceph/ceph/blob/master/qa/tasks/admin_socket.py>`_). +The parameter of the ``admin_socket`` task (and any other task) is a +structure which is interpreted as documented in the task. In this example +the parameter is a set of commands to be sent to the admin socket of +``osd.0``. The task verifies that each of them returns on success (i.e. +exit code zero). + +This test can be run with + +.. prompt:: bash $ + + teuthology-suite --machine-type smithi --suite rados/singleton/all/admin-socket.yaml fs/ext4.yaml + +Test descriptions +----------------- + +Each test has a "test description", which is similar to a directory path, +but not the same. In the case of a standalone test, like the one in +`Reading a standalone test`_, the test description is identical to the +relative path (starting from the ``suites/`` directory of the +`ceph/qa sub-directory`_) of the yaml file defining the test. + +Much more commonly, tests are defined not by a single yaml file, but by a +`directory tree of yaml files`. At runtime, the tree is walked and all yaml +files (facets) are combined into larger yaml "programs" that define the +tests. A full listing of the yaml defining the test is included at the +beginning of every test log. + +In these cases, the description of each test consists of the +subdirectory under `suites/ +<https://github.com/ceph/ceph/tree/master/qa/suites>`_ containing the +yaml facets, followed by an expression in curly braces (``{}``) consisting of +a list of yaml facets in order of concatenation. For instance the +test description:: + + ceph-deploy/basic/{distros/centos_7.0.yaml tasks/ceph-deploy.yaml} + +signifies the concatenation of two files: + +* ceph-deploy/basic/distros/centos_7.0.yaml +* ceph-deploy/basic/tasks/ceph-deploy.yaml + +How tests are built from directories +------------------------------------ + +As noted in the previous section, most tests are not defined in a single +yaml file, but rather as a `combination` of files collected from a +directory tree within the ``suites/`` subdirectory of the `ceph/qa sub-directory`_. + +The set of all tests defined by a given subdirectory of ``suites/`` is +called an "integration test suite", or a "teuthology suite". + +Combination of yaml facets is controlled by special files (``%`` and +``+``) that are placed within the directory tree and can be thought of as +operators. The ``%`` file is the "convolution" operator and ``+`` +signifies concatenation. + +Convolution operator +^^^^^^^^^^^^^^^^^^^^ + +The convolution operator, implemented as an empty file called ``%``, tells +teuthology to construct a test matrix from yaml facets found in +subdirectories below the directory containing the operator. + +For example, the `ceph-deploy suite +<https://github.com/ceph/ceph/tree/master/qa/suites/ceph-deploy/>`_ is +defined by the ``suites/ceph-deploy/`` tree, which consists of the files and +subdirectories in the following structure + +.. code-block:: none + + qa/suites/ceph-deploy + ├── % + ├── distros + │ ├── centos_latest.yaml + │ └── ubuntu_latest.yaml + └── tasks + ├── ceph-admin-commands.yaml + └── rbd_import_export.yaml + +This is interpreted as a 2x1 matrix consisting of two tests: + +1. ceph-deploy/basic/{distros/centos_7.0.yaml tasks/ceph-deploy.yaml} +2. ceph-deploy/basic/{distros/ubuntu_16.04.yaml tasks/ceph-deploy.yaml} + +i.e. the concatenation of centos_7.0.yaml and ceph-deploy.yaml and +the concatenation of ubuntu_16.04.yaml and ceph-deploy.yaml, respectively. +In human terms, this means that the task found in ``ceph-deploy.yaml`` is +intended to run on both CentOS 7.0 and Ubuntu 16.04. + +Without the file percent, the ``ceph-deploy`` tree would be interpreted as +three standalone tests: + +* ceph-deploy/basic/distros/centos_7.0.yaml +* ceph-deploy/basic/distros/ubuntu_16.04.yaml +* ceph-deploy/basic/tasks/ceph-deploy.yaml + +(which would of course be wrong in this case). + +Referring to the `ceph/qa sub-directory`_, you will notice that the +``centos_7.0.yaml`` and ``ubuntu_16.04.yaml`` files in the +``suites/ceph-deploy/basic/distros/`` directory are implemented as symlinks. +By using symlinks instead of copying, a single file can appear in multiple +suites. This eases the maintenance of the test framework as a whole. + +All the tests generated from the ``suites/ceph-deploy/`` directory tree +(also known as the "ceph-deploy suite") can be run with + +.. prompt:: bash $ + + teuthology-suite --machine-type smithi --suite ceph-deploy + +An individual test from the `ceph-deploy suite`_ can be run by adding the +``--filter`` option + +.. prompt:: bash $ + + teuthology-suite \ + --machine-type smithi \ + --suite ceph-deploy/basic \ + --filter 'ceph-deploy/basic/{distros/ubuntu_16.04.yaml tasks/ceph-deploy.yaml}' + +.. note:: To run a standalone test like the one in `Reading a standalone + test`_, ``--suite`` alone is sufficient. If you want to run a single + test from a suite that is defined as a directory tree, ``--suite`` must + be combined with ``--filter``. This is because the ``--suite`` option + understands POSIX relative paths only. + +Concatenation operator +^^^^^^^^^^^^^^^^^^^^^^ + +For even greater flexibility in sharing yaml files between suites, the +special file plus (``+``) can be used to concatenate files within a +directory. For instance, consider the `suites/rbd/thrash +<https://github.com/ceph/ceph/tree/master/qa/suites/rbd/thrash>`_ +tree + +.. code-block:: none + + qa/suites/rbd/thrash + ├── % + ├── clusters + │ ├── + + │ ├── fixed-2.yaml + │ └── openstack.yaml + └── workloads + ├── rbd_api_tests_copy_on_read.yaml + ├── rbd_api_tests.yaml + └── rbd_fsx_rate_limit.yaml + +This creates two tests: + +* rbd/thrash/{clusters/fixed-2.yaml clusters/openstack.yaml workloads/rbd_api_tests_copy_on_read.yaml} +* rbd/thrash/{clusters/fixed-2.yaml clusters/openstack.yaml workloads/rbd_api_tests.yaml} + +Because the ``clusters/`` subdirectory contains the special file plus +(``+``), all the other files in that subdirectory (``fixed-2.yaml`` and +``openstack.yaml`` in this case) are concatenated together +and treated as a single file. Without the special file plus, they would +have been convolved with the files from the workloads directory to create +a 2x2 matrix: + +* rbd/thrash/{clusters/openstack.yaml workloads/rbd_api_tests_copy_on_read.yaml} +* rbd/thrash/{clusters/openstack.yaml workloads/rbd_api_tests.yaml} +* rbd/thrash/{clusters/fixed-2.yaml workloads/rbd_api_tests_copy_on_read.yaml} +* rbd/thrash/{clusters/fixed-2.yaml workloads/rbd_api_tests.yaml} + +The ``clusters/fixed-2.yaml`` file is shared among many suites to +define the following ``roles`` + +.. code-block:: yaml + + roles: + - [mon.a, mon.c, osd.0, osd.1, osd.2, client.0] + - [mon.b, osd.3, osd.4, osd.5, client.1] + +The ``rbd/thrash`` suite as defined above, consisting of two tests, +can be run with + +.. prompt:: bash $ + + teuthology-suite --machine-type smithi --suite rbd/thrash + +A single test from the rbd/thrash suite can be run by adding the +``--filter`` option + +.. prompt:: bash $ + + teuthology-suite \ + --machine-type smithi \ + --suite rbd/thrash \ + --filter 'rbd/thrash/{clusters/fixed-2.yaml clusters/openstack.yaml workloads/rbd_api_tests_copy_on_read.yaml}' + +Filtering tests by their description +------------------------------------ + +When a few jobs fail and need to be run again, the ``--filter`` option +can be used to select tests with a matching description. For instance, if the +``rados`` suite fails the `all/peer.yaml <https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/peer.yaml>`_ test, the following will only +run the tests that contain this file + +.. prompt:: bash $ + + teuthology-suite --machine-type smithi --suite rados --filter all/peer.yaml + +The ``--filter-out`` option does the opposite (it matches tests that do `not` +contain a given string), and can be combined with the ``--filter`` option. + +Both ``--filter`` and ``--filter-out`` take a comma-separated list of strings +(which means the comma character is implicitly forbidden in filenames found in +the `ceph/qa sub-directory`_). For instance + +.. prompt:: bash $ + + teuthology-suite --machine-type smithi --suite rados --filter all/peer.yaml,all/rest-api.yaml + +will run tests that contain either +`all/peer.yaml <https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/peer.yaml>`_ +or +`all/rest-api.yaml <https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/rest-api.yaml>`_ + +Each string is looked up anywhere in the test description and has to +be an exact match: they are not regular expressions. + +Reducing the number of tests +---------------------------- + +The ``rados`` suite generates tens or even hundreds of thousands of tests out +of a few hundred files. This happens because teuthology constructs test +matrices from subdirectories wherever it encounters a file named ``%``. For +instance, all tests in the `rados/basic suite +<https://github.com/ceph/ceph/tree/master/qa/suites/rados/basic>`_ run with +different messenger types: ``simple``, ``async`` and ``random``, because they +are combined (via the special file ``%``) with the `msgr directory +<https://github.com/ceph/ceph/tree/master/qa/suites/rados/basic/msgr>`_ + +All integration tests are required to be run before a Ceph release is +published. When merely verifying whether a contribution can be merged without +risking a trivial regression, it is enough to run a subset. The ``--subset`` +option can be used to reduce the number of tests that are triggered. For +instance + +.. prompt:: bash $ + + teuthology-suite --machine-type smithi --suite rados --subset 0/4000 + +will run as few tests as possible. The tradeoff in this case is that +not all combinations of test variations will together, +but no matter how small a ratio is provided in the ``--subset``, +teuthology will still ensure that all files in the suite are in at +least one test. Understanding the actual logic that drives this +requires reading the teuthology source code. + +The ``--limit`` option only runs the first ``N`` tests in the suite: +this is rarely useful, however, because there is no way to control which +test will be first. + +.. _ceph/qa sub-directory: https://github.com/ceph/ceph/tree/master/qa +.. _Sepia Lab: https://wiki.sepia.ceph.com/doku.php +.. _teuthology repository: https://github.com/ceph/teuthology +.. _teuthology framework: https://github.com/ceph/teuthology diff --git a/doc/dev/developer_guide/tests-unit-tests.rst b/doc/dev/developer_guide/tests-unit-tests.rst new file mode 100644 index 000000000..72d724d98 --- /dev/null +++ b/doc/dev/developer_guide/tests-unit-tests.rst @@ -0,0 +1,177 @@ +Testing - unit tests +==================== + +The Ceph GitHub repository has two types of tests: unit tests (also called +``make check`` tests) and integration tests. Strictly speaking, the +``make check`` tests are not "unit tests", but rather tests that can be run +easily on a single build machine after compiling Ceph from source, whereas +integration tests require package installation and multi-machine clusters to +run. + +.. _make-check: + +What does "make check" mean? +---------------------------- + +After compiling Ceph, the code can be run through a battery of tests. For +historical reasons, this is often referred to as ``make check`` even though +the actual command used to run the tests is now ``ctest``. To be included in +this group of tests, a test must: + +* bind ports that do not conflict with other tests +* not require root access +* not require more than one machine to run +* complete within a few minutes + +For the sake of simplicity, this class of tests is referred to as "make +check tests" or "unit tests". This is meant to distinguish these tests from +the more complex "integration tests" that are run via the `teuthology +framework`_. + +While it is possible to run ``ctest`` directly, it can be tricky to correctly +set up your environment for it. Fortunately, there is a script that makes it +easy to run the unit tests on your code. This script can be run from the +top-level directory of the Ceph source tree by invoking: + + .. prompt:: bash $ + + ./run-make-check.sh + +You will need a minimum of 8GB of RAM and 32GB of free drive space for this +command to complete successfully on x86_64 architectures; other architectures +may have different requirements. Depending on your hardware, it can take from +twenty minutes to three hours to complete. + + +How unit tests are declared +--------------------------- + +Unit tests are declared in the ``CMakeLists.txt`` file, which is found in the +``./src`` directory. The ``add_ceph_test`` and ``add_ceph_unittest`` CMake +functions are used to declare unit tests. ``add_ceph_test`` and +``add_ceph_unittest`` are themselves defined in +``./cmake/modules/AddCephTest.cmake``. + +Some unit tests are scripts and other unit tests are binaries that are +compiled during the build process. + +* ``add_ceph_test`` function - used to declare unit test scripts +* ``add_ceph_unittest`` function - used for unit test binaries + +Unit testing of CLI tools +------------------------- +Some of the CLI tools are tested using special files ending with the extension +``.t`` and stored under ``./src/test/cli``. These tests are run using a tool +called `cram`_ via a shell script called ``./src/test/run-cli-tests``. +`cram`_ tests that are not suitable for ``make check`` can also be run by +teuthology using the `cram task`_. + +.. _`cram`: https://bitheap.org/cram/ +.. _`cram task`: https://github.com/ceph/ceph/blob/master/qa/tasks/cram.py + +Tox-based testing of Python modules +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Some of the Python modules in Ceph use `tox <https://tox.readthedocs.io/en/latest/>`_ +to run their unit tests. + +Most of these Python modules can be found in the directory ``./src/pybind/``. + +Currently (December 2020) the following modules use **tox**: + +* Cephadm (``./src/cephadm/tox.ini``) +* Ceph Manager Python API (``./src/pybind/mgr``) + + * ``./src/pybind/mgr/tox.ini`` + + * ``./src/pybind/mgr/dashboard/tox.ini`` + + * ``./src/pybind/tox.ini`` + +* Dashboard (``./src/pybind/mgr/dashboard``) +* Python common (``./src/python-common/tox.ini``) +* CephFS (``./src/tools/cephfs/tox.ini``) +* ceph-volume + + * ``./src/ceph-volume/tox.ini`` + + * ``./src/ceph-volume/plugin/zfs/tox.ini`` + + * ``./src/ceph-volume/ceph_volume/tests/functional/batch/tox.ini`` + + * ``./src/ceph-volume/ceph_volume/tests/functional/simple/tox.ini`` + + * ``./src/ceph-volume/ceph_volume/tests/functional/lvm/tox.ini`` + +Configuring Tox environments and tasks +"""""""""""""""""""""""""""""""""""""" +Most tox configurations support multiple environments and tasks. + +The list of environments and tasks that are supported is in the ``tox.ini`` +file, under ``envlist``. For example, here are the first three lines of +``./src/cephadm/tox.ini``:: + + [tox] + envlist = py3, mypy + skipsdist=true + +In this example, the ``Python 3`` and ``mypy`` environments are specified. + +The list of environments can be retrieved with the following command: + + .. prompt:: bash $ + + tox --list + +Or: + + .. prompt:: bash $ + + tox -l + +Running Tox +""""""""""" +To run **tox**, just execute ``tox`` in the directory containing +``tox.ini``. If you do not specify any environments (for example, ``-e +$env1,$env2``), then ``tox`` will run all environments. Jenkins will run +``tox`` by executing ``./src/script/run_tox.sh``. + +Here are some examples from Ceph Dashboard that show how to specify different +environments and run options:: + + ## Run Python 2+3 tests+lint commands: + $ tox -e py27,py3,lint,check + + ## Run Python 3 tests+lint commands: + $ tox -e py3,lint,check + + ## To run it as Jenkins would: + $ ../../../script/run_tox.sh --tox-env py3,lint,check + +Manager core unit tests +""""""""""""""""""""""" + +Currently only doctests_ inside ``mgr_util.py`` are run. + +To add more files to be tested inside the core of the manager, open the +``tox.ini`` file and add the files to be tested at the end of the line that +includes ``mgr_util.py``. + +.. _doctests: https://docs.python.org/3/library/doctest.html + +Unit test caveats +----------------- + +#. Unlike the various Ceph daemons and ``ceph-fuse``, the unit tests are + linked against the default memory allocator (glibc) unless they are + explicitly linked against something else. This enables tools such as + **valgrind** to be used in the tests. + +#. Google Test unit testing library hides the client output from the shell. + In order to debug the client after setting the desired debug level + (e.g ``ceph config set client debug_rbd 20``), the debug log file can + be found at ``build/out/client.admin.<pid>.log``. + This can also be handy when examining teuthology failed unit test + jobs, the job's debug level can be set at the relevant yaml file. + +.. _make check: +.. _teuthology framework: https://github.com/ceph/teuthology |