diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-19 00:47:55 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-19 00:47:55 +0000 |
commit | 26a029d407be480d791972afb5975cf62c9360a6 (patch) | |
tree | f435a8308119effd964b339f76abb83a57c29483 /testing/docs/tests-for-new-config | |
parent | Initial commit. (diff) | |
download | firefox-26a029d407be480d791972afb5975cf62c9360a6.tar.xz firefox-26a029d407be480d791972afb5975cf62c9360a6.zip |
Adding upstream version 124.0.1.upstream/124.0.1
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'testing/docs/tests-for-new-config')
-rw-r--r-- | testing/docs/tests-for-new-config/index.rst | 130 | ||||
-rw-r--r-- | testing/docs/tests-for-new-config/manual.rst | 224 |
2 files changed, 354 insertions, 0 deletions
diff --git a/testing/docs/tests-for-new-config/index.rst b/testing/docs/tests-for-new-config/index.rst new file mode 100644 index 0000000000..32fdf62b7f --- /dev/null +++ b/testing/docs/tests-for-new-config/index.rst @@ -0,0 +1,130 @@ +Turning on Firefox tests for a new configuration +================================================ + +You are ready to go with turning on Firefox tests for a new config. Once you +get to this stage, you will have seen a try push with all the tests running +(many not green) to verify some tests pass and there are enough machines +available to run tests. + +For the purpose of this document, assume you are tasked with upgrading Windows +10 OS from version 1803 -> 1903. To simplify this we can call this `windows_1903`, +and we need to: + + * create meta bug + * push to try + * run skip-fails + * repeat 2 more times + * land changes and turn on tests + * turn on run only failures + +If you are running this manually or on configs/tests that are not supported with +`./mach try --new-test-config`, then please follow the steps `here <manual.html>`__ + + +Create Meta Bug +--------------- + +This is a simple step where you create a meta bug to track the failures associated +with the tests you are greening up. If this is a test suite (i.e. `devtools`), it +is ok to have a meta bug just for the test suite and the new platform. + +All bugs related to tests skipped or failing will be blocking this meta bug. + +Push to Try Server +------------------ + +Now that you have a configuration setup and machines available via try server, it +is time to run try. If you are migrating mochitest or xpcshell, then you can do: + + ``./mach try fuzzy --no-artifact --full --rebuild 10 --new-task-config -q 'test-windows10-64-1903 mochitest-browser-chrome !ccov !ship !browsertime !talos !asan'`` + +This will run many tests (thanks to --full and --rebuild 10), but will give plenty +of useful data. + +In the scenario you are migrating tests such as: + * performance + * web-platform-tests + * reftest / crashtest / jsreftest + * mochitest-webgl (has a different process for test skipping) + * cppunittest / gtest / junit + * marionette / firefox-ui / telemetry + + then please follow the steps `here <manual.html>`__ + + If you are migrating to a small machine pool, it is best to avoid `--rebuild 10` and + instead do `--rebuild 3`. Likewise please limit your jobs to be the specific test + suite and variant. The size of a worker pool is shown at the Workers page of the + Taskcluster instance. + +Run skip-fails +-------------- + +When the try push is completed it is time to run skip-fails. Skip-fails will +look at all the test results and automatically create a set of local changes +with skip-if conditions to green up the tests faster. + +``./mach manifest skip-fails --b bugzilla.mozilla.org -m <meta_bug_id> --turbo "https://treeherder.mozilla.org/jobs?repo=try&revision=<rev>"`` + +Please input the proper `meta_bug_id` and `rev` into the above command. + +The first time running this, you will need to get a `bugzilla api key <https://bugzilla.mozilla.org/userprefs.cgi?tab=apikey>`__. copy +this key and add it to your `~/.config/python-bugzilla/bugzilla-rc` file: + +.. code-block:: none + + cat bugzillarc + [DEFAULT] + url = https://bugzilla.mozilla.org + [bugzilla.mozilla.org] + api_key = <key> + +When the command finishes, you will have new bugs created that are blocking the +meta bug. In addition you will have many changes to manifests adding skip-if +conditions. For tests than fail 40% of the time or for entire manifests that +take >20 minutes to run on opt or >40 minutes on debug. + +You will need to create a commit (or `--amend` your previous commit if this is round 2 or 3): + +``hg commit -m "Bug <meta_bug_id> - Green up tests for <suite> on <platform>"`` + + +Repeat 2 More Times +------------------- + +In 3 rounds this should be complete and ready to submit for review and turn on +the new tests. + +There will be additional failures, those will follow the normal process of +intermittents. + + +Land Changes and Turn on Tests +------------------------------ + +After you have a green test run, it is time to land the patches. There could +be changes needed to the taskgraph in order to add the new hardware type and +duplicate tests to run on both the old and the new, or create a new variant and +denote which tests to run on that variant. + +Using our example of ``windows_1903``, this would be a new worker type that +would require these edits: + + * `transforms/tests.py <https://searchfox.org/mozilla-central/source/taskcluster/taskgraph/transforms/tests.py#97>`__ (duplicate windows 10 entries) + * `test-platforms.py <https://searchfox.org/mozilla-central/source/taskcluster/ci/test/test-platforms.yml#229>`__ (copy windows10 debug/opt/shippable/asan entries and make win10_1903) + * `test-sets.py <https://searchfox.org/mozilla-central/source/taskcluster/ci/test/test-sets.yml#293>`__ (ideally you need nothing, otherwise copy ``windows-tests`` and edit the test list) + +In general this should allow you to have tests scheduled with no custom flags +in try server and all of these will be scheduled by default on +``mozilla-central``, ``autoland``, and ``release-branches``. + +Turn on Run Only Failures +------------------------- + +Now that we have tests running regularly, the next step is to take all the +disabled tests and run them in the special failures job. + +We have a basic framework created, but for every test harness (i.e. xpcshell, +mochitest-gpu, browser-chrome, devtools, web-platform-tests, crashtest, etc.), +there will need to be a corresponding tier-3 job that is created. + +TODO: point to examples of how to add this after we get our first jobs running. diff --git a/testing/docs/tests-for-new-config/manual.rst b/testing/docs/tests-for-new-config/manual.rst new file mode 100644 index 0000000000..cf2485251a --- /dev/null +++ b/testing/docs/tests-for-new-config/manual.rst @@ -0,0 +1,224 @@ +:orphan: + +Turning on Firefox tests for a new configuration (manual) +========================================================= + +You are ready to go with turning on Firefox tests for a new config. Once you +get to this stage, you will have seen a try push with all the tests running +(many not green) to verify some tests pass and there are enough machines +available to run tests. + +For the purpose of this document, assume you are tasked with upgrading Windows +10 OS from 1803 -> 1903. To simplify this we can call this `windows_1903`, and +we need to: + + * push to try + * analyze test failures + * disable tests in manifests + * repeat try push until no failures + * file bugs for test failures + * land changes and turn on tests + * turn on run only failures + +There are many edge cases, and I will outline them inside each step. + + +Push to Try Server +------------------ + +As you have new machines (or cloud instances) available with the updated +OS/config, it is time to push to try. + +In order to run all tests, we would need to execute: + ``./mach try fuzzy --no-artifact -q 'test-windows !-raptor- !-talos- --rebuild 10`` + +There are a few exceptions here: + + * Perf tests don't need to be run (hence the ``!-raptor- !-talos-``) + * Need to make sure we are not building with artifact builds (hence the + ``--no-artifact``) + * There are jobs hidden behind tier-3, some for a good reason (code coverage is + a good example, but fission tests might not be green) + + The last piece to sort out is running on the new config, here are some + considerations for new configs: + + * duplicated jobs (i.e. fission, a11y-checks), you can just run those specific + tasks: ``./mach try fuzzy --no-artifact -q 'test-windows fission --rebuild + 5`` + * new OS/hardware (i.e. aarch64, os upgrade), you need to reference the new + hardware, typically this is with ``--worker-override``: ``./mach try fuzzy + --no-artifact -q 'test-windows --rebuild 10 --worker-override + t-win10-64=gecko-t/t-win10-64-1903`` + + * the risk here is a scenario where hardware is limited, then ``--rebuild + 10`` will create too many tasks and some will expire. + * in low hardware situations, either run a subset of tests (i.e. + web-platform-tests, mochitest), or ``--rebuild 3`` and repeat. + + +Analyze Test Failures +--------------------- + +A try push will take many hours, it is best to push when you start work and +then results will be ready later in your day, or push at the end of your day +and results will be ready when you come back to work the next day. Please +make sure some tasks start before walking away, otherwise a small typo can +delay this process by hours or a full day. + +The best way to look at test failures is to use Push Health to avoid misleading +data. Push Health will bucket failures into possible regressions, known +regression, etc. When looking at 5 data points (from ``--rebuild 10``), this +will filter out intermittent failures. + +There are many reasons you might have invalid or misleading data: + + # Tests fail intermittently, we need a pattern to know if it is consistent or + intermittent. + # We still want to disable high frequency intermittent tests, those are just + annoying. + # You could be pushing off a bad base revision (regression or intermittent that + comes from the base revision). + # The machines you run on could be bad, skewing the data. + # Infrastructure problems could cause jobs to fail at random places, repeated + jobs filter that out. + # Some failures could affect future tests in the same browser session or tasks. + # If a crash occurs, or we timeout- it is possible that we will not run all of + the tests in the task, therefore believing a test was run 5 times, but maybe it + was only run once (and failed), or never run at all. + # Task failures that do not have a test name (leak on shutdown, crash on + shutdown, timeout on shutdown, etc.) + +That is a long list of reasons to not trust the data, luckily most of the time +using ``--rebuild 10`` will give us enough data to give enough confidence we +found all failures and can ignore random/intermittent failures. + +Knowing the reasons for misleading data, here is a way to use `Push Health +<https://treeherder.mozilla.org/push-health/push?revision=abaff26f8e084ac719bea0438dba741ace3cf5d8&repo=try&testGroup=pr>`__. + + * Alternatively, you could use the `API + <https://treeherder.mozilla.org/api/project/try/push/health/?revision=abaff26f8e084ac719bea0438dba741ace3cf5d8>`__ + to get raw data and work towards building a tool + * If you write a tool, you need to parse the resulting JSON file and keep in + mind to build a list of failures and match it with a list of jobnames to find + how many times the job ran and failed/passed. + +The main goal here is to know what <path>/<filenames> are failing, and having a +list of those. Ideally you would record some additional information like +timeout, crash, failure, etc. In the end you might end up with:: + + dom/html/test/test_fullscreen-api.html, scrollbar + gfx/layers/apz/test/mochitest/test_group_hittest.html, scrollbar + image/test/mochitest/test_animSVGImage.html, timeout + browser/base/content/test/general/browser_restore_isAppTab.js, crashed + + + + +Disable Tests in the Manifest Files +----------------------------------- + +The code sheriffs have been using `this documentation +<https://wiki.mozilla.org/Auto-tools/Projects/Stockwell/disable-recommended>`__ +for training and reference when they disable intermittents. + +First you need to add a keyword to be available in the manifest (e.g. ``skip-if += windows_1903``). + +There are many exceptions, the bulk of the work will fall into one of 4 +categories: + + # `manifestparser <mochitest_xpcshell_manifest_keywords>`_: \*.toml (mochitest*, + firefox-ui, marionette, xpcshell) easy to edit by adding a ``skip-if = + windows_1903 # <comment>``, a few exceptions here + # `reftest <reftest_manifest_keywords>`_: \*.list (reftest, crashtest) need to + add a ``fuzzy-if(windows_1903, A, B)``, this is more specific + # web-platform-test: testing/web-platform/meta/\*\*.ini (wpt, wpt-reftest, + etc.) need to edit/add testing/web-platform/meta/<path>/<testname>.ini, and add + expected results + # Other (compiled tests, jsreftest, etc.) edit source code, ask for help. + +Basically we want to take every non-intermittent failure found from push health +and edit the manifest, this typically means: + + * Finding the proper manifest. + * Adding the right text to the manifest. + +To find the proper manifest, it is typically <path>/<harness>.[toml|list]. +There are exceptions and if in doubt use searchfox.org/ to find the manifest +which contains the testname. + +Once you have the manifest, open it in an editor, and search for the exact test +name (there could be similar named tests). + +Rerun Try Push, Repeat as Necessary +----------------------------------- + +It is important to test your changes and for a new platform that will be +sheriffed, to rerun all the tests at scale. + +With your change in a commit, push again to try with ``--rebuild 10`` and come +back the next day. + +As there are so many edge cases, it is quite likely that you will have more +failures, mentally plan on 3 iterations of this, where each iteration has fewer +failures. + +Once you get a full push to show no persistent failures, it is time to land +those changes and turn on the new tests. There is a large risk here that the +longer you take to find all failures, the greater the chance of: + + * Bitrot of your patch + * New tests being added which could fail on your config + * Other edits to tests/tools which could affect your new config + +Since the new config process is designed to find failures fast and get the +changes landed fast, we do not need to ask developers for review, that comes +after the new config is running successfully where we notify the teams of what +tests are failing. + +File Bugs for Test Failures +--------------------------- + +Once the failure jobs are running on mozilla-central, now we have full coverage +and the ability to run tests on try server. There could be >100 tests that are +marked as ``skip-if`` and that would take a lot of time to file bugs. Instead +we will file a bug for each manifest that is edited, typically this reduces the +bugs to about 40% the total tests (average out to 2.5 test failures/manifest). + +When filing the bug, indicate the timeline, how to run the failure, link to the +bug where we created the config, describe briefly the config change (i.e. +upgrade windows 10 from version 1803 to 1903), and finally needinfo the triage +owner indicating this is a heads up and these tests are running reguarly on +mozilla-central for the next 6-7 weeks. + +Land Changes and Turn on Tests +------------------------------ + +After you have a green test run, it is time to land the patches. There could +be changes needed to the taskgraph in order to add the new hardware type and +duplicate tests to run on both the old and the new, or create a new variant and +denote which tests to run on that variant. + +Using our example of ``windows_1903``, this would be a new worker type that +would require these edits: + + * `transforms/tests.py <https://searchfox.org/mozilla-central/source/taskcluster/taskgraph/transforms/tests.py#97>`__ (duplicate windows 10 entries) + * `test-platforms.py <https://searchfox.org/mozilla-central/source/taskcluster/ci/test/test-platforms.yml#229>`__ (copy windows10 debug/opt/shippable/asan entries and make win10_1903) + * `test-sets.py <https://searchfox.org/mozilla-central/source/taskcluster/ci/test/test-sets.yml#293>`__ (ideally you need nothing, otherwise copy ``windows-tests`` and edit the test list) + +In general this should allow you to have tests scheduled with no custom flags +in try server and all of these will be scheduled by default on +``mozilla-central``, ``autoland``, and ``release-branches``. + +Turn on Run Only Failures +------------------------- + +Now that we have tests running regularly, the next step is to take all the +disabled tests and run them in the special failures job. + +We have a basic framework created, but for every test harness (i.e. xpcshell, +mochitest-gpu, browser-chrome, devtools, web-platform-tests, crashtest, etc.), +there will need to be a corresponding tier-3 job that is created. + +TODO: point to examples of how to add this after we get our first jobs running. |