Adding upstream version 6.0.4-dfsg.upstream/6.0.4-dfsg upstream

Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
author: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-05-06 03:01:46 +0000
committer: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-05-06 03:01:46 +0000
commit: f8fe689a81f906d1b91bb3220acde2a4ecb14c5b (patch)
tree: 26484e9d7e2c67806c2d1760196ff01aaa858e8c /src/VBox/ValidationKit/docs/AutomaticTestingRevamp.txt
parent: Initial commit. (diff)
download: virtualbox-f8fe689a81f906d1b91bb3220acde2a4ecb14c5b.tar.xz
virtualbox-f8fe689a81f906d1b91bb3220acde2a4ecb14c5b.zip
1 files changed, 1062 insertions, 0 deletions
diff --git a/src/VBox/ValidationKit/docs/AutomaticTestingRevamp.txt b/src/VBox/ValidationKit/docs/AutomaticTestingRevamp.txt
new file mode 100644
index 00000000..525384b6
--- /dev/null
+++ b/src/VBox/ValidationKit/docs/AutomaticTestingRevamp.txt
@@ -0,0 +1,1062 @@
+
+Revamp of Automatic VirtualBox Testing
+======================================
+
+
+Introduction
+------------
+
+This is the design document for a revamped automatic testing framework.
+The revamp aims at replacing the current tinderbox based testing by a new
+system that is written from scratch.
+
+The old system is not easy to work with and was never meant to be used for
+managing tests, after all it just a simple a build manager tailored for
+contiguous building.  Modifying the existing tinderbox system to do what
+we want would require fundamental changes that would render it useless as
+a build manager, it would therefore end up as a fork.  The amount of work
+required would probably be about the same as writing a new system from
+scratch.  Other considerations, such as the license of the tinderbox
+system (MPL) and language it is realized in (Perl), are also in favor of
+doing it from scratch.
+
+The language envisioned for the new automatic testing framework is Python.  This
+is for several reasons:
+
+  - The VirtualBox API has Python bindings.
+  - Python is used quite a bit inside Sun (dunno about Oracle).
+  - Works relatively well with Apache for the server side bits.
+  - It is more difficult to produce write-only code in Python (alias the
+    we-don't-like-perl argument).
+  - You don't need to compile stuff.
+
+Note that the author of this document has no special training as a test
+engineer and may therefore be using the wrong terms here and there.  The
+primary focus is to express what we need to do in order to improve
+testing.
+
+This document is written in reStructuredText (rst) which just happens to
+be used by Python, the primary language for this revamp.  For more
+information on reStructuredText: http://docutils.sourceforge.net/rst.html
+
+
+Definitions / Glossary
+======================
+
+sub-test driver
+  A set of test cases that can be used by more than one test driver.  Could
+  also be called a test unit, in the pascal sense of unit, if it wasn't so
+  easily confused with 'unit test'.
+
+test
+  This is somewhat ambiguous and this document try avoid using it where
+  possible.  When used it normally refers to doing testing by executing one or
+  more testcases.
+
+test case
+  A set of inputs, test programs and expected results. It validates system
+  requirements and generates a pass or failed status.  A basic unit of testing.
+  Note that we use the term in a rather broad sense.
+
+test driver
+  A program/script used to execute a test.  Also known as a test harness.
+  Generally abbreviated 'td'.  It can have sub-test drivers.
+
+test manager
+  Software managing the automatic testing.  This is a web application that runs
+  on a dedicated server (tindertux).
+
+test set
+  The output of testing activity.  Logs, results, ++.  Our usage of this should
+  probably be renamed to 'test run'.
+
+test group
+  A collection of related test cases.
+
+testbox
+  A computer that does testing.
+
+testbox script
+  Script executing orders from the test manager on a testbox.  Started
+  automatically upon bootup.
+
+testing
+  todo
+
+TODO: Check that we've got all this right and make them more exact
+      where possible.
+
+See also http://encyclopedia2.thefreedictionary.com/testing%20types
+and http://www.aptest.com/glossary.html .
+
+
+
+Objectives
+==========
+
+ - A scalable test manager (>200 testboxes).
+ - Optimize the web user interface (WUI) for typical workflows and analysis.
+ - Efficient and flexibile test configuration.
+ - Import test result from other test systems (logo testing, VDI, ++).
+ - Easy to add lots of new testscripts.
+ - Run tests locally without a manager.
+ - Revamp a bit at the time.
+
+
+
+The Testbox Side
+================
+
+Each testbox has a unique name corresponding to its DNS zone entry.  When booted
+a testbox script is started automatically.  This script will query the test
+manager for orders and execute them.  The core order downloads and executes a
+test driver with parameters (configuration) from the server.  The test driver
+does all the necessary work for executing the test.  In a typical VirtualBox
+test this means picking a build, installing it, configuring VMs, running the
+test VMs, collecting the results, submitting them to the server, and finally
+cleaning up afterwards.
+
+The testbox environment which the test drivers are executed in will have a
+number of environment variables for determining location of the source images
+and other test data, scratch space, test set id, server URL, and so on and so
+forth.
+
+On startup, the testbox script will look for crash dumps and similar on
+systems where this is possible.  If any sign of a crash is found, it will
+put any dumps and reports in the upload directory and inform the test
+manager before reporting for duty.  In order to generate the proper file
+names and report the crash in the right test set as well as prevent
+reporting crashes unrelated to automatic testing, the testbox script will
+keep information (test set id, ++) in a separate scratch directory
+(${TESTBOX_PATH_SCRATCH}/../testbox) and make sure it is synced to the
+disk (both files and directories).
+
+After checking for crashes, the testbox script will clean up any previous test
+which might be around.  This involves first invoking the test script in cleanup
+mode and the wiping the scratch space.
+
+When reporting for duty the script will submit information about the host: OS
+name, OS version, OS bitness, CPU vendor, total number of cores, VT-x support,
+AMD-V support, amount of memory, amount of scratch space, and anything else that
+can be found useful for scheduling tests or filtering test configurations.
+
+
+
+Testbox Script Orders
+---------------------
+
+The orders are kept in a queue on the server and the testbox script will fetch
+them one by one.  Orders that cannot be executed at the moment will be masked in
+the query from the testbox.
+
+Execute Test Driver
+  Downloads and executes the a specified test driver with the given
+  configuration (arguments).  Only one test driver can be executed at a time.
+  The server can specify more than one ZIP file to be downloaded and unpacked
+  before executing the test driver.  The testbox script may cache these zip
+  files using http time stamping.
+
+Abort Test Driver
+  Aborts the current test driver.  This will drop a hint to the driver and give
+  it 60 seconds to shut down the normal way.  If that fails, the testbox script
+  will kill the driver processes (SIGKILL or equivalent), invoke the
+  testdriver in cleanup mode, and finally wipe the scratch area.  Should either
+  of the last two steps fail in some way, the testbox will be rebooted.
+
+Idle
+  Ask again in X seconds, where X is specified by the server.
+
+Reboot
+  Reboot the testbox.  If a test driver is current running, an attempt at
+  aborting it (Abort Test Driver) will be made first.
+
+Update
+  Updates the testbox script.  The order includes a server relative path to the
+  new testbox script.  This can only be executed when no test driver is
+  currently being executed.
+
+
+Testbox Environment: Variables
+------------------------------
+
+COMSPEC
+  This will be set to C:\Windows\System32\cmd.exe on Windows.
+
+PATH
+  This will contain the kBuild binary directory for the host platform.
+
+SHELL
+  This will be set to point to kmk_ash(.exe) on all platforms.
+
+TESTBOX_NAME
+  The testbox name.
+  This is not required by the local reporter.
+
+TESTBOX_PATH_BUILDS
+  The absolute path to where the build repository can be found.  This should be
+  a read only mount when possible.
+
+TESTBOX_PATH_RESOURCES
+  The absolute path to where static test resources like ISOs and VDIs can be
+  found.  The test drivers knows the layout of this.  This should be a read only
+  mount when possible.
+
+TESTBOX_PATH_SCRATCH
+  The absolute path to the scratch space.  This is the current directory when
+  starting the test driver.  It will be wiped automatically after executing the
+  test.
+  (Envisioned as ${TESTBOX_PATH_SCRIPTS}/../scratch and that
+  ${TESTBOX_PATH_SCRATCH}/ will be automatically wiped by the testbox script.)
+
+TESTBOX_PATH_SCRIPTS
+  The absolute path to the test driver and the other files that was unzipped
+  together with it.  This is also where the test-driver-abort file will be put.
+  (Envisioned as ${TESTBOX_PATH_SCRATCH}/../driver, see above.)
+
+TESTBOX_PATH_UPLOAD
+  The absolute path to the upload directory for the testbox.  This is for
+  putting VOBs, PNGs, core dumps, crash dumps, and such on.  The files should be
+  bzipped or zipped if they aren't compress already.  The names should contain
+  the testbox and test set ID.
+
+TESTBOX_REPORTER
+  The name of the test reporter back end.  If not present, it will default to
+  the local reporter.
+
+TESTBOX_TEST_SET_ID
+  The test set ID if we're running.
+  This is not required by the local reporter.
+
+TESTBOX_MANAGER_URL
+  The URL to the test manager.
+  This is not required by the local reporter.
+
+TESTBOX_XYZ
+  There will probably be some more of these.
+
+
+Testbox Environment: Core Utilities
+-----------------------------------
+
+The testbox will not provide the typical unix /bin and /usr/bin utilities.  In
+other words, cygwin will not be used on Windows!
+
+The testbox will provide the unixy utilties that ships with kBuild and possibly
+some additional ones from tools/*.*/bin in the VirtualBox tree (wget, unzip,
+zip, and so on).  The test drivers will avoid invoking any of these utilites
+directly and instead rely on generic utility methods in the test driver
+framework.  That way we can more easily reimplement the functionality of the
+core utilites and drop the dependency on them.  It also allows us to quickly
+work around platform specific oddities and bugs.
+
+
+Test Drivers
+------------
+
+The test drivers are programs that will do the actual testing.  In addition to
+run under the testbox script, they can be executed in the VirtualBox development
+environment.  This is important for bug analysis and for simplifying local
+testing by the developers before commiting changes.  It also means the test
+drivers can be developed locally in the VirtualBox development environment.
+
+The main difference between executing a driver under the testbox script and
+running it manually is that there is no test manager in the latter case.  The
+test result reporter will not talk to the server, but report things to a local
+log file and/or standard out/err.  When invoked manually, all the necessary
+arguments will need to be specified by hand of course - it should be possible
+to extract them from a test set as well.
+
+For the early implementation stages, an implementation of the reporter interface
+that talks to the tinderbox base test manager will be needed.  This will be
+dropped later on when a new test manager is ready.
+
+As hinted at in other sections, there will be a common framework
+(libraries/packages/classes) for taking care of the tedious bits that every
+test driver needs to do.  Sharing code is essential to easing test driver
+development as well as reducing their complexity.  The framework will contain:
+
+    - A generic way of submitting output.  This will be a generic interface with
+      multiple implementation, the TESTBOX_REPORTER environment variable
+      will decide which of them to use.  The interface will have very specific
+      methods to allow the reporter to do a best possible job in reporting the
+      results to the test manager.
+
+    - Helpers for typical tasks, like:
+        - Copying files.
+        - Deleting files, directory trees and scratch space.
+        - Unzipping files.
+        - Creating ISOs
+        - And such things.
+
+    - Helpers for installing and uninstalling VirtualBox.
+
+    - Helpers for defining VMs. (The VBox API where available.)
+
+    - Helpers for controlling VMs. (The VBox API where available.)
+
+The VirtualBox bits will be separate from the more generic ones, simply because
+this is cleaner it will allow us to reuse the system for testing other products.
+
+The framework will be packaged in a zip file other than the test driver so we
+don't waste time and space downloading the same common code.
+
+The test driver will poll for the file
+${TESTBOX_PATH_SCRIPTS}/test-driver-abort and abort all testing when it sees it.
+
+The test driver can be invoked in three modes: execute, help and cleanup.  The
+default is execute mode, the help shows an configuration summary and the cleanup
+is for cleaning up after a reboot or aborted run.  The latter is done by the
+testbox script on startup and after abort - the driver is expected to clean up
+by itself after a normal run.
+
+
+
+The Server Side
+===============
+
+The server side will be implemented using a webserver (apache), a database
+(postgres) and cgi scripts (Python).  In addition a cron job (Python) running
+once a minute will generate static html for frequently used pages and maybe
+execute some other tasks for driving the testing forwards.  The order queries
+from the testbox script is the primary driving force in the system.  The total
+makes up the test manager.
+
+The test manager can be split up into three rough parts:
+
+  - Configuration (of tests, testgroups and testboxes).
+  - Execution     (of tests, collecting and organizing the output).
+  - Analysis      (of test output, mostly about presentation).
+
+
+Test Manager: Requirements
+==========================
+
+List of requirements:
+
+  - Two level testing - L1 quick smoke tests and L2 longer tests performed on
+    builds passing L1.  (Klaus (IIRC) ment this could be realized using
+    test dependency.)
+  - Black listing builds (by revision or similar) known to be bad.
+  - Distinguish between build types so we can do a portion of the testing with
+    strict builds.
+  - Easy to re-configure build source for testing different branch or for
+    testing a release candidate. (Directory based is fine.)
+  - Useful to be able to partition testboxes (run specific builds on some
+    boxes, let an engineer have a few boxes for a while).
+  - Interation with ILOM/...: reset systems.
+  - Be able to suspend testing on selected testboxes when doing maintance
+    (where automatically resuming testing on reboot is undesired) or similar
+    activity.
+  - Abort testing on seleced testboxes.
+  - Scheduling of tests requiring more than one testbox.
+  - Scheduling of tests that cannot be executing concurrently on several
+    machines because of some global resource like an iSCSI target.
+  - Jump the scheduling queue.  Scheduling of specified test the next time a
+    testbox is available (optionally specifying which testbox to schedule it
+    on).
+  - Configure tests with variable configuration to get better coverage. Two modes:
+      - TM generates the permutations based on one or more sets of test script arguments.
+      - Each configuration permuation is specified manually.
+  - Test specification needs to be flexible (select tests, disable test, test
+    scheduling (run certain tests nightly), ... ).
+  - Test scheduling by hour+weekday and by priority.
+  - Test dependencies (test A depends on test B being successful).
+  - Historize all configuration data, in particular test configs (permutations
+    included) and testboxes.
+  - Test sets has at a minimum a build reference, a testbox reference and a
+    primary log associated with it.
+  - Test sets stores further result as a recursive collection of:
+      - hierachical subtest name (slash sep)
+      - test parameters / config
+      - bool fail/succ
+      - attributes (typed?)
+      - test time
+      - e.g. throughput
+      - subresults
+      - log
+      - screenshots, video,...
+  - The test sets database structure needs to designed such that data mining
+    can be done in an efficient manner.
+  - Presentation/analysis: graphs!, categorize bugs, columns reorganizing
+    grouped by test (hierarchical), overviews, result for last day.
+
+
+
+Test Manager: Configuration
+===========================
+
+
+Testboxes
+---------
+
+Configuration of testboxes doesn't involve much work normally.  A testbox
+is added manually to the test manager by entering the DNS entry and/or IP
+address (the test manager resolves the missing one when necessary) as well as
+the system UUID (when obtainable - should be displayed by the testbox script
+installer).  Queries from unregistered testboxes will be declined as a kind of
+security measure, the incident should be logged in the webserver log if
+possible.   In later dealings with the client the System UUID will be the key
+identifier.  It's permittable for the IP address to change when the testbox
+isn't online, but not while testing (just imagine live migration tests and
+network tests).  Ideally, the testboxes should not change IP address.
+
+The testbox edit function must allow changing the name and system UUID.
+
+One further idea for the testbox configuration is indicating what they are
+capable of to filter out tests and test configurations that won't work on that
+testbox.  To examplify this take the ACP2 installation test.  If the test
+manager does not make sure the testbox have VT-x or AMD-v capabilities, the test
+is surely going to fail.  Other testbox capabilities would be total number of
+CPU cores, memory size, scratch space.  These testbox capabilities should be
+collected automatically on bootup by the testbox script together with OS name,
+OS version and OS bitness.
+
+A final thought, instead of outright declining all requests from new testboxes,
+we could record the unregistered testboxes with ip, UUID, name, os info and
+capabilities but mark them as inactive.  The test operator can then activate
+them on an activation page or edit the testbox or something.
+
+
+Testcases
+---------
+
+We use the term testcase for a test.
+
+
+Testgroups
+----------
+
+Testcases are organized into groups.  A testcase can be member of more than one
+group.  The testcase gets a priority assigned to it in connection with the
+group membership.
+
+Testgroups are picked up by a testbox partition (aka scheduling group) and a
+prioirty, scheduling time restriction and dependencies on other test groups are
+associated with the assignment.  A testgroup can be used by several testbox
+partitions.
+
+(This used to be called 'testsuites' but was renamed to avoid confusion with
+the VBox Test Suite.)
+
+
+Scheduling
+----------
+
+The initial scheduler will be modelled after what we're doing already on in the
+tinderbox driven testing.  It's best described as a best effort continuous
+integration scheduler.  Meaning, it will always use the latest build suitable
+for a testcase.  It will schedule on a testcase level, using the combined
+priority of the testcase in the test group and the test group with the testbox
+partition, trying to spread the test case argument varation out accordingly
+over the whole scheduilng queue.  Which argument variation to start with, is
+not undefined (random would be best).
+
+Later, we may add other schedulers as needed.
+
+
+
+The Test Manager Database
+=========================
+
+First a general warning:
+
+    The guys working on this design are not database experts, web
+    programming experts or similar, rather we are low level guys
+    who's main job is x86 & AMD64 virtualization.  So, please don't
+    be too hard on us. :-)
+
+
+A logical table layout can be found in TestManagerDatabaseMap.png (created by
+Oracle SQL Data Modeler, stored in TestManagerDatabase.dmd).  The physical
+database layout can be found in TestManagerDatabaseInit.pgsql postgreSQL
+script.  The script is commented.
+
+
+Data History
+------------
+
+We need to somehow track configuration changes over time.  We also need to
+be able to query the exact configuration a test set was run with so we can
+understand and make better use of the results.
+
+There are different techniques for archiving this, one is tuple-versioning
+( http://en.wikipedia.org/wiki/Tuple-versioning ), another is log trigger
+( http://en.wikipedia.org/wiki/Log_trigger ).  We use tuple-versioning in
+this database, with 'effective' as start date field name and 'expire' as
+the end (exclusive).
+
+Tuple-versioning has a shortcomming wrt to keys, both primary and foreign.
+The primary key of a table employing tuple-versioning is really
+'id' + 'valid_period', where the latter is expressed using two fields
+([effective...expire-1]).  Only, how do you tell the database engine that
+it should not allow overlapping valid_periods?  Useful suggestions are
+welcomed. :-)
+
+Foreign key references to a table using tuple-versioning is running into
+trouble because of the time axsis and that to our knowledge foreign keys
+must reference exactly one row in the other table.  When time is involved
+what we wish to tell the database is that at any given time, there actually
+is exactly one row we want to match in the other table, only we've no idea
+how to express this.  So, many foreign keys are not expressed in SQL of this
+database.
+
+In some cases, we extend the tuple-versioning with a generation ID so that
+normal foreign key referencing can be used.  We only use this for recording
+(references in testset) and scheduling (schedqueue), as using it more widely
+would force updates (gen_id changes) to propagate into all related tables.
+
+See also:
+    - http://en.wikipedia.org/wiki/Slowly_changing_dimension
+    - http://en.wikipedia.org/wiki/Change_data_capture
+    - http://en.wikipedia.org/wiki/Temporal_database
+
+
+
+Test Manager: Execution
+=======================
+
+
+
+Test Manager: Scenarios
+=======================
+
+
+
+#1 - Testbox Signs On (At Bootup)
+---------------------------------
+
+The testbox supplies a number of inputs when reporting for duty:
+    - IP address.
+    - System UUID.
+    - OS name.
+    - OS version.
+    - CPU architecture.
+    - CPU count (= threads).
+    - CPU VT-x/AMD-V capability.
+    - CPU nested paging capability.
+    - Chipset I/O MMU capability.
+    - Memory size.
+    - Scratch size space (for testing).
+    - Testbox Script revision.
+
+Results:
+    - ACK or NACK.
+    - Testbox ID and name on ACK.
+
+After receiving a ACK the testbox will ask for work to do, i.e. continue with
+scenario #2.  In the NACK case, it will sleep for 60 seconds and try again.
+
+
+Actions:
+
+1. Validate the testbox by looking the UUID up in the TestBoxes table.
+   If not found, NACK the request.  SQL::
+
+        SELECT  idTestBox, sName
+        FROM    TestBoxes
+        WHERE   uuidSystem = :sUuid
+          AND   tsExpire = 'infinity'::timestamp;
+
+2. Check if any of the information by testbox script has changed.  The two
+   sizes are normalized first, memory size rounded to nearest 4 MB and scratch
+   space is rounded down to nearest 64 MB.  If anything changed, insert a new
+   row in the testbox table and historize the current one, i.e. set
+   OLD.tsExpire to NEW.tsEffective and get a new value for NEW.idGenTestBox.
+
+3. Check with TestBoxStatuses:
+        a) If there is an row for the testbox in it already clean up change it
+           to 'idle' state and deal with any open testset like described in
+           scenario #9.
+        b) If there is no row, add one with 'idle' state.
+
+4. ACK the request and pass back the idTestBox.
+
+
+Note! Testbox.enabled is not checked here, that is only relevant when it asks
+      for a new task (scenario #2 and #5).
+
+Note! Should the testbox script detect changes in any of the inputs, it should
+      redo the sign in.
+
+Note! In scenario #8, the box will not sign on until it has done the reboot and
+      cleanup reporting!
+
+
+#2 - Testbox Asks For Work To Do
+---------------------------------
+
+
+Inputs:
+    - The testbox is supplying its IP indirectly.
+    - The testbox should supply its UUID and ID directly.
+
+Results:
+    - IDLE, WAIT, EXEC, REBOOT, UPGRADE, UPGRADE-AND-REBOOT, SPECIAL or DEAD.
+
+Actions:
+
+1. Validate the ID and IP by selecting the currently valid testbox row::
+
+     SELECT  idGenTestBox, fEnabled, idSchedGroup, enmPendingCmd
+     FROM    TestBoxes
+     WHERE   id = :id
+       AND   uuidSystem = :sUuid
+       AND   ip = :ip
+       AND   tsExpire = 'infinity'::timestamp;
+
+   If NOT found return DEAD to the testbox client (it will go back to sign on
+   mode and retry every 60 seconds or so - see scenario #1).
+
+   Note! The WUI will do all necessary clean-ups when deleting a testbox, so
+         contrary to the initial plans, we don't need to do anything more for
+         the DEAD status.
+
+2. Check with TestBoxStatuses (maybe joined with query from 1).
+
+   If enmState is 'gang-gathering': Goto scenario #6 on timeout or pending
+   'abort' or 'reboot' command.  Otherwise, tell the testbox to WAIT [done].
+
+   If enmState is 'gang-testing': The gang has been gathered and execution
+   has been triggered.  Goto 5.
+
+   If enmState is not 'idle', change it to 'idle'.
+
+   If idTestSet is not NULL, CALL scenario #9 to it up.
+
+   If there is a pending abort command, remove it.
+
+   If there is a pending command and the old state doesn't indicate that it was
+   being executed, GOTO scenario #3.
+
+   Note! There should be a TestBoxStatuses row after executing scenario #1,
+         however should none be found for some funky reason, returning DEAD
+         will fix the problem (see above)
+
+3. If the testbox was marked as disabled, respond with an IDLE command to the
+   testbox [done]. (Note! Must do this after TestBoxStatuses maintainance from
+   point 2, or abandoned tests won't be cleaned up after a testbox is disabled.)
+
+4. Consider testcases in the scheduling queue, pick the first one which the
+   testbox can execute.  There is a concurrency issue here, so we put and
+   exclusive lock on the SchedQueues table while considering its content.
+
+   The cursor we open looks something like this::
+
+     SELECT  idItem, idGenTestCaseArgs,
+             idTestSetGangLeader, cMissingGangMembers
+     FROM    SchedQueues
+     WHERE   idSchedGroup = :idSchedGroup
+        AND  (   bmHourlySchedule is NULL
+              OR get_bit(bmHourlySchedule, :iHourOfWeek) = 1 ) --< does this work?
+     ORDER BY ASC idItem;
+
+  If there no rows are returned (this can happen because no testgroups are
+  associated with this scheduling group, the scheduling group is disabled,
+  or because the queue is being regenerated), we will tell the testbox to
+  IDLE [done].
+
+  For each returned row we will:
+     a) Check testcase/group dependencies.
+     b) Select a build (and default testsuite) satisfying the dependencies.
+     c) Check the testcase requirements with that build in mind.
+     d) If idTestSetGangLeader is NULL, try allocate the necessary resources.
+     e) If it didn't check out, fetch the next row and redo from (a).
+     f) Tentatively create a new test set row.
+     g) If not gang scheduling:
+            - Next state: 'testing'
+        ElIf we're the last gang participant:
+            - Set idTestSetGangLeader to NULL.
+            - Set cMissingGangMembers to 0.
+            - Next state: 'gang-testing'
+        ElIf we're the first gang member:
+            - Set cMissingGangMembers to TestCaseArgs.cGangMembers - 1.
+            - Set idTestSetGangLeader to our idTestSet.
+            - Next state: 'gang-gathering'
+        Else:
+            - Decrement cMissingGangMembers.
+            - Next state: 'gang-gathering'
+
+        If we're not gang scheduling OR cMissingGangMembers is 0:
+            Move the scheduler queue entry to the end of the queue.
+
+        Update our TestBoxStatuses row with the new state and test set.
+        COMMIT;
+
+5. If state is 'testing' or 'gang-testing':
+        EXEC reponse.
+
+        The EXEC response for a gang scheduled testcase includes a number of
+        extra arguments so that the script knows the position of the testbox
+        it is running on and of the other members.  This means the that the
+        TestSet.iGangMemberNo is passed using --gang-member-no and the IP
+        addresses of the all gang members using --gang-ipv4-<memb-no> <ip>.
+   Else (state is 'gang-gathering'):
+        WAIT
+
+
+
+#3 - Pending Command When Testbox Asks For Work
+-----------------------------------------------
+
+This is a subfunction of scenario #2 and #5.
+
+As seen in scenario #2, the testbox will send 'abort' commands to /dev/null
+when it finds one when not executing a test.  This includes when it reports
+that the test has completed (no need to abort a completed test, wasting lot
+of effort when standing at the finish line).
+
+The other commands, though, are passed back to the testbox.  The testbox
+script will respond with an ACK or NACK as it sees fit.  If NACKed, the
+pending command will be removed (pending_cmd set to none) and that's it.
+If ACKed, the state of the testbox will change to that appropriate for the
+command and the pending_cmd set to none.  Should the testbox script fail to
+respond, the command will be repeated the next time it asks for work.
+
+
+
+#4 - Testbox Uploads Results During Test
+----------------------------------------
+
+
+TODO
+
+
+#5 - Testbox Completes Test and Asks For Work
+---------------------------------------------
+
+This is very similar to scenario #2
+
+TODO
+
+
+#6 - Gang Gathering Timeout
+---------------------------
+
+This is a subfunction of scenario #2.
+
+When gathering a gang of testboxes for a testcase, we do not want to wait
+forever and have testboxes doing nothing for hours while waiting for partners.
+So, the gathering has a reasonable timeout (imagine something like 20-30 mins).
+
+Also, we need some way of dealing with 'abort' and 'reboot' commands being
+issued while waiting.  The easy way out is pretent it's a time out.
+
+When changing the status to 'gang-timeout' we have to be careful.  First of all,
+we need to exclusively lock the SchedQueues and TestBoxStatuses (in that order)
+and re-query our status.  If it changed redo the checks in scenario #2 point 2.
+
+If we still want to timeout/abort, change the state from 'gang-gathering' to
+'gang-gathering-timedout' on all the gang members that has gathered so far.
+Then reset the scheduling queue record and move it to the end of the queue.
+
+
+When acting on 'gang-timeout' the TM will fail the testset in a manner similar
+to scenario #9.  No need to repeat that.
+
+
+
+#7 - Gang Cleanup
+-----------------
+
+When a testbox completes a gang scheduled test, we will have to serialize
+resource cleanup (both globally and on testboxes) as they stop.  More details
+can be found in the documentation of 'gang-cleanup'.
+
+So, the transition from 'gang-testing' is always to 'gang-cleanup'.  When we
+can safely leave 'gang-cleanup' is decided by the query::
+
+        SELECT  COUNT(*)
+        FROM    TestBoxStatuses,
+                TestSets
+        WHERE   TestSets.idTestSetGangLeader = :idTestSetGangLeader
+            AND TestSets.idTestBox = TestBoxStatuses.idTestBox
+            AND TestBoxStatuses.enmState = 'gang-running'::TestBoxState_T;
+
+As long as there are testboxes still running, we stay in the 'gang-cleanup'
+state.  Once there are none, we continue closing the testset and such.
+
+
+
+#8 - Testbox Reports A Crash During Test Execution
+--------------------------------------------------
+
+TODO
+
+
+#9 - Cleaning Up Abandoned Testcase
+-----------------------------------
+
+This is a subfunction of scenario #1 and #2.  The actions taken are the same in
+both situations.  The precondition for taking this path is that the row in the
+testboxstatus table is refering to a testset (i.e. testset_id is not NULL).
+
+
+Actions:
+
+1. If the testset is incomplete, we need to completed:
+        a) Add a message to the root TestResults row, creating one if necesary,
+           that explains that the test was abandoned.  This is done
+           by inserting/finding the string into/in TestResultStrTab and adding
+           a row to TestResultMsgs with idStrMsg set to that string id and
+           enmLevel set to 'failure'.
+        b) Mark the testset as failed.
+
+2. Free any global resources referenced by the test set.  This is done by
+   deleting all rows in GlobalResourceStatuses matching the testbox id.
+
+3. Set the idTestSet to NULL in the TestBoxStatuses row.
+
+
+
+#10 - Cleaning Up a Disabled/Dead TestBox
+-----------------------------------------
+
+The UI needs to be able to clean up the remains of a testbox which for some
+reason is out of action.  Normal cleaning up of abandoned testcases requires
+that the testbox signs on or asks for work, but if the testbox is dead or
+in some way indisposed, it won't be doing any of that.  So, the testbox
+sheriff needs to have a way of cleaning up after it.
+
+It's basically a manual scenario #9 but with some safe guards, like checking
+that the box hasn't been active for the last 1-2 mins (max idle/wait time * 2).
+
+
+Note! When disabling a box that still executing the testbox script, this
+      cleanup isn't necessary as it will happen automatically.   Also, it's
+      probably desirable that the testbox finishes what ever it is doing first
+      before going dormant.
+
+
+
+Test Manager: Analysis
+=======================
+
+One of the testbox sheriff's tasks is to try figure out the reason why something
+failed.  The test manager will provide facilities for doing so from very early
+in it's implementation.
+
+
+We need to work out some useful status reports for the early implementation.
+Later there will be more advanced analysis tools, where for instance we can
+create graphs from selected test result values or test execution times.
+
+
+
+Implementation Plan
+===================
+
+This has changed for various reasons.  The current plan is to implement the
+infrastructure (TM & testbox script) first and do a small deployment with the
+2-5 test drivers in the Testsuite as basis.  Once the bugs are worked out, we
+will convert the rest of the tests and start adding new ones.
+
+We just need to finally get this done, no point in doing it piecemeal by now!
+
+
+Test Manager Implementation Sub-Tasks
+-------------------------------------
+
+The implementation of the test manager and adjusting/completing of the testbox
+script and the test drivers are tasks which can be done by more than one
+person.  Splitting up the TM implementation into smaller tasks should allow
+parallel development of different tasks and get us working code sooner.
+
+
+Milestone #1
+------------
+
+The goal is to getting the fundamental testmanager engine implemented, debugged
+and working.  With the exception of testboxes, the configuration will be done
+via SQL inserts.
+
+Tasks in somewhat prioritized order:
+
+    - Kick off test manager.  It will live in testmanager/.  Salvage as much as
+      possible from att/testserv.  Create basic source and file layout.
+
+    - Adjust the testbox script, part one.  There currently is a testbox script
+      in att/testbox, this shall be moved up into testboxscript/.  The script
+      needs to be adjusted according to the specification layed down earlier
+      in this document.  Installers or installation scripts for all relevant
+      host OSes are required.  Left for part two is result reporting beyond the
+      primary log.  This task must be 100% feature complete, on all host OSes,
+      there is no room for FIXME, XXX or @todo here.
+
+    - Implement the schedule queue generator.
+
+    - Implement the testbox dispatcher in TM.  Support all the testbox script
+      responses implemented above, including upgrading the testbox script.
+
+    - Implement simple testbox management page.
+
+    - Implement some basic activity and result reports so that we can see
+      what's going on.
+
+    - Create a testmanager / testbox test setup.  This lives in selftest/.
+
+      1. Set up something that runs, no fiddly bits. Debug till it works.
+      2. Create a setup that tests testgroup dependencies, i.e. real tests
+         depending on smoke tests.
+      3. Create a setup that exercises testcase dependency.
+      4. Create a setup that exercises global resource allocation.
+      5. Create a setup that exercises gang scheduling.
+
+    - Check that all features work.
+
+
+Milestone #2
+------------
+
+The goal is getting to VBox testing.
+
+Tasks in somewhat prioritized order:
+
+    - Implement full result reporting in the testbox script and testbox driver.
+      A testbox script specific reporter needs to be implemented for the
+      testdriver framework.  The testbox script needs to forward the results to
+      the test manager, or alternatively the testdriver report can talk
+      directly to the TM.
+
+    - Implement the test manager side of the test result reporting.
+
+    - Extend the selftest with some setup that report all kinds of test
+      results.
+
+    - Implement script/whatever feeding builds to the test manager from the
+      tinderboxes.
+
+    - The toplevel test driver is a VBox thing that must be derived from the
+      base TestDriver class or maybe the VBox one.  It should move from
+      toptestdriver to testdriver and be renamed to vboxtltd or smth.
+
+    - Create a vbox testdriver that boots the t-xppro VM once and that's it.
+
+    - Create a selftest setup which tests booting t-xppro taking builds from
+      the tinderbox.
+
+
+Milestone #3
+------------
+
+The goal for this milestone is configuration and converting current testscases,
+the result will be the a minimal test deployment (4-5 new testboxes).
+
+Tasks in somewhat prioritized order:
+
+    - Implement testcase configuration.
+
+    - Implement testgroup configuration.
+
+    - Implement build source configuration.
+
+    - Implement scheduling group configuration.
+
+    - Implement global resource configuration.
+
+    - Re-visit the testbox configuration.
+
+    - Black listing of builds.
+
+    - Implement simple failure analysis and reporting.
+
+    - Implement the initial smoke tests modelled on the current smoke tests.
+
+    - Implement installation tests for Windows guests.
+
+    - Implement installation tests for Linux guests.
+
+    - Implement installation tests for Solaris guest.
+
+    - Implement installation tests for OS/2 guest.
+
+    - Set up a small test deployment.
+
+
+Further work
+------------
+
+After milestone #3 has been reached and issues found by the other team members
+have been addressed, we will probably go for full deployment.
+
+Beyond this point we will need to improve reporting and analysis.  There may be
+configuration aspects needing reporting as well.
+
+Once deployed, a golden rule will be that all new features shall have test
+coverage.  Preferrably, implemented by someone else and prior to the feature
+implementation.
+
+
+
+
+Discussion Logs
+===============
+
+2009-07-21,22,23 Various Discussions with Michal and/or Klaus
+-------------------------------------------------------------
+
+- Scheduling of tests requiring more than one testbox.
+- Scheduling of tests that cannot be executing concurrently on several machines
+  because of some global resource like an iSCSI target.
+- Manually create the test config permutations instead of having the test
+  manager create all possible ones and wasting time.
+- Distinguish between built types so we can run smoke tests on strick builds as
+  well as release ones.
+
+
+2009-07-20 Brief Discussion with Michal
+----------------------------------------
+
+- Installer for the testbox script to make bringing up a new testbox even
+  smoother.
+
+
+2009-07-16 Raw Input
+--------------------
+
+- test set. recursive collection of:
+    - hierachical subtest name (slash sep)
+    - test parameters / config
+    - bool fail/succ
+    - attributes (typed?)
+    - test time
+    - e.g. throughput
+    - subresults
+    - log
+    - screenshots,....
+
+- client package (zip) dl from server (maybe client caching)
+
+
+- thoughts on bits to do at once.
+    - We *really* need the basic bits ASAP.
+    - client -> support for test driver
+    - server -> controls configs
+    - cleanup on both sides
+
+
+2009-07-15 Raw Input
+--------------------
+
+- testing should start automatically
+- switching to branch too tedious
+- useful to be able to partition testboxes (run specific builds on some boxes, let an engineer have a few boxes for a while).
+- test specification needs to be more flexible (select tests, disable test, test scheduling (run certain tests nightly), ... )
+- testcase dependencies (blacklisting builds, run smoketests on box A before long tests on box B, ...)
+- more testing flexibility, more test than just install/moke. For instance unit tests, benchmarks, ...
+- presentation/analysis: graphs!, categorize bugs, columns reorganizing grouped by test (hierarchical), overviews, result for last day.
+- testcase specificion, variables (e.g. I/O-APIC, SMP, HWVIRT, SATA...) as sub-tests
+- interation with ILOM/...: reset systems
+- Changes needs LDAP authentication
+- historize all configuration  w/ name
+- ability to run testcase locally (provided the VDI/ISO/whatever extra requirements can be met).
+
+
+-----
+
+.. [1] no such footnote
+
+-----
+
+:Status: $Id: AutomaticTestingRevamp.txt $
+:Copyright: Copyright (C) 2010-2017 Oracle Corporation.
+
author	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-05-06 03:01:46 +0000
committer	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-05-06 03:01:46 +0000
commit	f8fe689a81f906d1b91bb3220acde2a4ecb14c5b (patch)
tree	26484e9d7e2c67806c2d1760196ff01aaa858e8c /src/VBox/ValidationKit/docs/AutomaticTestingRevamp.txt
parent	Initial commit. (diff)
download	virtualbox-f8fe689a81f906d1b91bb3220acde2a4ecb14c5b.tar.xz virtualbox-f8fe689a81f906d1b91bb3220acde2a4ecb14c5b.zip