To test the new design after making changes, use the Unit-tests. During development, it is useful to only run selected tests, although remember that all tests must pass before proposing the change as a review.
In each case, ensure that your local packages are up to date and rebase your
local development branch against master if git pull fetches new commits. If
your branch needed to be updated, always build and install your local packages.
See also
$ ./manage.py test
lava-server has several components, see the contents of ci-run for the
full list. Each component can be tested separately:
$ ./manage.py test lava_scheduler_app
To run particular tests in a specific file, add e.g. test_device.py to the
command:
$ ./manage.py test lava_scheduler_app.tests.test_device
Note
the tests directory needs to be specified (instead of the test
process discovering all tests) but the filename in the tests directory
lacks the .py suffix.
Add the class name to run all tests within that class within the specified file.:
$ ./manage.py test lava_scheduler_app.tests.test_device.DeviceTypeTest
Add a specific test function to run only that one unit test:
$ ./manage.py test lava_scheduler_app.tests.test_device.DeviceTypeTest.test_device_type_templates
Useful options to ./manage.py test include -v2 to follow what is
being done and --noinput to automatically remove a database created by a previous
run of the unit tests which did not complete properly.
Tests on the Jinja2 templates can be run using ./ci-run -t.
All jinja2 templates in lava_scheduler_app/tests/device-types/ will be
tested using a basic check in
lava_scheduler_app.tests.test_base_templates.TestBaseTemplates.test_all_templates
for YAML syntax. This renders the template without a device dictionary and
checks that the output is valid YAML. This test will fail with syntax errors
in variables, jinja2 blocks, inheritance and whitespace indent errors.
$ python3 -m unittest -vcf lava_scheduler_app.tests.test_base_templates.TestBaseTemplates.test_all_templates
Add a new unit test to the TestTemplates class in the same unit test
file when any jinja2 template fails to parse. Change the DEBUG setting
to True to see the rendered output and use the Online YAML Parser to identify the
problems with the YAML output. Once the basic validation passes, add an
initial device dictionary, following the examples and identify specific
values in the output which can be asserted.
See also
Compare the output of the template with a realistic device dictionary, using the unit test, with the test YAML used when developing the underlying support. If this is a fix for an existing template, you can generate the relevant output on the master branch and verify against the changes in the local branch.
Completed templates need to be installed into
/etc/lava-server/dispatcher-config/device-types/ before testjobs can be
submitted through XML-RPC. The lava-master daemon does not need to
be restarted, the next submitted testjob will use the modified template(s).
$ python3 -m unittest discover lava_dispatcher/
To run a single test, use the test class name as output by a failing test,
without the call to discover:
$ python3 -m unittest lava_dispatcher.tests.test_basic.TestPipelineInit.test_pipeline_init
$ python3 -m unittest -v -c -f lava_dispatcher.tests.test_basic.TestPipelineInit.test_pipeline_init
The call references the path to the python module, the class and then the test function within that class. To run all tests in a class, omit the function. To run all tests in a file, omit the class and the function.
See also
Unit-tests for information on running the full set of
unit tests on lava-server and lava-dispatcher.
The structure of any one job will be the same each time it is run (subject to
changes in the developing codebase). Each different job will have a different
pipeline structure. Do not rely on any of the pipeline levels have any specific
labels. When writing unit tests, only use checks based on isinstance or
self.name. (The description and summary fields are subject to change to
make the validation output easier to understand whereas self.name is a
strict class-based label.)
(Actual output is subject to frequent change.)
!!python/object/apply:collections.OrderedDict
- - - device
   - parameters:
       actions:
         boot:
           prompts: ['linaro-test', 'root@debian:~#']
           command:
             amd64: {qemu_binary: qemu-system-x86_64}
           methods: [qemu]
           overrides: [boot_cmds, qemu_options]
           parameters:
             boot_cmds:
             - {root: /dev/sda1}
             - {console: 'ttyS0,115200'}
             machine: accel=kvm:tcg
             net: ['nic,model=virtio', user]
             qemu_options: [-nographic]
         deploy:
           methods: [image]
       architecture: amd64
       device_type: kvm
       hostname: kvm01
       memory: 512
       root_part: 1
 - - job
   - parameters: {action_timeout: 5m, device_type: kvm, job_name: kvm-pipeline, job_timeout: 15m,
       output_dir: /tmp/codehelp, priority: medium, target: kvm01, yaml_line: 3}
 - - '1'
   - content:
       description: deploy image using loopback mounts
       level: '1'
       name: deployimage
       parameters:
         deployment_data: &id001 {TESTER_PS1: 'linaro-test [rc=$(echo \$?)]# ', TESTER_PS1_INCLUDES_RC: true,
           TESTER_PS1_PATTERN: 'linaro-test \[rc=(\d+)\]# ', boot_cmds: boot_cmds,
           distro: debian, lava_test_dir: /lava-%s, lava_test_results_dir: /lava-%s,
           lava_test_results_part_attr: root_part, lava_test_sh_cmd: /bin/bash}
       summary: deploy image
       valid: true
       yaml_line: 12
     description: deploy image using loopback mounts
     summary: deploy image
 - - '1.1'
   - content:
       description: download with retry
       level: '1.1'
       max_retries: 5
       name: download_action
       parameters:
         deployment_data: *id001
       sleep: 1
       summary: download-retry
       valid: true
     description: download with retry
     summary: download-retry
 - - '1.2'
   - content:
       description: md5sum, sha256sum and sha512sum
       level: '1.2'
       name: checksum_action
       parameters:
         deployment_data: *id001
       summary: checksum
       valid: true
     description: md5sum, sha256sum and sha512sum
     summary: checksum
 - - '1.3'
   - content:
       description: mount with offset
       level: '1.3'
       name: mount_action
       parameters:
         deployment_data: *id001
       summary: mount loop
       valid: true
     description: mount with offset
     summary: mount loop
 - - 1.3.1
   - content:
       description: calculate offset of the image
       level: 1.3.1
       name: offset_action
       parameters:
         deployment_data: *id001
       summary: offset calculation
       valid: true
     description: calculate offset of the image
     summary: offset calculation
 - - 1.3.2
   - content:
       description: ensure a loop back mount operation is possible
       level: 1.3.2
       name: loop_check
       parameters:
         deployment_data: *id001
       summary: check available loop back support
       valid: true
     description: ensure a loop back mount operation is possible
     summary: check available loop back support
 - - 1.3.3
   - content:
       description: Mount using a loopback device and offset
       level: 1.3.3
       max_retries: 5
       name: loop_mount
       parameters:
         deployment_data: *id001
       retries: 10
       sleep: 10
       summary: loopback mount
       valid: true
     description: Mount using a loopback device and offset
     summary: loopback mount
 - - '1.4'
   - content:
       description: customize image during deployment
       level: '1.4'
       name: customize
       parameters:
         deployment_data: *id001
       summary: customize image
       valid: true
     description: customize image during deployment
     summary: customize image
 - - '1.5'
   - content:
       description: load test definitions into image
       level: '1.5'
       name: test-definition
       parameters:
         deployment_data: *id001
       summary: loading test definitions
       valid: true
     description: load test definitions into image
     summary: loading test definitions
 - - 1.5.1
   - content:
       description: apply git repository of tests to the test image
       level: 1.5.1
       max_retries: 5
       name: git-repo-action
       parameters:
         deployment_data: *id001
       sleep: 1
       summary: clone git test repo
       uuid: b32dd5ff-fb80-44df-90fb-5fbd5ab35fe5
       valid: true
       vcs_binary: /usr/bin/git
     description: apply git repository of tests to the test image
     summary: clone git test repo
 - - 1.5.2
   - content:
       description: apply git repository of tests to the test image
       level: 1.5.2
       max_retries: 5
       name: git-repo-action
       parameters:
         deployment_data: *id001
       sleep: 1
       summary: clone git test repo
       uuid: 200e83ef-bb74-429e-89c1-05a64a609213
       valid: true
       vcs_binary: /usr/bin/git
     description: apply git repository of tests to the test image
     summary: clone git test repo
 - - 1.5.3
   - content:
       description: overlay test support files onto image
       level: 1.5.3
       name: test-overlay
       parameters:
         deployment_data: *id001
       summary: applying LAVA test overlay
       valid: true
     description: overlay test support files onto image
     summary: applying LAVA test overlay
 - - '1.6'
   - content:
       description: add lava scripts during deployment for test shell use
       lava_test_dir: /usr/lib/python3/dist-packages/lava_dispatcher/lava_test_shell
       level: '1.6'
       name: lava-overlay
       parameters:
         deployment_data: *id001
       runner_dirs: [bin, tests, results]
       summary: overlay the lava support scripts
       valid: true
       xmod: 493
     description: add lava scripts during deployment for test shell use
     summary: overlay the lava support scripts
 - - '1.7'
   - content:
       description: unmount the test image at end of deployment
       level: '1.7'
       max_retries: 5
       name: umount
       parameters:
         deployment_data: *id001
       sleep: 1
       summary: unmount image
       valid: true
     description: unmount the test image at end of deployment
     summary: unmount image
 - - '2'
   - content:
       description: boot image using QEMU command line
       level: '2'
       name: boot_qemu_image
       parameters:
         parameters: {failure_retry: 2, media: tmpfs, method: kvm, yaml_line: 22}
       summary: boot QEMU image
       timeout: {duration: 30, name: boot_qemu_image}
       valid: true
       yaml_line: 22
     description: boot image using QEMU command line
     summary: boot QEMU image
 - - '2.1'
   - content:
       description: Wait for a shell
       level: '2.1'
       name: expect-shell-connection
       parameters:
         parameters: {failure_retry: 2, media: tmpfs, method: kvm, yaml_line: 22}
       summary: Expect a shell prompt
       valid: true
     description: Wait for a shell
     summary: Expect a shell prompt
 - - '3'
   - content:
       level: '3'
       name: test
       parameters:
         parameters:
           definitions:
           - {from: git, name: smoke-tests, path: lava-test-shell/smoke-tests-basic.yaml,
             repository: 'git://git.linaro.org/lava-team/lava-functional-tests.git', yaml_line: 31}
           - {from: git, name: singlenode-basic, path: singlenode01.yaml, repository: 'git://git.linaro.org/people/neilwilliams/multinode-yaml.git',
             yaml_line: 39}
           failure_retry: 3
           name: kvm-basic-singlenode
           yaml_line: 27
       summary: test
       valid: true
     description: null
     summary: test
 - - '4'
   - content:
       level: '4'
       description: finish the process and cleanup
       name: finalize
       parameters:
         parameters: {}
       summary: finalize the job
       valid: true
     description: finish the process and cleanup
     summary: finalize the job
The hacks and workarounds in the old LavaTestShell classes may need to be marked and retained until such time as either the new model replaces the old or the bug can be fixed in both models. Whereas the submission schema, log file structure and result bundle schema have thrown away any backwards compatibility, LavaTestShell will need to at least attempt to retain compatibility while improving the overall design and integrating the test shell operations into the new classes.
Current possible issues include:
testdef.yaml is hardcoded into lava-test-runner when this could be a
parameter fed into the overlay from the VCS handlers.For a RetryAction to validate, the RetryAction subclass must be a wrapper class around a new pipeline to allow the RetryAction.run() function to handle all of the retry functionality in one place.
An Action which needs to support failure_retry or which wants to use
RetryAction support internally, needs a new class added which derives from
RetryAction, sets a useful name, summary and description and defines a
populate() function which creates the pipeline. The Action with the
customized run() function then gets added to the pipeline of the
RetryAction subclass - without changing the inheritance of the original Action.
To add Diagnostics, add subclasses of DiagnosticAction to the list of supported Diagnostic classes in the Job class. Each subclass must define a trigger classmethod which is unique across all Diagnostic subclasses. (The trigger string is used as an index in a generator hash of classes.) Trigger strings are only used inside the Diagnostic class. If an Action catches a JobError or InfrastructureError exception and wants to allow a specific Diagnostic class to run, import the relevant Diagnostic subclass and add the trigger to the current job inside the exception handling of the Action:
try:
  self._run_command(cmd_list)
except JobError as exc:
  self.job.triggers.append(DiagnoseNetwork.trigger())
  raise JobError(exc)
return connection
Actions should only append triggers which are relevant to the JobError or InfrastructureError exception about to be raised inside an Action.run() function. Multiple triggers can be appended to a single exception. The exception itself is still raised (so that a RetryAction container will still operate).
Hint
A DownloadAction which fails to download a file could
append a DiagnosticAction class which runs ifconfig or
route just before raising a JobError containing the
404 message.
If the error to be diagnosed does not raise an exception, append the trigger in a conditional block and emit a JobError or InfrastructureError exception with a useful message.
Do not clear failed results of previous attempts when running a Diagnostic class - the fact that a Diagnostic was required is an indication that the job had some kind of problem.
Avoid overloading common Action classes with Diagnostics, add a new Action subclass and change specific Strategy classes (Deployment, Boot, Test) to use the new Action.
Avoid chaining Diagnostic classes - if a Diagnostic requires a command to exist, it must check that the command does exist. Raise a RuntimeError if a Strategy class leads to a Diagnostic failing to execute.
It is an error to add a Diagnostic class to any Pipeline. Pipeline Actions should be restricted to classes which have an effect on the Test itself, not simply reporting information.
Sometimes, a particular test image will support the expected command but a subsequent image would need an alternative. Generally, the expectation is that the initial command should work, therefore the fallback or helper action should not be needed. The refactoring offers support for this situation using Adjuvants.
An Adjuvant is a helper action which exists in the normal pipeline but which is normally skipped, unless the preceding Action sets a key in the PipelineContext that the adjuvant is required. A successful operation of the adjuvant clears the key in the context.
One example is the reboot command. Normal user expectation is that a
reboot command as root will successfully reboot the device but LAVA needs
to be sure that a reboot actually does occur, so usually uses a hard reset PDU
command after a timeout. The refactoring allows LAVA to distinguish between a
job where the soft reboot worked and a job where the PDU command became
necessary, without causing the test itself to fail simply because the job
didn’t use a hard reset.
If the ResetDevice Action determines that a reboot happened (by matching a pexpect on the bootloader initialization), then nothing happens and the Adjuvant action (in this case, HardResetDevice) is marked in the results as skipped. If the soft reboot fails, the ResetDevice Action marks this result as failed but also sets a key in the PipelineContext so that the HardResetDevice action then executes.
Unlike Diagnostics, Adjuvants are an integral part of the pipeline and show up in the verification output and the results, whether executed or not. An Adjuvant is not a simple retry, it is a different action, typically a more aggressive or forced action. In an ideal world, the adjuvant would never be required.
A similar situation exists with firmware upgrades. In this case, the adjuvant is skipped if the firmware does not need upgrading. The preceding Action would not be set as a failure in this situation but LAVA would still be able to identify which jobs updated the firmware and which did not.
Most deployment Action classes run without needing a Connection. Once a Connection is established, the Action may need to run commands over that Connection. At this point, the Action delegates the maintenance of the run function to the Connection pexpect. i.e. the Action.run() is blocked, waiting for Connection.run_command() (or similar) to return and the Connection needs to handle timeouts, signals and other interaction over the connection. This role is taken on by the internal SignalDirector within each Connection. Unlike the old model, Connections have their own directors which takes the multinode and LMP workload out of the singlenode operations.
Some devices need a sequence of commands to change power state, some may
require a sleep or similar delay. The power commands available in the
device dictionary support two uses:
This is the simplest form and is recommended for the majority of devices.
{% set hard_reset_command = '/usr/bin/pduclient --daemon tweetypie --hostname pdu --command reboot --port 08' %}
It can be useful to have a short list of simple commands, e.g. during device integration. In the final file used in the device dictionary, the entire list must be on a single line.
{% set hard_reset_command = ['/usr/local/lab-scripts/snmp_pdu_control --hostname pdu14 --command reboot --port 5 --delay 20', '/usr/local/lab-scripts/eth008_control -a 10.0.9.2 -r 3 -s onoff'] %}
Note
Extending the list support to more than a simple list of sequential
commands is not supported and there is also no support for shell
operators like && or ||. Any device which needs something more
complex must have custom scripts made available on the worker which
can do all the conditionals and logic. A script will also make the device
dictionary more readable.
Construct your pipeline to use Actions in the order:
Note
There may be several Retry actions necessary within these steps.
So, for a U-Boot operation, this results in a pipeline like:
Hit any key prompt in a new connectionTypically, a Connection is started by an Action within the Pipeline. The call to start a Connection must not return until all operations on that Connection are complete or the Pipeline determines that the Connection needs to be terminated.
The refactored dispatcher has a different approach to logging:
target.self.logger.<LEVEL> in an action.Actual representation of the logs in the UI will change - these examples are the raw content of the output YAML.
- {debug: 'start: 1.4.2.3.7 test-install-overlay (max 300s)', ts: '2015-09-07T09:40:46.720450'}
- {debug: 'test-install-overlay duration: 0.02', ts: '2015-09-07T09:40:46.746036'}
- results:
    test-install-overlay: !!python/object/apply:collections.OrderedDict
    - - [success, a9b2300d-0864-4f9c-ba78-c2594b567fc5]
      - [skipped, a9b2300d-0864-4f9c-ba78-c2594b567fc5]
      - [duration, 0.024679899215698242]
      - [timeout, 300.0]
      - [level, 1.4.2.3.7]
- {debug: 'Received signal: <STARTTC> linux-linaro-ubuntu-pwd'}
- {target: ''}
- {target: ''}
- {target: ''}
- {target: ''}
- {debug: 'test shell timeout: 300 seconds'}
- {target: ''}
- {target: /lava-None/tests/0_smoke-tests}
- {target: <LAVA_SIGNAL_ENDTC linux-linaro-ubuntu-pwd>}
- {target: <LAVA_SIGNAL_TESTCASE TEST_CASE_ID=linux-linaro-ubuntu-pwd RESULT=pass>}
- {target: <LAVA_SIGNAL_STARTTC linux-linaro-ubuntu-uname>}
- {target: ''}
- {debug: 'Received signal: <ENDTC> linux-linaro-ubuntu-pwd'}
- {target: ''}
- {target: ''}
- {target: ''}
- {target: ''}
- {debug: 'test shell timeout: 300 seconds'}
- {debug: 'Received signal: <TESTCASE> TEST_CASE_ID=linux-linaro-ubuntu-pwd RESULT=pass'}
- {debug: 'res: {''test_case_id'': ''linux-linaro-ubuntu-pwd'', ''result'': ''pass''}
    data: {''test_case_id'': ''linux-linaro-ubuntu-pwd'', ''result'': ''pass''}'}
- results: {linux-linaro-ubuntu-pwd: pass, testsuite: smoke-tests-basic}
- {info: 'ok: lava_test_shell seems to have completed'}
- debug: {curl-http: pass, direct-install: pass, direct-update: pass, linux-linaro-ubuntu-ifconfig: pass,
    linux-linaro-ubuntu-ifconfig-dump: pass, linux-linaro-ubuntu-lsb_release: fail,
    linux-linaro-ubuntu-lscpu: pass, linux-linaro-ubuntu-netstat: pass, linux-linaro-ubuntu-pwd: pass,
    linux-linaro-ubuntu-route-dump-a: pass, linux-linaro-ubuntu-route-dump-b: pass,
    linux-linaro-ubuntu-route-ifconfig-up: pass, linux-linaro-ubuntu-route-ifconfig-up-lo: pass,
    linux-linaro-ubuntu-uname: pass, linux-linaro-ubuntu-vmstat: pass, ping-test: pass,
    remove-tgz: pass, tar-tgz: pass}
- {debug: 'lava-test-shell duration: 26.88', ts: '2015-09-07T09:43:14.065956'}
Pipeline jobs are sent to the worker dispatcher over http as fully formatted YAML files but are then deleted when the test job ends.
Equivalent files can be prepared using the lava-server manage
device-dictionary review option to output the device configuration YAML.
To re-run the job on the worker, pass this configuration as the --target
option to lava-dispatch and specify a temporary --output-dir and the
test job definition.
Note
MultiNode test jobs produce a specific test job for each node in the
group. The original MultiNode definition cannot be executed by
lava-dispatch on the command line and the job definition for a single
node within a MultiNode group will also need editing before it can be run
without reference to the other nodes.
See also Mapping deployment actions to the python code:
The expectation is that new tasks for the dispatcher will be created by adding more specialist Actions and organizing the existing Action classes into a new pipeline for the new task.
Adding new behavior is a two step process:
A Strategy class may use conditionals to select between a number of top level
Strategy Action classes, for example DeployImageAction is a top level
Strategy Action class for the DeployImage strategy. If used, this conditional
must only operate on job parameters and the device as the selection
function is a classmethod.
A test Job will consist of multiple strategies, one for each of the listed
actions in the YAML file. Typically, this may include a Deployment strategy,
a Boot strategy and a Test strategy. Jobs can have multiple deployment, boot,
or test actions. Strategies add top level Actions to the main pipeline in the
order specified by the parser. For the parser to select the new strategy, the
strategies.py module for the relevant type of action needs to import the
new subclass. There should be no need to modify the parser itself.
A single top level Strategy Action implements a single strategy for the outer Pipeline. The use of Logical actions can provide sufficient complexity without adding conditionals to a single top level Strategy Action class. Image deployment actions will typically include a conditional to check if a Test action is required later so that the test definitions can be added to the overlay during deployment.
Re-use existing Action classes wherever these can be used without changes.
If two or more Action classes have very similar behavior, re-factor to make a new base class for the common behavior and retain the specialized classes.
Strategy selection via select() must only ever rely on the device and the job parameters. Add new parameters to the job to distinguish strategies, e.g. the boot method or deployment method.
populate()run() function which
calls run_actions on the internal pipeline.#. Ensure that the accepts routine can uniquely identify this strategy
without interfering with other strategies. (Always add unit tests for new classes)
Wherever a new class is added, that new class can be tested - if only to be
sure that it is correctly initialized and added to the pipeline at the correct
level. Always create a new file in the tests directory for new functionality.
All unit tests need to be in a file with the test_ prefix and add a new
YAML file to the sample_jobs so that the strategies to select the new code can
be tested. See Basics of the YAML format.
Often the simplest way to understand the available parameters and how new statements in the device configuration or job submission show up inside the classes is to use a unit test. To run a single unit-test, for example test_function in a class called TestExtra in a file called test_extra.py, use:
$ python3 -m unittest -v -c -f lava_dispatcher.tests.test_extra.TestExtra.test_function
Example python code:
import os
import unittest
class TestExtra(unittest.TestCase):  # pylint: disable=too-many-public-methods
   def test_function(self):
       print "Hello world"
When using a connection to a device, group calls over that connection to calls which are expected to return within a consistent timeout for that class. If the final command from the class starts a longer running process, e.g. boot, set the connection prompt to look for a message which will be seen on that connection within a similar timeframe to all the other calls made by that class. This allows test writers to correctly choose the timeout to extend.
Add to the documentation when adding new classes which implement new dispatcher actions, parameters or behavior.
$ sudo apt install pylint
$ pylint -d line-too-long -d missing-docstring lava_dispatcher/
$ sudo apt install graphviz
$ pyreverse lava_dispatcher/
$ dot -Tpng classes_No_Name.dot > classes.png
(Actual images can be very large.)
$ sudo apt install python-meliae
Add this python snippet to a unit test or part of the code of interest:
from meliae import scanner
scanner.dump_all_objects('filename.json')
Once the test has run, the specified filename will exist. To analyze the results, start up a python interactive shell in the same directory:
$ python
>>> from meliae import loader
>>> om = loader.load('filename.json')
loaded line 64869, 64870 objs,   8.7 /   8.7 MiB read in 0.9s
checked    64869 /    64870 collapsed     5136
set parents    59733 /    59734
collapsed in 0.4s
>>> s = om.summarize(); s
Note
The python interpreter, the setup.py configuration and other
tools may allocate memory as part of the test, so the figures in the output
may be larger than it would seem for a small test. A basic test may give a
summary of 12Mb, total size. Figures above 100Mb should prompt a check on
what is using the extra memory.
Note
These provisions are under development and are likely to change substantially. e.g. it may be possible to do a lot of these tasks using secondary media and secondary connections.
There are several situations where an environment needs to be setup in a contained and tested manner and then used for one or multiple LAVA test operations.
One solution is to use MultiNode and this works well when the device under test supports a secondary connection, e.g. ethernet.
MultiNode has requirements on a POSIX-type command line shell to be able to pass messages, e.g. busybox.
QEMU tests involve downloading a pre-built chroot based on a stable distribution release of a foreign architecture and running tests inside that chroot.
Android tests may involve setting up a VM or a configured chroot to expose USB devices while retaining the ability to use different versions of tools for different tests.