There was some work in this area recently. First is started with @chase-qi submitting Add support for converting to lava test definitions (!1) · Merge requests · OSTC / tools / oh-spread · GitLab which allowed us to look at what this might look like and to have a conversation about how to proceed.
I’ve discussed this pull request with Chase in a video call and then together with @stevanr and Chase in our weekly Linaro sync. I don’t think we were immediately in agreement, mostly because I am stubborn to get things right even if it takes somewhat longer.
Here’s what we, I hope, agreed to:
- The converter will retain spread semantics, so that developers can expect equivalent behaviour locally when running spread directly, and in CI where lava executes all the tests
- The converter will support a subset of the features and will actively identify and refuse to work if an unsupported feature is used by the project.
- Projects opt into using LAVA by declaring a
lava
backend in their spread.yaml
. The converter uses exports those jobs implicitly, as if it was invoked with spread lava:
The most contentious and complex aspect is related to the spread prepare/restore logic, and the flexible way it can be defined at nearly every level. There’s no direct equivalent in LAVA and we argued if we need to support that feature or not.
For some context, spread project defines a set of tasks. Apart from initial deployment logic, when the system is prepared, everything else is a sequence of tasks that execute in (random) order. Spread allows each task to define an execute script, which should be exactly what the developer wants to see tested as well as prepare and restore scripts, which prepare the system to execute the task and restore it after executing the task respectively. Immediately there is no equivalent in LAVA, where everything is just a flat sequence of steps. In addition, spread has semantics that describe what happens when each of those scripts fails. If the prepare script fails, the execute script does not run and the restore script is started. If the execute script fails, the restore script is started. If the restore script fails, spread assumes the system is now corrupted and stops using it. There is also the debug script which spread runs if it is defined and anything related to a task has failed. The debug script runs before the restore script.
The second problem with how to map that to LAVA concepts is that, for convenience, spread allows to define prepare-each and restore-each scripts at nearly every level of the project: starting from the project-wide, to backend, to test suite. This lets a developer easily ensure that some piece of code runs before or after every tasks contained in a given sub-tree. The question is how to map that to LAVA with its linear run steps script.
My proposal is to to the following:
- Project prepare and restore are converted to a synthetic test, at the very start and very end of the test actions
- Suite prepare and restore are similarly converted to synthetic tests. As we iterate over the project structure and visit subsequent test suites, we emit the relevant prepare and restore scripts around all the tasks contained in a given suite.
- Now for tasks, this is where the magic happens! We keep track of the current suite as we traverse and emit a LAVA test that contains, together: the entire aggregated prepare-each script, task prepare, task execute, task restore, the entire aggregated restore-each script. (I simplified this by ignoring debug scripts but they are not fundamentally any different).
We use shell if-then-else logic to capture the relation and behaviour of each spread task. It is roughly, in pseudo-code, as follows:
set -e
trap aggregated-task-restore.sh EXIT
trap aggregated-task-debug.sh ERR
aggregated-task-prepare.sh
task-execute.sh
We stop on any error in simple statements.
On exit, we run restore.
On error we run debug.
We run prepare
We run execute
That’s it
All the scripts need to have correct environment variables or shell functions. There are three sources:
- shell functions MATCH, NOMATCH and REBOOT - they are constants that come from spread
- intrinsic spread variables that inform the task about the backend, suite and so on
- declared variables that come from the
environment:
section inside tasks, projects and suites.
Lastly spread variants, where a single task.yaml
becomes a set of unique named tasks is the last element of the puzzle. Spread offers a way to get all the variables with their correct values.
In pseudo code, the project traversal logic is:
emit project prepare
for each suite:
emit suite prepare
for each task:
for each variant of current task:
emit aggregated prepare
emit task execute
emit aggregate restore
emit suite restore
emit project restore
The emit word implies that a test definition, with all the variables is created.
The aggregated word implies that all the -each
scripts that apply to a given task are combined.
I’ve started to walk on this path with a pull request adding LAVA types: lava: add types describing LAVA concepts (!4) · Merge requests · OSTC / tools / oh-spread · GitLab
I will follow this with the next smallest logical step, working with Chase to review and land all the pieces as we make progress.
That’s it. Let’s get this done.