#11 Support tests that need fresh testbeds
Opened 5 years ago by martinpitt. Modified 3 years ago

Integration tests often change the testbed, sometimes up to the point where they are completely broken. For these it is crucial to be able to give a fresh testbed to each individual test.

With the current schema, the test has zero control over the testbed - that is set up outside of test invocation, and the test just gets passed a location (Ansible node/group name) where to run in. Thus there is no support for letting a test reset or reboot the testbed.

I see two possible approaches to this:

  1. Extend the tests.yaml definition to declare that each test needs a fresh test bed.

    • convenient for tests as they don't have to do any setup, and might be good enough for most use cases
    • does not work well with running tests locally, as the test invocation interface is currently all around Ansible, which does not have this concept of building/controlling targets
    • not very flexible, as e. g. tests still cannot reboot the test bed
  2. Give the test access to the test subject, i. e. the VM image path. And let the tests do their own VM management.

    • Very flexible, supports arbitrary reboots, or calling test suites with several tests without having to split them up in tests.yaml
    • Current tests already sort of do that: --extra-vars subjects=/mnt/[...]/images/test_subject.qcow2 . Do tests have access to that somehow?
    • This is brittle and hard to support if the initial/parent test environment is already a VM, as that requires nested VM. This might work, but hard to get on available OpenShift environments. Also, this means that the parent test environment VM already has to have enough RAM/disk to support running nested VMs with enough capacity.
    • For these cases it would be much more efficient to let the test run in a container instead, so that they can run their own VMs.

While I believe approach (2) is the more correct approach, it does run afoul of the realities of reliable nested virtualization not being ubiquitous.

Option (1) is accounted for in the STI. A tests/ directory can have multiple tests-*.yml files. Each of which should receive their own clean invocation from the testing system. If this is not clear in the standard test interface specification, then we should clarify it.

Is there any way 1 or 2 can be done with today's infrastructure? I recently discovered the existence of tests/provision.fmf to change the RAM; could that perhaps be used?

Soon will have a sort of 1) on downstream, making it possible to run each tests*.yml in their own environment, in parallel. Would that help?

As for 2) I think @astepano or @bgoncalv could answer it that is somehow possible. I believe it should not be hard to skip the dynamic provisioning and the subjects passed as extra vars should be consumable from your playbook, where you should be able to do what you want. But it seems quite a lot more work for you of course ...

Currently the Fedora CI already runs each playbook that starts with tests on their own environment.

https://github.com/CentOS-PaaS-SIG/ci-pipeline/blob/3fdd4312157cf4d5465cfe6a6c112014dabda02c/config/Dockerfiles/singlehost-test/package-test.sh#L125

I'm not sure what we can do regarding option 2.

@mvadkert : Having some implementation of 1) would definitively help for us to be able to run more than one integration test.

It would be very inefficient, as setup steps (like downloading and setting up selenium etc. containers) or intalling test dependencies into the VM cannot be shared, and need to be re-done for every test case. In our case, the setup takes pretty much the entire time (magnitude of 10 minutes), while the actual test only takes very few seconds. So this inefficiency is certainly not negligible.

@bgoncalv: Where does the test driver run right now, i. e. the thing that prepares the VM and launches QEMU? is that running in a container? Could test declare to run right in that container instead of the VM?

@martinpitt yes, it runs in a container. It prepares a qcow2 image where the built pacakges are installed and then this qcow2 is given as test subject to standard-test-roles.
standard-test-roles will launch the QEMU and run the tests in the VM.

The pipeline does not support running the tests in a container. @mvadkert is that something we even planned?

@martinpitt yep, that should be possible. We do already something in that sense for QE tests with the help of openstack and snapshoting. I hope we will support snapshoting also with qemu-kvm (or libvirt as provider). To cover this use case and how we would describe it to the CI system I opened an issue to create an example in fedora-ci/metadata:

https://pagure.io/fedora-ci/metadata/issue/10

@martinpitt, if I understand your use case correctly, you would like to separate the setup part of the test(s) and run it once, then store the snapshot, run test, restore snapshot, run test... But because of how Standard Test Interface is designed, everything is stored in an ansible playbook. Setup and test execution mixed together. Infrastructure errors mixed with test failures.

We are working on an Extensible CI configuration that will allow clear separation of the test phases (discover, provision, prepare, execute, report). I believe this concept should work with your use case as well. Will try to prepare an example demonstrating this. Your feedback will be welcome. Meanwhile you might want to have a look at the workflow example.

@psss : Right, the intention is to run the playbook right in the container, and pass it the path to the image as a variable. Then the test itself could control the VM and do whatever it likes.

However, if I interpret https://pagure.io/fedora-ci/metadata/pull-request/11 correctly, that would suffice as well. Does that already work, or is that "just" a spec/documentation for the future?

Thank you!

The example is just part of the proposal we are currently working on. Good to see something like that would meet your needs. We will keep it in the list of use cases.

@psss Checking in to see if this is still on our radar or possibly even already live? I see the listed PR was merged.

Login to comment on this ticket.

Metadata