#18 openQA version running on BOS does not schedule 32-bit jobs
Closed: Fixed None Opened 8 years ago by adamwill.

So @jsedlak asked me yesterday if I knew why there was a job sitting about in SCHEDULED on BOS. I didn't know then, but I do now.

The openQA running on BOS can't actually schedule 32-bit jobs properly for us. It has a check in the job_grab function in /usr/share/openqa/lib/OpenQA/Scheduler.pm, which is what gets run when a worker is 'grabbing' one of the scheduled jobs to make sure workers only run jobs which are appropriate to them. It does this:

    my $archquery = schema->resultset("JobSettings")->search(
        {
            key => "ARCH",
            value => $cando{$worker->get_property('CPU_ARCH')}
        }
    );

and then uses that query as one of the conditions for filtering the job list. The cando thing is basically a table of arch mappings, but unfortunately, they called their 32-bit arches 'i586' and 'i686', not 'i386', so it doesn't work for us (x86_64 'cando' i586 and i686, according to the table, but not i386).

There's a few ways we can go about dealing with this:

  1. We can hack some icky substitutions into the scheduling stuff so it converts 'i386' to 'i686' when scheduling the jobs, and 'i686' back into 'i386' when reading out the results and submitting them (we can't just rename 'i386' to 'i686' everywhere and have done with it, because fedfind is definite that the name is 'i386' and I don't think I want to hack up fedfind's Query stuff to do arch fuzzing, and also report_job_results reads out the arch and uses it as an environment name for some tests, so it has to be i386 not i686 there).

  2. We can update to a newer openQA. Upstream ditched this test with https://github.com/os-autoinst/openQA/commit/8caa3950e2ca0a5f024e53c17b20f1d92f5009bc . I run a newer openQA on happyassassin, which is why I didn't see this problem in testing.

  3. We can hack up openQA in-place on BOS to adjust or drop the test. This is ugly, but doesn't have many downsides because presumably if we ever actually update the package it'll be to a version without the problematic check.

  4. We can do a 'proper' rebuild of the openQA package with the test adjusted.

Thoughts?


For the record, a related issue would still be kinda lurking after the upstream change. The way SUSE does the arch filtering now is that they set a WORKER_CLASS for the machines in the templates. job_grab now checks the worker's classes (an x86_64 worker has the classes qemu_x86_64, qemu_i686 and qemu_i586) against the job's WORKER_CLASS setting, if it has one.

So they have Intel, ARM and S390 'machines' with appropriate WORKER_CLASS settings, and a test scheduled on an S390 machine will never get run on an ARM or Intel worker because the worker won't have a matching 'class'.

With our current job templates we simply bypass this mechanism entirely, because we don't have any WORKER_CLASS settings for our machines. So current openQA (e.g. running on happyassassin) will happily schedule any job on any worker so far as arch goes - it has no arch filtering.

That's no problem as things stand, because we only have x86_64 workers and x86_64 and i386 tests. But if we ever actually do add arch-incompatible workers/tests (ARM, for e.g.) we'd have to add the WORKER_CLASS settings to the machines in templates so arch filtering would work correctly. I'm noting this down here so we at least have it written somewhere in case we add ARM tests then wonder why they're getting scheduled on Intel workers :)

We can update OpenQA in BOS (if the fix is in the stable repo anyway), but I'm little bit worried, the last time I've updated OpenQA, database schema changed :-D.

It does that, yeah. The package should handle updating it, though?

There doesn't appear to be a stable update; I'm using the unstable repo on happyassassin, I just checked BOS and it has no update available. I don't know when upstream plans to update 'stable' again, but we could ask. I haven't really had a lot of trouble using 'unstable' on happyassassin, but I understand if you don't want to change BOS back to it.

I have updated code of OpenQA in BOS so that 'x86_64' (and 'i586', 'i686') can run 'i386'. This is not nice and it's only temporary fix, but as you said, it will work until OpenQA gets updated, but in updated version this is fixed, so we shouldn't (the s-word!) get to the state where OpenQA in BOS doesn't work with 32bit images.

It seems that it works now.

Login to comment on this ticket.

Metadata