#44 ARM test does not run correctly in staging (or on f25?)
Closed: Invalid None Opened 7 years ago by adamwill.

The ARM test doesn't seem to run right in staging (with recent git openQA and os-autoinst), on Fedora 25. Not sure which condition is significant yet. Compare the 20161204.n.0 Rawhide ARM tests:

prod - https://openqa.fedoraproject.org/tests/50441
stg - https://openqa.stg.fedoraproject.org/tests/62946

they both fail, but the prod one at least seems to run properly and then time out (I think because of an fsck running, but not sure). stg just claims that the screen was showing the 'Guest has not initialized the display (yet)' message throughout the run.

I note that os-autoinst.log shows this partway through the test on stg:

12:43:01.1400 9232 no change 587
12:43:01.6830 9232 considering VNC stalled - turning black

If you compare that to the prod test, we can see the screen only actually starts changing at around 346 in the countdown:

12:46:48.8958 36148 no change 351
12:46:49.8971 36148 no change 350
12:46:50.8982 36148 no change 349
12:46:51.8991 36148 no change 348
12:46:52.9003 36148 no change 347
12:46:53.9008 36148 no change 346
12:46:54.9562 36148 MATCH(console_initial_setup:0.00)
12:46:54.9612 36148 no match 345

(this is because the display doesn't get initialized for a long time, it seems). So I think it's possible this 'stalled VNC' detection code (which is new in os-autoinst) is messing with things, but I haven't confirmed that yet. The code looks like this:

    if ($self->_framebuffer) {
        if ($self->_last_update_requested - $self->_last_update_received > 4) {
            $self->_last_update_received(0);
            # return black image - screen turned off
            bmwqemu::diag "considering VNC stalled - turning black";
            $self->_framebuffer(tinycv::new($self->width, $self->height));
        }
        if ($time_since_last_update > 2) {
            $self->send_forced_update_request;
        }
    }

I'm asking coolo exactly what the result of that $self->_framebuffer(tinycv::new($self->width, $self->height)); is.


So I just made the timeout huge on staging to see what happens and actually it does come up working after a while, the timeout was just too short. I dunno why with the current timeout prod tends to at least reach the 'console output' point while staging doesn't, it may simply be related to how heavy the load on the worker host is at the time the test runs (that's probably quasi-deterministic on our deployments).

Anyway, I don't think there's an actual bug here beyond the timeout (we need to significantly lengthen the ARM timeouts when running on Rawhide, debug kernels make ARM emulation even slower), so I'll close this task. I've updated prod to the same openqa and os-autoinst as stg now.

Login to comment on this ticket.

Metadata