dturecek / copr / copr

Forked from copr/copr 6 years ago
Clone

b7f71e1 backend: simplify and cover the build worker code

Authored and Committed by praiskup 3 years ago
    backend: simplify and cover the build worker code
    
    Changes in logic:
    
    1. Worker process is no more restarted upon (unexpected?) failure, but
       it instead re-tries the build itself multiple times.  When worker
       process ends, the build is always _finished_ (failed or succeeded,
       nothing else).  So we never expect that backend will start a new
       worker process to re-try the task (exception is an unexpected system
       halt).
    2. The work with SSH is now more robust, we give it a multiple attempts
       before we give-up.
    3. We can cancel the background process when it is waiting for VM.
    4. The user-facing backend log was renamed to simply 'backend.log', and
       newly is compressed once build is finished.  It is now also much,
       much more useful (more informative logs are provided, and we start
       logging ASAP at the beginning of worker).
    5. Backend checks for minimal version of copr-rpmbuild package on VM,
       and re-tries other host if the version isn't sufficient.
    6. The VM is released back to pool as soon as it is possible.  So we (a)
       do the result downloads a bit earlier than before, (b) release the VM
       and (c) then do result directory analysis locally.  The same VM can
       be taken by other build.
    
    Other changes:
    
    1. Drop the PUBSUB_INTERRUPT_BUILDER logic, it is replaced by cancel
       event delivery through worker manager code.
    2. Move rsync-logic to sshcmd.py, so we use the same principles and
       better encapsulate.
    3. Move the announce/message sending code to msgbus.py so it's better
       encapsulated.  The API is still pretty ugly (e.g. hostname in arg
       list, but fixing that requires some major messaging format
       re-design, TBD).
    4. Remove the old tests for worker (those were "skipped, doesn't work"
       anyway), and replace it by new testsuite with a 100% coverage.
    5. The BuildBackgroundWorker was moved to library path, so it is not
       directly in a %_bindir script (and it is easy to test).
    6. Removed the useless and hard-to-follow "Worker wraps MockRemote which
       wraps Builder" abstraction.  Now there's only BuildBackgroundWorker.
       This also allowed me to drop the weird hierarchy of MockRemote
       exceptions (we only have 3 new module-local exceptions).  Overall,
       it's much easier to follow the code flow (including asynchronous
       exception flow) so we can easily handle the corner cases.
    
        
file modified
+5 -2
file modified
+22 -5
file modified
+101 -6
file modified
+9 -6
file modified
+2 -1