78b03a7 qemu_hotplug: delay sending DEVICE_REMOVED event until after *all* teardown

Authored and Committed by Laine Stump 5 years ago
    qemu_hotplug: delay sending DEVICE_REMOVED event until after *all* teardown
    
    The VIR_DOMAIN_EVENT_ID_DEVICE_REMOVED event is sent after qemu has
    responded to a device_del command with a DEVICE_DELETED event. Before
    queuing the event, *some* of the final teardown of the device's
    trappings in libvirt is done, but not *all* of it. As a result, an
    application may receive and process the DEVICE_REMOVED event before
    libvirt has really finished with it.
    
    Usually this doesn't cause a problem, but it can - in the case of the
    bug report referenced below, vdsm is assigning a PCI device to a guest
    with managed='no', using livirt's virNodeDeviceDetachFlags() and
    virNodeDeviceReAttach() APIs. Immediately after receiving a
    DEVICE_REMOVED event from libvirt signalling that the device had been
    successfully unplugged, vdsm would cal virNodeDeviceReAttach() to
    unbind the device from vfio-pci and rebind it to the host driverm but
    because the event was received before libvirt had completely finished
    processing the removal, that device was still on the "activeDevs"
    list, and so virNodeDeviceReAttach() failed.
    
    Experimentation with additional debug logs proved that libvirt would
    always end up dispatching the DEVICE_REMOVED event before it had
    removed the device from activeDevs (with a *much* greater difference
    with managed='yes', since in that case the re-binding of the device
    occurred after queuing the device).
    
    Although the case of hostdev devices is the most extreme (since there
    is so much involved in tearing down the device), *all* device types
    suffer from the same problem - the DEVICE_REMOVED event is queued very
    early in the qemuDomainRemove*Device() function for all of them,
    resulting in a possibility of any application receiving the event
    before libvirt has really finished with the device.
    
    The solution is to save the device's alias (which is the only piece of
    info from the device object that is needed for the event) at the
    beginning of processing the device removal, and then queue the event
    as a final act before returning. Since all of the
    qemuDomainRemove*Device() functions (except
    qemuDomainRemoveChrDevice()) are now called exclusively from
    qemuDomainRemoveDevice() (which selects which of the subordinates to
    call in a switch statement based on the type of device), the shortest
    route to a solution is to doing the saving of alias, and later
    queueing of the event, in the higher level qemuDomainRemoveDevice(),
    and just completely remove the event-related code from all the
    subordinate functions.
    
    The single exception to this, as mentioned before, is
    qemuDomainRemoveChrDevice(), which is still called from somewhere
    other than qemuDomainRemoveDevice() (and has a separate arg used to
    trigger different behavior when the chr device has targetType ==
    GUESTFWD), so it must keep its original behavior intact, and must be
    treated differently by qemuDomainRemoveDevice() (similar to the way
    that qemuDomainDetachDeviceLive() treats chr and lease devices
    differently from all the others).
    
    Resolves: https://bugzilla.redhat.com/1658198
    
    Signed-off-by: Laine Stump <laine@laine.org>
    ACKed-by: Peter Krempa <pkrempa@redhat.com>
    
        
file modified
+65 -76