d360c05 groupd: clean up leaving failed node

Authored and Committed by teigland 14 years ago
    groupd: clean up leaving failed node
    
    bz 521817
    
    Due to shutdown+failure scenarios that aren't fully understood,
    a node that fails while shutting down can cause the other nodes
    to get stuck trying to restart the clvmd group (whether other
    groups could be affected is unknown.)
    
    The other nodes will all show something like this from group_tool -v:
    
    dlm              1     clvmd    00010002 LEAVE_STOP_WAIT 1 100020002 1
    
    and group_tool dump will show things like:
    
    1260396236 1:clvmd waiting for 1 more stopped messages before LEAVE_ALL_STOPPED 1
    1260396236 1:clvmd waiting for 1 more stopped messages before LEAVE_ALL_STOPPED 1
    
    This fix is to more or less watch out for this very specific
    situation where things get messed up and forcibly clean things
    up so the other nodes aren't stuck.
    
    Signed-off-by: David Teigland <teigland@redhat.com>
    
        
file modified
+41 -0