Commit - linux-cluster/cluster - d360c0537aa734205e49939de92c763696ef477b

linux-cluster / cluster

`d360c05` groupd: clean up leaving failed node

Authored and Committed by teigland 14 years ago

raw patch tree parent

1 file changed. 41 lines added. 0 lines removed.

    groupd: clean up leaving failed node
    
    bz 521817
    
    Due to shutdown+failure scenarios that aren't fully understood,
    a node that fails while shutting down can cause the other nodes
    to get stuck trying to restart the clvmd group (whether other
    groups could be affected is unknown.)
    
    The other nodes will all show something like this from group_tool -v:
    
    dlm              1     clvmd    00010002 LEAVE_STOP_WAIT 1 100020002 1
    
    and group_tool dump will show things like:
    
    1260396236 1:clvmd waiting for 1 more stopped messages before LEAVE_ALL_STOPPED 1
    1260396236 1:clvmd waiting for 1 more stopped messages before LEAVE_ALL_STOPPED 1
    
    This fix is to more or less watch out for this very specific
    situation where things get messed up and forcibly clean things
    up so the other nodes aren't stuck.
    
    Signed-off-by: David Teigland <teigland@redhat.com>

group/daemon/app.c

file modified

+41 -0

linux-cluster / cluster

Source Code

Documentation

d360c05 groupd: clean up leaving failed node

Authored and Committed by teigland 14 years ago

`d360c05` groupd: clean up leaving failed node