gfs_controld: retry recovery for withdrawn journal
bz 442451
This is unfortunate, but seems to be the best solution available. The
problem, described more fully in the bz, is that when gfs_controld tries
to do recovery on a journal for a withdraw, the withdrawing node may not
yet have cleared its dlm locks. This means the journal lock may still be
held by the withdrawing node, causing all the recovering node(s) to fail
acquiring it, and no one does the recovery. The solution is for all
recovering nodes to retry recovery of a withdrawn journal until they
succeed (only the first to get the journal lock will actually recover
it, the others will see it's recovered and report success.)
Signed-off-by: David Teigland <teigland@redhat.com>