Bug 257881: Flush/recovery collision leads to deadlock after leg ...
The procedure for coordinating nominal I/O and recovery I/O, was to
either:
1) delay a flush which contained a mark to a region being recovered
2) skip over regions that are currently marked when assigning recovery
This bug has to do with the way #1 was implemented.
The following scenario would trigger it:
1) node1 is assigned recovery on region X
2) node1 also does a mark (write) on region Y
3) node2 attempts to mark region X
**) any flush issued here will delay waiting for recovery to complete on X
4) node1 needs to perform the flush before it can get on with completing
recovery - but it can't flush, so everyone is delayed *forever*.
The fix was to allow flushes from nodes that are not attempting to mark
regions that are being recovered. In the example above, node1 should be
allowed to complete the flush because it is not trying to write to the
same region that is being recovered. node2 would be correctly delayed.
Since node1 can complete the flush, it can also complete the recovery -
thus allowing things to proceed.
This bug only affects mirrors that are not in-sync and are doing I/O.
This bug can occur whether there are device/machine failures or not.
This bug is most easily reproduced with a number of mirrors, but would
be possible with just one.
I've also fixed up some debugging output so it is more consistent and
easier to follow the flow of events.