69df018 Ticket 47942: DS hangs during online total update

Authored and Committed by tbordaz 9 years ago
    Ticket 47942: DS hangs during online total update
    
    Bug Description:
    	During incremental or total update of a consumer the replica agreement thread may hang.
    	For total update:
    	The replica agreement thread that send the entries flowed the consumer that is not
    	able to process fast enough the entries. So the TCP connection get full and
    	the RA sender sleep on the connection to be able to write the next entries.
    
    	Sleeping on the poll or write the RA.sender holds the connection lock.
    
    	It prevents the replica agreement result thread to read the results from the
    	network. So the consumer is also halted because is can no longer send the results.
    
    	For incrementatl update:
    	During incremental update, all updates are sent by the RA.sender.
    	If many updates need to be send, the supplier may overflow the consumer
    	that is very late. This flow of updates can fill the TCP connection
    	so that the RA.sender hang when writing the next update.
    	On the hang, it holds the connection lock preventing the RA.reader
    	to receive the acks. And so the consumer can also hang trying to send the
    	acks.
    
    Fix Description:
    	For total update there are two parts of the fix:
    
    	To prevent the RA.sender to sleep too long on the poll, the fix (conn_is_available)
    	splits the RA.timeout into 1s period.
    	If unable to write for 1s, it releases the connection for a short period of time 100ms.
    
    	To prevent the RA.sender to sleep on the write, the fix (check_flow_control_tot_init)
    	checks how late is the consumer and if it is too late, it pauses (releasing the connection
    	during that time). This second part of the fix is configurable and it may need to be
    	tune according to the observed failures.
    
    	For incremental update:
    	The fix is to implement a flow control on the RA.sender.
    	After each sent update, if the window (update.sent - update.acked) cross the limit
    	The RA.sender pause during a configured delay.
    	When the RA.sender pause it does not hold the connection lock
    
    	Tuning can be done with nsds5ReplicaFlowControlWindow (how late is the consumer in terms of
    	number of entries/updates acknowledged) and nsds5ReplicaFlowControlPause (how long the RA.sender will
    	pause if the consumer is too late)
    
    	Logging:
    		For total update, the first time the flow control pauses, it logs a message (FATAL level).
    		If flow control happened, then at the end of the total update, it also logs the number
    		of flow control pauses (FATAL level).
    
    		For incremental update, if flow control happened it logs the number of pause (REPL level).
    
    https://fedorahosted.org/389/ticket/47942
    
    Reviewed by: Mark Reynolds, Rich Megginson, Andrey Ivanov, Noriko Hosoi (many many thanks to all of you !)
    
    Platforms tested: RHEL 7.0, Centos
    
    Flag Day: no
    
    Doc impact: no
    
        
file modified
+3 -1