During a total init, the replica agreement can report a status like:
10 Total update abortedLDAP error: Referral
The status, is actually the returned code (cb_data.rc) of send_entry/conn_send_extended_operation. These returned codes are ConnResult but not LDAP error.
For example, rc=CONN_TIMEOUT=10. But it is logged as a LDAP Error: Referral. This confuse the diagnostic of the total init failure
thierry bordaz wrote:
This bug appears from time to time during IPA initialization of a replica. Initialization fails. The reported status does not help to understand the reason of the failure, although it is a configuration issue. There is no risk and the fix should be easy, to it worth to backport it in 1.3.2
For now, setting the milestone to 1.3.3.
The fix is not easy to test because it needs to trigger an network error. To test it I attached the master process and set a breakpoint in see_if_write_available. Start a full update and after 2-3 times it hits see_if_write_available, steps after the PR_poll (https://git.fedorahosted.org/cgit/389/ds.git/tree/ldap/servers/plugins/replication/repl5_connection.c#n549), set rc=0 (timeout) and continue .
The full update will fail and it will log:
{{{ nsds5replicaLastInitStatus: 10 connection error: time out - Total update aborted }}}
attachment 0001-Ticket-47901-After-total-init-nsds5replicaLastInitSt.patch
do you need to log connection error in case of success, or would
connrc ? " - " : "", connrc ? connmsg : ""
suffice ?
attachment 0001-Ticket-47901-2-After-total-init-nsds5replicaLastInitSt.patch
ack - would also like a review from Ludwig
git merge ticket47901 Updating 4e39dbb..c70b88d Fast-forward ldap/servers/plugins/replication/repl5.h | 9 ++++++--- ldap/servers/plugins/replication/repl5_agmt.c | 30 ++++++++++++++++++++++++------ ldap/servers/plugins/replication/repl5_agmtlist.c | 2 +- ldap/servers/plugins/replication/repl5_connection.c | 31 +++++++++++++++++++++++++++++++ ldap/servers/plugins/replication/repl5_protocol.c | 2 +- ldap/servers/plugins/replication/repl5_tot_protocol.c | 34 +++++++++++++++++++--------------- ldap/servers/plugins/replication/windows_tot_protocol.c | 12 ++++++------ 7 files changed, 88 insertions(+), 32 deletions(-)
git push origin master Counting objects: 25, done. Delta compression using up to 4 threads. Compressing objects: 100% (13/13), done. Writing objects: 100% (13/13), 2.75 KiB, done. Total 13 (delta 11), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 4e39dbb..c70b88d master -> master
commit c70b88d Author: Thierry bordaz (tbordaz) tbordaz@redhat.com Date: Tue Oct 14 10:44:09 2014 +0200
'''push 1.3.3 branch'''
git push origin 389-ds-base-1.3.3 Counting objects: 25, done. Delta compression using up to 4 threads. Compressing objects: 100% (13/13), done. Writing objects: 100% (13/13), 2.76 KiB, done. Total 13 (delta 11), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 20888a6..1bf51f8 389-ds-base-1.3.3 -> 389-ds-base-1.3.3
commit 1bf51f8 Author: Thierry bordaz (tbordaz) tbordaz@redhat.com Date: Tue Oct 14 10:44:09 2014 +0200
Metadata Update from @rmeggins: - Issue assigned to tbordaz - Issue set to the milestone: 1.3.3 backlog
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/1232
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: Fixed)
Login to comment on this ticket.