Testing with
The test case is: Install a ipa-server, install a replica (client then replica).
The logs are:
Supplier [13/May/2016:14:24:43.942901892 +0200] NSMMReplicationPlugin - Beginning total update of replica "agmt="cn=meTovm-fqdn>" (vm:389)". [13/May/2016:14:24:43.946442985 +0200] NSMMReplicationPlugin - Need to create replication keep alive entry <cn=repl keep alive 4,<suffix>> [13/May/2016:14:24:43.948395482 +0200] NSMMReplicationPlugin - add dn: cn=repl keep alive 4,<suffix> objectclass: top objectclass: ldapsubentry objectclass: extensibleObject cn: repl keep alive 4 [13/May/2016:14:24:43.962849222 +0200] NSMMReplicationPlugin - agmt="cn=meTo<vm-fqdn>" (vm:389): Received error 2 (Protocol error): for total update operation [13/May/2016:14:24:43.964056847 +0200] NSMMReplicationPlugin - agmt="cn=meTovm-fqdn>" (vm:389): Received error 2 (Protocol error): for total update operation [13/May/2016:14:24:43.965257002 +0200] NSMMReplicationPlugin - agmt="cn=meTovm-fqdn>" (vm:389): Warning: unable to send endReplication extended operation (Protocol error) Replica ... [13/May/2016:14:24:43.244656584 +0200] ipa-topology-plugin - ipa_topo_be_state_changebackend userRoot is going offline; inactivate plugin [13/May/2016:14:24:43.254646244 +0200] NSMMReplicationPlugin - multimaster_be_state_change: replica <suffix> is going offline; disabling replication [13/May/2016:14:24:43.377911430 +0200] WARNING: Import is running with nsslapd-db-private-import-mem on; No other process is allowed to access the database [13/May/2016:14:24:43.967685367 +0200] ERROR bulk import abandoned [13/May/2016:14:24:43.986159599 +0200] import userRoot: Thread monitoring returned: -23 [13/May/2016:14:24:43.988339717 +0200] import userRoot: Aborting all Import threads... [13/May/2016:14:24:52.000458046 +0200] import userRoot: Import threads aborted. [13/May/2016:14:24:52.003123755 +0200] import userRoot: Closing files... [13/May/2016:14:24:52.007013515 +0200] import userRoot: Import failed. [13/May/2016:14:24:52.008702760 +0200] process_bulk_import_op: NULL target sdn [13/May/2016:14:24:43.227507390 +0200] conn=4 op=0 BIND dn="" method=sasl version=3 mech=GSSAPI ... [13/May/2016:14:24:43.236652974 +0200] conn=4 op=2 RESULT err=0 tag=97 nentries=0 etime=0 dn="cn=ldap/<vm-fqdn>@domain>,cn=config" [13/May/2016:14:24:43.237065786 +0200] conn=4 op=3 SRCH base="" scope=0 filter="(objectClass=*)" attrs="supportedControl supportedExtension" ... [13/May/2016:14:24:43.960003625 +0200] conn=4 op=9 EXT oid="2.16.840.1.113730.3.5.6" name="replication-multimaster-extop" [13/May/2016:14:24:43.963550368 +0200] conn=4 op=9 RESULT err=2 tag=120 nentries=0 etime=0 .. [13/May/2016:14:24:43.967436084 +0200] conn=4 op=17 EXT oid="2.16.840.1.113730.3.5.6" name="replication-multimaster-extop" [13/May/2016:14:24:43.967532226 +0200] conn=4 op=17 fd=66 closed - B4 2016-05-13T12:24:44Z DEBUG [error] RuntimeError: Failed to start replication
This failure seems to be related to https://git.fedorahosted.org/cgit/389/ds.git/commit/?id=2ecc93781abc786be6a8b8443faf2598a6c30f97.
Will confirm with more tests
Not certain yet, but we need to revisit the changes on plugin_call_exop_plugins made by these patches...
commit b57fe64 Author: William Brown firstyear@redhat.com Date: Tue Apr 26 18:07:57 2016 +1000 Ticket 48770 - Improve extended op plugin handling
commit d5589b5 Author: William Brown firstyear@redhat.com Date: Thu Apr 28 10:05:20 2016 +1000 Ticket 48770 - Improve extended op plugin handling
commit 2ecc937 Author: Noriko Hosoi nhosoi@redhat.com Date: Wed May 4 14:44:30 2016 -0700 Ticket #48822 - (389-ds-base-1.3.5) Fixing coverity issues.
Regarding the coverity fix (commit 2ecc937), if it is wrong, then the logic before the coverity fix is wrong, too. The if expression "(lderr == SLAPI_PLUGIN_EXTENDED_NOT_HANDLED || rc != LDAP_SUCCESS)" is not needed at all, and this assignment is always expected "lderr = rc;" If that's the only an issue, we can fix it as such. But let's review the plugin.c diff one more time.
{{{ diff --git a/ldap/servers/slapd/plugin.c b/ldap/servers/slapd/plugin.c index 5b81779..f196d2c 100644 --- a/ldap/servers/slapd/plugin.c +++ b/ldap/servers/slapd/plugin.c @@ -538,7 +538,7 @@ plugin_call_exop_plugins( Slapi_PBlock pb, struct slapdplugin p ) / * simple merge: report last real error / - if ( lderr == SLAPI_PLUGIN_EXTENDED_NOT_HANDLED || rc != LDAP_SUCCESS ) { + if ( rc != LDAP_SUCCESS ) { lderr = rc; } }
}}}
git diff ec3f8da plugin.c plugin.c.diff.txt
I successfully tested master branch + https://fedorahosted.org/389/ticket/48836 - 2ecc937, IMHO 2ecc937 is not neutral.
Reviewing the fix. If for example (*p->plg_exhandler)( pb ) returns LDAP_SUCCESS, then plugin_call_exop_plugins will return SLAPI_PLUGIN_EXTENDED_NOT_HANDLED where it use to return LDAP_SUCCESS.
Replying to [comment:5 tbordaz]:
IMHO 2ecc937 is not neutral. Reviewing the fix. If for example (*p->plg_exhandler)( pb ) returns LDAP_SUCCESS, then plugin_call_exop_plugins will return SLAPI_PLUGIN_EXTENDED_NOT_HANDLED where it use to return LDAP_SUCCESS. you're right. But it could probably be made simpler by just setting lderr = rc in the else branch
IMHO 2ecc937 is not neutral.
you're right. But it could probably be made simpler by just setting lderr = rc in the else branch
Replying to [comment:6 lkrispen]:
Replying to [comment:5 tbordaz]: IMHO 2ecc937 is not neutral. Reviewing the fix. If for example (*p->plg_exhandler)( pb ) returns LDAP_SUCCESS, then plugin_call_exop_plugins will return SLAPI_PLUGIN_EXTENDED_NOT_HANDLED where it use to return LDAP_SUCCESS. you're right. But it could probably be made simpler by just setting lderr = rc in the else branch
Great! Probably, we should build 1.3.5.4 with your fix, Thierry. (maybe with the idea from Ludwig.) Thierry, could you please make a patch and push it since both Ludwig and noriko acked it proactively? ;)
Since Ludwig and Thierry are in the long holiday, I'm revisiting this ticket...
It looks to me plugin_call_exop_plugins should always return the return code rc from (*p->plg_exhandler)( pb )...
My observation: If (p->plg_exhandler)(pb) returns SLAPI_PLUGIN_EXTENDED_SENT_RESULT, the return value rc is returned. If rc is not SLAPI_PLUGIN_EXTENDED_NOT_HANDLED, rc used to be always returned due to the DEADCODE. I changed it only when rc is not LDAP_SUCCESS, then it broke the total update. That is, if rc is not SLAPI_PLUGIN_EXTENDED_NOT_HANDLED, rc should be returned. If rc is SLAPI_PLUGIN_EXTENDED_NOT_HANDLED, SLAPI_PLUGIN_EXTENDED_NOT_HANDLED is returned.
That being said, my proposal is ... {{{ int plugin_call_exop_plugins( Slapi_PBlock pb, struct slapdplugin p ) { int rc; slapi_pblock_set( pb, SLAPI_PLUGIN, p ); set_db_default_result_handlers( pb ); rc = (*p->plg_exhandler)( pb ); return (rc); } }}} Am I missing something?
that was my conclusion as well, coverty was right,there was a lot of dead code, and alwast the result of the plugin call was returned
Well, given that now, that one, and only one exop plugin can handle the action, having a single result from the plugin is probably the right thing to do here. This is a good change.
Unfortunately, still the patch 0001-Ticket-48837-Replication-total-init-aborted.patch​ added has a problem. It issues this message for every single entry to be sent in the total update.
[16/May/2016:11:51:23.720833103 -0700] extendop.c failed with result 0 [16/May/2016:11:51:23.726204429 -0700] extendop.c failed with result 0
If the return value of (p->plg_exhandler)( pb ) is 0 (== LDAP_SUCCESS), should it be converted to SLAPI_PLUGIN_EXTENDED_SENT_RESULT something like this? {{{ diff --git a/ldap/servers/slapd/plugin.c b/ldap/servers/slapd/plugin.c index 3c32c55..a4e4de4 100644 --- a/ldap/servers/slapd/plugin.c +++ b/ldap/servers/slapd/plugin.c @@ -531,7 +531,10 @@ plugin_call_exop_plugins( Slapi_PBlock pb, struct slapdplugin p ) slapi_pblock_set( pb, SLAPI_PLUGIN, p ); set_db_default_result_handlers( pb ); rc = (p->plg_exhandler)( pb ); - return (rc); + if (!rc) { / LDAP_SUCCESS needs to translated to SLAPI_PLUGIN_EXTENDED_SENT_RESULT? / + rc = SLAPI_PLUGIN_EXTENDED_SENT_RESULT; + } + return (rc); } }}} If we do this, then the above "extendop.c failed with result 0" is not logged...
Well, the plugin should be setting SLAPI_PLUGIN_EXTENDED_SENT_RESULT I thought?
dna.c {{{ 4418 ret = SLAPI_PLUGIN_EXTENDED_SENT_RESULT; }}}
So is this actually revealing an issue in the plugin?
Also, that solution, what happens if the rc is SLAPI_PLUGIN_EXTENDED_SENT_RESULT? Won't this now cause the plugin to return SENT_RESULT when the plugin fails and it returns a PLUGIN_NOT_HANDLED?
perhaps this is the solution ....
{{{ if (rc == LDAP_SUCCESS) { rc = SLAPI_PLUGIN_EXTENDED_SENT_RESULT; } }}}
The plugin which is having the issue is the consumer initialization (aka total update, bulk import, etc. :). Should the plugin be modified to return SLAPI_PLUGIN_EXTENDED_SENT_RESULT in case of success? I think there's no problem to do so.
But probably, it's nicer to allow the plugins to return LDAP_SUCCESS, IMO...
I don't really mind. But to make the change to allow LDAP_SUCCESS, your patch needs to explicitly look for LDAP_SUCCESS for the conversion, it cannot check for (!rc).
git patch file (master) -- revised 0001-Ticket-48837-Replication-total-init-aborted.patch
That seems logically okay to me. I have not built or tested it, but code wise I will ack.
Thanks to Ludwig and William for their reviews and comments.
Pushed to master: 9eee3a8..e6ba94f master -> master commit e6ba94f
Fix e6ba94f is successfully tested (description test case) with https://fedorahosted.org/389/ticket/48836 (master+48836)
git patch file (master) 0001-Revert-Ticket-48837-Replication-total-init-aborted.patch
Metadata Update from @nhosoi: - Issue set to the milestone: 1.3.5.4
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/1897
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: Fixed)
Login to comment on this ticket.