Description of problem: Newly added object cenotaphs are part of the replication conflict feature bz1274430. They are tombstones and should like normal tombstones not prevent the deletion of a parent.
Version-Release number of selected component (if applicable): 389-ds-base-1.3.7.5-13.el7.x86_64
How reproducible: Always
Steps to Reproduce: 1. Create two master topology replication 2. Add an entry with nsContainer object to store future entries 3. Add user1 and user2 entries on master1 and wait for replication to happen 4. Stop replication 5. Rename user1 on master1 to user3 6. Rename user2 on master2 to user3 7. Start replication 8. Remove all user entries 9. Try to remove nsContainer entry
Actual results: The operation gives 'Operation not allowed on non-leaf'
Expected results: Container entry should be deleted
Metadata Update from @spichugi: - Custom field component adjusted to None - Custom field origin adjusted to None - Custom field reviewstatus adjusted to None - Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1539082 - Custom field type adjusted to None - Custom field version adjusted to None
<img alt="0001-Ticket-49551-correct-handling-of-numsubordinates-for.patch" src="/389-ds-base/issue/raw/files/97dd23502370e543d22cf5bfa6732f9fd80ad2a21481965fcfdb06b233098dc2-0001-Ticket-49551-correct-handling-of-numsubordinates-for.patch" />
The patch corrects the issues with tombstone handling and adds a informational message if entries cannot be deleted because of conflict children
dapdelete -r .... "cn=test_container,dc=example,dc=com" ldap_delete: Operation not allowed on non-leaf (66) additional info: Entry has replication conflicts as children
Metadata Update from @lkrispen: - Custom field reviewstatus adjusted to review (was: None)
Still reviewing the patch. Just rapid remark regarding slapi_entry_has_conflict_children. Why adding it in slapi-plugin.h rather than slapi-private.h ? Also it is missing slapi_free_search_results_internal and slapi_pblock_destroy
Still reviewing the patch. Just rapid remark regarding slapi_entry_has_conflict_children. Why adding it in slapi-plugin.h rather than slapi-private.h ?
I thought it is a generic function, like slapi_entry_has_children, and might be used by plugins, but if you prefer to hide it, I have no strong opinion
Also it is missing slapi_free_search_results_internal and slapi_pblock_destroy
yes, that is missing, sorry
corrected the patch, but testing showed anotehr issue: if the entry to be deleted has cenotaph tombstones as children (and no valid children) deletion fails because the entryrdn index code incorrectly treats the cenotaphs as valid children, it decides based only on the rdn
<img alt="0001-Ticket-49551-v2-correct-handling-of-numsubordinates-.patch" src="/389-ds-base/issue/raw/files/e81577a307fc6e719e96776f355181223a7e58470157de7fb12326e705081ed9-0001-Ticket-49551-v2-correct-handling-of-numsubordinates-.patch" />
the latest patch does the cleanup after search, allows to delete enties with cenotaph children and uses the right csn in cenotaph purging
<img alt="0001-ticket-49551-additional-corrections.patch" src="/389-ds-base/issue/raw/files/47269f1fd0b744559f9ea1ed83bbed57467f867c5b726e95027c824cd3217a88-0001-ticket-49551-additional-corrections.patch" />
two smaller corrections to go with the patch, will be merged after review changes
The patch looks good. Because of complex area I may have dummy questions, sorry in advance:
Shouldn't we prevent the creation of an entry with 'cenotaphid' ? it could impact 'entryrdn' index. If yes I think it can be done in an other ticket
I was under the impression that deleting a tombstone (PARENTUPDATE_DELETE_TOMBSTONE) is quite a NOOP operation. Why decreasing the tombstone_numsubordinate in parent_update_on_childchange ?
in task_fixup_tombstone_thread, in previous versions of DS, conflicts had not ldapsubentry value. The task will not miss them ? To retrieve tombstone of conflict do we need '(objectclass=*)' ?
filter = "(&(nstombstonecsn=*)(objectclass=nsTombstone)(|(objectclass=*)(objectclass=ldapsubentry)))"
could it be replaced by
filter = "(&(nstombstonecsn=*)(objectclass=nsTombstone)(objectclass=ldapsubentry)))"
Thanks for your comment, here some answers
1) yes, if there would be an entry with cenotaphid as rdn it would affect the "child" check in the entryrdn index. But I think that check is unnecessary since it the entry would have real children it would have been rejected earlier. just wanted to pass this check
2) so you question the use of numtombstone subordinates ? and that is a good question, I do not see where we really need them. But now the code increases the tombstonenumsubordinates when a tombstone is created, either by the delete of an entyr or by creating a centotaph. So when deleting a tombstone, directly or during tombstone purging it should be decreases, otherwise we could ignore it - and that might work (but not in this ticket)
3) the task is looking for all tombstones and was missing the conflicts, with your suggestion we would get the conflicts only, so I think we need the (|(objectclass=*))
Thanks Ludwig for your explanations. I had hard time to understand the numtombstone subordinate computation. I think it is correct but the case PARENTUPDATE_DELETE_TOMBSTONE looks pretty corner case. If we have a bug there we will open a new ticket. That is a great patch. ACK
Metadata Update from @tbordaz: - Custom field reviewstatus adjusted to ack (was: review)
I've just had a look and it seems okay to me too
<img alt="0001-Ticket-49551-v3-correct-handling-of-numsubordinates-.patch" src="/389-ds-base/issue/raw/files/e91b436f2c1f7684776b2ab8d3106468ef37edacd0b6febcd1c0fb9af8cdef0b-0001-Ticket-49551-v3-correct-handling-of-numsubordinates-.patch" />
consolidated patch
commit master: To ssh://pagure.io/389-ds-base.git a65e6da..ad83e55 master -> master
commit 1.3.7 To ssh://pagure.io/389-ds-base.git b68d3cb..dcec43c 389-ds-base-1.3.7 -> 389-ds-base-1.3.7
Metadata Update from @lkrispen: - Issue close_status updated to: fixed - Issue status updated to: Closed (was: Open)
Coverity scan found memory leak with this fix:
389-ds-base-1.3.7.5/ldap/servers/plugins/replication/repl5_replica.c:3060: leaked_storage: Variable "deletion_csn" going out of scope leaks the storage it points to.
Metadata Update from @mreynolds: - Issue status updated to: Open (was: Closed)
<img alt="0001-fix-memory-leaK.patch" src="/389-ds-base/issue/raw/files/cdc2326eee8cba0c4ba7cd4d7cc326607da3901d2e39b7087161d60084c2b784-0001-fix-memory-leaK.patch" />
Great coverity tool ! Ack for the memory leak patch
To ssh://pagure.io/389-ds-base.git ad83e55..d970649 master -> master
To ssh://pagure.io/389-ds-base.git dcec43c..c1183e5 389-ds-base-1.3.7 -> 389-ds-base-1.3.7
compiler warning now
389-ds-base-1.3.7.5/ldap/servers/plugins/replication/repl5_replica.c:3062:9: warning: passing argument 1 of 'csn_free' from incompatible pointer type [enabled by default]
hmm, I did check for warnings, but didn't get this one. Maybe its the "const" in the declaration of deletion_csn,
coud you try to remove the "const" or change to csn_free(&(CSN *)deletion_csn)
hmm, I did check for warnings, but didn't get this one. Maybe its the "const" in the declaration of deletion_csn, coud you try to remove the "const" or change to csn_free(&(CSN *)deletion_csn)
Yeah it was just a const issue. Harmless really. I still did the build with the warning. When you get a chance just fix it please. Thanks!
FYI, this is my configure command that picks up all the warnings:
CFLAGS='-g -pipe -Wall -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' CXXFLAGS='-g -pipe -Wall -O0 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' ../389-ds-base/configure ... ...
if I do a prefix build I do get the warning, but if I just do "make rpms" I don't, don't know where the options come from for this.
Here is a patch to make it clean:
<img alt="0001-fix-compiler-warning-for-const-csn.patch" src="/389-ds-base/issue/raw/files/a7b51a0bee5b4ec8c3268679566432399201cb6927d1ad3de8fe745fbd3899dd-0001-fix-compiler-warning-for-const-csn.patch" />
Ahh okay. So "make rpms" uses 389-ds-base/rpm/389-ds-base.spec.in This does an optimized build of DS and does not use any special warning flags.
Patch looks good to me - I no longer see any warnings. Thanks Ludwig!
To ssh://pagure.io/389-ds-base.git d970649..7a54d9b master -> master
To ssh://pagure.io/389-ds-base.git c1183e5..ce3231b 389-ds-base-1.3.7 -> 389-ds-base-1.3.7
Metadata Update from @mreynolds: - Issue set to the milestone: 1.3.7.0
<img alt="0001-Additional-fix-for-ticket-49551-correctly-handle-sub.patch" src="/389-ds-base/issue/raw/files/4b1a6a1352d1b57c6380721d407d4b8979db74a9bfbdfb9285a6c1f15b273c9b-0001-Additional-fix-for-ticket-49551-correctly-handle-sub.patch" />
with this patch the two master tests pass (after some tewaking)
@spichugi after resuming replication, these tests can take some time to converge, I had to change the timeout for waiting if repl is insync, and for the group test even had to add a sleep in between (does the repl check only check one direction ?)
@ -443,6 +443,8 @@ class TestTwoMasters: repl.test_replication_topology(topology_m2) + time.sleep(30) + group_dns_m1 = [group.dn for group in test_groups_m1.list()] group_dns_m2 = [group.dn for group in test_groups_m2.list()] @@ -723,7 +725,7 @@ class TestTwoMasters: topology_m2.resume_all_replicas() - repl.test_replication_topology(topology_m2) + repl.test_replication_topology(topology_m2,timeout=60)
The Three master test still fails, as a DEL seems to be looping on an err=1, but later succeeds. I have not yet found if it is a issue inh ds or in the tests
test_replication_topology goes through all combinations in the topology. https://pagure.io/389-ds-base/blob/master/f/src/lib389/lib389/replica.py#_1836
Ok, I'll add the timeouts to https://pagure.io/389-ds-base/pull-request/49558
could I get some feedback for the latest patch added, I think it should be added
@lkrispen LGTM, ack
To ssh://pagure.io/389-ds-base.git 14e413a..ec4a669 master -> master
To ssh://pagure.io/389-ds-base.git 8a3b9f1..4fdae3c 389-ds-base-1.3.7 -> 389-ds-base-1.3.7
@lkrispen is there something missing in that ticket ? the associated BZ is now closed/delivered
well, there are still some artificial scenarios which can lead to inconsistent tombstones/conflicts, but we then decided it is not worth the effort to try to resolve these.
if the BZ is closed, we should close this ticket as well and eventually create a new one if one of these scenarios is seen in the wild
That sounds reasonable to me @lkrispen. Have we documented those scenarios somewhere?
somewhere :-( I will find them
ok, they are already in the testcases. suites/replication/conflict_resolve_test.py
there are a couple of pytest.xfail
to see the issues you need to run the tests, keep the instances and run repl-check the problem cases are all around nested conflict entries combined with delete or modrdn operations
Great, in that case I'm happy :)
Metadata Update from @mreynolds: - Issue close_status updated to: fixed - Issue status updated to: Closed (was: Open)
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/2610
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: fixed)
Log in to comment on this ticket.