#49551 Child entry cenotaphs should not prevent the deletion of the parent
Closed: wontfix 4 years ago by mreynolds. Opened 6 years ago by spichugi.

Description of problem:
Newly added object cenotaphs are part of the replication conflict feature bz1274430. They are tombstones and should like normal tombstones not prevent the deletion of a parent.

Version-Release number of selected component (if applicable):
389-ds-base-1.3.7.5-13.el7.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Create two master topology replication
2. Add an entry with nsContainer object to store future entries
3. Add user1 and user2 entries on master1 and wait for replication to happen
4. Stop replication
5. Rename user1 on master1 to user3
6. Rename user2 on master2 to user3
7. Start replication
8. Remove all user entries
9. Try to remove nsContainer entry

Actual results:
The operation gives 'Operation not allowed on non-leaf'

Expected results:
Container entry should be deleted


Metadata Update from @spichugi:
- Custom field component adjusted to None
- Custom field origin adjusted to None
- Custom field reviewstatus adjusted to None
- Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1539082
- Custom field type adjusted to None
- Custom field version adjusted to None

6 years ago

The patch corrects the issues with tombstone handling and adds a informational message if entries cannot be deleted because of conflict children

 dapdelete -r .... "cn=test_container,dc=example,dc=com"
 ldap_delete: Operation not allowed on non-leaf (66)
      additional info: Entry has replication conflicts as children

Metadata Update from @lkrispen:
- Custom field reviewstatus adjusted to review (was: None)

6 years ago

Still reviewing the patch.
Just rapid remark regarding slapi_entry_has_conflict_children.
Why adding it in slapi-plugin.h rather than slapi-private.h ?
Also it is missing slapi_free_search_results_internal and slapi_pblock_destroy

Still reviewing the patch.
Just rapid remark regarding slapi_entry_has_conflict_children.
Why adding it in slapi-plugin.h rather than slapi-private.h ?

I thought it is a generic function, like slapi_entry_has_children, and might be used by plugins, but if you prefer to hide it, I have no strong opinion

Also it is missing slapi_free_search_results_internal and slapi_pblock_destroy

yes, that is missing, sorry

corrected the patch, but testing showed anotehr issue: if the entry to be deleted has cenotaph tombstones as children (and no valid children) deletion fails because the entryrdn index code incorrectly treats the cenotaphs as valid children, it decides based only on the rdn

the latest patch does the cleanup after search, allows to delete enties with cenotaph children and uses the right csn in cenotaph purging

The patch looks good.
Because of complex area I may have dummy questions, sorry in advance:

  • Shouldn't we prevent the creation of an entry with 'cenotaphid' ? it could impact 'entryrdn' index. If yes I think it can be done in an other ticket

  • I was under the impression that deleting a tombstone (PARENTUPDATE_DELETE_TOMBSTONE) is quite a NOOP operation. Why decreasing the tombstone_numsubordinate in parent_update_on_childchange ?

  • in task_fixup_tombstone_thread,
    in previous versions of DS, conflicts had not ldapsubentry value. The task will not miss them ?
    To retrieve tombstone of conflict do we need '(objectclass=*)' ?

        filter = "(&(nstombstonecsn=*)(objectclass=nsTombstone)(|(objectclass=*)(objectclass=ldapsubentry)))"
    

could it be replaced by

        filter = "(&(nstombstonecsn=*)(objectclass=nsTombstone)(objectclass=ldapsubentry)))"

Thanks for your comment, here some answers

1) yes, if there would be an entry with cenotaphid as rdn it would affect the "child" check in the entryrdn index. But I think that check is unnecessary since it the entry would have real children it would have been rejected earlier. just wanted to pass this check

2) so you question the use of numtombstone subordinates ? and that is a good question, I do not see where we really need them. But now the code increases the tombstonenumsubordinates when a tombstone is created, either by the delete of an entyr or by creating a centotaph. So when deleting a tombstone, directly or during tombstone purging it should be decreases, otherwise we could ignore it - and that might work (but not in this ticket)

3) the task is looking for all tombstones and was missing the conflicts, with your suggestion we would get the conflicts only, so I think we need the (|(objectclass=*))

Thanks Ludwig for your explanations.
I had hard time to understand the numtombstone subordinate computation. I think it is correct but the case PARENTUPDATE_DELETE_TOMBSTONE looks pretty corner case. If we have a bug there we will open a new ticket.
That is a great patch. ACK

Metadata Update from @tbordaz:
- Custom field reviewstatus adjusted to ack (was: review)

6 years ago

I've just had a look and it seems okay to me too

Metadata Update from @lkrispen:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

6 years ago

Coverity scan found memory leak with this fix:

389-ds-base-1.3.7.5/ldap/servers/plugins/replication/repl5_replica.c:3060: leaked_storage: Variable "deletion_csn" going out of scope leaks the storage it points to.

Metadata Update from @mreynolds:
- Issue status updated to: Open (was: Closed)

6 years ago

Great coverity tool !
Ack for the memory leak patch

To ssh://pagure.io/389-ds-base.git
ad83e55..d970649 master -> master

To ssh://pagure.io/389-ds-base.git
dcec43c..c1183e5 389-ds-base-1.3.7 -> 389-ds-base-1.3.7

compiler warning now

389-ds-base-1.3.7.5/ldap/servers/plugins/replication/repl5_replica.c:3062:9: warning: passing argument 1 of 'csn_free' from incompatible pointer type [enabled by default]

hmm, I did check for warnings, but didn't get this one. Maybe its the "const" in the declaration of deletion_csn,

coud you try to remove the "const" or change to
csn_free(&(CSN *)deletion_csn)

hmm, I did check for warnings, but didn't get this one. Maybe its the "const" in the declaration of deletion_csn,
coud you try to remove the "const" or change to
csn_free(&(CSN *)deletion_csn)

Yeah it was just a const issue. Harmless really. I still did the build with the warning. When you get a chance just fix it please. Thanks!

FYI, this is my configure command that picks up all the warnings:

CFLAGS='-g -pipe -Wall -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' CXXFLAGS='-g -pipe -Wall -O0 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' ../389-ds-base/configure ... ...

if I do a prefix build I do get the warning, but if I just do "make rpms" I don't, don't know where the options come from for this.

Here is a patch to make it clean:

0001-fix-compiler-warning-for-const-csn.patch

if I do a prefix build I do get the warning, but if I just do "make rpms" I don't, don't know where the options come from for this.

Ahh okay. So "make rpms" uses 389-ds-base/rpm/389-ds-base.spec.in This does an optimized build of DS and does not use any special warning flags.

Patch looks good to me - I no longer see any warnings. Thanks Ludwig!

To ssh://pagure.io/389-ds-base.git
d970649..7a54d9b master -> master

To ssh://pagure.io/389-ds-base.git
c1183e5..ce3231b 389-ds-base-1.3.7 -> 389-ds-base-1.3.7

Metadata Update from @mreynolds:
- Issue set to the milestone: 1.3.7.0

6 years ago

with this patch the two master tests pass (after some tewaking)

@spichugi after resuming replication, these tests can take some time to converge, I had to change the timeout for waiting if repl is insync, and for the group test even had to add a sleep in between (does the repl check only check one direction ?)

 @ -443,6 +443,8 @@ class TestTwoMasters:

          repl.test_replication_topology(topology_m2)

 +        time.sleep(30)
 +
          group_dns_m1 = [group.dn for group in test_groups_m1.list()]
          group_dns_m2 = [group.dn for group in test_groups_m2.list()]


 @@ -723,7 +725,7 @@ class TestTwoMasters:

          topology_m2.resume_all_replicas()

 -        repl.test_replication_topology(topology_m2)
 +        repl.test_replication_topology(topology_m2,timeout=60)

The Three master test still fails, as a DEL seems to be looping on an err=1, but later succeeds.
I have not yet found if it is a issue inh ds or in the tests

could I get some feedback for the latest patch added, I think it should be added

To ssh://pagure.io/389-ds-base.git
14e413a..ec4a669 master -> master

To ssh://pagure.io/389-ds-base.git
8a3b9f1..4fdae3c 389-ds-base-1.3.7 -> 389-ds-base-1.3.7

@lkrispen is there something missing in that ticket ? the associated BZ is now closed/delivered

well, there are still some artificial scenarios which can lead to inconsistent tombstones/conflicts, but we then decided it is not worth the effort to try to resolve these.

if the BZ is closed, we should close this ticket as well and eventually create a new one if one of these scenarios is seen in the wild

That sounds reasonable to me @lkrispen. Have we documented those scenarios somewhere?

That sounds reasonable to me @lkrispen. Have we documented those scenarios somewhere?

somewhere :-(
I will find them

ok, they are already in the testcases. suites/replication/conflict_resolve_test.py

there are a couple of pytest.xfail

to see the issues you need to run the tests, keep the instances and run repl-check
the problem cases are all around nested conflict entries combined with delete or modrdn operations

Great, in that case I'm happy :)

Metadata Update from @mreynolds:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

4 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/2610

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: fixed)

3 years ago

Login to comment on this ticket.