#47830 usn tombstone entry not properly created
Closed: wontfix None Opened 9 years ago by lkrispen.

When testing IPA with the head of 1.3.2 strange entries can be created.

There exist entries like:
dn: nsuniqueid=31db6ead-fc5911e3-b087920e-24b83048,fqdn=testhost1.idm.lab.eng.brq.redhat.com,cn=computers,cn=accounts,dc=idm,dc=lab,dc=eng,dc=brq,dc=redhat,dc=com

but there is no objectclass: tombstone, so the entries remain visible and cannot be deleted. In the error log there are messages like:

[25/Jun/2014:13:10:34 +0200] - tombstone entry nsuniqueid=31db6ead-fc5911e3-b087920e-24b83048,fqdn=testhost1.idm.lab.eng.brq.redhat.com,cn=computers,cn=accounts,dc=idm,dc=lab,dc=eng,dc=brq,dc=redhat,dc=com failed to add to the cache: -1

With version 1.3.2.16 this problem does not exist


Replication is not enabled, the tombstone entries are created because the USN plugin is enabled.

The deletion fails because a tombstone entry should not be deleted by a client operation, but the detection of a tombstone in this case is based on the rdn (if it contains nsuniqueid)

Thank you, Ludwig. Let me take a look...

I cannot reproduce the problem... Here's what I did:
Built a debug build from 389-ds-base-1.3.2 branch.
Installed the standalone DS.
Enabled USN.
Added 3 entries:
{{{
adding new entry uid=tuser0, o=my.com
adding new entry uid=tuser1, o=my.com
adding new entry uid=tuser2, o=my.com
}}}
and deleted uid=tuser1, o=my.com, which looks like this. (please note that "objectClass: nsTombstone" exists)
{{{
dn: nsuniqueid=258b0a82-fd5a11e3-a1c8dc4c-747b35a8,uid=tuser1,o=my.com
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
objectClass: nsTombstone
cn: A
sn: B
givenName: X
uid: tuser1
userPassword:: e1NTSEF9M3Q1WDMra2hWYTZKci9QU0R1SmlueHRFSEl5ZTROZTYwUnVPdHc9PQ==
nsParentUniqueId: 13f53300-fd5911e3-932af19a-6c36a85b
entryusn: 21
}}}
And there is no problem to delete it.
{{{
$ ldapdelete [...]
nsuniqueid=258b0a82-fd5a11e3-a1c8dc4c-747b35a8,uid=tuser1,o=my.com
$ ldapsearch [...] -b o=my.com '(objectclass=nstombstone)'
$
}}}
There should be some more condition(s) to cause this bug...

Replying to [comment:3 nhosoi]:

There should be some more condition(s) to cause this bug...

Yes, there is more to trigger this.

Managed entry plugin has to be enabled
The entry to be deleted has a "managedBy" reference to itself
The delete txn has to be retried, so that the preop plugins are called a second time

and this is not enough
The retro changelog has to be enabled
There has to be another backend and changes have to be applied to both backends in parallel. If I run the attached scripts in a loop in parallel, I did get the error in 2/100 deletes

Hi Ludwig, so far I have no luck to reproduce the problem. I used a build from master as well as 389-ds-base-1.3.2 branch, I went to your test system vm-111 and borrowed the contents, but I could not create a broken USN tombstone.

BTW, I could not find the broken USN tombstone. Has it been wiped out? E.g., the count of the tombstone DN and the count of the "objectClass: nsTombstone" are identical... (I exported the db with db2ldif -r ...)
{{{

egrep -i "dn: nsuniqueid=" /var/lib/dirsrv/slapd-IDM-LAB-ENG-BRQ-REDHAT-COM/ldif/IDM-LAB-ENG-BRQ-REDHAT-COM-userRoot-2014_06_28_020322.ldif | wc

391     782   30889

egrep -i "objectClass: nsTombstone" /var/lib/dirsrv/slapd-IDM-LAB-ENG-BRQ-REDHAT-COM/ldif/IDM-LAB-ENG-BRQ-REDHAT-COM-userRoot-2014_06_28_020322.ldif | wc

391     782    9775

}}}

Replying to [comment:6 nhosoi]:

BTW, I could not find the broken USN tombstone. Has it been wiped out? E.g., the count of the tombstone DN and the count of the "objectClass: nsTombstone" are identical... (I exported the db with db2ldif -r ...)

That is another piece to the puzzle, the corrupted entry is not in the database - only in the entry cache. That also explains why after upgrade/downgrade of 389 the tests did start again, it was just due to the restart.

So we know we have an entry with dn=nsuniqueid and the original entry id in the entry cache and any further lookups will find this entry.

BTW I tested the error check relaxation fix, but the result is still the same

Thank you for testing the patch, Ludwig! And the fact -- problem is only in the memory -- you pointed out could be a very good clue. Continue working on it...

Noriko, the attached patch worked for me, can you have a look and explain why this code was removed ?

I removed them while I was debugging the crash in the entry cache, I think.

Yeah, it looks the code you pointed out is needed... But let me run my test again with the code and see there's no crash coming back...

Thanks, Ludwig!
--noriko

Replying to [comment:9 lkrispen]:

Noriko, the attached patch worked for me, can you have a look and explain why this code was removed ?

Thank you so much, Ludwig. I applied your patch and ran the conflict test, which did not cause the crash. I'm running some more acceptance tests and push it to the branches. Since the bug was introduced by 47750, I'd like to reopen the bug and close this bug as duplicate of 47750 to make the patch maintenance easier. Is it okay?

Metadata Update from @nhosoi:
- Issue assigned to nhosoi
- Issue set to the milestone: 1.3.2.18

7 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/1161

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Duplicate)

3 years ago

Login to comment on this ticket.

Metadata