#49967 entry cache corruption after failed MODRDN
Closed: fixed 9 months ago Opened 9 months ago by lkrispen.

Issue Description

if a modrdn fails with operations error, because a plugin fails, the modrdn operation is rejected, but the entry cache has a mixed state

Package Version and Platform

tested with master and 1.3.6

Steps to reproduce

It is hard to reproduce, so far I could only do it by manipulating the plugin rc in gdb:

  • attach gdb to slapd
  • set a breakpoint in ldbm_back_modrdn
  • start the modrdn operation
  • set a breakpoint before calling the txn postop plugins

    if ((retval = plugin_call_plugins(pb, SLAPI_PLUGIN_BE_TXN_POST_MODRDN_FN))) {

  • step into plugi_call_plugins

  • set rc=-1
  • jump to return statement

An example of a test is here:

 ldapmodify -h localhost  -p 39001 -x -D "cn=directory manager" -w password
 dn: cn=yyy,cn=sub3,ou=people,dc=example,dc=com
 changetype: modrdn
 newrdn: cn=y3
 deleteoldrdn: 0

 modifying rdn of entry "cn=yyy,cn=sub3,ou=people,dc=example,dc=com"
 ldap_rename: Operations error (1)

Now do a search for the original dn

 ldapsearch -LLL -o ldif-wrap=no -h localhost  -p 39001 -x -D "cn=directory manager" -w password -b "cn=yyy,cn=sub3,ou=people,dc=example,dc=com" -s base

no result, try for the attempted new rdn:

 ldapsearch -LLL -o ldif-wrap=no -h localhost  -p 39001 -x -D "cn=directory manager" -w     password -b "cn=y3,cn=sub3,ou=people,dc=example,dc=com" -s base 
 dn: cn=y3,cn=sub3,ou=People,dc=example,dc=com
 objectClass: person
 objectClass: top
 sn: yyy,cn=sub3
 description: test-yyy,cn=sub3
 cn: yyy

NOTE: the entry is returned, the dn contains the attempted new value, but the attribute value for cn is the original one.

Now restart the server and the search for the original dn returns the correct result and the search for the new rdn returns nothing - as expected


@lkrispen We could make a plugin that given certain operation conditions triggers failures for us? That way we could create deterministic test failure cases like this.

Metadata Update from @firstyear:
- Custom field component adjusted to None
- Custom field origin adjusted to None
- Custom field reviewstatus adjusted to None
- Custom field type adjusted to None
- Custom field version adjusted to None

9 months ago

Metadata Update from @tbordaz:
- Issue assigned to tbordaz

9 months ago

Metadata Update from @mreynolds:
- Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1518320

9 months ago

This patch fixes an entry cache crash a customer was running into! I have a reproducer script for the crash as well

That is a good news that the patch also fixes the crash (https://pagure.io/389-ds-base/issue/49905).
Just a guess how 49905 happens, If the MODRDN fails, the target entry should be on the LRU, but if it remains a reference to it into the DN cache, there is a chance that during later cache lookup we compare the DN of the failing MODRDN target entry with the lookup DN. but at that time the target entry may have been freed.

Metadata Update from @tbordaz:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

9 months ago

e87985c..0a2cc3b 389-ds-base-1.3.8 -> 389-ds-base-1.3.8

cb21dc5..1448415 389-ds-base-1.3.7 -> 389-ds-base-1.3.7

@lkrispen We could make a plugin that given certain operation conditions triggers failures for us? That way we could create deterministic test failure cases like this.

This is a very good idea. I wrote a first draft of such a plugin, see attachment.

And in a first test I found that a failed search preop plugin hangs the client. there is probably more to find.
And it needs handling of internal op entry points.

@lkrispen I'm reviewing this now, but it looks good. I think we should open a ticket/PR to accept this because it would be great for testing.

c154889..68e0d58 389-ds-base-1.3.6 -> 389-ds-base-1.3.6

Login to comment on this ticket.

Metadata
Attachments 1
Attached 9 months ago View Comment