#49824 Deadlock during db2ldif export
Opened a year ago by telackey. Modified a year ago

Issue Description

It seems to be possible to deadlock during export using db2ldif given a combination of the right database structure (a deep, branchy tree) and bad luck.

Package Version and Platform

Observed with 389DS 1.3.6.14 on CentOS 7

Steps to reproduce

  1. Fill the database with entries that are both deep and leafy (eg, where each entry under ou=XYZ is the parent of several subentries, each of which has several subentries, each of which has several subentries...).

  2. Run db2ldif

Actual results

If you are unlucky, the export can deadlock (even though it is a single thread) with a stack similar to:

#0  0x00007fa31ffcb945 in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib64/libpthread.so.0
#1  0x00007fa318f1b403 in __db_hybrid_mutex_suspend () from /usr/lib64/libdb-5.3.so
#2  0x00007fa318f1ac47 in __db_tas_mutex_readlock () from /usr/lib64/libdb-5.3.so
#3  0x00007fa319034e55 in __memp_fget () from /usr/lib64/libdb-5.3.so
#4  0x00007fa318f38321 in __bam_search () from /usr/lib64/libdb-5.3.so
#5  0x00007fa318f23366 in __bamc_search () from /usr/lib64/libdb-5.3.so
#6  0x00007fa318f24e1f in __bamc_get () from /usr/lib64/libdb-5.3.so
#7  0x00007fa318fdddc6 in __dbc_iget () from /usr/lib64/libdb-5.3.so
#8  0x00007fa318fecc22 in __dbc_get_pp () from /usr/lib64/libdb-5.3.so
#9  0x00007fa3169a1123 in entryrdn_lookup_dn (be=0x56151d9653e0, rdn=<optimized out>, id=<optimized out>, dn=dn@entry=0x7ffef19c1710, psrdn=psrdn@entry=0x0, txn=txn@entry=0x0) at ldap/servers/slapd/back-ldbm/ldbm_entryrdn.c:1191
#10 0x00007fa3169b265f in ldbm_back_ldbm2ldif (pb=<optimized out>) at ldap/servers/slapd/back-ldbm/ldif2ldbm.c:1504
#11 0x000056151d451412 in slapd_exemode_db2ldif (argv=0x7ffef19c1c88, argc=15) at ldap/servers/slapd/main.c:2290
#12 main (argc=15, argv=0x7ffef19c1c88) at ldap/servers/slapd/main.c:878

The suspected issue has to do with the entryrdn changes. In the old days, the export code simply opened up a cursor on id2entry and walked it sequentially from start to finish. It is impossible to deadlock doing that.

However, with the new code, the export process makes sure that all parent entries are exported prior to their children. Also, the full DN has to be constructed from the RDN and the parent. These differences entail opening both entryrdn and id2entry and potentially hopping around each of them a bit.

Underneath, BDB is managing all of the page locks. It is possible to be holding a lock from one BDB operation that is needed by another BDB operation. The locks themselves are re-entrant, but without a transaction, BDB does not know that both operations are logically related. There is an element of luck here, because the data must be stored such that a page locked by BDB op A1 is also needed by subsequent BDB op A2.

Again, none of this would have been an issue in the older, sequential export scheme.

If all of the above is correct the fix would be to open a TXN across the export, so that BDB will be able to recognize that A2 and A1 are both actions owned by transaction A, and so grant access.

A proposed patch is attached.

Unfortunately the machine where this was initially observed is no longer accessible, so it is not possible to reproduce this at will to confirm.

Expected results

Export should never deadlock.
export_with_txn.patch


I think that the reason we do not use transactions in db2ldif is that it is only reading the database, and each page can have multiple readers. In offline mode db2ldif is the only reader and in online export the backend is set busy and prevents writes.

So, before going to do a big change in export code I would prefer to have a reproducer and investigate the deadlock a bit more, eg running db_stat -C A -M A to see wich pages are really affected.

You describe the reproducer as: "database with entries that are both deep and leafy", could you be a bit more specific on how deep and how many entries your database had.
Also, did you perform modrdns and move subtrees before runn8ing db2ldif ?

Metadata Update from @lkrispen:
- Custom field component adjusted to None
- Custom field origin adjusted to None
- Custom field reviewstatus adjusted to None
- Custom field type adjusted to None
- Custom field version adjusted to None

a year ago

More to follow on BDB angle, but as regards the other questions, ~ 27 million entries, something like

ou=XYZ
    cn=ABC
        custom0=DEF
             moreCustom=GHI
        custom1=JKL
             moreCustom=MNO
                 moreMoreCustom=PQR
        ...
        custom10=TUV
            moreCustom=WXY
...

It can go deeper than that, but that is an idea. No, no modrdns or subtree moves. Nothing had really been done since importing.

I tried to reproduce this. I was using deeeply nested trees, starting at the top each node had 5 child nodes, up to 10 levels.
I tested with 4, 8, 12 36 million entries, with asnd without replication enabled/exported, with modrdns of a subtree and without. On different machines.
But I never did get a deadlock.

So there is something I am missing in the setup.
Are you still able to reproduce ? If this is the case could you provide pstacks of the hang and output
of db_stat -c
and db_stat -C A

@lkrispen Your comment above suggests to me that you don't have a complete understanding of BDB's behavior in respect of shared access to the database from two processes. Briefly, it is not true that because db2ldif only requires read access that transactions and concurrency control can be ignored. In fact, db2ldif does end up opening the DB environment in transacted mode (which is the right thing to do given that another process may be concurrently writing). Since the existing code effectively ignores the need to open transactions, it is getting the default transaction (supplied by BDB, per BDB-call). This is why we theorized that the deadlock/hang seen on our customer's machine (pstack above) was caused by dl2ldif deadlocking with itself -- a transaction gets created by the cursor open. Subsequent get operations done when chasing up the tree to find parents end up hitting the same page, but with a different transaction, hence deadlock. However there are other possible causes. For example the customer may have killed a previous db2ldif or even slapd, leaving a lock leaked in the shared region. Because it is not possible to perform the recovery protocol in db2ldif (since it may be run concurrently with a running slapd), it could block on that leaked lock. I can think of a few other scenarios under which we could get to the end state seen in the pstack. Unfortunately our customer deleted the reproduction case before we could perform further analysis. We have been unsuccessful in our attempts to reproduce the problem, but in analyzing the code we found that it is certainly not correct in its use of default transactions. It is also not safe to run concurrently with a running slapd.

I'm not sure what happened historically with db2ldif btw -- originally (UMich, Netscape 1.x days) it was just not safe at all -- no transactions in the DB and you could potentially end up with a truncated ldif or crash db2ldif if run against a concurrently written slapd. Then in Netscape 3.x we added transaction support to the DB and made db2ldif "offline only" to ensure safety. Next, in Netscape 5.x we added the capability for slapd to do its own "db2ldif" in order to get the ability to dump ldif "online" back into the product. The stand alone db2ldif was intended henceforth to be only run in cases where slapd was offline (e.g. if it were broken, couldn't start). Somehow we now have a situation where db2ldif is not safe to run against an online slapd, but users are told that is is ok :(

@lkrispen Your comment above suggests to me that you don't have a complete understanding of BDB's behavior in respect of shared access to the database from two processes.

I think I do, but I was definitely sure we would not run db2ldif while the server was online, and was wrong. If we do runn db2ldif -r to export repl data it is rejected, if not it is accepted.
We have seen other problems with tis approach eg if teh db2ldif crashes it could keep pages locked.

I would vote for preventing db2ldif wile the server is online

I would vote for preventing db2ldif wile the server is online

I am of two minds about this. One hand, I agree that it is unsafe, and we should disable this behaviour allowing only "online export" to be triggered via the cn=config interfaces.

However, I also see a great benefit to being able to trigger this from the server via cron or other backup tasks while the server is online. I think we should consider how our online backup strategy should look.

I think this is a larger issue and is something that should be investigated and discussed how to approach.

For now, let's be safe - disable online db2ldif and we should also check the safety of the other db2* tasks in light of this information.

Thanks,

Login to comment on this ticket.

Metadata
Attachments 2
Attached a year ago View Comment
Attached a year ago View Comment