#47449 deadlock after adding and deleting entries
Closed: Fixed None Opened 6 years ago by mreynolds.

If you have multiple clients, each adding and deleting users the server will deadlock. I created 5 ldif files. Each ldif file added and then deleted 200 entries. Using 5 separate ldapmodify's the server will deadlock within a minute or so. Appears to be an issue with an entry cache lock not being unlocked: Thread 29 (Thread 0x7f7d16bfd700 (LWP 8337)): #3 0x0000003b56223fe9 in PR_Lock () from /lib64/libnspr4.so #4 0x0000003b5622410b in PR_EnterMonitor () from /lib64/libnspr4.so #5 0x00007f7d29eb19bb in dblayer_lock_backend (be=0x2094160) at ../ds/ldap/servers/slapd/back-ldbm/dblayer.c:3942 #6 0x00007f7d29eb102f in dblayer_txn_begin (be=0x2094160, parent_txn=0x0, txn=0x7f7d16bfa860) at ../ds/ldap/servers/slapd/back-ldbm/dblayer.c:3664 #7 0x00007f7d29eeb814 in ldbm_back_delete (pb=0x7f7d16bfcaa0) at ../ds/ldap/servers/slapd/back-ldbm/ldbm_delete.c:257 #8 0x00007f7d2dc8def4 in op_shared_delete (pb=0x7f7d16bfcaa0) at ../ds/ldap/servers/slapd/delete.c:364 #9 0x00007f7d2dc8d6dd in do_delete (pb=0x7f7d16bfcaa0) at ../ds/ldap/servers/slapd/delete.c:128 #10 0x000000000041578e in connection_dispatch_operation (conn=0x7f7d2464f730, op=0x231ee80, pb=0x7f7d16bfcaa0) at ../ds/ldap/servers/slapd/connection.c:643 #11 0x0000000000417524 in connection_threadmain () at ../ds/ldap/servers/slapd/connection.c:2482 Thread 27 (Thread 0x7f7d157fb700 (LWP 8339)): #3 0x0000003b56223fe9 in PR_Lock () from /lib64/libnspr4.so #4 0x0000003b5622410b in PR_EnterMonitor () from /lib64/libnspr4.so #5 0x00007f7d29eb19bb in dblayer_lock_backend (be=0x2094160) at ../ds/ldap/servers/slapd/back-ldbm/dblayer.c:3942 #6 0x00007f7d29eb102f in dblayer_txn_begin (be=0x2094160, parent_txn=0x0, txn=0x7f7d157f6790) at ../ds/ldap/servers/slapd/back-ldbm/dblayer.c:3664 #7 0x00007f7d29ede3f1 in ldbm_back_add (pb=0x7f7d157faaa0) at ../ds/ldap/servers/slapd/back-ldbm/ldbm_add.c:261 #8 0x00007f7d2dc7dc4b in op_shared_add (pb=0x7f7d157faaa0) at ../ds/ldap/servers/slapd/add.c:735 #9 0x00007f7d2dc7cb96 in do_add (pb=0x7f7d157faaa0) at ../ds/ldap/servers/slapd/add.c:258 #10 0x000000000041576c in connection_dispatch_operation (conn=0x7f7d2464f878, op=0x22ec000, pb=0x7f7d157faaa0) at ../ds/ldap/servers/slapd/connection.c:638 #11 0x0000000000417524 in connection_threadmain () at ../ds/ldap/servers/slapd/connection.c:2482 Thread 15 (Thread 0x7f7d09bf5700 (LWP 8351)): #0 0x000000377560e054 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00000037756093be in _L_lock_995 () from /lib64/libpthread.so.0 #2 0x0000003775609326 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x0000003b56223fe9 in PR_Lock () from /lib64/libnspr4.so #4 0x0000003b5622410b in PR_EnterMonitor () from /lib64/libnspr4.so #5 0x00007f7d29eb19bb in dblayer_lock_backend (be=0x2094160) at ../ds/ldap/servers/slapd/back-ldbm/dblayer.c:3942 #6 0x00007f7d29eb102f in dblayer_txn_begin (be=0x2094160, parent_txn=0x0, txn=0x7f7d09bf0790) at ../ds/ldap/servers/slapd/back-ldbm/dblayer.c:3664 #7 0x00007f7d29ede3f1 in ldbm_back_add (pb=0x7f7d09bf4aa0) at ../ds/ldap/servers/slapd/back-ldbm/ldbm_add.c:261 #8 0x00007f7d2dc7dc4b in op_shared_add (pb=0x7f7d09bf4aa0) at ../ds/ldap/servers/slapd/add.c:735 #9 0x00007f7d2dc7cb96 in do_add (pb=0x7f7d09bf4aa0) at ../ds/ldap/servers/slapd/add.c:258 #10 0x000000000041576c in connection_dispatch_operation (conn=0x7f7d2464f4a0, op=0x232c850, pb=0x7f7d09bf4aa0) at ../ds/ldap/servers/slapd/connection.c:638 #11 0x0000000000417524 in connection_threadmain () at ../ds/ldap/servers/slapd/connection.c:2482 ---> this thread is causing the deadlock #3 0x0000003b56223fe9 in PR_Lock () from /lib64/libnspr4.so #4 0x0000003b5622410b in PR_EnterMonitor () from /lib64/libnspr4.so #5 0x00007f7d29ea7169 in cache_lock_entry (cache=0x21130b8, e=0x22f67a0) at ../ds/ldap/servers/slapd/back-ldbm/cache.c:1502 #6 0x00007f7d29ebee77 in find_entry_internal_dn (pb=0x7f7d073f0aa0, be=0x2094160, sdn=0x7f7ca400dec0, lock=1, txn=0x7f7d073ee860, flags=0) at ../ds/ldap/servers/slapd/back-ldbm/findentry.c:155 #7 0x00007f7d29ebf446 in find_entry_internal (pb=0x7f7d073f0aa0, be=0x2094160, addr=0x22f4b68, lock=1, txn=0x7f7d073ee860, flags=0) at ../ds/ldap/servers/slapd/back-ldbm/findentry.c:293 #8 0x00007f7d29ebf530 in find_entry2modify (pb=0x7f7d073f0aa0, be=0x2094160, addr=0x22f4b68, txn=0x7f7d073ee860) at ../ds/ldap/servers/slapd/back-ldbm/findentry.c:324 #9 0x00007f7d29eeb8b4 in ldbm_back_delete (pb=0x7f7d073f0aa0) at ../ds/ldap/servers/slapd/back-ldbm/ldbm_delete.c:273 #10 0x00007f7d2dc8def4 in op_shared_delete (pb=0x7f7d073f0aa0) at ../ds/ldap/servers/slapd/delete.c:364 #11 0x00007f7d2dc8d6dd in do_delete (pb=0x7f7d073f0aa0) at ../ds/ldap/servers/slapd/delete.c:128 #12 0x000000000041578e in connection_dispatch_operation (conn=0x7f7d2464f358, op=0x22f4a90, pb=0x7f7d073f0aa0) at ../ds/ldap/servers/slapd/connection.c:643 #13 0x0000000000417524 in connection_threadmain () at ../ds/ldap/servers/slapd/connection.c:2482

Version: 1.3.0
Is this version info correct? (I guess it could be 1.3.2/master?)

Replying to [comment:2 nhosoi]:

Version: 1.3.0
Is this version info correct? (I guess it could be 1.3.2/master?)

Yes this is with 1.3.2(master).

In ldbm_back_delete() I also forced the setting of the error code in case any future code shuffling occurs.

in ldbm_delete - in the first 3 cases, retval = -1 already - it is not necessary to set it, except perhaps to make the assumptions more clear

in ldbm_modrdn - would rather not make a change that is only formatting

Replying to [comment:6 rmeggins]:

in ldbm_delete - in the first 3 cases, retval = -1 already - it is not necessary to set it, except perhaps to make the assumptions more clear

Right, I was under the assumption that this bug might have happened from code being shuffled around. So I hard set it to avoid future mistakes. But this "mistake" is present in all versions of 389(at least 1.2.11 and up). Anyway I just remove it.

in ldbm_modrdn - would rather not make a change that is only formatting.

No problem, it was only formatting.

New patch attached.

git merge ticket47449
Updating aa55789..6bd78b3
Fast-forward
ldap/servers/slapd/back-ldbm/ldbm_delete.c | 7 ++++---
1 files changed, 4 insertions(+), 3 deletions(-)

git push origin master
To ssh://git.fedorahosted.org/git/389/ds.git
aa55789..6bd78b3 master -> master

commit 6bd78b3

1.3.1

024abee..8ea067b 389-ds-base-1.3.1 -> 389-ds-base-1.3.1

1.3.0

3f75400..93bde65 389-ds-base-1.3.0 -> 389-ds-base-1.3.0

1.2.11

c1dcfc6..66fbebc 389-ds-base-1.2.11 -> 389-ds-base-1.2.11
commit 66fbebc

Metadata Update from @nkinder:
- Issue assigned to mreynolds
- Issue set to the milestone: 1.2.11.22

2 years ago

Login to comment on this ticket.

Metadata