#5637 Deadlock between slapi-nis/memberof when add/del a segment
Closed: Fixed None Opened 5 years ago by tbordaz.

389-ds-base-libs-1.3.4.6-1.fc23.x86_64
freeipa-server-4.3.90-20160125115244Zjenkins129git6896035.fc23.x86_64

The deadlock occurs because of locks (slapi-nis map lock and DB page lock) taken in the opposite order. During a topology segment DEL, memberof plugin triggers an internal search on base=<suffix> while holding some DB page.
If at the same time, an other op (here a ADD) does a search, scoping slapi-nis. It holds slapi-nis map lock and then can not access the same page pages acquired by DEL txn.

The deadlock should not be systematic


The deadlock occurs between Thread 6 and 9.

Thread 6
       a dd=34 locks held 0    write locks 0    pid/thread 23273/139679617693440 (7F09B1FEB700)flags 0    priority 100
       a READ          1 WAIT    userRoot/objectclass.db   page          5

Bound as DM
ADD: cn=ipa_pwd_extop,cn=plugins,cn=config
    schema-compat triggers internal search for goups having added entry as member
    base: "cn=groups,cn=accounts,<SUFFIX>"
    scope: one
    filter: "(member=cn=ipa_pwd_extop,cn=plugins,cn=config)"

    It acquired the schema-compat map lock but required a DB page acquired by Thread 9 DEL txn


Thread 9
80000068 dd= 4 locks held 60   write locks 31   pid/thread 23273/139679642871552 (7F09B37EE700) flags 0    priority 100
80000068 WRITE         5 HELD    userRoot/objectclass.db   page          5
80000068 READ          6 HELD    userRoot/objectclass.db   page          5

Bound as krbprincipalname=ldap/vm-040.idm.lab.eng.brq.redhat.com@dom-040.idm.lab.eng.brq.redhat.com,cn=services,cn=accounts,<SUFFIX>
DEL "cn=vm-102.idm.lab.eng.brq.redhat.com-to-vm-040.idm.lab.eng.brq.redhat.com,cn=domain,cn=topology,cn=ipa,cn=etc,<SUFFIX>"
    memberof is updating all groups it belongs to, doing an internal search
    base: <SUFFIX>
    scope: subtree
    filter: "(memberHost=cn=vm-102.idm.lab.eng.brq.redhat.com-to-vm-040.idm.lab.eng.brq.redhat.com,cn=domain,cn=topology,cn=ipa,cn=etc,<SUFFIX>)"

    while holding some DB page for the DEL txn it requires access to schema-compat map lock

Details on the threads

Thread 6 (Thread 0x7f09b1feb700 (LWP 23311)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f09dd57e0fb in __db_pthread_mutex_condwait (env=0x559f0b347880, mutex=139680443779782, timespec=0x0, mutexp=<optimized out>) at ../../src/mutex/mut_pthread.c:321
#2  __db_hybrid_mutex_suspend (env=env@entry=0x559f0b347880, mutex=mutex@entry=2159, timespec=timespec@entry=0x0, exclusive=exclusive@entry=1) at ../../src/mutex/mut_pthread.c:577
#3  0x00007f09dd57d52f in __db_tas_mutex_lock_int (nowait=0, timeout=0, mutex=2159, env=<optimized out>) at ../../src/mutex/mut_tas.c:255
#4  __db_tas_mutex_lock (env=env@entry=0x559f0b347880, mutex=2159, timeout=timeout@entry=0) at ../../src/mutex/mut_tas.c:286
#5  0x00007f09dd627c97 in __lock_get_internal (lt=lt@entry=0x559f0b4dbf50, sh_locker=sh_locker@entry=0x7f09d55fcbc0, flags=flags@entry=0, obj=<optimized out>, lock_mode=<optimized out>, timeout=timeout@entry=0, lock=0x7f09b1fd69a0) at ../../src/lock/lock.c:989
#6  0x00007f09dd628727 in __lock_get (env=env@entry=0x559f0b347880, locker=0x7f09d55fcbc0, flags=0, obj=obj@entry=0x559f0b4e3ae0, lock_mode=lock_mode@entry=DB_LOCK_READ, lock=lock@entry=0x7f09b1fd69a0) at ../../src/lock/lock.c:469
#7  0x00007f09dd6544c7 in __db_lget (dbc=dbc@entry=0x559f0b4e39f0, action=action@entry=0, pgno=5, mode=mode@entry=DB_LOCK_READ, lkflags=lkflags@entry=0, lockp=lockp@entry=0x7f09b1fd69a0) at ../../src/db/db_meta.c:1257
#8  0x00007f09dd59b222 in __bam_search (dbc=dbc@entry=0x559f0b4e39f0, root_pgno=1, root_pgno@entry=0, key=key@entry=0x7f09b1fd6c70, flags=flags@entry=1409, slevel=slevel@entry=1, recnop=recnop@entry=0x0, exactp=0x7f09b1fd6ac4) at ../../src/btree/bt_search.c:723
#9  0x00007f09dd585f0c in __bamc_search (dbc=dbc@entry=0x559f0b4e39f0, root_pgno=root_pgno@entry=0, key=0x7f09b1fd6c70, flags=<optimized out>, exactp=<optimized out>) at ../../src/btree/bt_cursor.c:2804
#10 0x00007f09dd58802f in __bamc_get (dbc=0x559f0b4e39f0, key=<optimized out>, data=<optimized out>, flags=<optimized out>, pgnop=0x7f09b1fd6b54) at ../../src/btree/bt_cursor.c:1099
#11 0x00007f09dd640d14 in __dbc_iget (dbc=0x559f0b343d40, key=0x7f09b1fd6c70, data=0x7f09b1fd6ca0, flags=26) at ../../src/db/db_cam.c:952
#12 0x00007f09dd64154d in __dbc_get (dbc=dbc@entry=0x559f0b343d40, key=key@entry=0x7f09b1fd6c70, data=data@entry=0x7f09b1fd6ca0, flags=flags@entry=2074) at ../../src/db/db_cam.c:770
#13 0x00007f09dd64fef2 in __dbc_get_pp (dbc=0x559f0b343d40, key=0x7f09b1fd6c70, data=0x7f09b1fd6ca0, flags=2074) at ../../src/db/db_iface.c:2361
#14 0x00007f09d90d6a4e in idl_new_fetch (be=0x559f0b346910, db=0x559f0b4ee710, inkey=0x7f09b1fd8e00, txn=0x0, a=0x559f0b4d94b0, flag_err=0x7f09b1fdfb4c, allidslimit=4000) at ldap/servers/slapd/back-ldbm/idl_new.c:202
#15 0x00007f09d90d66e5 in idl_fetch_ext (be=be@entry=0x559f0b346910, db=<optimized out>, key=key@entry=0x7f09b1fd8e00, txn=txn@entry=0x0, a=<optimized out>, err=err@entry=0x7f09b1fdfb4c, allidslimit=4000) at ldap/servers/slapd/back-ldbm/idl_shim.c:101
#16 0x00007f09d90e5093 in index_read_ext_allids (pb=pb@entry=0x559f0b6feaa0, be=be@entry=0x559f0b346910, type=type@entry=0x559f0b27ff70 "objectclass", indextype=indextype@entry=0x7f09d912c13f "eq", val=<optimized out>, txn=txn@entry=0x7f09b1fdd0e0, err=0x7f09b1fdfb4c, unindexed=0x7f09b1fdd0d4, allidslimit=4000) at ldap/servers/slapd/back-ldbm/index.c:1028
#17 0x00007f09d90cf5dd in keys2idl (pb=pb@entry=0x559f0b6feaa0, be=be@entry=0x559f0b346910, type=0x559f0b27ff70 "objectclass", indextype=indextype@entry=0x7f09d912c13f "eq", ivals=0x7f09b1fdd1c0, err=err@entry=0x7f09b1fdfb4c, unindexed=0x7f09b1fdd0d4, txn=0x7f09b1fdd0e0, allidslimit=4000) at ldap/servers/slapd/back-ldbm/filterindex.c:977
#18 0x00007f09d90cfdf2 in ava_candidates (pb=pb@entry=0x559f0b6feaa0, be=be@entry=0x559f0b346910, f=f@entry=0x559f0b207780, ftype=<optimized out>, err=0x7f09b1fdfb4c, allidslimit=4000, range=0, nextf=0x0) at ldap/servers/slapd/back-ldbm/filterindex.c:288
#19 0x00007f09d90d03da in filter_candidates_ext (pb=pb@entry=0x559f0b6feaa0, be=be@entry=0x559f0b346910, base=base@entry=0x559f0b285120 "cn=groups,cn=accounts,<SUFFIX>", f=f@entry=0x559f0b207780, nextf=nextf@entry=0x0, range=range@entry=0, err=0x7f09b1fdfb4c, allidslimit=4000) at ldap/servers/slapd/back-ldbm/filterindex.c:111
, range=range@entry=0, err=0x7f09b1fdfb4c, allidslimit=4000) at ldap/servers/slapd/back-ldbm/filterindex.c:111
#20 0x00007f09d90d13df in list_candidates (pb=pb@entry=0x559f0b6feaa0, be=be@entry=0x559f0b346910, base=base@entry=0x559f0b285120 "cn=groups,cn=accounts,<SUFFIX>", flist=flist@entry=0x559f0b263190, ftype=161, err=0x7f09b1fdfb4c, allidslimit=4000) at ldap/servers/slapd/back-ldbm/filterindex.c:808
#21 0x00007f09d90d0342 in filter_candidates_ext (pb=pb@entry=0x559f0b6feaa0, be=be@entry=0x559f0b346910, base=base@entry=0x559f0b285120 "cn=groups,cn=accounts,<SUFFIX>", f=f@entry=0x559f0b263190, nextf=nextf@entry=0x0, range=range@entry=0, err=0x7f09b1fdfb4c, allidslimit=4000) at ldap/servers/slapd/back-ldbm/filterindex.c:144
#22 0x00007f09d90d13df in list_candidates (pb=pb@entry=0x559f0b6feaa0, be=be@entry=0x559f0b346910, base=base@entry=0x559f0b285120 "cn=groups,cn=accounts,<SUFFIX>", flist=flist@entry=0x559f0b243860, ftype=160, err=0x7f09b1fdfb4c, allidslimit=4000) at ldap/servers/slapd/back-ldbm/filterindex.c:808
#23 0x00007f09d90d0342 in filter_candidates_ext (pb=pb@entry=0x559f0b6feaa0, be=be@entry=0x559f0b346910, base=base@entry=0x559f0b285120 "cn=groups,cn=accounts,<SUFFIX>", f=0x559f0b243860, nextf=nextf@entry=0x0, range=range@entry=0, err=0x7f09b1fdfb4c, allidslimit=4000) at ldap/servers/slapd/back-ldbm/filterindex.c:144
#24 0x00007f09d90d19af in filter_candidates (pb=pb@entry=0x559f0b6feaa0, be=be@entry=0x559f0b346910, base=base@entry=0x559f0b285120 "cn=groups,cn=accounts,<SUFFIX>", f=<optimized out>, nextf=nextf@entry=0x0, range=range@entry=0, err=0x7f09b1fdfb4c) at ldap/servers/slapd/back-ldbm/filterindex.c:175
#25 0x00007f09d910d1b4 in onelevel_candidates (err=0x7f09b1fdfb4c, lookup_returned_allidsp=0x7f09b1fdfb3c, managedsait=<optimized out>, filter=<optimized out>, e=0x559f0b4d38c0, base=0x559f0b285120 "cn=groups,cn=accounts,<SUFFIX>", be=0x559f0b346910, pb=0x559f0b6feaa0) at ldap/servers/slapd/back-ldbm/ldbm_search.c:1143
#26 build_candidate_list (candidates=0x7f09b1fdfb78, lookup_returned_allidsp=0x7f09b1fdfb3c, scope=<optimized out>, base=0x559f0b285120 "cn=groups,cn=accounts,<SUFFIX>", e=<optimized out>, be=0x559f0b346910, pb=0x559f0b6feaa0) at ldap/servers/slapd/back-ldbm/ldbm_search.c:1010
#27 ldbm_back_search (pb=0x559f0b6feaa0) at ldap/servers/slapd/back-ldbm/ldbm_search.c:657
#28 0x00007f09e583c270 in op_shared_search (pb=pb@entry=0x559f0b6feaa0, send_result=send_result@entry=1) at ldap/servers/slapd/opshared.c:802
#29 0x00007f09e584cb4e in search_internal_callback_pb (pb=0x559f0b6feaa0, callback_data=<optimized out>, prc=0x0, psec=0x7f09d7338eb0 <backend_shr_note_entry_sdn_cb>, prec=0x0) at ldap/servers/slapd/plugin_internal_op.c:783
#30 0x00007f09d733a6c7 in backend_shr_update_references_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#31 0x00007f09d7346e9f in map_data_foreach_map () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#32 0x00007f09d733889b in backend_shr_update_references () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#33 0x00007f09d7339806 in backend_shr_add_cb.part () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#34 0x00007f09d733be61 in backend_shr_betxn_post_add_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#35 0x00007f09e5847969 in plugin_call_func (list=0x559f0b323620, operation=operation@entry=560, pb=pb@entry=0x7f09b1feab00, call_one=call_one@entry=0) at ldap/servers/slapd/plugin.c:1987
#36 0x00007f09e5847bf5 in plugin_call_list (pb=0x7f09b1feab00, operation=560, list=<optimized out>) at ldap/servers/slapd/plugin.c:1931
#37 plugin_call_plugins (pb=pb@entry=0x7f09b1feab00, whichfunction=whichfunction@entry=560) at ldap/servers/slapd/plugin.c:438
#38 0x00007f09e5804aa2 in dse_add (pb=0x7f09b1feab00) at ldap/servers/slapd/dse.c:2403
#39 0x00007f09e57eddfa in op_shared_add (pb=pb@entry=0x7f09b1feab00) at ldap/servers/slapd/add.c:714
#40 0x00007f09e57ef178 in do_add (pb=pb@entry=0x7f09b1feab00) at ldap/servers/slapd/add.c:226
#41 0x0000559f092c428f in connection_dispatch_operation (pb=0x7f09b1feab00, op=0x559f0b77f1b0, conn=0x559f0b79aa90) at ldap/servers/slapd/connection.c:604
#42 connection_threadmain () at ldap/servers/slapd/connection.c:1735
#43 0x00007f09e3a185cb in _pt_root (arg=0x559f0b7569f0) at ../../../nspr/pr/src/pthreads/ptthread.c:212
#44 0x00007f09e33b760a in start_thread (arg=0x7f09b1feb700) at pthread_create.c:334
#45 0x00007f09e30f1a4d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109


Thread 9 (Thread 0x7f09b37ee700 (LWP 23308)):
#0  0x00007f09e33bb87c in futex_wait (private=<optimized out>, expected=0, futex_word=0x559f0b31e308) at ../sysdeps/unix/sysv/linux/futex-internal.h:61
#1  futex_wait_simple (private=<optimized out>, expected=0, futex_word=0x559f0b31e308) at ../sysdeps/nptl/futex-internal.h:135
#2  __pthread_rwlock_rdlock_slow (rwlock=0x559f0b31e300) at pthread_rwlock_rdlock.c:68
#3  0x00007f09d733698d in backend_search_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#4  0x00007f09e5847969 in plugin_call_func (list=0x559f0b321f60, operation=operation@entry=403, pb=pb@entry=0x7f094800e750, call_one=call_one@entry=0) at ldap/servers/slapd/plugin.c:1987
#5  0x00007f09e5847bf5 in plugin_call_list (pb=0x7f094800e750, operation=403, list=<optimized out>) at ldap/servers/slapd/plugin.c:1931
#6  plugin_call_plugins (pb=pb@entry=0x7f094800e750, whichfunction=whichfunction@entry=403) at ldap/servers/slapd/plugin.c:438
#7  0x00007f09e583bf12 in op_shared_search (pb=pb@entry=0x7f094800e750, send_result=send_result@entry=1) at ldap/servers/slapd/opshared.c:559
#8  0x00007f09e584cb4e in search_internal_callback_pb (pb=pb@entry=0x7f094800e750, callback_data=callback_data@entry=0x7f09b37eb560, prc=prc@entry=0x0, psec=psec@entry=0x7f09d87d8770 <memberof_del_dn_type_callback>, prec=prec@entry=0x0) at ldap/servers/slapd/plugin_internal_op.c:783
#9  0x00007f09e584d0bd in slapi_search_internal_callback_pb (pb=pb@entry=0x7f094800e750, callback_data=callback_data@entry=0x7f09b37eb560, prc=prc@entry=0x0, psec=psec@entry=0x7f09d87d8770 <memberof_del_dn_type_callback>, prec=prec@entry=0x0) at ldap/servers/slapd/plugin_internal_op.c:564
#10 0x00007f09d87d8061 in memberof_call_foreach_dn (sdn=sdn@entry=0x7f09480039c0, config=config@entry=0x7f09b37eb5e0, types=types@entry=0x7f09b37eb570, callback=callback@entry=0x7f09d87d8770 <memberof_del_dn_type_callback>, callback_data=callback_data@entry=0x7f09b37eb560, pb=0x7f09b37edb00) at ldap/servers/plugins/memberof/memberof.c:778
#11 0x00007f09d87d84aa in memberof_del_dn_from_groups (config=config@entry=0x7f09b37eb5e0, sdn=sdn@entry=0x7f09480039c0, pb=0x7f09b37edb00) at ldap/servers/plugins/memberof/memberof.c:597
#12 0x00007f09d87db54b in memberof_postop_del (pb=0x7f09b37edb00) at ldap/servers/plugins/memberof/memberof.c:531
#13 0x00007f09e5847969 in plugin_call_func (list=0x559f0b2f5ac0, operation=operation@entry=563, pb=pb@entry=0x7f09b37edb00, call_one=call_one@entry=0) at ldap/servers/slapd/plugin.c:1987
#14 0x00007f09e5847bf5 in plugin_call_list (pb=0x7f09b37edb00, operation=563, list=<optimized out>) at ldap/servers/slapd/plugin.c:1931
#15 plugin_call_plugins (pb=pb@entry=0x7f09b37edb00, whichfunction=whichfunction@entry=563) at ldap/servers/slapd/plugin.c:438
#16 0x00007f09d90f88e5 in ldbm_back_delete (pb=0x7f09b37edb00) at ldap/servers/slapd/back-ldbm/ldbm_delete.c:1212
#17 0x00007f09e57fb170 in op_shared_delete (pb=pb@entry=0x7f09b37edb00) at ldap/servers/slapd/delete.c:333
#18 0x00007f09e57fb433 in do_delete (pb=pb@entry=0x7f09b37edb00) at ldap/servers/slapd/delete.c:97
#19 0x0000559f092c421b in connection_dispatch_operation (pb=0x7f09b37edb00, op=0x559f0b762a10, conn=0x559f0b79a658) at ldap/servers/slapd/connection.c:609
#20 connection_threadmain () at ldap/servers/slapd/connection.c:1735
#21 0x00007f09e3a185cb in _pt_root (arg=0x559f0b770e80) at ../../../nspr/pr/src/pthreads/ptthread.c:212
#22 0x00007f09e33b760a in start_thread (arg=0x7f09b37ee700) at pthread_create.c:334
#23 0x00007f09e30f1a4d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Current configuration looks conform

dn: cn=computers,cn=Schema Compatibility,cn=plugins,cn=config
schema-compat-restrict-subtree: <SUFFIX>
schema-compat-restrict-subtree: cn=Schema Compatibility,cn=plugins,cn=config
schema-compat-ignore-subtree: cn=dna,cn=ipa,cn=etc,<SUFFIX>

dn: cn=groups,cn=Schema Compatibility,cn=plugins,cn=config
schema-compat-restrict-subtree: <SUFFIX>
schema-compat-restrict-subtree: cn=Schema Compatibility,cn=plugins,cn=config
schema-compat-ignore-subtree: cn=dna,cn=ipa,cn=etc,<SUFFIX>

dn: cn=ng,cn=Schema Compatibility,cn=plugins,cn=config
schema-compat-restrict-subtree: <SUFFIX>
schema-compat-restrict-subtree: cn=Schema Compatibility,cn=plugins,cn=config
schema-compat-ignore-subtree: cn=dna,cn=ipa,cn=etc,<SUFFIX>

dn: cn=sudoers,cn=Schema Compatibility,cn=plugins,cn=config
schema-compat-restrict-subtree: <SUFFIX>
schema-compat-restrict-subtree: cn=Schema Compatibility,cn=plugins,cn=config
schema-compat-ignore-subtree: cn=dna,cn=ipa,cn=etc,<SUFFIX>

dn: cn=users,cn=Schema Compatibility,cn=plugins,cn=config
schema-compat-restrict-subtree: <SUFFIX>
schema-compat-restrict-subtree: cn=Schema Compatibility,cn=plugins,cn=config
schema-compat-ignore-subtree: cn=dna,cn=ipa,cn=etc,<SUFFIX>

dn: cn=MemberOf Plugin,cn=plugins,cn=config
memberofentryscope: <SUFFIX>
memberofentryscopeexcludesubtree: cn=provisioning,<SUFFIX>

But is not suffisant to prevent the deadlock. A possible solution would be

dn: cn=MemberOf Plugin,cn=plugins,cn=config
memberofentryscope: <SUFFIX>
memberofentryscopeexcludesubtree: cn=provisioning,<SUFFIX>
memberofentryscopeexcludesubtree: cn=topology,cn=ipa,cn=etc,<SUFFIX>

I think the following configuration would NOT prevent the deadlock

dn: cn=groups,cn=Schema Compatibility,cn=plugins,cn=config
schema-compat-restrict-subtree: <SUFFIX>
schema-compat-restrict-subtree: cn=Schema Compatibility,cn=plugins,cn=config
schema-compat-ignore-subtree: cn=dna,cn=ipa,cn=etc,<SUFFIX>
schema-compat-ignore-subtree: cn=topology,cn=ipa,cn=etc,<SUFFIX>

I think adding cn=compat to the excluded subtrees would also be needed:

dn: cn=MemberOf Plugin,cn=plugins,cn=config
memberofentryscope: <SUFFIX>
memberofentryscopeexcludesubtree: cn=compat,<SUFFIX>
memberofentryscopeexcludesubtree: cn=provisioning,<SUFFIX>
memberofentryscopeexcludesubtree: cn=topology,cn=ipa,cn=etc,<SUFFIX>

from triage:

tbordaz: For the fix we may choose:

  • exclude some subtrees from memberof config
  • disable slapi-nis during admin task (here deadlock occured when ADD 'cn=ipa_pwd_extop,cn=plugins,cn=config')

ipa-4-3:

  • 17873d1 DS deadlock when memberof scopes topology plugin updates

master:

  • e1bbd90 DS deadlock when memberof scopes topology plugin updates

Metadata Update from @tbordaz:
- Issue assigned to tbordaz
- Issue set to the milestone: FreeIPA 4.3.1

4 years ago

Login to comment on this ticket.

Metadata