#49286 Poor performance when removing a member to large static group
Closed: wontfix 6 years ago Opened 6 years ago by tbordaz.

Issue Description

Performance issue when deleting a user being part of very large groups (40K users): user-del lasts up to 6 min

The performance hit is due to DS during the LDAP DEL of the user.
Several internal LDAP requests (during LDAP DEL) participated to the overall duration
but by far (99%) the most expensive is triggered by referential integrity plugin that
deletes the deleted user from the groups it was memberof.
More specifically the expensive part is when building a sorted array of the deleted members (a_deleted_values) of the group

Package Version and Platform

freeipa 4.4
DS 1.3.5

Steps to reproduce

Test case would be:
- install freeipa
- install master and a replica (This is important to be in a replicated topology so that urp keeps deleted values).
- Provision 20000 users (better to user https://github.com/freeipa/freeipa-tools/blob/master/create-test-data.py)
- Delete a given user

Actual results

user-del last several minutes

Expected results

user-del should be much more rapid, like few seconds


The layout of the request is: [08/Jun/2017:13:08:14.340684502 -0400] conn=29 op=10 DEL dn="uid=U1,cn=users,cn=accounts,<suffix>" # inexpensive internal MOD [08/Jun/2017:13:08:15.534150619 -0400] conn=Internal op=-1 MOD dn="cn=G1,cn=groups,cn=accounts,<suffix>" [08/Jun/2017:13:08:20.548200003 -0400] conn=Internal op=-1 RESULT err=0 tag=48 nentries=0 etime=0.012000 # First expensive internal MOD where 99% of the time is spent in [1] [08/Jun/2017:13:08:20.548280670 -0400] conn=Internal op=-1 MOD dn="<suffix>" [1] [08/Jun/2017:13:09:54.780515706 -0400] conn=Internal op=-1 SRCH base="cn=G1,cn=groups,cn=accounts,<suffix>" scope=0 filter="(|(objectclass=*)(objectclass=ldapsubentry))" attrs=ALL [08/Jun/2017:13:09:54.789216729 -0400] conn=Internal op=-1 RESULT err=0 tag=48 nentries=1 etime=0.009000 [2] ... [08/Jun/2017:13:09:57.819950749 -0400] conn=Internal op=-1 RESULT err=0 tag=48 nentries=0 etime=102.286000 csn=59398480000400040000 # Second expensive internal MOD where 99% of the time is spent in [3] [08/Jun/2017:13:09:57.833759498 -0400] conn=Internal op=-1 MOD dn="cn=ipausers,cn=groups,cn=accounts,<suffix>" [08/Jun/2017:13:09:57.833791633 -0400] conn=Internal op=-1 SRCH base="cn=ipausers,cn=groups,cn=accounts,<suffix>" scope=0 filter="(|(objectclass=*)(objectclass=ldapsubentry))" attrs=ALL [08/Jun/2017:13:09:58.431560735 -0400] conn=Internal op=-1 RESULT err=0 tag=48 nentries=1 etime=0.598000 [08/Jun/2017:13:11:41.313486256 -0400] conn=Internal op=-1 SRCH base="cn=ipausers,cn=groups,cn=accounts,<suffix>" scope=0 filter="(|(objectclass=*)(objectclass=ldapsubentry))" attrs=ALL [08/Jun/2017:13:11:41.319724880 -0400] conn=Internal op=-1 RESULT err=0 tag=48 nentries=1 etime=0.006000 [3] ... [08/Jun/2017:13:11:44.372613437 -0400] conn=Internal op=-1 RESULT err=0 tag=48 nentries=0 etime=106.539000 csn=593984eb000000040000 [4] # inexpensive DEL of a private group [08/Jun/2017:13:11:44.405304161 -0400] conn=Internal op=-1 DEL dn="cn=U1,cn=groups,cn=accounts,<suffix>" [08/Jun/2017:13:11:44.890214001 -0400] conn=Internal op=-1 RESULT err=0 tag=48 nentries=0 etime=0.485000 csn=59398551000100040000 [08/Jun/2017:13:11:44.893020378 -0400] conn=29 op=10 RESULT err=0 tag=107 nentries=0 etime=210.553000 csn=59398480000300040000 Following is the most expensive part where time is spent to build a sorted array (a_deleted_values). To sort it, it ends comparing values. ** [1] 99s/102s ** Thread 21 (Thread 0x7f592bfef700 (LWP 10586)): #0 0x00007f5bcf26510a in __strlen_sse2_pminub () from /lib64/libc.so.6 #1 0x00007f5bd1982589 in slapi_has8thBit () from /usr/lib64/dirsrv/libslapd.so.0 #2 0x00007f5bd1983156 in slapi_utf8casecmp () from /usr/lib64/dirsrv/libslapd.so.0 #3 0x00007f5bd19884ed in valueset_value_cmp () from /usr/lib64/dirsrv/libslapd.so.0 #4 0x00007f5bd1989163 in valueset_array_to_sorted () from /usr/lib64/dirsrv/libslapd.so.0 #5 0x00007f5bd19894e7 in slapi_valueset_add_attr_valuearray_ext () from /usr/lib64/dirsrv/libslapd.so.0 #6 0x00007f5bd191439f in entry_delete_present_values_wsi.isra.5 () from /usr/lib64/dirsrv/libslapd.so.0 #7 0x00007f5bd1914843 in entry_apply_mod_wsi () from /usr/lib64/dirsrv/libslapd.so.0 #8 0x00007f5bd1914963 in entry_apply_mods_wsi () from /usr/lib64/dirsrv/libslapd.so.0 #9 0x00007f5bc3adaea6 in modify_apply_check_expand.isra.1 () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #10 0x00007f5bc3adc87b in ldbm_back_modify () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #11 0x00007f5bd193c55b in op_shared_modify () from /usr/lib64/dirsrv/libslapd.so.0 #12 0x00007f5bd193d094 in modify_internal_pb () from /usr/lib64/dirsrv/libslapd.so.0 #13 0x00007f5bc2344041 in _do_modify () from /usr/lib64/dirsrv/plugins/libreferint-plugin.so #14 0x00007f5bc234644b in update_integrity () from /usr/lib64/dirsrv/plugins/libreferint-plugin.so #15 0x00007f5bc23471a2 in referint_postop_del () from /usr/lib64/dirsrv/plugins/libreferint-plugin.so #16 0x00007f5bd194edb8 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0 #17 0x00007f5bd194f043 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0 #18 0x00007f5bc3ace277 in ldbm_back_delete () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #19 0x00007f5bd1900d60 in op_shared_delete () from /usr/lib64/dirsrv/libslapd.so.0 #20 0x00007f5bd1901043 in do_delete () from /usr/lib64/dirsrv/libslapd.so.0 #21 0x00007f5bd1e269ed in connection_threadmain () #22 0x00007f5bcfb2996b in _pt_root () from /lib64/libnspr4.so #23 0x00007f5bcf4c9dc5 in start_thread () from /lib64/libpthread.so.0 #24 0x00007f5bcf1f873d in clone () from /lib64/libc.so.6 Following is the final part of the MOD updating database/index but it does not cost a lot ** [2] 3s/102s ** #0 0x00007f5bc3aba11c in bvals_strcasecmp () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #1 0x00007f5bcf138b35 in msort_with_tmp.part.0 () from /lib64/libc.so.6 #2 0x00007f5bcf138a12 in msort_with_tmp.part.0 () from /lib64/libc.so.6 #3 0x00007f5bcf138a28 in msort_with_tmp.part.0 () from /lib64/libc.so.6 #4 0x00007f5bcf138a28 in msort_with_tmp.part.0 () from /lib64/libc.so.6 #5 0x00007f5bcf138f6c in qsort_r () from /lib64/libc.so.6 #6 0x00007f5bc3abd9f7 in index_addordel_values_ext_sv () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #7 0x00007f5bc3abdcf4 in index_addordel_values_sv () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #8 0x00007f5bc3abe724 in index_add_mods () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #9 0x00007f5bc3adb8d9 in ldbm_back_modify () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #10 0x00007f5bd193c55b in op_shared_modify () from /usr/lib64/dirsrv/libslapd.so.0 #11 0x00007f5bd193d094 in modify_internal_pb () from /usr/lib64/dirsrv/libslapd.so.0 #12 0x00007f5bc2344041 in _do_modify () from /usr/lib64/dirsrv/plugins/libreferint-plugin.so #13 0x00007f5bc234644b in update_integrity () from /usr/lib64/dirsrv/plugins/libreferint-plugin.so #14 0x00007f5bc23471a2 in referint_postop_del () from /usr/lib64/dirsrv/plugins/libreferint-plugin.so #15 0x00007f5bd194edb8 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0 #16 0x00007f5bd194f043 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0 #17 0x00007f5bc3ace277 in ldbm_back_delete () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #18 0x00007f5bd1900d60 in op_shared_delete () from /usr/lib64/dirsrv/libslapd.so.0 #19 0x00007f5bd1901043 in do_delete () from /usr/lib64/dirsrv/libslapd.so.0 #20 0x00007f5bd1e269ed in connection_threadmain () #21 0x00007f5bcfb2996b in _pt_root () from /lib64/libnspr4.so #22 0x00007f5bcf4c9dc5 in start_thread () from /lib64/libpthread.so.0 #23 0x00007f5bcf1f873d in clone () from /lib64/libc.so.6 Following is the most expensive part where the time is spent to build a sorted array (a_deleted_values). To sort it, it ends comparing values. ** [3] 103s/106s ** Thread 21 (Thread 0x7f592bfef700 (LWP 10586)): #0 0x00007f5bd19825a0 in slapi_has8thBit () from /usr/lib64/dirsrv/libslapd.so.0 #1 0x00007f5bd1983156 in slapi_utf8casecmp () from /usr/lib64/dirsrv/libslapd.so.0 #2 0x00007f5bd19884ed in valueset_value_cmp () from /usr/lib64/dirsrv/libslapd.so.0 #3 0x00007f5bd1989163 in valueset_array_to_sorted () from /usr/lib64/dirsrv/libslapd.so.0 #4 0x00007f5bd19894e7 in slapi_valueset_add_attr_valuearray_ext () from /usr/lib64/dirsrv/libslapd.so.0 #5 0x00007f5bd191439f in entry_delete_present_values_wsi.isra.5 () from /usr/lib64/dirsrv/libslapd.so.0 #6 0x00007f5bd1914843 in entry_apply_mod_wsi () from /usr/lib64/dirsrv/libslapd.so.0 #7 0x00007f5bd1914963 in entry_apply_mods_wsi () from /usr/lib64/dirsrv/libslapd.so.0 #8 0x00007f5bc3adaea6 in modify_apply_check_expand.isra.1 () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #9 0x00007f5bc3adc87b in ldbm_back_modify () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #10 0x00007f5bd193c55b in op_shared_modify () from /usr/lib64/dirsrv/libslapd.so.0 #11 0x00007f5bd193d094 in modify_internal_pb () from /usr/lib64/dirsrv/libslapd.so.0 #12 0x00007f5bc2344041 in _do_modify () from /usr/lib64/dirsrv/plugins/libreferint-plugin.so #13 0x00007f5bc234644b in update_integrity () from /usr/lib64/dirsrv/plugins/libreferint-plugin.so #14 0x00007f5bc23471a2 in referint_postop_del () from /usr/lib64/dirsrv/plugins/libreferint-plugin.so #15 0x00007f5bd194edb8 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0 #16 0x00007f5bd194f043 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0 #17 0x00007f5bc3ace277 in ldbm_back_delete () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #18 0x00007f5bd1900d60 in op_shared_delete () from /usr/lib64/dirsrv/libslapd.so.0 #19 0x00007f5bd1901043 in do_delete () from /usr/lib64/dirsrv/libslapd.so.0 #20 0x00007f5bd1e269ed in connection_threadmain () #21 0x00007f5bcfb2996b in _pt_root () from /lib64/libnspr4.so #22 0x00007f5bcf4c9dc5 in start_thread () from /lib64/libpthread.so.0 #23 0x00007f5bcf1f873d in clone () from /lib64/libc.so.6 Following is the final part of the MOD updating database/index but it does not cost a lot ** [4] 3s/106s ** Thread 21 (Thread 0x7f592bfef700 (LWP 10586)): #0 0x00007f5bcf18fb79 in __memcpy_sse2 () from /lib64/libc.so.6 #1 0x00007f5bcf138a12 in msort_with_tmp.part.0 () from /lib64/libc.so.6 #2 0x00007f5bcf138f6c in qsort_r () from /lib64/libc.so.6 #3 0x00007f5bc3abd9f7 in index_addordel_values_ext_sv () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #4 0x00007f5bc3abdcf4 in index_addordel_values_sv () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #5 0x00007f5bc3abe724 in index_add_mods () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #6 0x00007f5bc3adb8d9 in ldbm_back_modify () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #7 0x00007f5bd193c55b in op_shared_modify () from /usr/lib64/dirsrv/libslapd.so.0 #8 0x00007f5bd193d094 in modify_internal_pb () from /usr/lib64/dirsrv/libslapd.so.0 #9 0x00007f5bc2344041 in _do_modify () from /usr/lib64/dirsrv/plugins/libreferint-plugin.so #10 0x00007f5bc234644b in update_integrity () from /usr/lib64/dirsrv/plugins/libreferint-plugin.so #11 0x00007f5bc23471a2 in referint_postop_del () from /usr/lib64/dirsrv/plugins/libreferint-plugin.so #12 0x00007f5bd194edb8 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0 #13 0x00007f5bd194f043 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0 #14 0x00007f5bc3ace277 in ldbm_back_delete () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #15 0x00007f5bd1900d60 in op_shared_delete () from /usr/lib64/dirsrv/libslapd.so.0 #16 0x00007f5bd1901043 in do_delete () from /usr/lib64/dirsrv/libslapd.so.0 #17 0x00007f5bd1e269ed in connection_threadmain () #18 0x00007f5bcfb2996b in _pt_root () from /lib64/libnspr4.so #19 0x00007f5bcf4c9dc5 in start_thread () from /lib64/libpthread.so.0 #20 0x00007f5bcf1f873d in clone () from /lib64/libc.so.6

Metadata Update from @tbordaz:
- Custom field type adjusted to defect

6 years ago

Metadata Update from @tbordaz:
- Custom field origin adjusted to IPA

6 years ago

Metadata Update from @tbordaz:
- Issue assigned to tbordaz

6 years ago

In my tests the first removed users are fast. After 4 hours approximately running the task, the bad performance is observed.

ipa -v user-del user1 user2 userN

N >= 13 000 users

If I cancel the task and wait for a few days, the IPA returns to normal process. Even if I restart the IPA and server, the problem persists several days.

Actually it was a known bug https://fedorahosted.org/389/ticket/48894. That was already fixed 1.3.6.1-2.

Would you try this version ?

Metadata Update from @tbordaz:
- Issue close_status updated to: duplicate
- Issue status updated to: Closed (was: Open)

6 years ago

Hello, this version is not available in CentOS 7.3. The current version is 1.3.5.10-18. It is good to know that it has already been resolved.

Thanks.

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/2345

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: duplicate)

3 years ago

Login to comment on this ticket.

Metadata