On 32-bit platforms it's possible that deleting a suffix can crash the DS. The server does not crash on 64-bit platforms. One strange issue with this is that you must NOT have a backend named "userRoot" to trigger the crash. This can not easily be explained since "userRoot" is not a hard-coded value in the source.
Steps to reproduce:
[1] Create 7 suffixes [2] Add an entry to each suffix [3] Restart the server [4] Delete the last suffix that was added --> crash!
Valgrind output:
==790== Invalid read of size 4 ==790== at 0x64C289: dse_call_callback (dse.c:2386) --> crash ==790== by 0x64C943: dse_delete (dse.c:2314) ==790== by 0x645E3B: op_shared_delete (delete.c:364) ==790== by 0x646251: do_delete (delete.c:128) ==790== by 0x8059564: connection_threadmain (connection.c:583) ==790== by 0xDE4271: ??? (in /lib/libnspr4.so) ==790== by 0xBC7A48: start_thread (in /lib/libpthread-2.12.so) ==790== by 0xB03AED: clone (in /lib/libc-2.12.so)
==790== Address 0x6003dbc is 4 bytes inside a block of size 36 free'd ==790== at 0x4006C7F: free (vg_replace_malloc.c:446) ==790== by 0x6416EC: slapi_ch_free (ch_malloc.c:363) ==790== by 0x64B354: dse_callback_delete (dse.c:260) ==790== by 0x64B463: dse_remove_callback (dse.c:345) ==790== by 0x64B526: slapi_config_remove_callback (dse.c:2434) ==790== by 0x6CD1E58: vlv_remove_callbacks (vlv.c:469) ==790== by 0x6CB3150: ldbm_instance_post_delete_instance_entry_callback (ldbm_instance_config.c:1062) ==790== by 0x64C32F: dse_call_callback (dse.c:2393) ==790== by 0x64C943: dse_delete (dse.c:2314) ==790== by 0x645E3B: op_shared_delete (delete.c:364) ==790== by 0x646251: do_delete (delete.c:128) ==790== by 0x8059564: connection_threadmain (connection.c:583) ==790== by 0xDE4271: ??? (in /lib/libnspr4.so) ==790== by 0xBC7A48: start_thread (in /lib/libpthread-2.12.so) ==790== by 0xB03AED: clone (in /lib/libc-2.12.so)
attachment 0001-Ticket-562-Crash-when-deleting-suffix.patch
This fix reduces the risk that p->next is modified before it is used, but cannot completely eliminate it. At the time this thread does p = p->next an other thread could do p p->next is p->next->next; free (p->next) and at the beginning of your loop it's gone.
Replying to [comment:2 lkrispen]:
I guess its possible two threads could be trying to delete the same suffix. I'll look into adding locks, but adding locking around this recursive code is not so easy. Deleting a callback, calls callbacks from the same list, which can then delete callbacks from the same list - and then we call the dse_delete function again, and the cycle continues.
Linked to Bugzilla bug: https://bugzilla.redhat.com/show_bug.cgi?id=901507 (''Red Hat Enterprise Linux 7'')
master
git merge ticket562 Updating 435972e..6c855a8 Fast-forward ldap/servers/slapd/dse.c | 32 ++++++++++++++------------------ 1 files changed, 14 insertions(+), 18 deletions(-)
git push origin master Counting objects: 20, done. Delta compression using up to 4 threads. Compressing objects: 100% (11/11), done. Writing objects: 100% (11/11), 1.66 KiB, done. Total 11 (delta 7), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 86ceedb..0d41c0b master -> master
1.3.0
git push origin 389-ds-base-1.3.0 Counting objects: 11, done. Delta compression using up to 4 threads. Compressing objects: 100% (6/6), done. Writing objects: 100% (6/6), 1.21 KiB, done. Total 6 (delta 4), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 606cb84..ba4b1c6 389-ds-base-1.3.0 -> 389-ds-base-1.3.0
Metadata Update from @mreynolds: - Issue assigned to mreynolds - Issue set to the milestone: 1.3.0.3
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/562
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: Fixed)
Login to comment on this ticket.