#49233 Crash in persistent search
Closed: wontfix 6 years ago Opened 6 years ago by tbordaz.

Issue Description

Crash in persistent search

Package Version and Platform

master branch

Steps to reproduce

  1. install an instance
  2. add an entry (uid=foo,ou=People,dc=example,dc=com)
  3. run a ldapsearch with persistent search
    /usr/lib64/mozldap/ldapsearch -h localhost -p 38901 -D "cn=directory manager" -C PS:modify -w password -b "dc=example,dc=com" "uid=*"
    4: update the entry (uid=foo,ou=People,dc=example,dc=com)

Actual results

Crash is sytematic

(gdb) where
#0  0x00007fe10f3655df in slapi_sdn_get_ndn (sdn=0x200000)
    at <path>/ldap/servers/slapd/dn.c:2383
#1  0x00007fe10f3659cb in slapi_sdn_compare (sdn1=0x7fe08c006380, sdn2=0x200000)
    at <path>/ldap/servers/slapd/dn.c:2512
#2  0x00007fe10f365d1d in slapi_sdn_scope_test (dn=0x7fe08c006380, base=0x200000, scope=0)
    at <path>/ldap/servers/slapd/dn.c:2645
#3  0x000000000042d07d in ps_service_persistent_searches (e=0x7fe08c006380, eprev=0x0, chgtype=4, chgnum=0)
    at <path>/ldap/servers/slapd/psearch.c:567
#4  0x00007fe10f3ae6d6 in do_ps_service (e=0x7fe08c006380, eprev=0x0, chgtype=4, chgnum=0)
    at <path>/ldap/servers/slapd/opshared.c:98
#5  0x00007fe10f3a846e in op_shared_modify (pb=0x7fe08c0008c0, pw_change=0, old_pw=0x0)
    at <path>/ldap/servers/slapd/modify.c:1083
#6  0x00007fe10f3a68ff in do_modify (pb=0x7fe08c0008c0)
    at <path>/ldap/servers/slapd/modify.c:391
#7  0x00000000004170e8 in connection_dispatch_operation (conn=0x4a83bb8, op=0x7fe0b4007610, pb=0x7fe08c0008c0)
    at <path>/ldap/servers/slapd/connection.c:633
#8  0x0000000000418fe5 in connection_threadmain ()
    at <path>/ldap/servers/slapd/connection.c:1768
#9  0x00007fe10d77268b in _pt_root (arg=0x241df80) at ../../../nspr/pr/src/pthreads/ptthread.c:216
#10 0x00007fe10d11160a in start_thread () from /lib64/libpthread.so.0
#11 0x00007fe10cc02a4d in clone () from /lib64/libc.so.6
(gdb) frame 3
#3  0x000000000042d07d in ps_service_persistent_searches (e=0x7fe08c006380, eprev=0x0, chgtype=4, chgnum=0)
    at <path>/ldap/servers/slapd/psearch.c:567
567         if ( slapi_sdn_scope_test( slapi_entry_get_sdn_const(e), base, scope ) &&
(gdb) print base
$13 = (Slapi_DN *) 0x200000
(gdb) print pb_op
$14 = (Operation *) 0x7fe0b4004590
(gdb) print *pb_op
$15 = {o_ber = 0xf597000a, o_msgid = 0, o_tag = 72058143793676288, o_time = 0, o_interval = 0, o_isroot = 0, o_sdn = {
    flag = 0 '\000', udn = 0x0, dn = 0x0, ndn = 0x0, ndn_len = 0}, o_authtype = 0x0, o_ssf = 0, o_opid = 0, o_connid = 0, 
  o_handler_data = 0x0, o_result_handler = 0x0, o_search_entry_handler = 0xf5, o_search_referral_handler = 0x7fe0b4002d60, 
  o_csngen_handler = 0x7fe0b4004700, o_replica_attr_handler = 0x7fe0b40042f0, o_next = 0x0, o_status = 0, o_searchattrs = 0x0, 
  o_flags = 12884901892, o_extension = 0x9, o_target_spec = 0x0, o_abandoned_op = 1, o_params = {operation_type = 140603069324624, 
    target_address = {udn = 0x666c80 <ids_sasl_callbacks> "\001", 
      uniqueid = 0xffffffff00000000 <error: Cannot access memory at address 0xffffffff00000000>, sdn = 0x200000}, csn = 0x0, 
    request_controls = 0x0, p = {p_add = {target_entry = 0x0, parentuniqueid = 0x0}, p_bind = {bind_method = 0, bind_creds = 0x0, 
        bind_saslmechanism = 0x7fe10d532c00 <_sasl_canon_user_lookup> "AWAVAUATM\211\305USH\211\375A\211\314H\203\354\b\350D\335\377\377\205\300\211\303t.\301\350\037H\205\355\017\225\302!Є\300t\006\211\235`\t", bind_ret_saslcreds = 0x7fe0b4004380}, 
      p_compare = {compare_ava = {ava_type = 0x0, ava_value = {bv_len = 0, 
            bv_val = 0x7fe10d532c00 <_sasl_canon_user_lookup> "AWAVAUATM\211\305USH\211\375A\211\314H\203\354\b\350D\335\377\377\205\300\211\303t.\301\350\037H\205\355\017\225\302!Є\300t\006\211\235`\t"}, ava_private = 0x7fe0b4004380}}, p_modify = {
        modify_mods = 0x0}, p_modrdn = {modrdn_newrdn = 0x0, modrdn_deloldrdn = 0, modrdn_newsuperior_address = {
          udn = 0x7fe10d532c00 <_sasl_canon_user_lookup> "AWAVAUATM\211\305USH\211\375A\211\314H\203\354\b\350D\335\377\377\205\300\211\303t.\301\350\037H\205\355\017\225\302!Є\300t\006\211\235`\t", uniqueid = 0x7fe0b4004380 "\210c", sdn = 0x0}, 
        modrdn_mods = 0x0}, p_search = {search_scope = 0, search_deref = 0, search_sizelimit = 0, search_timelimit = 0, 
        search_filter = 0x7fe10d532c00 <_sasl_canon_user_lookup>, search_strfilter = 0x7fe0b4004380 "\210c", search_attrs = 0x0, 
        search_attrsonly = 0, search_is_and = 0, search_gerattrs = 0x0}, p_abandon = {abandon_targetmsgid = 0}, p_extended = {
        exop_oid = 0x0, exop_value = 0x0}}}, o_results = {operation_type = 0, opreturn = 0, result_controls = 0x0, 
    result_code = 0, result_text = 0x400000000 <error: Cannot access memory at address 0x400000000>, result_matched = 0x0, r = {
      r_bind = {bind_ret_saslcreds = 0x25}, r_search = {search_result_set = 0x25, search_result_entry = 0x7fe000534469, 
        opaque_backend_ptr = 0x7fe0b40000a8, nentries = 0, search_referrals = 0x25, estimate = 0}, r_extended = {
        exop_ret_oid = 0x25 <error: Cannot access memory at address 0x25>, exop_ret_value = 0x7fe000534469}}}, 
  o_pagedresults_sizelimit = -1275068280, o_reverse_search_state = 32736}
(gdb) print pb_op.o_params.target_address.sdn
$16 = (Slapi_DN *) 0x200000

Expected results


Metadata Update from @tbordaz:
- Custom field origin adjusted to IPA
- Custom field type adjusted to defect

6 years ago

Metadata Update from @tbordaz:
- Issue assigned to tbordaz

6 years ago

Metadata Update from @firstyear:
- Issue assigned to firstyear (was: tbordaz)

6 years ago

Given our discussion, we think this is related to #49097, as this part of the code does a pblock clone: it's likely that this is a use-after-free of some kind. I would like to make a lib389 test case for this, and be able to reproduce this.

=================================================================
==9174==ERROR: AddressSanitizer: heap-use-after-free on address 0x6140000802d8 at pc 0x00000042b727 bp 0x7fecedeb6aa0 sp 0x7fecedeb6a90
READ of size 8 at 0x6140000802d8 thread T33
    #0 0x42b726 in connection_add_operation /home/william/development/389ds/ds/ldap/servers/slapd/connection.c:2022:0
    #1 0x42ad1b in connection_activity /home/william/development/389ds/ds/ldap/servers/slapd/connection.c:1883:0
    #2 0x43864a in ns_handle_pr_read_ready /home/william/development/389ds/ds/ldap/servers/slapd/daemon.c:2008:0
    #3 0x7fed31dcdff9 in work_job_execute /home/william/development/389ds/ds/src/nunc-stans/ns/ns_thrpool.c:279:0
    #4 0x7fed31dcfd11 in event_cb /home/william/development/389ds/ds/src/nunc-stans/ns/ns_thrpool.c:619:0
    #5 0x7fed31dd434c in event_cb /home/william/development/389ds/ds/src/nunc-stans/ns/ns_event_fw_event.c:108:0
    #6 0x7fed2f19a9d8 in event_base_loop ??:0:0
    #7 0x7fed31dd4cb9 in ns_event_fw_loop /home/william/development/389ds/ds/src/nunc-stans/ns/ns_event_fw_event.c:310:0
    #8 0x7fed31dcf9f2 in event_loop_thread_func /home/william/development/389ds/ds/src/nunc-stans/ns/ns_thrpool.c:558:0
    #9 0x7fed2f3da36c in start_thread /usr/src/debug/glibc-2.25-80-ga10e9c4/nptl/pthread_create.c:456:0
    #10 0x7fed2ecb35be in __GI___clone /usr/src/debug////////glibc-2.25-80-ga10e9c4/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:97:0

0x6140000802d8 is located 152 bytes inside of 416-byte region [0x614000080240,0x6140000803e0)
freed by thread T35 here:
    #0 0x7fed320b94e8 in __interceptor_free _asan_rtl_:0
    #1 0x7fed316ad6bd in slapi_ch_free /home/william/development/389ds/ds/ldap/servers/slapd/ch_malloc.c:274:0
    #2 0x7fed3177a6e9 in ber_special_free /home/william/development/389ds/ds/ldap/servers/slapd/operation.c:144:0
    #3 0x7fed3177b4eb in operation_free /home/william/development/389ds/ds/ldap/servers/slapd/operation.c:271:0
    #4 0x7fed3178d758 in pblock_done /home/william/development/389ds/ds/ldap/servers/slapd/pblock.c:193:0
    #5 0x7fed3178d6cb in slapi_pblock_init /home/william/development/389ds/ds/ldap/servers/slapd/pblock.c:173:0
    #6 0x42a2db in connection_threadmain /home/william/development/389ds/ds/ldap/servers/slapd/connection.c:1804:0
    #7 0x7fed2f61ae8a in _pt_root /usr/src/debug/nspr-4.13.1/pr/src/pthreads/../../../nspr/pr/src/pthreads/ptthread.c:216:0

previously allocated by thread T33 here:
    #0 0x7fed320b9880 in malloc _asan_rtl_:0
    #1 0x7fed316acdd9 in slapi_ch_malloc /home/william/development/389ds/ds/ldap/servers/slapd/ch_malloc.c:96:0
    #2 0x7fed3177a6b2 in ber_special_alloc /home/william/development/389ds/ds/ldap/servers/slapd/operation.c:135:0
    #3 0x7fed3177abd6 in operation_new /home/william/development/389ds/ds/ldap/servers/slapd/operation.c:206:0
    #4 0x420614 in connection_get_operation /home/william/development/389ds/ds/ldap/servers/slapd/connection.c:108:0
    #5 0x42acdb in connection_activity /home/william/development/389ds/ds/ldap/servers/slapd/connection.c:1882:0
    #6 0x43864a in ns_handle_pr_read_ready /home/william/development/389ds/ds/ldap/servers/slapd/daemon.c:2008:0
    #7 0x7fed31dcdff9 in work_job_execute /home/william/development/389ds/ds/src/nunc-stans/ns/ns_thrpool.c:279:0
    #8 0x7fed31dcfd11 in event_cb /home/william/development/389ds/ds/src/nunc-stans/ns/ns_thrpool.c:619:0
    #9 0x7fed31dd434c in event_cb /home/william/development/389ds/ds/src/nunc-stans/ns/ns_event_fw_event.c:108:0
    #10 0x7fed2f19a9d8 in event_base_loop ??:0:0

Thread T33 created by T0 here:
    #0 0x7fed32012a5f in pthread_create _asan_rtl_:0
    #1 0x7fed31dd31ad in ns_thrpool_new /home/william/development/389ds/ds/src/nunc-stans/ns/ns_thrpool.c:1417:0
    #2 0x433a03 in slapd_daemon /home/william/development/389ds/ds/ldap/servers/slapd/daemon.c:1046:0
    #3 0x449616 in main /home/william/development/389ds/ds/ldap/servers/slapd/main.c:1144:0
    #4 0x7fed2ebc25fd in __libc_start_main /usr/src/debug/glibc-2.25-80-ga10e9c4/csu/../csu/libc-start.c:295:0

Thread T35 created by T0 here:
    #0 0x7fed32012a5f in pthread_create _asan_rtl_:0
    #1 0x7fed2f61ab69 in _PR_CreateThread /usr/src/debug/nspr-4.13.1/pr/src/pthreads/../../../nspr/pr/src/pthreads/ptthread.c:457:0

SUMMARY: AddressSanitizer: heap-use-after-free (/opt/dirsrv/sbin/ns-slapd+0x42b726)
Shadow bytes around the buggy address:
  0x0c2880008000: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x0c2880008010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c2880008020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c2880008030: 00 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa
  0x0c2880008040: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
=>0x0c2880008050: fd fd fd fd fd fd fd fd fd fd fd[fd]fd fd fd fd
  0x0c2880008060: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2880008070: fd fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa
  0x0c2880008080: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x0c2880008090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c28800080a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==9174==ABORTING

Metadata Update from @firstyear:
- Custom field reviewstatus adjusted to review

6 years ago

Hi William,

great finding !! The fix looks good to me.

A couple of questions. Clearing the operation field before slapi_pblock_init prevents to free the operation struct, where will the operation struct be freed ?
During shutdown, the pblock is destroyed and the operation is freed. How is it checked that work_job_execute is not accessing this operation.

Apparently:

108  »       »       »            * ps_send_results will call connection_remove_operation_ext to free it

So I think it's freed there. In my tests (which were done with ASAN+LSAN for leak checking), no memory was leaked, so I think this is working :)

You know, that's an awesome question. I wish I knew the answer. I'll have to re-examine the code, and get back to you about work_job_execute. Previously, this was a stack allocated pblock, so as soon as return was called it was "freed", so moving the pblock_destroy() to before the return has the same effect in essence, IE it should work. But since you have made the comment, I think it's worth checking to be sure.

Metadata Update from @tbordaz:
- Custom field rhbz adjusted to https://pagure.io/389-ds-base/issue/49233

6 years ago

Metadata Update from @mreynolds:
- Custom field reviewstatus adjusted to ack (was: review)

6 years ago

I acked it, but the indentation is off again, please fix it before pushing

Ahhh @tbordaz work job execute is part of nunc stans. What matters there is that the pblock is in ns_handle_pr_read_ready(), which expects the operation to be available. Pre-fix it was freed, and this is where the first access occured to cause heap use after free. The fix prevents the op being freed prematurely, so ns_handle_pr_read_ready() will have the op put back into the pb, so it will work correctly.

Metadata Update from @firstyear:
- Custom field reviewstatus adjusted to review (was: ack)
- Custom field rhbz reset (from https://pagure.io/389-ds-base/issue/49233)

6 years ago

commit 565314c
To ssh://git@pagure.io/389-ds-base.git
37da9e6..565314c master -> master

Metadata Update from @firstyear:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

6 years ago

Just to clarify where this fix applies.
It fixes a regression introduced in #49097 (in connection.c:connectio_threadmain OP_FLAG_PS calls slapi_pblock_init rather that memset the pblock)

The issue was that in pblock_clone(), previous it did a straight struct copy. Then when the next iteration occured source_pb would be memset(0). However, in the #49097, this changed to slapi_pblock_init(), which detects source_pb->pb_op and frees it. Of course, because we copied source_pb->pb_op as a pointer to dst_pb->pb_op, we free it from under dst_pb->pb_op. The memory stays for a "bit", and then is reclaimed (giving my heap use after free trace, or the weird segfault you see later on). So the "fix", is to NULL source_pb->pb_op in this case so that the ownership is transferred to dst_pb. :)

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/2292

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: fixed)

3 years ago

Login to comment on this ticket.

Metadata