#48149 ns-slapd double free or corruption crash
Closed: wontfix None Opened 6 years ago by lkrispen.

ds crashes occasionally hen performing a cn=monitor search.

his bug was reported in bz1203338, details there.

The core issue is in libdb, see: bz 1211871, but it could eventually fixed in DS by prebvening db_open calls and memp_stat calls to run in parallel


I wrote a lib389 test and got the following crash: Core was generated by `./ns-slapd -D /root/389TEST/install/etc/dirsrv/slapd-standalone -i /root/389TES'. Program terminated with signal SIGABRT, Aborted. #0 0x00007f9bc6bb6887 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install audit-libs-2.4-1.fc20.x86_64 cyrus-sasl-gssapi-2.1.26-14.fc20.x86_64 cyrus-sasl-lib-2.1.26-14.fc20.x86_64 cyrus-sasl-md5-2.1.26-14.fc20.x86_64 glibc-2.18-14.fc20.x86_64 keyutils-libs-1.5.9-1.fc20.x86_64 krb5-libs-1.11.5-11.fc20.x86_64 libcom_err-1.42.8-3.fc20.x86_64 libdb-5.3.28-1.fc20.x86_64 libgcc-4.8.3-7.fc20.x86_64 libicu-50.1.2-11.fc20.x86_64 libselinux-2.2.1-6.fc20.x86_64 libstdc++-4.8.3-7.fc20.x86_64 nspr-4.10.7-1.fc20.x86_64 nss-3.17.0-1.fc20.x86_64 nss-softokn-3.17.0-1.fc20.x86_64 nss-softokn-freebl-3.17.0-1.fc20.x86_64 nss-util-3.17.0-1.fc20.x86_64 openssl-libs-1.0.1e-39.fc20.x86_64 pam-1.1.8-1.fc20.x86_64 pcre-8.33-6.fc20.x86_64 sqlite-3.8.6-2.fc20.x86_64 svrcore-4.0.4-10.fc20.x86_64 xz-libs-5.1.2-12alpha.fc20.x86_64 zlib-1.2.8-3.fc20.x86_64 (gdb) bt #0 0x00007f9bc6bb6887 in raise () from /lib64/libc.so.6 #1 0x00007f9bc6bb7f78 in abort () from /lib64/libc.so.6 #2 0x00007f9bc6bf6ad4 in __libc_message () from /lib64/libc.so.6 #3 0x00007f9bc6bfddf8 in _int_free () from /lib64/libc.so.6 #4 0x00007f9bc9168eb6 in slapi_ch_free (ptr=ptr@entry=0x7f9bb27f2ac0) at /root/389TEST/workspaces/389-ds-base/ds/ldap/servers/slapd/ch_malloc.c:363 #5 0x00007f9bbf305c3c in ldbm_back_monitor_instance_search (pb=<optimized out>, e=0x7f9b9400fe70, entryAfter=<optimized out>, returncode=0x7f9bb27f4c84, returntext=<optimized out>, arg=0x1272880) at /root/389TEST/workspaces/389-ds-base/ds/ldap/servers/slapd/back-ldbm/monitor.c:260 #6 0x00007f9bc917329b in dse_call_callback (pb=pb@entry=0x7f9bb27fbae0, operation=operation@entry=4, flags=1, entryBefore=entryBefore@entry=0x7f9b9400fe70, entryAfter=entryAfter@entry=0x0, returncode=returncode@entry=0x7f9bb27f4c84, returntext=returntext@entry=0x7f9bb27f4f00 "", pdse=<optimized out>) at /root/389TEST/workspaces/389-ds-base/ds/ldap/servers/slapd/dse.c:2663 #7 0x00007f9bc9174c7d in do_dse_search (attrsonly=<optimized out>, attrs=<optimized out>, filter=<optimized out>, basedn=<optimized out>, scope=<optimized out>, pb=0x7f9bb27fbae0, pdse=0xeeb5b0) at /root/389TEST/workspaces/389-ds-base/ds/ldap/servers/slapd/dse.c:1675 #8 dse_search (pb=0x7f9bb27fbae0) at /root/389TEST/workspaces/389-ds-base/ds/ldap/servers/slapd/dse.c:1789 #9 0x00007f9bc91aafa9 in op_shared_search (pb=pb@entry=0x7f9bb27fbae0, send_result=send_result@entry=1) at /root/389TEST/workspaces/389-ds-base/ds/ldap/servers/slapd/opshared.c:823 #10 0x00000000004283cc in do_search (pb=pb@entry=0x7f9bb27fbae0) at /root/389TEST/workspaces/389-ds-base/ds/ldap/servers/slapd/search.c:378 #11 0x000000000041860e in connection_dispatch_operation (pb=0x7f9bb27fbae0, op=0x15aebe0, conn=0x7f9bc9578560) at /root/389TEST/workspaces/389-ds-base/ds/ldap/servers/slapd/connection.c:684 #12 connection_threadmain () at /root/389TEST/workspaces/389-ds-base/ds/ldap/servers/slapd/connection.c:2534 #13 0x00007f9bc75a6e3b in _pt_root () from /lib64/libnspr4.so #14 0x00007f9bc6f46f35 in start_thread () from /lib64/libpthread.so.0 #15 0x00007f9bc6c75c3d in clone () from /lib64/libc.so.6 (gdb) f 5 #5 0x00007f9bbf305c3c in ldbm_back_monitor_instance_search (pb=<optimized out>, e=0x7f9b9400fe70, entryAfter=<optimized out>, returncode=0x7f9bb27f4c84, returntext=<optimized out>, arg=0x1272880) at /root/389TEST/workspaces/389-ds-base/ds/ldap/servers/slapd/back-ldbm/monitor.c:260 260 slapi_ch_free((void **)&mpfstat); This looks very close to the customer crash

with the attached test script I did get a crash in 5 out of 10 runs
I produced 10 crahses with 7 different stack traces, but all in malloc and related to ldbm_back_monitor_instance search. That we see different crash location is quite usual for heap corruptions

{{{
105 #define DB_OPEN(priv, oflags, db, txnid, file, database, type, flags, mode, rval) \
...
109 if ((priv)) slapi_rwlock_rdlock((priv)->dblayer_env_lock); \
109 110 (rval) = ((db)->open)((db), (txnid), (file), (database), (type), (flags)|DB_AUTO_COMMIT, (mode)); \
111 if ((priv)) slapi_rwlock_unlock((priv)->dblayer_env_lock); \
}}}
Should this be "env" instead of "priv"?

well it is of type 'struct dblayer_private_env *', so I called it priv, but maybe it could be penv

Replying to [comment:9 lkrispen]:

well it is of type 'struct dblayer_private_env *', so I called it priv, but maybe it could be penv

Ok. I was just confused because everywhere DB_OPEN is used, the first argument is pENV or mypENV, and here the argument is env instead of priv:
{{{
125 #define DB_OPEN(env, oflags, db, txnid, file, database, type, flags, mode, rval) \
}}}

If it should be priv for the first definition of DB_OPEN, that's fine.

This issue is taken care in 1.2.11.
This is not a problem in 1.3.3 and newer since libdb has the fix.

Closing this ticket. Thanks, Ludwig!

Metadata Update from @lkrispen:
- Issue assigned to lkrispen
- Issue set to the milestone: 1.2.11.33

4 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/1480

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Fixed)

8 months ago

Login to comment on this ticket.

Metadata