#47759 Crash in replication when server is under write load
Closed: Fixed None Opened 6 years ago by mreynolds.

If a replica is under a lot of load, the server can crash in the replication connection code: #0 0x00007f225b287989 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f225b289098 in __GI_abort () at abort.c:90 #2 0x00007f225b2808f6 in __assert_fail_base (fmt=0x7f225b3d03e8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7f225ca773d3 "ld != ((void *)0)", file=file@entry=0x7f225ca71e76 "search.c", line=line@entry=95, function=function@entry=0x7f225ca71f50 <__PRETTY_FUNCTION__.8910> "ldap_pvt_search") at assert.c:92 #3 0x00007f225b2809a2 in __GI___assert_fail (assertion=assertion@entry=0x7f225ca773d3 "ld != ((void *)0)", file=file@entry=0x7f225ca71e76 "search.c", line=line@entry=95, function=function@entry=0x7f225ca71f50 <__PRETTY_FUNCTION__.8910> "ldap_pvt_search") at assert.c:101 #4 0x00007f225ca44b2c in ldap_pvt_search (ld=ld@entry=0x0, base=base@entry=0x7f225376240d "", scope=scope@entry=0, filter=filter@entry=0x7f2253759e21 "(objectclass=*)", attrs=attrs@entry=0x7f2177feead0, attrsonly=attrsonly@entry=0, sctrls=sctrls@entry=0x0, cctrls=cctrls@entry=0x0, timeout=timeout@entry=0x7f218c028b38, sizelimit=sizelimit@entry=0, deref=deref@entry=-1, msgidp=msgidp@entry=0x7f2177feea24) at search.c:95 #5 0x00007f225ca44bea in ldap_pvt_search_s (ld=0x0, base=base@entry=0x7f225376240d "", scope=scope@entry=0, filter=filter@entry=0x7f2253759e21 "(objectclass=*)", attrs=attrs@entry=0x7f2177feead0, attrsonly=attrsonly@entry=0, sctrls=sctrls@entry=0x0, cctrls=cctrls@entry=0x0, timeout=timeout@entry=0x7f218c028b38, sizelimit=sizelimit@entry=0, deref=deref@entry=-1, res=res@entry=0x7f2177feeac8) at search.c:174 #6 0x00007f225ca44cc0 in ldap_search_ext_s (ld=<optimized out="">, base=base@entry=0x7f225376240d "", scope=scope@entry=0, filter=filter@entry=0x7f2253759e21 "(objectclass=*)", attrs=attrs@entry=0x7f2177feead0, attrsonly=attrsonly@entry=0, sctrls=sctrls@entry=0x0, cctrls=cctrls@entry=0x0, timeout=timeout@entry=0x7f218c028b38, sizelimit=sizelimit@entry=0, res=res@entry=0x7f2177feeac8) at search.c:150 #7 0x00007f2253721da5 in conn_replica_supports_ds5_repl (conn=conn@entry=0x7f218c028ab0) at ldap/servers/plugins/replication/repl5_connection.c:1237 #8 0x00007f225372a451 in acquire_replica (prp=prp@entry=0x7f218c028c10, prot_oid=<optimized out="">, prot_oid@entry=0x7f225375b15a "2.16.840.1.113730.3.6.1", ruv=ruv@entry=0x7f2177feec88) at ldap/servers/plugins/replication/repl5_protocol_util.c:184 #9 0x00007f22537246c7 in repl5_inc_run (prp=<optimized out="">) at ldap/servers/plugins/replication/repl5_inc_protocol.c:794 #10 0x00007f2253729a9c in prot_thread_main (arg=0x7f218c0289a0) at ldap/servers/plugins/replication/repl5_protocol.c:296 The crash is occurring because the connection LDAP structure(conn->ld) was disconnected. We need locking around these code areas to prevent this race condition.

git merge ticket47759
Updating 67ba61b..9940ca2
Fast-forward
.../servers/plugins/replication/repl5_connection.c | 89 +++++++++++---------
ldap/servers/slapd/ldaputil.c | 39 ++++-----
2 files changed, 69 insertions(+), 59 deletions(-)

git push origin master
67ba61b..9940ca2 master -> master

commit 9940ca2
Author: Mark Reynolds mreynolds@redhat.com
Date: Mon Mar 31 15:17:59 2014 -0400

Pushed to 389-ds-base-1.3.2 branch:
7a50bc6..0e576c8 389-ds-base-1.3.2 -> 389-ds-base-1.3.2
commit 0e576c8

Pushed to 389-ds-base-1.3.1 branch:
99609ce..2a80b71 389-ds-base-1.3.1 -> 389-ds-base-1.3.1
commit 2a80b7152823ca16628c2da48614166b8d2104a4

Thanks to Mark for the fix! Since it passed the stress test, I'm closing this ticket with fixed.

Metadata Update from @nhosoi:
- Issue assigned to mreynolds
- Issue set to the milestone: 1.3.1.23

3 years ago

Login to comment on this ticket.

Metadata