Test case:
freeipa (master branch): freeipa-server-4.3.90 DS version: master branch Install ipa server: ipa-server-install -a Secret123 -p Secret123 --domain <domain> --setup-dns --auto-forwarders --auto-reverse -U --realm <realm> Prepare env: http://www.freeipa.org/page/Testing Running serverrole (./make-test ipatests/test_ipaserver/test_serverroles.py). The crash is no systematic, so you may need to run it several times
The crash stack is:
#0 ipa_topo_cfg_host_find (tconf=tconf@entry=0x7f1228001430, findhost=findhost@entry=0x7f11f0018320 "configured-ca.<fqdn_domain>", lock=lock@entry=0) at topology_cfg.c:455 #1 0x00007f1246e4aacc in ipa_topo_cfg_host_add (replica=replica@entry=0x7f1228001430, newhost=newhost@entry=0x7f11f0018320 "configured-ca.<fqdn_domain>") at topology_cfg.c:489 #2 0x00007f1246e4fcc6 in ipa_topo_util_add_managed_host (suffix=<optimized out>, addhost=0x7f11f0018320 "configured-ca.<fqdn_domain>") at topology_util.c:1479 #3 0x00007f1246e4fd55 in ipa_topo_util_add_host (hostentry=<optimized out>) at topology_util.c:1493 #4 0x00007f1246e4b712 in ipa_topo_post_add (pb=<optimized out>) at topology_post.c:106 #5 0x00007f12537f29c9 in plugin_call_func (list=0x5603c73b1cf0, operation=operation@entry=507, pb=pb@entry=0x7f12267fba30, call_one=call_one@entry=0) at ldap/servers/slapd/plugin.c:2034 #6 0x00007f12537f2c34 in plugin_call_list (pb=0x7f12267fba30, operation=507, list=<optimized out>) at ldap/servers/slapd/plugin.c:1978 #7 plugin_call_plugins (pb=pb@entry=0x7f12267fba30, whichfunction=whichfunction@entry=507) at ldap/servers/slapd/plugin.c:445 #8 0x00007f1253795fd9 in op_shared_add (pb=pb@entry=0x7f12267fba30) at ldap/servers/slapd/add.c:769 #9 0x00007f1253797288 in do_add (pb=pb@entry=0x7f12267fba30) at ldap/servers/slapd/add.c:226 #10 0x00005603c6960180 in connection_dispatch_operation (pb=0x7f12267fba30, op=0x5603c79aab50, conn=0x7f1253b30d90) at ldap/servers/slapd/connection.c:612 #11 connection_threadmain () at ldap/servers/slapd/connection.c:1759 #12 0x00007f12519bd7df in _pt_root (arg=0x5603c79cb320) at ../../../nspr/pr/src/pthreads/ptthread.c:216 #13 0x00007f125135c5ca in start_thread (arg=0x7f12267fc700) at pthread_create.c:333 #14 0x00007f1251095ead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
The crash is similar to #5928 but #5928 fix does not prevent the heap corruption
I failed to trigger a core dump (apparently the soft limit 'core file size' was 0 although it was correctly set to 'ulimit -c unlimited' in /etc/sysconfig/dirsrv). Was unable to force this value with prlimit. So the only way was to run the test with attached debugger :-(
Crash analyze. It crashed when adding this entry
dn: cn=configured-ca.<fqdn_domain>,cn=masters,cn=ipa,cn=etc,dc=<domain_suffix> objectclass: top objectclass: nsContainer objectclass: ipaReplTopoManagedServer objectclass: ipaSupportedDomainLevelConfig objectclass: ipaConfigObject ipaMaxDomainLevel: 1 ipaMinDomainLevel: 0 ipaReplTopoManagedSuffix: dc=<domain_suffix> ipaReplTopoManagedSuffix: o=ipaca cn: configured-ca.<fqdn_domain> creatorsName: uid=admin,cn=users,cn=accounts,dc=<domain_suffix> modifiersName: uid=admin,cn=users,cn=accounts,dc=<domain_suffix> createTimestamp: 20160619174814Z modifyTimestamp: 20160619174814Z nsUniqueId: e096c9df-364511e6-b2eeddef-9f8b2671 frame 4 :entry_type=TOPO_HOST_ENTRY frame 3 print addhost $40 = 0x7f11f0018320 "configured-ca.<fqdn_domain>" frame 2 (gdb) print *conf $47 = {next = 0x0, repl_lock = 0x7f1228002fd0, shared_config_base = 0x7f1228001330 "cn=domain,cn=topology,cn=ipa,cn=etc,dc=<domain_suffix>", repl_root = 0x7f1228001860 "dc=<domain_suffix>", strip_attrs = 0x7f1228003080 "modifiersName modifyTimestamp internalModifiersName internalModifyTimestamp", total_attrs = 0x7f12280017b0 "(objectclass=*) $ EXCLUDE entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount", repl_attrs = 0x7f12280013a0 "(objectclass=*) $ EXCLUDE memberof idnssoaserial entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount", repl_segments = 0x0, hosts = 0x7f11c4007c80} frame 0 (gdb) print *tconf $55 = {next = 0x0, repl_lock = 0x7f1228002fd0, shared_config_base = 0x7f1228001330 "cn=domain,cn=topology,cn=ipa,cn=etc,dc=<domain_suffix>", repl_root = 0x7f1228001860 "dc=<domain_suffix>", strip_attrs = 0x7f1228003080 "modifiersName modifyTimestamp internalModifiersName internalModifyTimestamp", total_attrs = 0x7f12280017b0 "(objectclass=*) $ EXCLUDE entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount", repl_attrs = 0x7f12280013a0 "(objectclass=*) $ EXCLUDE memberof idnssoaserial entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount", repl_segments = 0x0, hosts = 0x7f11c4007c80} (gdb) print *tconf->hosts $56 = {next = 0x7f11c400a3a0, hostname = 0x0} (gdb) print *tconf->hosts->next $58 = {next = 0xe, hostname = 0x7f11c403cf60 "nsuniqueid=e096c93e-364511e6-b2eeddef-9f8b2671,cn=ca-dns-dnssec-keymaster.<fqdn_domain>,cn=masters,cn=ipa,cn=etc,<suffix>"...} (gdb) print host $57 = (TopoReplicaHost *) 0xe
So there is the same symptom as https://fedorahosted.org/freeipa/ticket/5928 (null hostname in the config).
Following the host with NULL hostname there is a host ("nsuniqueid=e096c93e-36451...") that contains a corrupted next=0xe. That is looking to be a heap corruption
Next step is to run the test under valgrind
Still failing to run DS under valgrind into systemd...
Plugin logs during the failure shows that ipa_topo_post_add is called without preceding ipa_topo_pre_add
[21/Jun/2016:15:17:01.474069458 +0200] ipa-topology-plugin - --> ipa_topo_post_add [21/Jun/2016:15:17:01.475027785 +0200] ipa-topology-plugin - <-- ipa_topo_post_add [21/Jun/2016:15:17:01.505713546 +0200] ipa-topology-plugin - --> ipa_topo_post_mod [21/Jun/2016:15:17:01.506944947 +0200] ipa-topology-plugin - <-- ipa_topo_post_mod [21/Jun/2016:15:17:01.537601371 +0200] ipa-topology-plugin - --> ipa_topo_pre_add [21/Jun/2016:15:17:01.538704073 +0200] ipa-topology-plugin - <-- ipa_topo_pre_add [21/Jun/2016:15:17:01.546102621 +0200] ipa-topology-plugin - --> ipa_topo_pre_add [21/Jun/2016:15:17:01.547306114 +0200] ipa-topology-plugin - <-- ipa_topo_pre_add [21/Jun/2016:15:17:01.564007633 +0200] ipa-topology-plugin - --> ipa_topo_post_add [21/Jun/2016:15:17:01.565446674 +0200] ipa-topology-plugin - <-- ipa_topo_post_add [21/Jun/2016:15:17:01.597961742 +0200] ipa-topology-plugin - --> ipa_topo_post_add [21/Jun/2016:15:17:01.604449640 +0200] ipa-topology-plugin - ipa_topo_util_update_segments_for_host: no agrements found
The last part of the log (in ipa_topo_post_add)
[21/Jun/2016:15:17:01.597961742 +0200] ipa-topology-plugin - --> ipa_topo_post_add [21/Jun/2016:15:17:01.598977772 +0200] NSUniqueAttr - ADD begin [21/Jun/2016:15:17:01.599995353 +0200] NSUniqueAttr - ADD target=cn=KDC,cn=configured-ca.dom-141.abc.idm.lab.eng.brq.redhat.com,cn=masters,cn=ipa,cn=etc,dc=dom-141,dc=abc,dc=idm,dc=lab,dc=eng,dc=brq,dc=redhat,dc=com [21/Jun/2016:15:17:01.600991405 +0200] NSUniqueAttr - ADD begin [21/Jun/2016:15:17:01.602244244 +0200] NSUniqueAttr - ADD target=cn=KDC,cn=configured-ca.dom-141.abc.idm.lab.eng.brq.redhat.com,cn=masters,cn=ipa,cn=etc,dc=dom-141,dc=abc,dc=idm,dc=lab,dc=eng,dc=brq,dc=redhat,dc=com [21/Jun/2016:15:17:01.603310273 +0200] ipa-range-check - Not an ID range object, nothing to do. [21/Jun/2016:15:17:01.604449640 +0200] ipa-topology-plugin - ipa_topo_util_update_segments_for_host: no agrements found [21/Jun/2016:15:17:01.605520148 +0200] NSUniqueAttr - ADD begin
Not possible to correlate that with access log because the log are not flush during the crash
attachment 0001-Topology-plugins-sigsev-heap-corruption-when-adding-.patch
master:
Metadata Update from @tbordaz: - Issue assigned to tbordaz - Issue set to the milestone: FreeIPA 4.4
Login to comment on this ticket.