...which causes krb5kdc to crash as it hits an assert.
#0 0x00007f38f96611d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f38f96628c8 in __GI_abort () at abort.c:90 #2 0x00007f38f965a146 in __assert_fail_base (fmt=0x7f38f97ab3a8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7f38f26506d3 "ld != ((void *)0)", file=file@entry=0x7f38f264acb6 "search.c", line=line@entry=95, function=function@entry=0x7f38f264ad90 <__PRETTY_FUNCTION__.8931> "ldap_pvt_search") at assert.c:92 #3 0x00007f38f965a1f2 in __GI___assert_fail (assertion=assertion@entry=0x7f38f26506d3 "ld != ((void *)0)", file=file@entry=0x7f38f264acb6 "search.c", line=line@entry=95, function=function@entry=0x7f38f264ad90 <__PRETTY_FUNCTION__.8931> "ldap_pvt_search") at assert.c:101 #4 0x00007f38f261d29c in ldap_pvt_search (ld=ld@entry=0x0, base=base@entry=0x7f38fc6ea1e0 "cn=ad,cn=trusts,dc=example,dc=com", scope=scope@entry=2, filter=filter@entry=0x7f38f2a819f0 "(objectclass=ipaNTTrustedDomain)", attrs=attrs@entry=0x7ffd28f71110, attrsonly=attrsonly@entry=0, sctrls=sctrls@entry=0x0, cctrls=cctrls@entry=0x0, timeout=timeout@entry=0x7f38f2c855d0 <std_timeout>, sizelimit=sizelimit@entry=0, deref=deref@entry=-1, msgidp=msgidp@entry=0x7ffd28f71034) at search.c:95 #5 0x00007f38f261d35a in ldap_pvt_search_s (ld=0x0, base=base@entry=0x7f38fc6ea1e0 "cn=ad,cn=trusts,dc=example,dc=com", scope=scope@entry=2, filter=filter@entry=0x7f38f2a819f0 "(objectclass=ipaNTTrustedDomain)", attrs=attrs@entry=0x7ffd28f71110, attrsonly=attrsonly@entry=0, sctrls=sctrls@entry=0x0, cctrls=cctrls@entry=0x0, timeout=timeout@entry=0x7f38f2c855d0 <std_timeout>, sizelimit=sizelimit@entry=0, deref=deref@entry=-1, res=res@entry=0x7ffd28f71120) at search.c:174 #6 0x00007f38f261d430 in ldap_search_ext_s (ld=<optimized out>, base=base@entry=0x7f38fc6ea1e0 "cn=ad,cn=trusts,dc=example,dc=com", scope=scope@entry=2, filter=filter@entry=0x7f38f2a819f0 "(objectclass=ipaNTTrustedDomain)", attrs=attrs@entry=0x7ffd28f71110, attrsonly=attrsonly@entry=0, sctrls=sctrls@entry=0x0, cctrls=cctrls@entry=0x0, timeout=timeout@entry=0x7f38f2c855d0 <std_timeout>, sizelimit=sizelimit@entry=0, res=res@entry=0x7ffd28f71120) at search.c:150 #7 0x00007f38f2a745d3 in ipadb_simple_search (ipactx=ipactx@entry=0x7f38fc7021b0, basedn=0x7f38fc6ea1e0 "cn=ad,cn=trusts,dc=example,dc=com", scope=scope@entry=2, filter=filter@entry=0x7f38f2a819f0 "(objectclass=ipaNTTrustedDomain)", attrs=attrs@entry=0x7ffd28f71110, res=res@entry=0x7ffd28f71120) at ipa_kdb_common.c:189 #8 0x00007f38f2a7b128 in ipadb_mspac_check_trusted_domains (ipactx=ipactx@entry=0x7f38fc7021b0) at ipa_kdb_mspac.c:2378 #9 0x00007f38f2a7b7a8 in ipadb_reinit_mspac (ipactx=ipactx@entry=0x7f38fc7021b0, force_reinit=force_reinit@entry=false) at ipa_kdb_mspac.c:2631 #10 0x00007f38f2a73e61 in ipadb_get_connection (ipactx=0x7f38fc7021b0) at ipa_kdb.c:460 #11 0x00007f38f2a74361 in ipadb_need_retry (ipactx=<optimized out>, error=<optimized out>) at ipa_kdb_common.c:149 #12 0x00007f38f2a7452e in ipadb_simple_search (ipactx=ipactx@entry=0x7f38fc7021b0, basedn=0x7f38fc704c00 "cn=ad,cn=trusts,dc=example,dc=com", scope=scope@entry=2, filter=filter@entry=0x7f38f2a819f0 "(objectclass=ipaNTTrustedDomain)", attrs=attrs@entry=0x7ffd28f71360, res=res@entry=0x7ffd28f71370) at ipa_kdb_common.c:187 #13 0x00007f38f2a7b128 in ipadb_mspac_check_trusted_domains (ipactx=ipactx@entry=0x7f38fc7021b0) at ipa_kdb_mspac.c:2378 #14 0x00007f38f2a7b7a8 in ipadb_reinit_mspac (ipactx=ipactx@entry=0x7f38fc7021b0, force_reinit=force_reinit@entry=false) at ipa_kdb_mspac.c:2631 #15 0x00007f38f2a73e61 in ipadb_get_connection (ipactx=0x7f38fc7021b0) at ipa_kdb.c:460 #16 0x00007f38f2a74361 in ipadb_need_retry (ipactx=<optimized out>, error=<optimized out>) at ipa_kdb_common.c:149 #17 0x00007f38f2a7452e in ipadb_simple_search (ipactx=ipactx@entry=0x7f38fc7021b0, basedn=0x7f38fc6e9fb0 "cn=ad,cn=trusts,dc=example,dc=com", scope=scope@entry=2, filter=filter@entry=0x7f38f2a819f0 "(objectclass=ipaNTTrustedDomain)", attrs=attrs@entry=0x7ffd28f715b0, res=res@entry=0x7ffd28f715c0) at ipa_kdb_common.c:187 #18 0x00007f38f2a7b128 in ipadb_mspac_check_trusted_domains (ipactx=ipactx@entry=0x7f38fc7021b0) at ipa_kdb_mspac.c:2378 #19 0x00007f38f2a7b7a8 in ipadb_reinit_mspac (ipactx=ipactx@entry=0x7f38fc7021b0, force_reinit=force_reinit@entry=false) at ipa_kdb_mspac.c:2631 #20 0x00007f38f2a73e61 in ipadb_get_connection (ipactx=0x7f38fc7021b0) at ipa_kdb.c:460 #21 0x00007f38f2a74361 in ipadb_need_retry (ipactx=<optimized out>, error=<optimized out>) at ipa_kdb_common.c:149 #22 0x00007f38f2a747db in ipadb_simple_modify (ipactx=ipactx@entry=0x7f38fc7021b0, dn=0x7f38fc6e8e60 "fqdn=ccs-10-22-1-26.hosts.example.com,cn=computers,cn=accounts,dc=example,dc=com", mods=0x7f38fc6ea4b0) at ipa_kdb_common.c:252 #23 0x00007f38f2a78a1d in ipadb_modify_principal (entry=0x7f38fc7044b0, kcontext=0x7f38fc701c70) at ipa_kdb_principals.c:2349 #24 ipadb_put_principal (kcontext=kcontext@entry=0x7f38fc701c70, entry=entry@entry=0x7f38fc7044b0, db_args=db_args@entry=0x0) at ipa_kdb_principals.c:2367 #25 0x00007f38f2a7e211 in ipadb_audit_as_req (kcontext=0x7f38fc701c70, request=<optimized out>, client=0x7f38fc7044b0, server=<optimized out>, authtime=1491042316, error_code=0) at ipa_kdb_audit_as.c:132 #26 0x00007f38fb1dd8d0 in krb5_db_audit_as_req (kcontext=kcontext@entry=0x7f38fc701c70, request=request@entry=0x7f38fc703700, client=client@entry=0x7f38fc7044b0, server=server@entry=0x7f38fc704680, authtime=authtime@entry=1491042316, error_code=error_code@entry=0) at kdb5.c:2536 #27 0x00007f38fb84a356 in log_as_req (context=0x7f38fc701c70, from=<optimized out>, request=0x7f38fc703700, reply=reply@entry=0x7f38fc702ae8, client=0x7f38fc7044b0, cname=<optimized out>, server=0x7f38fc704680, sname=0x7f38fc703860 "krbtgt/example.COM@example.COM", authtime=1491042316, status=status@entry=0x0, errcode=errcode@entry=0, emsg=emsg@entry=0x0) at kdc_log.c:92 #28 0x00007f38fb839c27 in finish_process_as_req (state=0x7f38fc702920, errcode=<optimized out>) at do_as_req.c:362 #29 0x00007f38fb844aa5 in enc_ts_verify (context=0x7f38fc701c70, req_pkt=<optimized out>, request=<optimized out>, enc_tkt_reply=0x7f38fc702958, pa=<optimized out>, cb=<optimized out>, rock=0x7f38fc702a90, moddata=0x0, respond=0x7f38fb842ef0 <finish_verify_padata>, arg=0x7f38fc6e9fb0) at kdc_preauth_encts.c:131 #30 0x00007f38fb842e73 in next_padata (state=<optimized out>) at kdc_preauth.c:1178 #31 0x00007f38fb83a7b5 in process_as_req (request=<optimized out>, req_pkt=req_pkt@entry=0x7f38fc703c68, from=from@entry=0x7f38fc702808, kdc_active_realm=0x7f38fc6e9e90, vctx=vctx@entry=0x7f38fc6e9b30, respond=respond@entry=0x7f38fb8389e0 <finish_dispatch_cache>, arg=arg@entry=0x7f38fc6ff360) at do_as_req.c:819 #32 0x00007f38fb838d02 in dispatch (cb=0x7f38fba552c0 <shandle>, local_saddr=<optimized out>, from=0x7f38fc702808, pkt=pkt@entry=0x7f38fc703c68, is_tcp=is_tcp@entry=1, vctx=vctx@entry=0x7f38fc6e9b30, respond=respond@entry=0x7f38fb84bf60 <process_tcp_response>, arg=arg@entry=0x7f38fc703be0) at dispatch.c:190 #33 0x00007f38fb84c240 in process_tcp_connection_read (ctx=0x7f38fc6e9b30, ev=0x7f38fc6e8f10) at net-server.c:1739 #34 0x00007f38f99efcd8 in verto_fire (ev=0x7f38fc6e8f10) at verto.c:947 #35 0x00007f38f1d68d8b in epoll_event_loop (tvalp=0x7ffd28f72050, epoll_ev=0x7f38fc709620) at ../tevent_epoll.c:728 #36 epoll_event_loop_once (ev=<optimized out>, location=<optimized out>) at ../tevent_epoll.c:926 #37 0x00007f38f1d67257 in std_event_loop_once (ev=0x7f38fc709a90, location=0x7f38ec6c7132 "verto-tevent.c:55") at ../tevent_standard.c:114 #38 0x00007f38f1d6340d in _tevent_loop_once (ev=0x7f38fc709a90, location=0x7f38ec6c7132 "verto-tevent.c:55") at ../tevent.c:533 #39 0x00007f38f99ef4af in verto_run (ctx=ctx@entry=0x7f38fc6e9b30) at verto.c:578 #40 0x00007f38fb837997 in main (argc=5, argv=0x7ffd28f722c8) at main.c:1064
This happens at random intervals, as far as I can tell. Also, I think it might be related to the fact that I turned on ipaNTHash generation on IPA user principals, as I don't otherwise have AD set up. Also, since I was asked about it on IRC: I have run ipa-adtrust-install
I think the problem comes from a complexe handling of ldap connection where ipadb_get_connection can called recursively
IMHO the problem comes from those lines:
# # up to here the ldap->lcontext is valid (not NULL) # /* get adtrust options using default refresh interval */ ret = ipadb_reinit_mspac(ipactx, false); if (ret && ret != ENOENT) { /* TODO: log that there is an issue with adtrust settings */ } # But here whatever is the result of ipadb_reinit_mspac we return success so that ldap->lcontext will be used # asserting it is not NULL ret = 0;
So if ipadb_reinit_mspac resets ipactx->lcontext it will not be detected and it will later assert. ipadb_reinit_mspac has the ability to reset ipactx->lcontext. For example in the scenario it calls ipadb_simple_search -> ipadb_need_retry -> ipadb_get_connection -> <any transient failure ldap_initialize or ldap_set_option or ldap_sasl_bind_s will reset ipactx->lcontext and return EIO> But the problem is that the returned EIO/failure is ignored in the lines above.
An additional minor issue is that ipadb_need_retry is called with a 'ret' instead of '&ret', so the return code is not propagated.
I think a fix like could prevent the assert
/* get adtrust options using default refresh interval */ ret = ipadb_reinit_mspac(ipactx, false); if (ret && ret != ENOENT) { /* TODO: log that there is an issue with adtrust settings */ goto done; }
<img alt="0001-Ticket-7017-NULL-LDAP-context-in-call-to-ldap_search.patch" src="/freeipa/issue/raw/5de0c3d32d114d3859ae36ccb563e9a41a30e439a80f77dc00145351fc2b0b8a-0001-Ticket-7017-NULL-LDAP-context-in-call-to-ldap_search.patch" />
a possible scenario is the following one
ipadb_simple_search # Here: ipactx->lcontext = cnx_1 # it searches under "cn=ad,cn=trusts,<suffix>" for "(objectclass=ipaNTTrustedDomain)" entries # Let assum that this search fails for any reason using cnx_1 --> ldap_search_ext_s <-- ldap_search_ext_s = ERROR # Here: ipactx->lcontext = cnx_1 -- > ipadb_need_retry --> ipadb_get_connection # Here: ipactx->lcontext = cnx_2 --> ipadb_reinit_mspac --> ipadb_mspac_check_trusted_domains --> ipadb_simple_search # Here: ipactx->lcontext = cnx_2 # Let assum that this search fails for any reason (second failure) using cnx_2 --> ldap_search_ext_s <-- ldap_search_ext_s = ERROR # Here: ipactx->lcontext = cnx_2 --> ipadb_need_retry --> ipadb_get_connection # Here ldap_initialize fails so # ipactx->lcontext = NULL <-- ipadb_get_connection = EIO + ipactx->lcontext = NULL <-- ipadb_need_retry = FALSE + ipactx->lcontext = NULL <-- ipadb_simple_search = KRB_ERROR + ipactx->lcontext = NULL <-- ipadb_mspac_check_trusted_domains= KRB_ERROR + ipactx->lcontext = NULL <-- ipadb_reinit_mspac = KRB_ERROR + ipactx->lcontext = NULL # # does not detect faillure of ipadb_reinit_mspac and returns 0 # <-- ipadb_get_connection = SUCCESS + ipactx->lcontext = NULL <-- ipadb_need_retry = TRUE + ipactx->lcontext = NULL # Here ipactx->lcontext = NULL and need_retry=TRUE # then KDC issue an internal search with a NULL ldapconnection --> ldap_search_ext_s .... will eventualy abort
The scenario means there are - two consecutive search errors (on different connections) - then a failure to open a new connection
After additional investigation I think the scenario is rather: - two consecutive search errors (on different connections) - then a failure to establish a new connection that can be one of the following 1- ldap_initialize failure 2- or failure to bind 3- or failure to retrieve config/rootdse 4- or failure to read the connection
Metadata Update from @pvoborni: - Issue set to the milestone: FreeIPA 4.5.3
Metadata Update from @pvoborni: - Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1464205
Issue linked to bug 1464205
@blaman can you tell us which distribution and freeipa versions do you run? Would you be able to test the patch @tbordaz produced to see if more changes are needed?
Metadata Update from @tkrizek: - Issue set to the milestone: FreeIPA 4.5.4 (was: FreeIPA 4.5.3)
master:
ipa-4-5:
Metadata Update from @pvomacka: - Issue close_status updated to: fixed - Issue status updated to: Closed (was: Open)
@abbra:
tbordaz's patch worked, haven't seen any more crashes since it was deployed on June 30.
I'm running CentOS 7.3.1611 and IPA 4.4.0 (ipa-server-4.4.0-14.el7.centos.7.x86_64) from repos.
Seeing it's been merged into the 4-5 branch, will the patch be in an (hypothetical, as-yet-unreleased) 4.5.4?
Thanks for the testing. It is good to hear that it was a good fix for your environment.
If it is already in the 4-5 branch, then next 4.5.x release will pick it up.
Log in to comment on this ticket.