Process 3094 (krb5kdc) of user 0 dumped core.
Stack trace of thread 3094: #0 0x00007f3ac842b7ff raise (libc.so.6) #1 0x00007f3ac8415c35 abort (libc.so.6) #2 0x00007f3ac8415b09 __assert_fail_base.cold.0 (libc.so.6) #3 0x00007f3ac8423de6 __assert_fail (libc.so.6) #4 0x00007f3ab9befb52 ldap_get_values_len (libldap_r-2.4.so.2) #5 0x00007f3abab73273 ipadb_parse_ldap_entry (ipadb.so) #6 0x00007f3abab7418d dbget_princ (ipadb.so) #7 0x00007f3abab742a3 ipadb_get_principal (ipadb.so) #8 0x00007f3aca480ef8 krb5_db_get_principal (libkdb5.so.10) #9 0x000055abf52fc8c8 process_as_req (krb5kdc) #10 0x000055abf52fb5c8 dispatch (krb5kdc) #11 0x000055abf5311125 process_packet (krb5kdc) #12 0x00007f3ac87ba5ec verto_fire (libverto.so.1) #13 0x00007f3ab2006ff1 event_process_active_single_queue (libevent-2.1.so.6) #14 0x00007f3ab2007787 event_base_loop (libevent-2.1.so.6) #15 0x000055abf52fa2f9 main (krb5kdc) #16 0x00007f3ac84177b3 __libc_start_main (libc.so.6) #17 0x000055abf52fa68e _start (krb5kdc)
I don't know, this happened on one of our servers, without doing something special.
Distribution is Red Hat Enterprise Linux release 8.3 (Ootpa).
$ rpm -q freeipa-server freeipa-client ipa-server ipa-client 389-ds-base pki-ca krb5-server package freeipa-server is not installed package freeipa-client is not installed ipa-server-4.8.7-13.module+el8.3.0+8376+0bba7131.x86_64 ipa-client-4.8.7-13.module+el8.3.0+8376+0bba7131.x86_64 389-ds-base-1.4.3.8-6.module+el8.3.0+8995+c08169ba.x86_64 pki-ca-10.9.4-1.module+el8.3.0+8058+d5cd4219.noarch krb5-server-1.18.2-5.el8.x86_64
Given the backtrace below and the code, I don't get how we can end up with lcontext != NULL and ipactx->lcontext == NULL in frame 7.
lcontext != NULL
ipactx->lcontext == NULL
#0 0x00007f3ac842b7ff in raise () from /lib64/libc.so.6 No symbol table info available. #1 0x00007f3ac8415c35 in abort () from /lib64/libc.so.6 No symbol table info available. #2 0x00007f3ac8415b09 in __assert_fail_base.cold.0 () from /lib64/libc.so.6 No symbol table info available. #3 0x00007f3ac8423de6 in __assert_fail () from /lib64/libc.so.6 No symbol table info available. #4 0x00007f3ab9befb52 in ldap_get_values_len (ld=<optimized out>, entry=entry@entry=0x55abf8bad780, target=target@entry=0x7f3abab7e525 "ipatokenRadiusConfigLink") at getvalues.c:97 ber = {ber_opts = {lbo_valid = 5, lbo_options = 0, lbo_debug = 0}, ber_tag = 0, ber_len = 140729034795568, ber_usertag = 5066068250598200064, ber_buf = 0x55abf5911e10 "\247~\002\f", ber_ptr = 0x7ffe0820ea68 "", ber_end = 0x2 <error: Cannot access memory at address 0x2>, ber_sos_ptr = 0x7f3abab6f991 <ipadb_need_retry+49> "\205\300\017\224\300H\203\304\b\303\017\037D", ber_rwptr = 0x7ffe0820ea68 "", ber_memctx = 0x7f3abab6fb7e <ipadb_simple_search+142>} attr = 0x3bad83140 <error: Cannot access memory at address 0x3bad83140> vals = 0x0 __PRETTY_FUNCTION__ = "ldap_get_values_len" #5 0x00007f3abab73273 in ipadb_validate_radius (ipactx=0x55abf5911e10, ua=0x7ffe0820ea64, lentry=0x55abf8bad780) at ipa_kdb_principals.c:315 vals = <optimized out> vals = <optimized out> #6 ipadb_get_user_auth (lentry=0x55abf8bad780, ipactx=0x55abf5911e10) at ipa_kdb_principals.c:352 gua = <optimized out> ua = (IPADB_USER_AUTH_PASSWORD | IPADB_USER_AUTH_RADIUS | IPADB_USER_AUTH_OTP) gcfg = <optimized out> gua = <optimized out> ua = <optimized out> gcfg = <optimized out> #7 ipadb_parse_ldap_entry (kcontext=kcontext@entry=0x55abf58f74f0, principal=0x55abf8bb0920 "<redacted>", lentry=0x55abf8bad780, kentry=kentry@entry=0x55abf8bbf728, polmask=polmask@entry=0x7ffe0820ec0c) at ipa_kdb_principals.c:628 rad_string = "otp\000[{\"indicators\": [\"radius\"]}]" otp_string = "otp\000[{\"indicators\": [\"otp\"]}]" ipactx = 0x55abf5911e10 ua = <optimized out> lcontext = 0x55abf8ba75a0 entry = 0x55abf8b3d710 ied = <optimized out> kerr = <optimized out> res_tl_data = <optimized out> res_key_data = 0x0 mkvno = 0 restrlist = 0x55abf596f3f0 restring = 0x600a3cbf <error: Cannot access memory at address 0x600a3cbf> authz_data_list = 0x2f0000001b restime = 32766 resbool = 8 result = 136374896 ret = <optimized out> #8 0x00007f3abab7418d in dbget_princ (kcontext=kcontext@entry=0x55abf58f74f0, ipactx=ipactx@entry=0x55abf5911e10, search_for=search_for@entry=0x55abf8bbf0e0, flags=flags@entry=96, entry=entry@entry=0x55abf8bbf728) at ipa_kdb_principals.c:1336 kerr = 0 principal = 0x55abf8bb0920 "<redacted>" res = 0x55abf8bad780 lentry = 0x55abf8bad780 pol = 0 --Type <RET> for more, q to quit, c to continue without paging-- #9 0x00007f3abab742a3 in ipadb_get_principal (kcontext=0x55abf58f74f0, search_for=0x55abf8bbf0e0, flags=96, entry=0x55abf8bbf728) at ipa_kdb_principals.c:1522 ipactx = 0x55abf5911e10 kerr = <optimized out> #10 0x00007f3aca480ef8 in krb5_db_get_principal () from /lib64/libkdb5.so.10 No symbol table info available. #11 0x000055abf52fc8c8 in lookup_client (req=0x55abf8bb2990, req=0x55abf8bb2990, entry_out=0x55abf8bbf728, flags=96, context=0x55abf58f74f0) at do_as_req.c:148 pa = <optimized out> cert = {magic = 64, length = 32, data = 0x7f3a0000000a <error: Cannot access memory at address 0x7f3a0000000a>} pa = <optimized out> cert = <optimized out> #12 process_as_req (request=0x55abf8bb2990, req_pkt=<optimized out>, local_addr=0x55abf8bd08b8, remote_addr=<optimized out>, kdc_active_realm=0x55abf59120d0, vctx=0x55abf58ab430, respond=0x55abf52fb2b0 <finish_dispatch_cache>, arg=0x55abf8b6f910) at do_as_req.c:606 errcode = <optimized out> s_flags = 0 encoded_req_body = {magic = 205, length = 173, data = 0x55abf8bd0a0b "0\201\252\240\a\003\005"} useenctype = <optimized out> state = 0x55abf8bbf5d0 au_state = 0x55abf8b97e10 #13 0x000055abf52fb5c8 in dispatch (cb=0x55abf5519140 <shandle>, local_addr=local_addr@entry=0x55abf8bd08b8, remote_addr=remote_addr@entry=0x55abf8bd0890, pkt=pkt@entry=0x55abf8bd09d8, is_tcp=is_tcp@entry=0, vctx=vctx@entry=0x55abf58ab430, respond=0x55abf530e9a0 <process_packet_response>, arg=0x55abf8bd0860) at dispatch.c:201 retval = 0 req = 0x55abf8bb2990 response = 0x0 state = 0x55abf8b6f910 handle = <optimized out> kdc_err_context = 0x55abf58f74f0 #14 0x000055abf5311125 in process_packet (ctx=0x55abf58ab430, ev=<optimized out>) at net-server.c:1096 cc = <optimized out> conn = <optimized out> state = 0x55abf8bd0860 __PRETTY_FUNCTION__ = "process_packet" #15 0x00007f3ac87ba5ec in verto_fire () from /lib64/libverto.so.1 No symbol table info available. #16 0x00007f3ab2006ff1 in event_process_active_single_queue () from /lib64/libevent-2.1.so.6 No symbol table info available. #17 0x00007f3ab2007787 in event_base_loop () from /lib64/libevent-2.1.so.6 No symbol table info available. #18 0x000055abf52fa2f9 in main (argc=5, argv=0x7ffe0820f258) at main.c:1064 retval = 0 kcontext = 0x55abf5891610 realm = <optimized out> ctx = 0x55abf58ab430 tcp_listen_backlog = 5 i = <optimized out>
Thanks, this is a known issue (https://bugzilla.redhat.com/show_bug.cgi?id=1835741, most comments are private due to customer details). We have not so far been able to figure out why lcontext suddenly becomes NULL.
lcontext
Thanks for the bugzilla link (I didn't came across it when googling for my bug) and sorry for the noise.
Any ideas on where this might come from are welcomed. ;) It is really a puzzling crash.
I don't have any idea yet (this issue happened only once on one of our 3 servers). Also, I don't get how we can end up in this situation when I read the code. Are there multiple threads acting on the ipactx variable?
I don't think there are multiple threads at all, KDC is single threaded multi-process. @rharwood is that correct?
OK, then I think I'm misunderstanding something. I interpreted from GDB output that lcontext != NULL and ipactx->lcontext == NULL in frame 7. So, my understanding of the code is that we get from https://pagure.io/freeipa/blob/ipa-4-8/f/daemons/ipa-kdb/ipa_kdb_principals.c#_610 straight to https://pagure.io/freeipa/blob/ipa-4-8/f/daemons/ipa-kdb/ipa_kdb_principals.c#_623 (because lcontext != NULL). https://pagure.io/freeipa/blob/ipa-4-8/f/daemons/ipa-kdb/ipa_kdb_principals.c#_623 and https://pagure.io/freeipa/blob/ipa-4-8/f/daemons/ipa-kdb/ipa_kdb_principals.c#_624 do not modify ipactx so how could we end up in https://pagure.io/freeipa/blob/ipa-4-8/f/daemons/ipa-kdb/ipa_kdb_principals.c#_627 with ipactx->lcontext == NULL?
ipactx
If I was interpreting everything correctly, I would say there is a compiler bug but I really think the bug is in my reasoning instead ;-)
On the other hand, if lcontext == NULL, we'd get to ret = ipadb_get_connection(ipactx); on line 614 and then lcontext would be NULL in line 631. This is not the case for the stacktrace but it would be a bug too.
lcontext == NULL
ret = ipadb_get_connection(ipactx);
One thing that might help here is to see the directory server's access logs for the connections from KDB driver around this crash. Say, if we've got down to ipadb_validate_radius(), we must have passed through ipadb_get_global_config() before that -- if that caused to change ipactx->lcontext, we'd see a search for ipaConfigString.
ipadb_validate_radius()
ipadb_get_global_config()
ipactx->lcontext
ipaConfigString
I fear we don't have these logs any moreā¦
The KDC is multiprocess, not multithreaded, yes. But that doesn't mean that there's no concurrency: we use an event library (through libverto) for descheduling.
master:
ipa-4-9:
Metadata Update from @frenaud: - Issue close_status updated to: fixed - Issue status updated to: Closed (was: Open)
Metadata Update from @frenaud: - Custom field affects_doc adjusted to on - Custom field knownissue adjusted to on - Issue status updated to: Open (was: Closed)
re-opening as we are still missing the backports to ipa-4-6 and ipa-4-8 branches
ipa-4-8:
ipa-4-6:
Login to comment on this ticket.