Learn more about these different git repos.
Other Git URLs
Description of problem:
Ran ipa-client-install to join an IPA domain. 10 minutes later sssd crash
notification came up.
Version-Release number of selected component:
cmdline: /usr/libexec/sssd/sssd_be --domain ipa.thewalter.lan
var_log_messages: Feb 7 13:31:37 stef-rawhide-thewalter-lan abrt: Saved
core dump of pid 8282 (/usr/libexec/sssd/sssd_be) to
/var/spool/abrt/ccpp-2013-02-07-13:31:36-8282 (18923520 bytes)
Thread no. 1 (10 frames)
#2 talloc_abort at ../talloc.c:317
#3 talloc_abort_access_after_free at ../talloc.c:336
#4 talloc_chunk_from_ptr at ../talloc.c:357
#6 talloc_get_name at ../talloc.c:1153
#7 talloc_check_name at ../talloc.c:1172
#8 ipa_dyndns_child_handler at src/providers/ipa/ipa_dyndns.c:1173
#9 child_invoke_callback at src/util/child_common.c:578
#10 tevent_common_loop_immediate at ../tevent_immediate.c:135
#11 std_event_loop_once at ../tevent_standard.c:556
#12 _tevent_loop_once at ../tevent.c:507
This might shed some light:
#2 0x00007fe2a544a2e6 in talloc_abort (reason=0x7fe2a5450718 "Bad talloc magic value - access after free") at ../talloc.c:317
A use after free should be visible in valgrind during normal operation. Please also investigate in the corefile (based on the tevent_req return value perhaps) if the event completed successfully or after a timeout perhaps.
design_review: => 0
testsupdated: => 0
Putting to 1.9.5 for more investigation. We don't have logs so we should try to find the issue in code.
milestone: NEEDS_TRIAGE => SSSD 1.9.5
The problem here is that fork_nsupdate_send request was finished before nsupdate exited. Thus when SIGCHLD is received and ipa_dyndns_child_handler() tries to retrieve private date as struct tevent_req, it tries to access a request that was already freed.
There are only two scenarios when this can happen:
1. We reach IPA_DYNDNS_TIMEOUT (15 seconds) which calls tevent_req_error(req, ETIMEDOUT).
2. We fail to write date to pipe, then in ipa_dyndns_stdin_done() we get ret != EOK from write_pipe_recv() and we call tevent_req_error(req, ret).
=> callback is called, request is freed, but SIGCHLD handler still awaits the signal. When the handler is fired, we access already freed data which causes sssd_be to crash.
1. Do not call tevent_req_error() and tevent_req_done() outside SIGCHLD handler. However, this would make the timeout useless.
2. Provide a way to remove SIGCHLD handler and remove the handler before we mark the request as finished.
owner: somebody => lslebodn
Will be fixed along with the AD dyndns enhancement.
milestone: SSSD 1.9.5 => SSSD 1.10 beta
owner: lslebodn => jhrozek
review: => 0
patch: 0 => 1
This access-after-free was fixed as a byproduct of 9cb46bc
resolution: => fixed
status: new => closed
Metadata Update from @jhrozek:
- Issue assigned to jhrozek
- Issue set to the milestone: SSSD 1.10 beta
to comment on this ticket.