Learn more about these different git repos.
Other Git URLs
I did a dnf update today and was stuck without a working kerberos ticket cache. It appears that there were some issues restarting the service rendering it useless thereafter. A simple systemctl restart sssd-kcm.socket fixed the issue obviously.
systemctl restart sssd-kcm.socket
Attaching both the system log and the DNF output.
<img alt="system.log" src="/SSSD/sssd/issue/raw/files/d2bd4720b5f5115aed99ec634d1e725b607f20d08da453fe5ca2d00f9d0fc932-system.log" /><img alt="dnf_history_info_last" src="/SSSD/sssd/issue/raw/files/9a3c92c3286adc4d1040594531a0fe2440afd6178ace5366561a7b6c21a71df4-dnf_history_info_last" />
Can you reproduce the issue? In the system log, I can only see that KCM had some issues, as you say and the syslog said the socket was already there.
Could you reproduce the issue again, this time adding:
[kcm] debug_level=10 debug_microseconds=true [secrets] debug_level=10 debug_microseconds=true
to sssd.conf and restarting the sssd service?
Hm, I don't seem to be able to reproduce this right now. Though maybe the order/timing of restarts in the post install scripts is relevant?
Maybe... What did you upgrade from and to?
Hm, the attachment is being slightly mishandled, but I think it should be there.
https://pagure.io/SSSD/sssd/issue/raw/files/9a3c92c3286adc4d1040594531a0fe2440afd6178ace5366561a7b6c21a71df4-dnf_history_info_last
Not sure if this might be related; I just unsuspended the machine, and got stuck with a non-working SSSD-KCM. Though it looked a bit like sssd-secrets was stuck on something and sssd-kcm just refused to work (or even stop) at that point.
Attaching an strace of sssd-kcm, sorry, don't have anything else at this point. But it keeps trying to do a 'sendto(14, "GET /kcm/persistent/1000/ccache/"..., 115, MSG_NOSIGNAL, NULL, 0) = 115'. Doing a "klist -A" resulted "klist: Internal credentials cache error while listing ccache collection" (also attaching strace).
I have now enabled the debugging features, so lets hope that something more useful comes out of that. <img alt="klist-A-failure-after-suspend-for-more-than-a-day" src="/SSSD/sssd/issue/raw/files/1719787b38b7bb1d442458538c53e7f6621269d3ee4b5abfdbcd223259a5035f-klist-A-failure-after-suspend-for-more-than-a-day" />
<img alt="strace-sssd-kcm-hanging" src="/SSSD/sssd/issue/raw/files/3cdd419ffda677956297e1a7173e71c427fbaf7a77c78a95b322e70b9fd710d3-strace-sssd-kcm-hanging" />
Seems to be the same bug as in fedora ticket https://bugzilla.redhat.com/show_bug.cgi?id=1494843#c4
Debug log files will be more useful then strace output https://bugzilla.redhat.com/show_bug.cgi?id=1494843#c12
@benzea Do you use GNOME Online Accounts + kerberos? Or you can reproduce with plain kinit?
GNOME online accounts is obviously running, but I have always added my kerberos identities by running kinit every time.
So, I just ran into it, and after a short chat with Patrick Uiterwijk it looks the race condition is simply that systemd has not opened the socket when the sssd service is being active. i.e. what happens is:
There are different possible fixes for this: * add proper Before=/After= lines * prevent the service from ever trying to bind to the socket if running under systemd
On (03/11/17 10:17), Benjamin Berg wrote:
So, I just ran into it, and after a short chat with Patrick Uiterwijk it looks the race condition is simply that systemd has not opened the socket when the sssd service is being active. i.e. what happens is: sssd-*.service is started sssd-*.socket is also triggered (but the socket is not bound yet) daemon comes up daemon binds to the socket as systemd has not done so yet systemd fails to bind the socket and the sssd-*.socket units fail to start up There are different possible fixes for this: add proper Before=/After= lines prevent the service from ever trying to bind to the socket if running under systemd
There are different possible fixes for this: add proper Before=/After= lines prevent the service from ever trying to bind to the socket if running under systemd
I checked few other socket activated services
And most of socket activates services use "Requires=$name.socket" instead of Before/After and some of them used Wants+After
I will check with systemd guys.
LS
https://github.com/SSSD/sssd/pull/437
Metadata Update from @lslebodn: - Issue tagged with: PR
master:
Metadata Update from @lslebodn: - Custom field version adjusted to 1.15.3
Metadata Update from @lslebodn: - Issue close_status updated to: Fixed - Issue set to the milestone: SSSD 1.16.1 - Issue status updated to: Closed (was: Open)
Metadata Update from @lslebodn: - Issue assigned to lslebodn
SSSD is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in SSSD's github repository.
This issue has been cloned to Github and is available here: - https://github.com/SSSD/sssd/issues/4555
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Login to comment on this ticket.