#3756 machine credentials renewal with adcli reports "success" even if adcli fails
Closed: wontfix a year ago by pbrezina. Opened 2 years ago by vojamo.

I have been troubleshooting an error renewing machine account credentials for over 6 months. Seemingly at random times, weeks to months between cases, a seemingly random machine (out of a three-digit number total) fails to renew its machine account credentials and then of course SSSD stops working with AD. I have not yet found out if the problem is at adcli, SSSD, or something else.

I finally managed to capture a log of the error, and it shows a possible bug in SSSD.
SSSD log says the adcli operation "finished successfully":

(Tue Jun  5 12:07:39 2018) [sssd[be[example.com]]] [be_ptask_done] (0x0400): Task [AD machine account password renewal]: finished successfully

Meanwhile, adcli did not actually finish successfully, it has an error:

<adcli output cut out>
 ! Cannot change computer password: Authentication error
adcli: updating membership with domain example.com failed: Cannot change computer password: Authentication error

So it seems SSSD either thinks the operation was successful even though it had an error, or then the log message of SSSD is misleading.

I have no explanations right now for why the "Authentication error" happens so the troubleshooting of the actual problem will continue.


Hi,

I agree that the 'finished successfully' message here is a bit irritating. It basically says that SSSD was able to run adcli but does not reflect the result of the adcli operation.

About the 'Authentication error' error. This is most probable a timeout issue while using UDP (which is the default). For the password change libkrb5 send a UDP packet with the needed data to the AD DC. If there is no reply after 1s (hardcoded) it sends the package again because libkrb5 assumes the packet is lost. But if the AD DC received the first packet and is still busy processing it it will reply with KRB5KRB_AP_ERR_REPEAT if the second packet arrives. Since this reply is not specified in the related RFC 3244 libkrb5 assumes the most common issue and returns 'Authentication error'. With this error adcli has to assume the password change failed and does not update the local keytab with the new keys but the AD DC, since it received the first packet, will update the password and won't accept the old one anymore.

This issue is known upstream as http://krbdev.mit.edu/rt/Ticket/Display.html?id=7905.

To get around this I would recommend to effectively disable UDP by setting:

udp_preference_limit = 0

in the [libdefaults] section of /etc/krb5.conf. This is useful in AD environments in general as well because due to the PAC the Kerberos tickets are typically larger than a UDP packet and libkrb5 has to fall back to TCP anyway after trying UDP.

HTH

bye,
Sumit

Thank you very much Sumit for giving pointers on the machine password change issue.

I've finally had the time to start testing if the problem is related to UDP like you suggest. Will roll this change out in a few phases and should see in a few months if the problems have stopped.

I added "udp_preference_limit = 0" to libdefaults in krb5.conf some time ago, and we've found at least one system where this problem still occurred. If I capture traffic with tcpdump and run "adcli update", I do still see some UDP traffic on port 88, though I can't tell if it's relevant.

Should that setting disable all UDP traffic? If so, it does not seem to be effective.

I added "udp_preference_limit = 0" to libdefaults in krb5.conf some time ago, and we've found at least one system where this problem still occurred. If I capture traffic with tcpdump and run "adcli update", I do still see some UDP traffic on port 88, though I can't tell if it's relevant.
Should that setting disable all UDP traffic? If so, it does not seem to be effective.

Hi,

which version of MIT Kerberos are you using on this system? Maybe you are hit by https://krbdev.mit.edu/rt/Ticket/Display.html?id=8554.

HTH

bye,
Sumit

Metadata Update from @pbrezina:
- Issue tagged with: Canditate to close

a year ago

Thank you for taking time to submit this request for SSSD. Unfortunately this issue was not given priority and the team lacks the capacity to work on it at this time.

Given that we are unable to fulfill this request I am closing the issue as wontfix.

If the issue still persist on recent SSSD you can request re-consideration of this decision by reopening this issue. Please provide additional technical details about its importance to you.

Thank you for understanding.

Metadata Update from @pbrezina:
- Issue close_status updated to: wontfix
- Issue status updated to: Closed (was: Open)

a year ago

SSSD is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in SSSD's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/SSSD/sssd/issues/4762

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Login to comment on this ticket.

Metadata