Learn more about these different git repos.
Other Git URLs
I have encountered a bug in SSSD that seems present in all versions that I have been able to test.
Instabilities with the ldap/ipa server, seems to result in the sssd cache holding gid values for nss, but not the alpha group name representation.
For example: uid=3341(user) gid=3341(user) groups=3341(user),3602(file-administrators),3579,12001,6895(vpn-users),12000(admins)
I have troubleshot this issue with sgallagh and he has recommended that I file this bug.
attachment sssd_nss.log
RH developers: The corresponding support case number is 01369183
Linked to Bugzilla bug: https://bugzilla.redhat.com/show_bug.cgi?id=1196204 (Red Hat Enterprise Linux 6)
rhbz: => [https://bugzilla.redhat.com/show_bug.cgi?id=1196204 1196204]
Fields changed
owner: somebody => sbose
milestone: NEEDS_TRIAGE => SSSD 1.12.5
From the logs you provide I assume you used a SSSD version from RHEL-6.5. Did you had a chance to try a recent version from RHEL-6.6? If yes, do you have logs from the RHEL-6.6 version as well?
If I remember correctly, JR was compiling his own SSSD builds from source. Last time I was debugging an issue for him, he was running sssd-1-9 on RHEL-5.
That said, trying out the latest version would be good.
I will test both. I am actually running both RHEL5 & RHEL6 and the problem exists and would like to be resolved in both, but I can manage the oneness of recompiling the RHEL5 source with any patches that might be applicable.
I'm still trying to reproduce this on RHEL5 with a local build of sssd-1.9. So far without luck, but I have a suspicion. Can you try to set 'ldap_deref_threshold = 0' in the domain section of sssd.conf and see if group-memberships are now resolved properly?
Reproducing the problem may be difficult as it is a symptom/result of a bug in the LDAP system... I would suggest doing things like deliberately running GDB on 389 or krb5kdc to stop it from answering requests mid stream from the client. It is during these interruptions in service or outages in the ldap backend where these caching problems seem to manifest. Specifically they are sort of soft failures, where the traffic isn't rejected with a port not listening etc, it could be that it accepts the connection but doesn't send data, or it sends unexpected or incomplete data.
It is in these scenarios where SSSD is supposed to be strongest, so that the client system can withstand the outage, but in the case of a soft-failure, the results appear to be mixed.
I will try setting ldap_deref_threshold = 0 and report back.
I have found this other similar bug: https://fedorahosted.org/sssd/ticket/2398?cversion=0&cnum_hist=1
I am very concerned that if I do impliment the ldap_deref_threshold value that it will prohibit my users from successfully ssh'ing into the systems.
Is there are number of bugs between IPA + SSSD that tie this problem together?
I can now confirm that setting ldap_deref_threshold = 0 does seem to resolve / mask the issue. I am able to move forward, though I expect this setting has consequences and potentially load issues that I need to consider.
Thank you for testing, this confirm my suspicion that there is an issue in the deref code-path. I will continue searching for a fix so that you can remove the ldap_deref_threshold option in a patched version of SSSD.
About the consequences of setting 'ldap_deref_threshold = 0'. By default the group memberships looked up one-by-one. Only if there are more than the given threshold value a deref call is used to hopefully speed things up. So, looking up one-by-one is the default case which is always used and using the deref is an optimization.
About #2398, I think it turned out that the reason for the issue was the new FreeIPA ACI scheme and not 'ldap_deref_threshold = 0'.
Thank you for your assistance. I'll come back to you when I have a patch. Please let me know if your prefer to build new packages yourself or if I should provide test builds? In the latter case please let me know which version for which platform you need.
patch: 0 => 1
resolution: => fixed status: new => closed
Metadata Update from @jraquino: - Issue assigned to sbose - Issue set to the milestone: SSSD 1.12.5
SSSD is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in SSSD's github repository.
This issue has been cloned to Github and is available here: - https://github.com/SSSD/sssd/issues/3632
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Login to comment on this ticket.