#2591 sssd nss bug update vs create cache
Closed: Fixed None Opened 5 years ago by jraquino.

I have encountered a bug in SSSD that seems present in all versions that I have been able to test.

Instabilities with the ldap/ipa server, seems to result in the sssd cache holding gid values for nss, but not the alpha group name representation.

For example: uid=3341(user) gid=3341(user) groups=3341(user),3602(file-administrators),3579,12001,6895(vpn-users),12000(admins)

I have troubleshot this issue with sgallagh and he has recommended that I file this bug.

RH developers: The corresponding support case number is 01369183

Fields changed

owner: somebody => sbose

Fields changed

milestone: NEEDS_TRIAGE => SSSD 1.12.5

From the logs you provide I assume you used a SSSD version from RHEL-6.5. Did you had a chance to try a recent version from RHEL-6.6? If yes, do you have logs from the RHEL-6.6 version as well?

If I remember correctly, JR was compiling his own SSSD builds from source. Last time I was debugging an issue for him, he was running sssd-1-9 on RHEL-5.

That said, trying out the latest version would be good.

I will test both. I am actually running both RHEL5 & RHEL6 and the problem exists and would like to be resolved in both, but I can manage the oneness of recompiling the RHEL5 source with any patches that might be applicable.

I'm still trying to reproduce this on RHEL5 with a local build of sssd-1.9. So far without luck, but I have a suspicion. Can you try to set 'ldap_deref_threshold = 0' in the domain section of sssd.conf and see if group-memberships are now resolved properly?

Reproducing the problem may be difficult as it is a symptom/result of a bug in the LDAP system... I would suggest doing things like deliberately running GDB on 389 or krb5kdc to stop it from answering requests mid stream from the client. It is during these interruptions in service or outages in the ldap backend where these caching problems seem to manifest. Specifically they are sort of soft failures, where the traffic isn't rejected with a port not listening etc, it could be that it accepts the connection but doesn't send data, or it sends unexpected or incomplete data.

It is in these scenarios where SSSD is supposed to be strongest, so that the client system can withstand the outage, but in the case of a soft-failure, the results appear to be mixed.

I will try setting ldap_deref_threshold = 0 and report back.

I have found this other similar bug: https://fedorahosted.org/sssd/ticket/2398?cversion=0&cnum_hist=1

I am very concerned that if I do impliment the ldap_deref_threshold value that it will prohibit my users from successfully ssh'ing into the systems.

Is there are number of bugs between IPA + SSSD that tie this problem together?

I can now confirm that setting ldap_deref_threshold = 0 does seem to resolve / mask the issue. I am able to move forward, though I expect this setting has consequences and potentially load issues that I need to consider.

Thank you for testing, this confirm my suspicion that there is an issue in the deref code-path. I will continue searching for a fix so that you can remove the ldap_deref_threshold option in a patched version of SSSD.

About the consequences of setting 'ldap_deref_threshold = 0'. By default the group memberships looked up one-by-one. Only if there are more than the given threshold value a deref call is used to hopefully speed things up. So, looking up one-by-one is the default case which is always used and using the deref is an optimization.

About #2398, I think it turned out that the reason for the issue was the new FreeIPA ACI scheme and not 'ldap_deref_threshold = 0'.

Thank you for your assistance. I'll come back to you when I have a patch. Please let me know if your prefer to build new packages yourself or if I should provide test builds? In the latter case please let me know which version for which platform you need.

Fields changed

patch: 0 => 1

resolution: => fixed
status: new => closed

Metadata Update from @jraquino:
- Issue assigned to sbose
- Issue set to the milestone: SSSD 1.12.5

3 years ago

SSSD is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in SSSD's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/SSSD/sssd/issues/3632

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Login to comment on this ticket.