#3868 getgrgid fails unless getgrnam was called earlier for groups in Active Directory
Opened 24 days ago by mbar. Modified 10 days ago

=== Scenario:

Step 1. Clear cache with sudo sss_cache --everything
Step 2. Run the following commands in Windows PowerShell:

Remove-ADUser -Identity alice -Confirm:\$False
Remove-ADGroup -Identity alice_group -Confirm:\$False
New-ADUser -Name alice -AccountPassword (ConvertTo-SecureString 'password' -AsPlainText -Force)
Enable-ADAccount -Identity 'alice'
New-ADGroup -Name alice_group -GroupScope DomainLocal
Add-ADGroupMember -Identity alice_group -Members alice

Step 3. Get group database entries for alice:

$ python <<EOF                                                                     
import grp
import os
import pwd

entry = pwd.getpwnam("alice@test.mydomain.com")
gids = os.getgrouplist('alice@test.mydomain.com', entry.pw_gid)
for gid in gids:
    print(grp.getgrgid(gid))
EOF

The loop fails with an an exception: "gid not found: 524201262".

=== Workaround:
Repeating the steps above, but with the following code in step 3 works:

$ python <<EOF                                                                     
import grp
import os
import pwd

grp.getgrnam('alice_group@test.mydomain.com')  # IMPORTANT LINE
entry = pwd.getpwnam("alice@test.mydomain.com")
gids = os.getgrouplist('alice@test.mydomain.com', entry.pw_gid)
for gid in gids:
    print(grp.getgrgid(gid))
EOF

=== Versions:

$ cat /etc/centos-release
CentOS Linux release 7.5.1804 (Core) 

$ uname -a
Linux HOSTNAME 3.10.0-862.14.4.el7.x86_64 #1 SMP Wed Sep 26 15:12:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

$ rpm -qa sssd
sssd-1.16.0-19.el7_5.8.x86_64

Active Directory on Windows Server 2012 R2 Version 6.3 (Build 9600).

=== Config file:

$ cat /etc/sssd/sssd.conf:
[sssd]
domains = test.mydomain.com
config_file_version = 2
services = nss, pam
debug_level = 0x4000

[nss]
debug_level = 0x4000

[domain/test.mydomain.com]
ad_domain = test.mydomain.com
krb5_realm = TEST.MYDOMAIN.COM
realmd_tags = manages-system joined-with-samba 
cache_credentials = True
id_provider = ad
krb5_store_password_if_offline = True
default_shell = /bin/bash
ldap_id_mapping = True
use_fully_qualified_names = True
fallback_homedir = /home/%u@%d
access_provider = ad

This looks strange:
(Thu Oct 25 09:31:03 2018) [sssd[nss]] [nss_protocol_done] (0x4000): Sending reply: error [1432158209]: Internal Error

But I can't reproduce anything similar locally. Can you send both the nss log and the sssd domain log with higher debug_level set in both the domain and nss sections?

It's crucial to delete group (and user?) and create them again:

(Fri Oct 26 05:57:41 2018) [sssd[nss]] [cache_req_search_cache] (0x0020): CR #11: Multiple objects were found when only one was expected!

I attached the reproducing script and logs - you need ssh access to Windows Server to run it. Note that sss_cache --everything is called before adding each time.

In the logs: user: user_random_Ig4kAT, group: group_random_Ig4kAT.

sssd_nss.extended.log.failure
sssd.extended.log.failure
reproduce.sh

I suspect that since you recreate the entries, what might help is try with the sssd version that comes with centos 7.6. There are some patches added that originally were supposed to help with renaming entries, but since the entries stay the same and therefore have the same originalDN, I think those patches might help as well.

RHEL-7.6 was released two days ago, I don't know how long it will take centos to release 7.6, but I suspect not much.

Can you wait a bit and comment if you can still reproduce the bug with centos 7.6?

Sure. I assume that CentOS 7.6 will have sssd 2.0, right?

Login to comment on this ticket.

Metadata
Attachments 5
Attached 24 days ago View Comment
Attached 24 days ago View Comment
Attached 23 days ago View Comment