#2622 id lookup against large no. of posix groups skips a few groups
Closed: Invalid None Opened 5 years ago by jhrozek.

Ticket was cloned from Red Hat Bugzilla (product Red Hat Enterprise Linux 6): Bug 1208484

Description of problem:
Few groups are skipped in a id lookup against large no. of posix groups

Version-Release number of selected component (if applicable):
sssd-1.12.4-25.el6

How reproducible:
Always

Steps to Reproduce:
1. User lg-user1201-010724 belongs to 500 posix groups lg-group{1..500}-010724

2. sssd domain section has:
[domain/sssdad2012r2.com]
id_provider = ad
fallback_homedir = /home/%d/%u
default_shell = /bin/bash
ad_domain = sssdad2012r2.com
use_fully_qualified_names = True
ad_enable_gc = False
ldap_id_mapping = False
debug_level = 0xFFF0

3. id lg-user1201-010724


Actual results:
Automation run shows:
...
...
[   FAIL   ] :: id for lg-user1201-010724@sssdad2012r2.com is missing group
lg-group25-010724@sssdad2012r2.com
[   PASS   ] :: id for lg-user1201-010724@sssdad2012r2.com has group
lg-group26-010724@sssdad2012r2.com
[   PASS   ] :: id for lg-user1201-010724@sssdad2012r2.com has group
lg-group27-010724@sssdad2012r2.com
[   FAIL   ] :: id for lg-user1201-010724@sssdad2012r2.com is missing group
lg-group28-010724@sssdad2012r2.com
[   PASS   ] :: id for lg-user1201-010724@sssdad2012r2.com has group
lg-group29-010724@sssdad2012r2.com
[   FAIL   ] :: id for lg-user1201-010724@sssdad2012r2.com is missing group
lg-group30-010724@sssdad2012r2.com
[   PASS   ] :: id for lg-user1201-010724@sssdad2012r2.com has group
lg-group31-010724@sssdad2012r2.com
[   PASS   ] :: id for lg-user1201-010724@sssdad2012r2.com has group
lg-group32-010724@sssdad2012r2.com
[   PASS   ] :: id for lg-user1201-010724@sssdad2012r2.com has group
lg-group33-010724@sssdad2012r2.com
[   PASS   ] :: id for lg-user1201-010724@sssdad2012r2.com has group
lg-group34-010724@sssdad2012r2.com
[   PASS   ] :: id for lg-user1201-010724@sssdad2012r2.com has group
lg-group35-010724@sssdad2012r2.com
[   FAIL   ] :: id for lg-user1201-010724@sssdad2012r2.com is missing group
lg-group36-010724@sssdad2012r2.com
...
...

Expected results:
All groups should be returned

Additional info:
For a group that is not returned(lg-group36-010724), I see the following in
domain log:
(Thu Apr  2 16:36:18 2015) [sssd[be[sssdad2012r2.com]]]
[sdap_add_incomplete_groups] (0x1000): Group #0
[lg-group36-010724][lg-group36-010724] is not cached, need to add a fake entry
(Thu Apr  2 16:36:18 2015) [sssd[be[sssdad2012r2.com]]] [sdap_get_primary_name]
(0x0400): Processing object lg-group36-010724
(Thu Apr  2 16:36:18 2015) [sssd[be[sssdad2012r2.com]]]
[sdap_add_incomplete_groups] (0x2000): Adding fake group lg-group36-010724 to
sysdb
(Thu Apr  2 16:36:26 2015) [sssd[be[sssdad2012r2.com]]] [cleanup_groups]
(0x1000): Searching with: (&(objectClass=user)(|(memberOf=name=lg-group36-01072
4,cn=groups,cn=sssdad2012r2.com,cn=sysdb)(gidNumber=57040037)))
(Thu Apr  2 16:36:26 2015) [sssd[be[sssdad2012r2.com]]] [cleanup_groups]
(0x2000): About to delete group lg-group36-010724

For a group that is returned in the id lookup(lg-group35-010724), I see the
following in domain log:
(Thu Apr  2 16:36:40 2015) [sssd[be[sssdad2012r2.com]]]
[sdap_add_incomplete_groups] (0x1000): Group #0
[lg-group35-010724][lg-group35-010724] is not cached, need to add a fake entry
(Thu Apr  2 16:36:40 2015) [sssd[be[sssdad2012r2.com]]] [sdap_get_primary_name]
(0x0400): Processing object lg-group35-010724
(Thu Apr  2 16:36:40 2015) [sssd[be[sssdad2012r2.com]]]
[sdap_add_incomplete_groups] (0x2000): Adding fake group lg-group35-010724 to
sysdb
(Thu Apr  2 16:38:36 2015) [sssd[be[sssdad2012r2.com]]] [sdap_parse_entry]
(0x1000): OriginalDN: [CN=lg-group35-010724,CN=Users,DC=sssdad2012r2,DC=com].
(Thu Apr  2 16:38:36 2015) [sssd[be[sssdad2012r2.com]]]
[sdap_nested_group_hash_entry] (0x4000): Inserting
[CN=lg-group35-010724,CN=Users,DC=sssdad2012r2,DC=com] into hash table [groups]

Fields changed

blockedby: =>
blocking: =>
changelog: =>
coverity: =>
design: =>
design_review: => 0
feature_milestone: =>
fedora_test_page: =>
mark: no => 0
owner: somebody => lslebodn
review: True => 0
selected: =>
testsupdated: => 0

Fields changed

milestone: NEEDS_TRIAGE => SSSD 1.12.5

I was not able to reproduce it manually just with automated test. It was caused by race condition between initgroups and cleanup task.

If id-mapping is disabled then POSIX attributes need to be fetched for initgroups from AD. It causes "n + 1" ldap request.

If user authenticate (su/ssh) few operation are performed getpwnam, intgroups and then pam related tasks.

The operation initgroups takes approximately 13 seconds for 500 groups. The user is still not authenticated and therefore groups can be removed by cleanup tasks. This is a reason why test with posix attributes failed.

Log file:

(Thu Apr 16 03:41:42 2015) [sssd[be[sssdad.com]]] [be_get_account_info] (0x0200): Got request for [0x1001][1][name=lg-user1201-301548]
(Thu Apr 16 03:41:42 2015) [sssd[be[sssdad.com]]] [be_get_account_info] (0x0200): Got request for [0x3][1][name=lg-user1201-301548]
(Thu Apr 16 03:41:51 2015) [sssd[be[sssdad.com]]] [be_ptask_execute] (0x0400): Task [Cleanup of sssdad.com]: executing task, timeout 10800 seconds
(Thu Apr 16 03:41:51 2015) [sssd[be[sssdad.com]]] [be_ptask_done] (0x0400): Task [Cleanup of sssdad.com]: finished successfully
(Thu Apr 16 03:41:51 2015) [sssd[be[sssdad.com]]] [be_ptask_schedule] (0x0400): Task [Cleanup of sssdad.com]: scheduling task 10800 seconds from last execution time [1429180911]
(Thu Apr 16 03:41:56 2015) [sssd[be[sssdad.com]]] [be_pam_handler] (0x0100): Got request with the following data
(Thu Apr 16 03:41:56 2015) [sssd[be[sssdad.com]]] [be_pam_handler_callback] (0x0100): Backend returned: (0, 0, <NULL>) [Success]
(Thu Apr 16 03:41:56 2015) [sssd[be[sssdad.com]]] [be_pam_handler_callback] (0x0100): Sending result [0][sssdad.com]
(Thu Apr 16 03:41:56 2015) [sssd[be[sssdad.com]]] [be_pam_handler_callback] (0x0100): Sent result [0][sssdad.com]
(Thu Apr 16 03:41:56 2015) [sssd[be[sssdad.com]]] [be_pam_handler] (0x0100): Got request with the following data
(Thu Apr 16 03:41:56 2015) [sssd[be[sssdad.com]]] [be_target_access_permit] (0x4000): be_target_access_permit called, returning PAM_SUCCESS.
(Thu Apr 16 03:41:56 2015) [sssd[be[sssdad.com]]] [be_pam_handler_callback] (0x0100): Backend returned: (0, 0, <NULL>) [Success]

Downstream ticket was closed as NOT a BUG.

resolution: => invalid
status: new => closed

Metadata Update from @jhrozek:
- Issue assigned to lslebodn
- Issue set to the milestone: SSSD 1.12.5

3 years ago

SSSD is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in SSSD's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/SSSD/sssd/issues/3663

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Login to comment on this ticket.

Metadata