#3696 Sudo is not finding the host in the netgroup
Closed: Invalid 6 years ago Opened 6 years ago by gerases.

sssd version: 1.12.4
environment: ipa

I've replaced the private values in the logs with:

hostA
hostB
HOSTGROUP
UNAME

I have six hosts belonging to HOSTGROUP. The issue is that a user has a sudo rule associated with the group in IPA. Running sudo -l -U UNAME doesn't list the rule on hostA but shows it on hostB.

I've enabled debugging in both sudo and sssd as shown here: https://docs.pagure.org/SSSD.sssd/users/sudo_troubleshooting.html

Here are the relevant logs on hostA (non-working host)

sudo log

Apr  5 23:32:31 sudo[36761] -> netgr_matches @ ./match.c:720
Apr  5 23:32:31 sudo[36761] (hostA, *, (none)) NOT found in netgroup HOSTGROUP
Apr  5 23:32:31 sudo[36761] (hostA, *, (none)) NOT found in netgroup HOSTGROUP
Apr  5 23:32:31 sudo[36761] <- netgr_matches @ ./match.c:772 := false
Apr  5 23:32:31 sudo[36761] IPA hostname (hostA) matches +HOSTGROUP => false

sssd domain log (ipa)

(Thu Apr  5 23:32:31 2018) [sssd[be[DOMAIN]]] [ipa_netgr_members_process] (0x2000): Found 6 members in current search base
(Thu Apr  5 23:32:31 2018) [sssd[be[DOMAIN]]] [ipa_netgr_process_all] (0x2000): Extracting netgroup members of netgroup 0
(Thu Apr  5 23:32:31 2018) [sssd[be[DOMAIN]]] [ipa_netgr_process_all] (0x2000): Extracted 0 netgroup members
(Thu Apr  5 23:32:31 2018) [sssd[be[DOMAIN]]] [ipa_netgr_process_all] (0x2000): Extracted 0 user members
(Thu Apr  5 23:32:31 2018) [sssd[be[DOMAIN]]] [ipa_netgr_process_all] (0x2000): Extracted 6 host members
(Thu Apr  5 23:32:31 2018) [sssd[be[DOMAIN]]] [ipa_netgr_process_all] (0x2000): Putting together triples of netgroup 0
...
(Thu Apr  5 23:32:31 2018) [sssd[be[DOMAIN]]] [ipa_save_netgroup] (0x1000): No original members for netgroup [HOSTGROUP]
(Thu Apr  5 23:32:31 2018) [sssd[be[DOMAIN]]] [ipa_save_netgroup] (0x1000): No members for netgroup [HOSTGROUP]
(Thu Apr  5 23:32:31 2018) [sssd[be[DOMAIN]]] [ipa_save_netgroup] (0x0400): Storing info for netgroup HOSTGROUP
(Thu Apr  5 23:32:31 2018) [sssd[be[DOMAIN]]] [sysdb_add_basic_netgroup] (0x0400): Error: 17 (File exists)
(Thu Apr  5 23:32:31 2018) [sssd[be[DOMAIN]]] [acctinfo_callback] (0x0100): Request processed. Returned 0,0,Success
(Thu Apr  5 23:32:31 2018) [sssd[be[DOMAIN]]] [sdap_process_result] (0x2000): Trace: sh[0xd02d90], connected[1], ops[(nil)], ldap[0xcc8060]
(Thu Apr  5 23:32:31 2018) [sssd[be[DOMAIN]]] [sdap_process_result] (0x2000): Trace: ldap_result found nothing!

sssd sudo log

(Thu Apr  5 23:32:30 2018) [sssd[sudo]] [sudosrv_get_sudorules_from_cache] (0x0400): Returning 18 rules for [USER]

I've looked at the db on disk and I did find the 18 rules I expected to see. The only difference between the working host and the non-working host is that the working host (hostB) has this in its sudo log.

Apr  5 21:30:41 sudo[38094] -> sudo_sss_ipa_hostname_matches @ ./sssd.c:557
Apr  5 21:30:41 sudo[38094] -> hostname_matches @ ./match.c:604
Apr  5 21:30:41 sudo[38094] <- hostname_matches @ ./match.c:613 := false
Apr  5 21:30:41 sudo[38094] -> netgr_matches @ ./match.c:720
Apr  5 21:30:41 sudo[38094] (hostB, *, *) found in netgroup HOSTGROUP
Apr  5 21:30:41 sudo[38094] <- netgr_matches @ ./match.c:773 := true
Apr  5 21:30:41 sudo[38094] IPA hostname (hostB) matches +HOSTGROUP => true

So it finds hostB in its hostgroup, but fails to find hostA in the same hostgroup. I've tried clearing the cache, removing the files from /var/lib/sssd/db. The sssd.conf files are identical except for ipa_hostname.

Not sure why sudo can't match the host against HOSTGROUP. I've also executed the ldap searches mentioned in the logs and what comes back the expected hosts.

I would really appreciate any ideas on this.

Thanks,
Sergei


One thing I did notice in /var/log/messages:

Apr  5 22:07:53 name102 sssd[nss]: More groups have the same GID [12510] in directory server. SSSD will not work correctly.

That message is not present on hostB (the working host).

1.12 is really ancient, don't run that in production or really, anywhere else.

Can you resolve the HOSTGROUP as a netgroup (getent netgroup HOSTGROUP) ? How does the hostgroup look like on the server.

Thank you for the quick response.

I think i can upgrade to 1.13 on those systems but since it's 6.5 I don't think i have much choice. But i will look closer shortly.

Yes, i used getent last and got the 6 machines back (including the problem server). Also, i dumped the sssd cache and saw the same result there.

Also, why does None mean in this line:

(name102.sj2.hbase.crm.cnvr.net, *, (none)) NOT found in netgroup prd.dtm.hbase.name

Updated to sssd 1.13. Issue still there.

Looking at the source code, I see that unless the domain name is defined, it should be *. So, it appears that in the case of the non-working host, the domain name is literally '(none)', which will never match the domain returned from LDAP? If that's so, it's a mystery why getdomainname would return different values on the two hosts.

Running domainname in bash returns (none) on both.

I have good news to report. After playing with your C code and reading about innetgrp, I realized that its third parameter is the domain name. So the facts are:

  • Both systems report (none) for domainname in Bash.
  • The ''good'' system has * for the last parameter in innetgr, which means getdomainname returned NULL.
  • The "bad" system has (none) for the domain name, which means getdomainname returned the literal string (none). In your code (sudo-1.8.6p3-netgrmatchtrace.patch) you have:
<snip>
+     sudo_debug_printf(SUDO_DEBUG_TRACE,
+                       "(%s, %s, %s) NOT found in netgroup %s\n",
+                       lhost ? lhost : "*",
+                       user ? user : "*",
+                       domain ? domain : "*",
+                       netgr);
</snip>

... which supports my theory

Having gathered those facts, I executed domainname OURDOMAIN and re-ran sudo, which worked!

I still don't know why the good system doesn't get fooled by the (none) return.

I've compiled a small C program (found it with a quick Google) to check getdomainname and it returned (none) on both.

/* gethostn.c:
 *
 * Example of gethostname(2):
 */
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>

int
main(int argc,char **argv) {
    int z;
    char buf[32];

    z = getdomainname(buf,sizeof buf);

    if ( z == -1 ) {
        fprintf(stderr,"%s: getdomainname(2)\n",
            strerror(errno));
        exit(1);
    }

    printf("domain name = '%s'\n",buf);

    return 0;
}

It returned (none) on both. But the good news is that we have a workaround now.

Hmm, I really don't know either. It sounds like you find the issue in sudo, so I would suggest to open an issue against sudo to ask there.

I'm closing this ticket, but thank you very much for debugging it.

Metadata Update from @jhrozek:
- Issue close_status updated to: Invalid
- Issue status updated to: Closed (was: Open)

6 years ago

SSSD is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in SSSD's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/SSSD/sssd/issues/4713

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Login to comment on this ticket.

Metadata