#3520 Files provider supports only BE_FILTER_ENUM
Closed: Fixed 11 months ago Opened a year ago by lslebodn.

  • enable files provider
  • query local user
    getent passwd -s sss 1001

Actual results:
sh$ getent passwd -s sss 1000
sh$ echo $?
2

Expected results:
The same as with /etc/passwd

sssd_nss.log

sh$ getent passwd -s files 1000
test_user:x:1000:1000::/home/test_user:/bin/bash
[sssd[nss]] [get_client_cred] (0x4000): Client creds: euid[0] egid[0] pid[32141].
[sssd[nss]] [setup_client_idle_timer] (0x4000): Idle timer re-set for client [0x556f04fab0f0][28]
[sssd[nss]] [accept_fd_handler] (0x0400): Client connected!
[sssd[nss]] [sss_cmd_get_version] (0x0200): Received client version [1].
[sssd[nss]] [sss_cmd_get_version] (0x0200): Offered version [1].
[sssd[nss]] [nss_getby_id] (0x0400): Input ID: 1000

[sssd[nss]] [cache_req_set_plugin] (0x2000): CR #0: Setting "User by ID" plugin
[sssd[nss]] [cache_req_send] (0x0400): CR #0: New request 'User by ID'
[sssd[nss]] [cache_req_select_domains] (0x0400): CR #0: Performing a multi-domain search
[sssd[nss]] [cache_req_search_domains] (0x0400): CR #0: Search will check the cache and check the data provider
[sssd[nss]] [cache_req_validate_domain_type] (0x2000): Request type POSIX-only for domain implicit_files type POSIX is valid
[sssd[nss]] [cache_req_set_domain] (0x0400): CR #0: Using domain [implicit_files]
[sssd[nss]] [cache_req_search_send] (0x0400): CR #0: Looking up UID:1000@implicit_files
[sssd[nss]] [cache_req_search_ncache] (0x0400): CR #0: Checking negative cache for [UID:1000@implicit_files]
[sssd[nss]] [sss_ncache_check_str] (0x2000): Checking negative cache for [NCE/UID/1000]
[sssd[nss]] [cache_req_search_ncache] (0x0400): CR #0: [UID:1000@implicit_files] is not present in negative cache
[sssd[nss]] [cache_req_search_cache] (0x0400): CR #0: Looking up [UID:1000@implicit_files] in cache
[sssd[nss]] [cache_req_search_send] (0x0400): CR #0: Object found, but needs to be refreshed.

[sssd[nss]] [cache_req_search_dp] (0x0400): CR #0: Looking up [UID:1000@implicit_files] in data provider
[sssd[nss]] [sss_dp_account_files_params] (0x2000): Domain files is not consistent, issuing update
[sssd[nss]] [sss_dp_issue_request] (0x0400): Issuing request for [0x556f02f74650:1:1000@implicit_files]
[sssd[nss]] [sss_dp_get_account_msg] (0x0400): Creating request for [implicit_files][0x1][BE_REQ_USER][idnumber=1000:-]
[sssd[nss]] [sbus_add_timeout] (0x2000): 0x556f04fa33b0
[sssd[nss]] [sss_dp_internal_get_send] (0x0400): Entering request [0x556f02f74650:1:1000@implicit_files]
[sssd[nss]] [sbus_remove_timeout] (0x2000): 0x556f04fa33b0
[sssd[nss]] [sbus_dispatch] (0x4000): dbus conn: 0x556f04f943f0
[sssd[nss]] [sbus_dispatch] (0x4000): Dispatching.
[sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Fatal]
[sssd[nss]] [cache_req_common_dp_recv] (0x0040): CR #0: Data Provider Error: 3, 5, Failed to get reply from Data Provider
[sssd[nss]] [cache_req_common_dp_recv] (0x0400): CR #0: Due to an error we will return cached data
[sssd[nss]] [cache_req_search_cache] (0x0400): CR #0: Looking up [UID:1000@implicit_files] in cache
[sssd[nss]] [cache_req_search_ncache_filter] (0x0400): CR #0: Filtering out results by negative cache
[sssd[nss]] [sss_ncache_check_str] (0x2000): Checking negative cache for [NCE/USER/implicit_files/test_user@implicit_files]
[sssd[nss]] [cache_req_search_ncache_filter] (0x0400): CR #0: [test_user@implicit_files] filtered out! (negative cache)

domain log

[sssd[be[implicit_files]]] [sbus_dispatch] (0x4000): dbus conn: 0x5625683a3770
[sssd[be[implicit_files]]] [sbus_dispatch] (0x4000): Dispatching.
[sssd[be[implicit_files]]] [sbus_message_handler] (0x2000): Received SBUS method org.freedesktop.sssd.dataprovider.getAccountInfo on path /org/freedesktop/sssd/dataprovider
[sssd[be[implicit_files]]] [sbus_get_sender_id_send] (0x2000): Not a sysbus message, quit
[sssd[be[implicit_files]]] [dp_get_account_info_handler] (0x0200): Got request for [0x1][BE_REQ_USER][idnumber=1000]
[sssd[be[implicit_files]]] [dp_attach_req] (0x0400): DP Request [Account #2568]: New request. Flags [0x0001].
[sssd[be[implicit_files]]] [dp_attach_req] (0x0400): Number of active DP request: 1
[sssd[be[implicit_files]]] [sss_domain_get_state] (0x1000): Domain implicit_files is Active
[sssd[be[implicit_files]]] [files_account_info_handler_send] (0x0020): Unexpected user filter type: 2
[sssd[be[implicit_files]]] [dp_req_done] (0x0400): DP Request [Account #2568]: Request handler finished [22]: Invalid argument
[sssd[be[implicit_files]]] [_dp_req_recv] (0x0400): DP Request [Account #2568]: Receiving request data.
[sssd[be[implicit_files]]] [dp_req_reply_gen_error] (0x0020): DP Request [Account #2568]: Finished. Error [22]: Invalid argument
[sssd[be[implicit_files]]] [dp_table_value_destructor] (0x0400): Removing [0:1:0x0001:1::implicit_files:idnumber=1000] from reply table
[sssd[be[implicit_files]]] [dp_req_destructor] (0x0400): DP Request [Account #2568]: Request removed.
[sssd[be[implicit_files]]] [dp_req_destructor] (0x0400): Number of active DP request: 0

It is a performance degradation with "sss files" in nsswitch because back-end is connected quite often and memory cache cannot be refreshed


Metadata Update from @lslebodn:
- Issue tagged with: bug, performance

a year ago

Metadata Update from @lslebodn:
- Custom field version adjusted to 1.15.1

a year ago

Metadata Update from @lslebodn:
- Issue untagged with: bug

a year ago
sh# getent passwd -s files | wc -l
52

sh# getent passwd -s sss | wc -l
47

Well, according to this:

[sssd[nss]] [sss_dp_account_files_params] (0x2000): Domain files is not consistent, issuing update

The request was done at a time when the back end was updating the cache as a result of something touching /etc/passwd. In that case, we should send another enumeration request so that the DP queues the issued request with the to wait for the update to finish. So as far as I can see, the problem is that we should not send a by-ID request, but an enumeration request.

btw the difference between the number of the enumerates users is the entries with UID or GID 0, which sssd doesn't handle, right?

Hmm, but also in the domain logs there is:

[sssd[be[implicit_files]]] [sss_domain_get_state] (0x1000): Domain implicit_files is Active

So it looks like the provider and responder disagree on the state of the domain. But I can't reproduce this problem..

btw the difference between the number of the enumerates users is the entries with UID or GID 0, which sssd doesn't handle, right?

I have just one user with UID 0 and difference was 5

Well, according to this:
[sssd[nss]] [sss_dp_account_files_params] (0x2000): Domain files is not consistent, issuing update

The request was done at a time when the back end was updating the cache.

No back-end was idle.

btw the difference between the number of the enumerates users is the entries with UID or GID 0, which sssd doesn't handle, right?

I have just one user with UID 0 and difference was 5

And you were right. Iw was because of primary gid =0

sh$ getent passwd -s sss | sort > from_sss
sh$ getent passwd -s files | sort > from_files
sh$ diff -u from_sss from_files | grep ^+
+++ from_files  2017-09-20 21:09:20.016639228 +0200
+halt:x:7:0:halt:/sbin:/sbin/halt
+operator:x:11:0:operator:/root:/sbin/nologin
+root:x:0:0:root:/root:/bin/bash
+shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
+sync:x:5:0:sync:/sbin:/bin/sync

Hmm, but also in the domain logs there is:
[sssd[be[implicit_files]]] [sss_domain_get_state] (0x1000): Domain implicit_files is Active

So it looks like the provider and responder disagree on the state of the domain. But I can't reproduce this problem.

Unfortunately, I did not create backup of cache and only removed problematic entry from cache with ldbedit; restarted sssd and touched /etc/passwd. Next time I'll switch order for modules in nsswitch.conf as a workaround.

I cannot parse the last comment.

What problematic entry?

How can I reproduce the bug that you saw? By removing one entry from ldb, restarting sssd, touching passwd and running getent?

I cannot parse the last comment.
What problematic entry?

The entry with UID 1000. I could not see any issue with UID 1001.

How can I reproduce the bug that you saw? By removing one entry from ldb, restarting sssd, touching passwd and running getent?

I play with sssd a lot on my machine :-) therefore I'm not sure what was a trigger for the bug. I noticed this bug after a while when I noticed that compilation was quite slow in ramdisk due to glibc -> sssd_nss -> (sometime sssd_be) -> and then libnss_files.so

But if we get to the situation that responder try to other request then BE_FILTER_ENUM then we should either support it or translate BE_FILTER_{NAME,IDNUM} to BE_FILTER_ENUM

Yes, we should translate BE_FILTER_{NAME,IDNUM} to BE_FILTER_ENUM. It's a bug we don't.

And homework for me: to find a reasonable reproducer :-)

Metadata Update from @jhrozek:
- Issue set to the milestone: SSSD 1.16.0

a year ago

And homework will not be difficult because I hit it again :-)

Metadata Update from @lslebodn:
- Issue tagged with: bug

a year ago

Another important note. It is a huge performance regression comparing to files.

sh$ cd /tmp/
sh$ mkdir data
sh$ for i in {1..10000}; do touch /tmp/data/file$i; done

#files is 1st in /etc/nsswitch.conf for passwd and groups
sh$ time ls -l /tmp/data/ >/dev/null

real    0m0.007s
user    0m0.000s
sys     0m0.007s

#sss is 1st in /etc/nsswitch.conf for passwd and groups
sh$time ls -l /tmp/data/ >/dev/null

real    0m0.046s
user    0m0.001s
sys     0m0.010s

As you can see it is almost 7 times slower due to missing valid entry in memory cache and bug in nss responder.

Please send me the logs or tell me how to reproduce the problem..

Since we are required to release a new upstream tarball no later than Friday Oct-20, I'm moving tickets that will not be closed by that date to the next milestone, 1.16.1

Metadata Update from @jhrozek:
- Issue set to the milestone: SSSD 1.16.1 (was: SSSD 1.16.0)

a year ago

I'm able to reproduce this bug finally. And since it affects a component enabled and shipped by default in Fedora, I think we can only defer to the next version

Metadata Update from @jhrozek:
- Issue tagged with: postpone-to-1-16-2

a year ago

Metadata Update from @jhrozek:
- Issue priority set to: major

a year ago

Metadata Update from @jhrozek:
- Issue set to the milestone: SSSD 1.16.2 (was: SSSD 1.16.1)

a year ago

Metadata Update from @jhrozek:
- Issue untagged with: postpone-to-1-16-2

a year ago

Metadata Update from @jhrozek:
- Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1540703

a year ago

Metadata Update from @fidencio:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

11 months ago

Login to comment on this ticket.

Metadata