#2865 sssd_nss memory usage keeps growing on sssd-1.12.4-47.el6.x86_64 (RHEL6.7) when trying to retrieve non-existing netgroups
Closed: Fixed None Opened 3 years ago by preichl.

When a non-existant netgroup is requested, the sssd_nss memory usage goes up.


Memory growth is bad and Pavel has a fix. We should put the ticket into 1.13.3 and if the fix gets acked before 1.13.2 is tagged, move it up.

blockedby: =>
blocking: =>
changelog: =>
coverity: =>
design: =>
design_review: => 0
feature_milestone: =>
fedora_test_page: =>
mark: no => 0
review: True => 0
selected: =>
testsupdated: => 0

Fields changed

owner: somebody => preichl
priority: major => critical

Fields changed

milestone: NEEDS_TRIAGE => SSSD 1.13.3

Fields changed

patch: 0 => 1
status: new => assigned

A patch is available, but we're still discussing details. If we push the patch before the 1.13.3 release, I'll move the ticket back up..

milestone: SSSD 1.13.3 => SSSD 1.13.4

Fields changed

description: Ticket was cloned from Red Hat Bugzilla (product ''Red Hat Enterprise Linux 6''): [https://bugzilla.redhat.com/show_bug.cgi?id=1259687 Bug 1259687]

''Please note that this Bug is private and may not be accessible as it contains confidential Red Hat customer information.''

{{{
Description of problem:
I hope the title is accurate, this is my understanding so far.

Component leaking memory: sssd_nss

package version: sssd-1.12.4-47.el6.x86_64 (RHEL6.7)

Domain log:

Initially:

(Sun Aug 30 04:02:07 2015) [sssd[be[bskyb.com]]]
[sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire,
releasing it
(Sun Aug 30 04:10:01 2015) [sssd[be[bskyb.com]]] [sdap_process_group_send]
(0x0040): No Members. Done!
(Sun Aug 30 04:10:01 2015) [sssd[be[bskyb.com]]] [sdap_save_group] (0x0080):
Failed to retrieve UUID [22][Invalid argument].
(Sun Aug 30 04:17:10 2015) [sssd[be[bskyb.com]]]
[sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire,
releasing it
(Sun Aug 30 04:32:10 2015) [sssd[be[bskyb.com]]]
[sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire,
releasing it
(Sun Aug 30 04:34:41 2015) [sssd[be[bskyb.com]]] [sdap_save_user] (0x0080):
Failed to retrieve UUID [22][Invalid argument].
(Sun Aug 30 04:36:56 2015) [sssd[be[bskyb.com]]] [sdap_save_group] (0x0080):
Failed to retrieve UUID [22][Invalid argument].
(Sun Aug 30 04:36:56 2015) [sssd[be[bskyb.com]]] [sdap_save_group] (0x0080):
Failed to retrieve UUID [22][Invalid argument].
(Sun Aug 30 04:36:56 2015) [sssd[be[bskyb.com]]] [sdap_save_group] (0x0080):
Failed to retrieve UUID [22][Invalid argument].
(Sun Aug 30 04:36:56 2015) [sssd[be[bskyb.com]]] [sdap_save_group] (0x0080):
Failed to retrieve UUID [22][Invalid argument].
(Sun Aug 30 04:47:10 2015) [sssd[be[bskyb.com]]]
[sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire,
releasing it
(Sun Aug 30 04:50:34 2015) [sssd[be[bskyb.com]]] [sdap_save_user] (0x0080):
Failed to retrieve UUID [22][Invalid argument].
(Sun Aug 30 04:50:34 2015) [sssd[be[bskyb.com]]] [sdap_process_group_send]
(0x0040): No Members. Done!
(Sun Aug 30 04:50:34 2015) [sssd[be[bskyb.com]]] [sdap_save_group] (0x0080):
Failed to retrieve UUID [22][Invalid argument].
(Sun Aug 30 04:50:34 2015) [sssd[be[bskyb.com]]] [sdap_save_user] (0x0080):
Failed to retrieve UUID [22][Invalid argument].
(Sun Aug 30 04:50:34 2015) [sssd[be[bskyb.com]]] [sdap_process_group_send]
(0x0040): No Members. Done!

This stops at about:

(Tue Sep 1 12:02:49 2015) [sssd[be[bskyb.com]]]
[sdap_id_conn_data_expire_handler] (0x0080): connection is about to expire,
releasing it
(Tue Sep 1 12:09:18 2015) [sssd[be[bskyb.com]]] [sdap_save_user] (0x0080):
Failed to retrieve UUID [22][Invalid argument].
(Tue Sep 1 12:09:18 2015) [sssd[be[bskyb.com]]] [sdap_save_user] (0x0080):
Failed to retrieve UUID [22][Invalid argument].
(Tue Sep 1 12:09:18 2015) [sssd[be[bskyb.com]]] [sdap_save_group] (0x0080):
Failed to retrieve UUID [22][Invalid argument].
(Tue Sep 1 12:09:19 2015) [sssd[be[bskyb.com]]] [sdap_save_user] (0x0080):
Failed to retrieve UUID [22][Invalid argument].
(Tue Sep 1 12:09:19 2015) [sssd[be[bskyb.com]]] [sdap_save_group] (0x0080):
Failed to retrieve UUID [22][Invalid argument].
(Tue Sep 1 12:14:17 2015) [sssd[be[bskyb.com]]] [server_common_rotate_logs]
(0x0010): Debug level changed to 0x77f0
(Tue Sep 1 12:14:24 2015) [sssd[be[bskyb.com]]] [sbus_dispatch] (0x4000): dbus
conn: 0x102e4d0
(Tue Sep 1 12:14:24 2015) [sssd[be[bskyb.com]]] [sbus_dispatch] (0x4000):
Dispatching.
(Tue Sep 1 12:14:24 2015) [sssd[be[bskyb.com]]] [sbus_message_handler]
(0x4000): Received SBUS method [ping]
(Tue Sep 1 12:14:24 2015) [sssd[be[bskyb.com]]] [sbus_get_sender_id_send]
(0x2000): Not a sysbus message, quit
(Tue Sep 1 12:14:24 2015) [sssd[be[bskyb.com]]] [sbus_handler_got_caller_id]
(0x4000): Received SBUS method [ping]
(Tue Sep 1 12:14:34 2015) [sssd[be[bskyb.com]]] [sbus_dispatch] (0x4000): dbus
conn: 0x102e4d0
(Tue Sep 1 12:14:34 2015) [sssd[be[bskyb.com]]] [sbus_dispatch] (0x4000):
Dispatching.

Then we have netgroup searches:

(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_get_netgroups_next_base]
(0x0400): Searching for netgroups with base [ou=netgroup,dc=bskyb,dc=com]
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_print_server] (0x2000):
Searching 10.88.18.10
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_get_generic_ext_step]
(0x0400): calling ldap_search_ext with [(&(cn=sb-actuate-prd-sup-su)(objectclas
s=nisNetgroup))][ou=netgroup,dc=bskyb,dc=com].
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_get_generic_ext_step]
(0x1000): Requesting attrs: [objectClass]
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_get_generic_ext_step]
(0x1000): Requesting attrs: [cn]
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_get_generic_ext_step]
(0x1000): Requesting attrs: [memberNisNetgroup]
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_get_generic_ext_step]
(0x1000): Requesting attrs: [nisNetgroupTriple]
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_get_generic_ext_step]
(0x1000): Requesting attrs: [modifyTimestamp]
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_get_generic_ext_step]
(0x2000): ldap_search_ext called, msgid = 145
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_process_result]
(0x2000): Trace: sh[0x1150750], connected[1], ops[0x117c4c0], ldap[0x10cfe90]
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_process_result]
(0x2000): Trace: ldap_result found nothing!
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_process_result]
(0x2000): Trace: sh[0x1150750], connected[1], ops[0x117c4c0], ldap[0x10cfe90]
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_process_message]
(0x4000): Message type: [LDAP_RES_SEARCH_RESULT]
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_get_generic_op_finished]
(0x0400): Search result: Success(0), no errmsg set
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_get_netgroups_process]
(0x0400): Search for netgroups, returned 0 results.
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sdap_id_op_done] (0x4000):
releasing operation connection

Not sure if this is relevant:

(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [ldb] (0x4000): cancel ldb
transaction (nesting: 1)
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sysdb_add_basic_netgroup]
(0x0400): Error: 17 (File exists)
(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [ldb] (0x4000): start ldb
transaction (nesting: 1)

Or this:

(Tue Sep 1 13:30:56 2015) [sssd[be[bskyb.com]]] [sysdb_search_by_name]
(0x0400): No such entry
(Tue Sep 1 13:30:56 2015) [sssd[be[bskyb.com]]] [sysdb_delete_group] (0x0400):
Error: 2 (No such file or directory)
(Tue Sep 1 13:30:56 2015) [sssd[be[bskyb.com]]] [acctinfo_callback] (0x0100):
Request processed. Returned 0,0,Success
(Tue Sep 1 13:30:56 2015) [sssd[be[bskyb.com]]] [sdap_process_result]
(0x2000): Trace: sh[0x10191c0], connected[1], ops[(nil)], ldap[0x10cfe90]
(Tue Sep 1 13:30:56 2015) [sssd[be[bskyb.com]]] [sdap_process_result]
(0x2000): Trace: ldap_result found nothing!


sssd_nss log:

(Sun Aug 30 03:51:56 2015) [sssd[nss]] [nss_cmd_getgrgid_search] (0x0080): No
matching domain found for [20]
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [nss_cmd_getgrgid_search] (0x0080): No
matching domain found for [10002]
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [nss_cmd_getgrgid_search] (0x0080): No
matching domain found for [70644]
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [nss_cmd_getgrgid_search] (0x0080): No
matching domain found for [70796]
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0040): No results
for netgroup filenet_ibm_temp_prd (domain bskyb.com)
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0080): No
matching domain found for [filenet_ibm_temp_prd], fail!
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0040): No results
for netgroup sb-actuate-prd-sup-su (domain bskyb.com)
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0080): No
matching domain found for [sb-actuate-prd-sup-su], fail!
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0040): No results
for netgroup sb-migration-team-ng (domain bskyb.com)
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0080): No
matching domain found for [sb-migration-team-ng], fail!
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0010): getnetgr
call returned more than one result !?!
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0040): No results
for netgroup filenet_ibm_temp (domain bskyb.com)
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0080): No
matching domain found for [filenet_ibm_temp], fail!
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0040): No results
for netgroup oprbatch-prd-sup-ng (domain bskyb.com)
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0080): No
matching domain found for [oprbatch-prd-sup-ng], fail!
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0040): No results
for netgroup kenan-tst-sup-ng (domain bskyb.com)
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0080): No
matching domain found for [kenan-tst-sup-ng], fail!
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0040): No results
for netgroup star-testers-ng (domain bskyb.com)
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0080): No
matching domain found for [star-testers-ng], fail!
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0040): No results
for netgroup conceptwave-ng (domain bskyb.com)
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0080): No
matching domain found for [conceptwave-ng], fail!
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0040): No results
for netgroup dunarbordev-ng (domain bskyb.com)
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0080): No
matching domain found for [dunarbordev-ng], fail!
(Sun Aug 30 03:51:56 2015) [sssd[nss]] [lookup_netgr_step] (0x0040): No results
for netgroup star-mig-ng (domain bskyb.com)
(Sun Aug 30 03:51:57 2015) [sssd[nss]] [lookup_netgr_step] (0x0080): No
matching domain found for [star-mig-ng], fail!
(Sun Aug 30 03:51:57 2015) [sssd[nss]] [lookup_netgr_step] (0x0040): No results
for netgroup chordif-su (domain bskyb.com)
(Sun Aug 30 03:51:57 2015) [sssd[nss]] [lookup_netgr_step] (0x0080): No
matching domain found for [chordif-su], fail!
(Sun Aug 30 03:51:57 2015) [sssd[nss]] [lookup_netgr_step] (0x0040): No results
for netgroup ProductCatalogue-ng (domain bskyb.com)
(Sun Aug 30 03:51:57 2015) [sssd[nss]] [lookup_netgr_step] (0x0080): No
matching domain found for [ProductCatalogue-ng], fail!
(Sun Aug 30 03:51:57 2015) [sssd[nss]] [lookup_netgr_step] (0x0040): No results
for netgroup cdmsup-ng (domain bskyb.com)
(Sun Aug 30 03:51:57 2015) [sssd[nss]] [lookup_netgr_step] (0x0080): No
matching domain found for [cdmsup-ng], fail!
(Sun Aug 30 03:51:57 2015) [sssd[nss]] [lookup_netgr_step] (0x0040): No results
for netgroup darwintmp-ng (domain bskyb.com)
(Sun Aug 30 03:51:57 2015) [sssd[nss]] [lookup_netgr_step] (0x0080): No
matching domain found for [darwintmp-ng], fail!
(Sun Aug 30 03:51:57 2015) [sssd[nss]] [lookup_netgr_step] (0x0040): No results
for netgroup smcora-ng (domain bskyb.com)
(Sun Aug 30 03:51:57 2015) [sssd[nss]] [lookup_netgr_step] (0x0080): No
matching domain found for [smcora-ng], fail!
...
this goes on...

Also, keep getting the following as well:
(Sun Aug 30 04:25:11 2015) [sssd[nss]] [nss_cmd_getpwuid_search] (0x0080): No
matching domain found for [76079]
(Sun Aug 30 04:34:41 2015) [sssd[nss]] [nss_cmd_getpwuid_search] (0x0080): No
matching domain found for [76079]
(Sun Aug 30 04:36:56 2015) [sssd[nss]] [nss_cmd_getgrgid_search] (0x0080): No
matching domain found for [20]
(Sun Aug 30 04:36:56 2015) [sssd[nss]] [nss_cmd_getgrgid_search] (0x0080): No
matching domain found for [10002]
(Sun Aug 30 04:36:56 2015) [sssd[nss]] [nss_cmd_getgrgid_search] (0x0080): No
matching domain found for [70644]
(Sun Aug 30 04:36:56 2015) [sssd[nss]] [nss_cmd_getgrgid_search] (0x0080): No
matching domain found for [70796]

and:
(Tue Sep 1 12:14:45 2015) [sssd[nss]] [nss_cmd_getpwuid_search] (0x0100):
Requesting info for [10187159@bskyb.com]
(Tue Sep 1 12:14:45 2015) [sssd[nss]] [nss_cmd_getpwuid_search] (0x0400):
Returning info for uid [10187159@bskyb.com]
(Tue Sep 1 12:14:45 2015) [sssd[nss]] [nss_cmd_getpwuid_search] (0x0080): No
matching domain found for [10187159]

it looks like:
$ grep nss_cmd_getpwuid_search sssd_nss.log | cut -d ' ' -f 7-15 | sort | uniq
[nss_cmd_getpwuid_search] (0x0080): No matching domain found for [10155853]
[nss_cmd_getpwuid_search] (0x0080): No matching domain found for [25014683]
[nss_cmd_getpwuid_search] (0x0080): No matching domain found for [25015226]
[nss_cmd_getpwuid_search] (0x0080): No matching domain found for [76079]
[sssd[nss]] [nss_cmd_getpwuid_search] (0x0080): No matching domain found for
[10155853]
[sssd[nss]] [nss_cmd_getpwuid_search] (0x0080): No matching domain found for
[10187159]
[sssd[nss]] [nss_cmd_getpwuid_search] (0x0080): No matching domain found for
[25014683]
[sssd[nss]] [nss_cmd_getpwuid_search] (0x0080): No matching domain found for
[76079]
[sssd[nss]] [nss_cmd_getpwuid_search] (0x0100): Requesting info for
[10187159@bskyb.com]
[sssd[nss]] [nss_cmd_getpwuid_search] (0x0100): Requesting info for
[76079@bskyb.com]
[sssd[nss]] [nss_cmd_getpwuid_search] (0x0400): Returning info for uid
[10187159@bskyb.com]
[sssd[nss]] [nss_cmd_getpwuid_search] (0x0400): Returning info for uid
[76079@bskyb.com]

and most of the queries are for 76079:
$ grep nss_cmd_getpwuid_search sssd_nss.log | grep 76079 | wc -l
313

(Tue Sep 1 11:48:41 2015) [sssd[nss]] [nss_cmd_getpwuid_search] (0x0080): No
matching domain found for [76079]
(Tue Sep 1 11:58:38 2015) [sssd[nss]] [nss_cmd_getpwuid_search] (0x0080): No
matching domain found for [76079]
(Tue Sep 1 12:08:26 2015) [sssd[nss]] [nss_cmd_getpwuid_search] (0x0080): No
matching domain found for [76079]
(Tue Sep 1 12:18:19 2015) [sssd[nss]] [nss_cmd_getpwuid_search] (0x0100):
Requesting info for [76079@bskyb.com]
(Tue Sep 1 12:18:19 2015) [sssd[nss]] [nss_cmd_getpwuid_search] (0x0400):
Returning info for uid [76079@bskyb.com]
(Tue Sep 1 12:18:19 2015) [sssd[nss]] [nss_cmd_getpwuid_search] (0x0080): No
matching domain found for [76079]


Will be attaching debug logs.
Per Lukas, might need to check further why it is trying to store info twice:


(Tue Sep 1 12:14:45 2015) [sssd[be[bskyb.com]]] [sysdb_add_basic_netgroup]
(0x0400): Error: 17 (File exists)


Don't think it relates to what Pavel & Pavel were working on.
Thank you for looking into this!
}}} => When a non-existant netgroup is requested, the sssd_nss memory usage goes up.

resolution: => fixed
status: assigned => closed

Metadata Update from @preichl:
- Issue assigned to preichl
- Issue set to the milestone: SSSD 1.13.4

2 years ago

Login to comment on this ticket.

Metadata