#1361 sssd_pam file descriptor leak /var/lib/sss/pipes/private/pam

Created 6 years ago by sgallagh
Modified a year ago

https://bugzilla.redhat.com/show_bug.cgi?id=826192 (Red Hat Enterprise Linux 6)

Description of problem:

Occasionally, sssd_pam leaks file descriptors to /var/lib/sss/pipes/private/pam
which causes logins to fail and cron jobs to not launch.

I noticed that it happens after a user closes their ssh session, then attempts
to log back in using a password. Normally the reconnect works fine, but it
randomly fails.  After sssd_pam gets borked, each failed ssh attempt leads to
another leaked file descriptor.  In our case, it's an automated process that
keeps retrying, and eventually we bump up against the open file descriptor

I have a debug sssd_pam log, /var/log/secure log, and an strace of sssd_pam
when this issue occurred, but due to their sensitive nature, I can't post them
until this bug is marked private.

Version-Release number of selected component (if applicable):
Occurs in both sssd-1.5.1-66.el6_2.3.x86_64 and sssd-1.8.0-22.el6.x86_64

How reproducible:
Not sure yet.  It's sporadic, so I haven't been able to create a reproducer.

Steps to Reproduce:

Actual results:

Expected results:

Additional info:

Fields changed

blockedby: =>
blocking: =>
coverity: =>
feature_milestone: =>
milestone: NEEDS_TRIAGE => SSSD 1.9.0
tests: => 0
testsupdated: => 0
upgrade: => 0

The NSS and PAM pipes use common code to handle pipes. Due to NSS requirements to handle enumerations we defaulted to leave connections open by default.

However in the PAM case we do not depend on a client connection to keep client-bound state so we could avoid leaks by simply always closing the connection after each PAM operation.

Given authentication is usually an inherently slow operation and is also performed rarely in most cases, the overhead of re-opening the pipe should be lost in the noise in most case.

In case the overhead is not minimal for some special authentication hubs that have a higher rate of PAM requests we could conceivably provide a way to indicate connections should be reused though an environment variable.

Fields changed

owner: somebody => pbrezina
status: new => assigned

Fields changed

patch: 0 => 1

Fields changed

milestone: SSSD 1.9.0 => SSSD 1.9.1

It is unclear whether we still need this enhancement after dd94e9c. Putting back to NEEDS_TRIAGE for discussion.

milestone: SSSD 1.9.1 => NEEDS_TRIAGE
patch: 1 => 0

We discussed the proposal some more with Simo and Sumit and decided for a different approach (two actually). They are being tracked in #1569 and #1570.

Fields changed

resolution: => wontfix
status: assigned => closed

Fields changed

design: =>
design_review: => 0
fedora_test_page: =>
milestone: NEEDS_TRIAGE => SSSD 1.9.2

a year ago

Metadata Update from @sgallagh:
- Issue assigned to pbrezina
- Issue set to the milestone: SSSD 1.9.2

Login to comment on this ticket.