#1570 Keep track of per-PID/cgroup/process group fds in the PAM responder.
Closed: wontfix a year ago by pbrezina. Opened 8 years ago by jhrozek.

This is another idea that came out of an IRC discussion between me, Sumit and Simo. Simo proposed that we keep track of open fds and when the count of open file descriptors reaches a certain limit, then kill the oldest one (=the one that had been open the longest). The question is, what should be used as the key in such structure - we can group the fds by PID, UID, process group or cgroup. The limit needs to be tunable.

Full discussion follows:

7:13 < simo> jhrozek: ok closing idle connections is ok
17:13 < simo> but does not protect you from malicious or simply badly misbehaving clients
17:14 < simo> jhrozek: I think we need to protect ourselves a bit more
17:14 < simo> and that can be relatively easily done with not too great changes 
17:14 < jhrozek> well, we would reach the fd limit soon with a misbehaving application
17:14 < simo> the way to do it is to keep a per-pid list of open sockets
17:14 < simo> perhaps in a hash table
17:15 < simo> if the same pid tries to open more than X connection we simply go and kill the least used one to make space
17:15 < simo> it also means keeping a timestamp associated with the fd that marks when a connection was used last
17:15 < simo> this way we can have a tunable parm that says something like: no more than 15 connections per pid
17:16 < jhrozek> wouldn't a malicious application simply fork and open a new fd in a child?
17:16 < simo> a very bad app could still fork children though ...
17:16 < simo> actually we could have a limit per process group
17:16 < simo> or even per user
17:16 < simo> but exempt root
17:16 < simo> or maybe we should have a combination of limits
17:17 < jhrozek> I think we should keep the limits simple
17:17 < simo> per-pid + per-cgroup + per-user
17:17 < jhrozek> to make sure we're not denying legitimate access
17:17 < simo> jhrozek: you can, but then sssd_pam can be easily abused and DoSed
17:18 < simo> all it takes is a user running a bash script that forks a child and cat the pam pipe
17:18 < simo> and presto you will use all FDs
17:18 < simo> so some limits that prevents that should be put in place on the server side
17:21 < jhrozek> OK
17:22 < jhrozek> I'll write it up into a ticket, but I still think the limits should be kept simple at least for configurations's sake
17:22 < jhrozek> X allowed connections per process group is much easier to understand than a combination of factors
17:22 < simo> jhrozek: oh yeah we need to try to make them genrally not something you want to actually explicitly configure unless you have so weird setup
17:22 < simo> jhrozek: yeah we can start simple and see if that suffices
17:23 < simo> jhrozek: I would think of using cgroups though
17:23 < simo> but not sure
17:23 < simo> the problem is that it is easy to create a new process group as a user
17:23 < simo> so it won't be effective
17:23 < jhrozek> I know little about cgroups
17:23 < simo> maybe we should simply have a relatively large X per-user
17:23 < simo> like 50 per-uid
17:23 < simo> tunable
17:24 < simo> root exampted
17:24 < simo> *exempted
17:24 < simo> for now
17:24 < jhrozek> and your proposal was to keep a timestamp or keep them ordered based on time and when the limit is reached, then go and kill the last one?
17:26 < simo> jhrozek: the first one (oldest) one
17:26 < simo> I assume that a misbehaving app is simply leaking fds
17:26 < simo> so killing the oldest should be safe
17:27 < jhrozek> simo: right, that's what I meant by "last".
17:27 < simo> for active DoSs I do not care which one is killed, they are all bad connections

I'd do this generic for all responders not just PAM. It should be "easily" hidden inside responder common code.

It seems that hash and collection interfaces from the ding-libs would be good primitives to use. It will be a two level object. The top level object the hash table of users. The key is the user and the value is the collection object. Collection will store the list of pids in the order of creation.

Fields changed

milestone: NEEDS_TRIAGE => SSSD 1.11 beta
rhbz: [https://bugzilla.redhat.com/show_bug.cgi?id=826192 826192] => [https://bugzilla.redhat.com/show_bug.cgi?id=826192 826192] todo

Fields changed

milestone: SSSD 1.11 beta => SSSD 1.12 beta

Fields changed

changelog: =>
design: =>
design_review: => 0
fedora_test_page: =>
mark: => 0
priority: major => minor
review: => 0
selected: =>
sensitive: => 0

Might make sense, but not in 1.14..

milestone: SSSD 1.14 beta => SSSD 1.15 beta

Metadata Update from @jhrozek:
- Issue set to the milestone: SSSD Future releases (no date set yet)

4 years ago

Metadata Update from @thalman:
- Custom field design_review reset (from 0)
- Custom field mark reset (from 0)
- Custom field patch reset (from 0)
- Custom field review reset (from 0)
- Custom field sensitive reset (from 0)
- Custom field testsupdated reset (from 0)
- Issue close_status updated to: None
- Issue tagged with: Canditate to close

a year ago

Metadata Update from @thalman:
- Custom field design_review reset (from false)
- Custom field mark reset (from false)
- Custom field patch reset (from false)
- Custom field review reset (from false)
- Custom field sensitive reset (from false)
- Custom field testsupdated reset (from false)
- Issue untagged with: Canditate to close
- Issue tagged with: Patch is welcome

a year ago

Thank you for taking time to submit this request for SSSD. Unfortunately this issue was not given priority and the team lacks the capacity to work on it at this time.

Given that we are unable to fulfill this request I am closing the issue as wontfix.

If the issue still persist on recent SSSD you can request re-consideration of this decision by reopening this issue. Please provide additional technical details about its importance to you.

Thank you for understanding.

Metadata Update from @pbrezina:
- Issue close_status updated to: wontfix
- Issue status updated to: Closed (was: Open)

a year ago

SSSD is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in SSSD's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/SSSD/sssd/issues/2612

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Login to comment on this ticket.