#6691 LDAP schema cache assumes persistent state between requests
Opened 7 years ago by cheimes. Modified 7 years ago

LDAP schema retrieval and parsing is a rather costly operation. For this reason, {{{ipapython.ipaldap}}} caches schema data. FreeIPA's high level LDAP library uses a module global variable {{{schema_cache}}} to cache a LDAP server's schema. The schema is not prefetched at startup but downloaded and cached on demand.

While the approach sounds like a good solution at first, it has some draw backs. Most importantly the implementation relies on the assumption that a module global variable is persisted between WSGI requests. This assumption is dangerous and in general wrong for a {{{wsgi.multiprocess}}} WSGI server. It might be OK for a {{{wsgi.multithread}}}, however FreeIPA framework is not compatible with multithreading. In general a WSGI application must not assume that any state is retained between WSGI requests.

So far FreeIPA got lucky. We use Apache with mpm prefork and mod_wsgi with daemon processes. The default mod_wsgi configuration uses two daemon processes with 500 max requests each. The pre-forking MPM does not handle each HTTP request in a separate fork. Instead it pre-forks worker processes. Each worker handles one WSGI request at a time. Other WSGI implemementations use a fork approach: load WSGI app in main process, handle each request in a new forked child process. Werkzeug's WSGI server works that way.

Problems with current approach:

  • First request and each 500th request to a WSGI worker daemon is much slower.
  • Schema cache is not synchronized. In case schema changes, one WSGI worker may have a different copy of the schema than other WSGI workers on the same machine.
  • Schema cache is not shared between WSGI workers. Each worker has its own copy in memory instead of a copy-on-write shared state.
  • ipaldap does not contain documentation that it relies on persistent state

Metadata Update from @cheimes:
- Issue assigned to someone
- Issue set to the milestone: 0.0 NEEDS_TRIAGE

7 years ago

Parsing LDAP schema is indeed expensive, to monitor LDAP schema change you may lookup a specific attribute value 'nsschemacsn'

ldapsearch -LLL ... -b "cn=schema" nsSchemaCSN
dn: cn=schema
nsSchemaCSN: 58b83abf000000000000

When schema is updated (either through replication or direct changes) this value changes.

Could WSGI workers use this attribute to monitor LDAP schema change ?

Metadata Update from @pvoborni:
- Issue close_status updated to: None
- Issue set to the milestone: Future Releases (was: 0.0 NEEDS_TRIAGE)

7 years ago

Login to comment on this ticket.

Metadata