LDAP schema retrieval and parsing is a rather costly operation. For this reason, {{{ipapython.ipaldap}}} caches schema data. FreeIPA's high level LDAP library uses a module global variable {{{schema_cache}}} to cache a LDAP server's schema. The schema is not prefetched at startup but downloaded and cached on demand.
While the approach sounds like a good solution at first, it has some draw backs. Most importantly the implementation relies on the assumption that a module global variable is persisted between WSGI requests. This assumption is dangerous and in general wrong for a {{{wsgi.multiprocess}}} WSGI server. It might be OK for a {{{wsgi.multithread}}}, however FreeIPA framework is not compatible with multithreading. In general a WSGI application must not assume that any state is retained between WSGI requests.
So far FreeIPA got lucky. We use Apache with mpm prefork and mod_wsgi with daemon processes. The default mod_wsgi configuration uses two daemon processes with 500 max requests each. The pre-forking MPM does not handle each HTTP request in a separate fork. Instead it pre-forks worker processes. Each worker handles one WSGI request at a time. Other WSGI implemementations use a fork approach: load WSGI app in main process, handle each request in a new forked child process. Werkzeug's WSGI server works that way.
Problems with current approach:
Metadata Update from @cheimes: - Issue assigned to someone - Issue set to the milestone: 0.0 NEEDS_TRIAGE
Parsing LDAP schema is indeed expensive, to monitor LDAP schema change you may lookup a specific attribute value 'nsschemacsn'
ldapsearch -LLL ... -b "cn=schema" nsSchemaCSN dn: cn=schema nsSchemaCSN: 58b83abf000000000000
When schema is updated (either through replication or direct changes) this value changes.
Could WSGI workers use this attribute to monitor LDAP schema change ?
Metadata Update from @pvoborni: - Issue close_status updated to: None - Issue set to the milestone: Future Releases (was: 0.0 NEEDS_TRIAGE)
Login to comment on this ticket.