From 57ff36a3c3b39a386802b0de170fa8197d911536 Mon Sep 17 00:00:00 2001 From: Fabiano FidĂȘncio Date: Jul 14 2017 09:27:19 +0000 Subject: Add "Fast NSS Cache" design page A copy & paste from the old one. Signed-off-by: Fabiano FidĂȘncio --- diff --git a/design_pages/fast_nss_cache.rst b/design_pages/fast_nss_cache.rst new file mode 100644 index 0000000..ed1347e --- /dev/null +++ b/design_pages/fast_nss_cache.rst @@ -0,0 +1,97 @@ +FastNSSCache +============ + +Problem Statement +----------------- + +Currently in SSSD user and group lookups are performed by using a +connection to a named socket. This means each client has to talk to the +sssd\_nss daemon for every lookup it needs to perform. + +Although this behaviour works well for normal machines, it has +scalability limits on busy machines with many processes that need to +query the user/group database frequently. + +There are 2 factors that cause scalability issues: + +#. context switches +#. the responder process is single-threaded (although asynchronous) so + the amount of processing it can do is limited by the speed of 1 cpu + +Each request suffers from at least 2 context switches (and a few copies +of the data in memory) to write() the request in, wait until it is +processed by sssd\_nss, read() the reply back. Because sssd\_nss may be +busy answering many requests a queue may build up and replies be +delayed. + +Allowing the clinets to direcly access SSSD caches is not possible for +various reasons including: + +- sssd uses LDB as caching backend and LDB depends on byte range locks. + Letting a client read access to the cache would allow DoS if the + client lcoks a record and never unlock it. +- sssd stores data not all clients are allowed to get access to + (password hashes for example) and partitioning access to this data + within the LDB cache is not feasible. + +A method to avoid context switches and the sssd\_nss bottleneck without +compromising th security of the system is therefore desirable. + +Overview of FastNSSCache solution +--------------------------------- + +The FastNSSCache feature addresses both issues. + +This is done by creating a specialized cache that have a few properties + +- Contains only public data (the same data available in a public passwd + or group files) +- Read only for clients +- Does not use locking and yet prevents access to inconsistent data +- Cache has a fixed size and uses a FIFO (for now) method to know which + entries to purge +- Fallback to named sockets if entry is not found in the Fast Cache + +Implementation details +---------------------- + +The cache files are opened on the client at the first query and mmaped +in the process memory, all accesses to data are therefore direct access +to memory and do not suffer from any context switch. They also happen in +parallel within each process with synchronization (in order to allow +updates) performed by using memory barriers. + +Cache files can only be used for direct lookups (by name or by uid/gid), +enumerations are \_never\_ handled via fast cache lookups by design, +they always fallback to socket communication. + +The "maps" currently available are the \_passwd\_ and \_group\_ maps, +each map has a file associated in the /var/lib/sss/mc directory which is +accessible read-only by clients. + +Configuration +------------- + +At the moement we plan to provide 3 parameters per map that can control +the caches. + +- Per map enablement parameter that allows to activate/deactiveate maps + invidiually. +- Per map cache size to fine tune the cache sizes in case space is at a + premium or the dataset does not fit the default cache. +- Expiration time for entries. + +Cache entries warrant a shorter expiration time than current LDB caches +because access to these entries is undetectable by sssd\_nss which +cannot decide how much an entry is required and whether a midway refresh +is needed. By shortening the FastNSSCache entries life time we incur in +the penalty of using the pipe from time to time but in turn we allow +ssd\_nss to decide whether it is required to refresh the entry or not. + +Future Improvements +------------------- + +- Better garbage collection on the server side, at the moment a FIFO is + used. +- Handle caching other nsswitch.conf database plugins in order to avoid + slow access to the files db diff --git a/design_pages/index.rst b/design_pages/index.rst index bd1c067..44f4984 100644 --- a/design_pages/index.rst +++ b/design_pages/index.rst @@ -34,6 +34,7 @@ When writing a design page, please start from the blank template. dbus_simple_api dbus_users_and_groups ddns_messages_update + fast_nss_cache matching_and_mapping_certificates subdomain_configuration kcm