#2602 Optimize cache writes to sysdb
Closed: Fixed None Opened 4 years ago by vokac.

Description of problem:
We are trying to replace our old NIS environment with AD and sssd-ad, but sssd is unacceptable slow compared to NIS environment. After several tests I came to conclusion that persistent tdb/ldb backend is very slow (when adding or modifying stored records).

Version-Release number of selected component (if applicable):
sssd-1.11.2-68.el7_0.6.x86_64
libldb-1.1.16-4.el7.x86_64
libtdb-1.2.12-3.el7.x86_64

How reproducible:
1. create user1 ... user4000 in AD
2. configure sssd-ad (with minimum number of modified options)
3. clear sssd cache: sss_cache -E
4. try to resolve all accounts: date; python -c 'import pwd; [ pwd.getpwnam(uid) for uid in ["user1", ..., "user4000"]]'; date

Additional info:
Calling getpwnam 4000 times when SSSD has empty cache takes ~ 20 minutes on our system (real life scenario is GUI access to /home that hangs for 20minutes!). This slow behavior is not caused by LDAP queries to AD, because when I run directly all LDAP queries they are finished in less than 10s. Strace on sssd_be revieled that significant time was spend by "syncing" data - as a test I recompiled sssd with LDB_FLG_NOSYNC ldb_connect option and same operation that took ~ 20 minutes was finished in ~ 3 minutes (still not really satisfactory speed). It looks like tdb/ldb persistent storage is not able to provide good performance for sssd.


We're working on improving the performance in 1.13. In the meantime, did you try enabling enumeration or the background refresh (very new feature, in git only)? For 4000 users, it might be doable, but performance depends on your environment.

Alternatively, you can try symlinking the cache to /dev/shm.

We should take a look at the number of transactions together with #1370

milestone: NEEDS_TRIAGE => SSSD 1.13 backlog

Replying to [comment:1 jhrozek]:

We're working on improving the performance in 1.13. In the meantime, did you try enabling enumeration or the background refresh (very new feature, in git only)? For 4000 users, it might be doable, but performance depends on your environment.

With enabled enumeration sssd works better in few specific situations. With clean cache it took ~ 35s to resolve all uid<->uidNumber (running just all LDAP queries logged by sssd took ~ 10s). Unfortunately this doesn't fix all issues, because probably after entry_cache_timeout sssd no longer resolves all information in one transaction and that means resolving all names took ~ 20 minutes again.

Even with background refresh (I did not yet tested that functionality) it seem to me that 20 minutes (of locking/syncing persistent cache database backend) every entry_cache_timeout is too high price for obtaining passwd/group data.

Alternatively, you can try symlinking the cache to /dev/shm.

With this configuration I get same performance (~ 3 minutes) as with LDB_FLG_NOSYNC ldb_connect option.

Fields changed

rhbz: => 0

Changing the topic based on the feature designs we did with the other sssd developers during face-to-face meetings, over the phone an on IRC. I will post a full desing page later, but tl;dr is:
- we will split the sysdb cache into two, one that stores the entries themselves and one that stores the timestamps
- avoid cache writes unless the entry itself had changed
- use the LDB_NOSYNC option with the timestamp sysdb to speed up writes there. If this cache would be lost, we only lose the stamps

Other incremental improvements will be tracked in separate tickets. Also moving to 1.14 Alpha, since performance will be one of the main topics in the 1.14 release.

milestone: SSSD 1.13 backlog => SSSD 1.14 alpha
priority: major => blocker
sensitive: => 0
summary: Very slow persistent tdb/ldb storage for sssd => Optimize cache writes to sysdb

Fields changed

owner: somebody => jhrozek
status: new => assigned

Fields changed

patch: 0 => 1

The patches are on review, but I would like to release 1.14 alpha today, therefore moving to 1.14 beta.

milestone: SSSD 1.14 alpha => SSSD 1.14 beta

Fields changed

resolution: => fixed
status: assigned => closed

Metadata Update from @vokac:
- Issue assigned to jhrozek
- Issue set to the milestone: SSSD 1.14 beta

2 years ago

Login to comment on this ticket.

Metadata