#490 Slow role performance when using a lot of roles
Closed: Fixed None Opened 7 years ago by mreynolds.

If you are using 100's or 1000's of roles, a search for a single role can take a very long time. Gathering more information on how to reproduce.


Using test data having 86568 entries in total; 98 nsRoleDefinition entries and 61542 nsRoleDn among them...

Sample command line:
ldapsearch -LLLx -h localhost -p 389 -D 'cn=directory manager' -w password -b "dc=example,dc=com" "(nsrole=cn=CN0,o=O0,dc=example,dc=com)" nsrole
It returns 3291 entries with 8321 nsrole attribute values.

With the patch:
nsslapd-ndn-cache-enabled: on
No entries in cache: 0m49.308s
All entries in cache: 0m0.181s

nsslapd-ndn-cache-enabled: off
No entries in cache: 0m51.792s
All entries in cache: 0m0.210s

Without the patch:
nsslapd-ndn-cache-enabled: on
No entries in cache: 0m50.579s
All entries in cache: 0m9.599s

nsslapd-ndn-cache-enabled: off
No entries in cache: 0m52.727s
All entries in cache: 0m9.857s

The patch has no impact on the elapsed time to generate virtual attributes (No entries in cache). But once they are evaluated and placed in the entry cache, we could see the improvement (All entries in cache). Please note that if all the entries in the database have virtual attributes, this patch would have no effect.

In addition, I tested with nsslapd-ndn-cache-enabled: on and off. It's not huge, but we could recognize steady improvement. I recommend to enable the functionality, by default or at least, advertise it more (on 1.3.0 or newer)...

Bug description: Role uses the virtual attribute framework.
When the search with a filter including nsrole or a return
attribute list containing nsrole is being processed, the
virtual attribute code checks the entry if the vattr values
are valid or not by examining the watermark. If it is valid,
the values are used as if they are static. If it is not
valid, the entry is evaluated against the role definitions
and dynamically generated virtual attributes are set to the
list (e_virtual_attrs) with the proper watermark.

The current code additionally checks e_virtual_attrs to determine
the entry is already evaluated or not. If it is NULL, it
considers the entry is not yet evaluated and it returns SLAPI_
ENTRY_VATTR_NOT_RESOLVED even if the watermark is valid. That
is, all the entries which do not have virtual attributes are
unnecessarily evaluated every time search with nsrole is executed.

Fix description: This patch does not return SLAPI_ENTRY_VATTR_NOT_
RESOLVED but does SLAPI_ENTRY_VATTR_RESOLVED_ABSENT if e_virtual_
attrs is NULL AND the watermark is valid. By skipping the not-
needed nsrole evaluation, it speeds up the virtual search once
virutual attribute values are placed in the entries in memory.

Replying to [comment:4 nhosoi]:

Using test data having 86568 entries in total; 98 nsRoleDefinition entries and 61542 nsRoleDn among them...

Sample command line:
ldapsearch -LLLx -h localhost -p 389 -D 'cn=directory manager' -w password -b "dc=example,dc=com" "(nsrole=cn=CN0,o=O0,dc=example,dc=com)" nsrole
It returns 3291 entries with 8321 nsrole attribute values.

With the patch:
nsslapd-ndn-cache-enabled: on
No entries in cache: 0m49.308s
All entries in cache: 0m0.181s

nsslapd-ndn-cache-enabled: off
No entries in cache: 0m51.792s
All entries in cache: 0m0.210s

Without the patch:
nsslapd-ndn-cache-enabled: on
No entries in cache: 0m50.579s
All entries in cache: 0m9.599s

nsslapd-ndn-cache-enabled: off
No entries in cache: 0m52.727s
All entries in cache: 0m9.857s

The patch has no impact on the elapsed time to generate virtual attributes (No entries in cache). But once they are evaluated and placed in the entry cache, we could see the improvement (All entries in cache). Please note that if all the entries in the database have virtual attributes, this patch would have no effect.

In addition, I tested with nsslapd-ndn-cache-enabled: on and off. It's not huge, but we could recognize steady improvement. I recommend to enable the functionality, by default or at least, advertise it more (on 1.3.0 or newer)...

Can you run cn=monitor (on the cn=ldbm database entry), with the ndn cache enabled? I'm curious if the cache was large enough for your testcase. You can adjust the cache size, if it was not large enough, and probably get better results.

Thanks,
Mark

Can you put the str2simple fix in a separate ticket?

Replying to [comment:7 mreynolds]:

Can you run cn=monitor (on the cn=ldbm database entry), with the ndn cache enabled? I'm curious if the cache was large enough for your testcase. You can adjust the cache size, if it was not large enough, and probably get better results.

When I ran the test, I set the large enough size "nsslapd-ndn-cache-max-size: 104857600" and also monitored the cache usages then. Looking at the numbers, I thought the cache is fully functioning. 99% hit ratio!

This is for "With the patch & nsslapd-ndn-cache-enabled: on" case.
normalizeddncachetries: 3613456
normalizeddncachehits: 3587850
normalizeddncachemisses: 25606
normalizeddncachehitratio: 99
currentnormalizeddncachesize: 3578776
maxnormalizeddncachesize: 104857600
currentnormalizeddncachecount: 25606

This is for "Without the patch & nsslapd-ndn-cache-enabled: on" case.
normalizeddncachetries: 4819910
normalizeddncachehits: 4794304
normalizeddncachemisses: 25606
normalizeddncachehitratio: 99
currentnormalizeddncachesize: 3578776
maxnormalizeddncachesize: 104857600
currentnormalizeddncachecount: 25606

Following the suggestion from Rich, I separated the change on str2filter.c to another ticket/patch (https://fedorahosted.org/389/ticket/603).

Reviewed by Rich (Thank you!!)

Pushed to master: commit ae7e811

Pushed to 389-ds-base-1.3.0: commit 6d45cda


Reopening this ticket.

I can confirm that the commit in #512 has broken the functionality gained from the fix provided by Norkio in ticket #490.

commit 86f8b9f (ticket #512) appears to have broken the nsrole fix implemented in ae7e811 (ticket #490).

Test case:
Same exact data & config on 389-Directory/1.3.1.pre.a1.gitfebd0db and 389-Directory/1.3.2.a1.gita154ecf yield significantly different results in terms of time:
role search on 1.3.1 takes 0m2.279s
role search on 1.3.2 it takes 2h37m

Re-closing this ticket. The regression mentioned in comment#16 is caused by ticket 512, so that will be dealt with in that ticket instead of here.

Metadata Update from @nhosoi:
- Issue assigned to nhosoi
- Issue set to the milestone: 1.3.0

2 years ago

Login to comment on this ticket.

Metadata