Ticket 49372 - filter optimisation improvements for common queries
Bug Description: Due to the way we apply indexes to searches
and the presence of the "filter test threshold" there are a number
of queries which can be made faster if they understood the internals
of our idl_set and index mechanisms. However, instead of expecting
application authors to do this, we should provide it.
Fix Description: In the server we have some cases we want to
achieve, and some to avoid:
* If a union has an unindexed candidate, we throw away all work
and return an ALLIDS idls.
* In an intersection, if we have an idl that is less than
filter test threshold, we return immediately that idl
rather than accessing all others, and perform a filter
test.
Knowing these two properties, we can now look at improving filters
for queries.
In a common case, SSSD will give us a query which is a union of
host cn and sudoHost rules. However, the sudoHost rules are
substring searchs that are not able to be indexed - thus the whole
filter becomes an unindexed search. IE:
(|(cn=a)(cn=b)(cn= ....)(sudoHost=[*]*))
So in this case we want to move the substring to the first query
so that if it's un-indexed, we fail immediately with ALLIDS rather
than opening the cn index.
For intersection, we often see:
(&(objectClass=account)(objectClass=posixAccount)(uid=william))
The issue here is that the idls for account and posixAccount both
may contain 100,000 items. Even with idl lookthrough limits, until
we start to read these, we don't know if we will exceed that.
A better query is:
(&(uid=william)(objectClass=account)(objectClass=posixAccount))
Because the uid=william index will contain a single item, this
put's us below filter test threshold, and we will not open the
objectClass indexes.
In fact, in an intersection, it is almost always better to perform
simple equalities first:
(&(uid=william)(modifyTimestamp>=...)(sn=br*)(objectClass=posixAccount))
In most other cases, we will not greatly benefit from re-arrangement
due to the size of the idls involved we won't hit filter test. IE
(&(modifyTimestamp>=...)(sn=br*)(objectClass=posixAccount))
Would not be significantly better despite and possible arrangement
without knowing the content of sn.
So in summary, our rules for improving queries are:
* unions-with-substrings should have substrings *first*
* intersection-with-equality should have all non-objectclass
equality filters *first*.
https://pagure.io/389-ds-base/issue/49372
Author: wibrown
Review by: lkrispen, mreynolds (Thanks!)