30c4f30 Ticket 49290 - improve idl handling in complex searches

Authored and Committed by William Brown 6 years ago
    Ticket 49290 - improve idl handling in complex searches
    
    Bug Description:  Previously we performed iterative set manipulations
    during searches. For example, (&(a)(b)(c)(d)) would result in:
    
    t1 = intersect(a, b)
    t2 = intersect(t1, c)
    t3 = intersect(t2, d)
    
    This is similar for or with union. The issue is that if the candidate
    sets were large (until d), then we would allocate and duplicate a large
    set of ID's repeatedly, until the set was reduced in the third step.
    
    The same is true for unions, but the set would continue to grow. As well
    due to the sorted nature of the IDList, we would perform an O(m + n)
    merge of the arrays, potentially many times. This would be amplified
    the more terms added.
    
    Especially in the and case, this could result in significant time
    differences if you move smaller candidate sets to the start of the
    query, and for the or case, would result in exponential time
    increases by adding extra terms.
    
    Fix Description:  To resolve this, rather than processing each
    set in an interative fashion, we store each subfilters result in
    the new idl_set structure. When we have collected each candidate
    set we can then perform better intersections. This is done through
    k-way intersection and k-way unions, which prevent intermediate
    allocations during processing.
    
    A benefit of this change, is that future improvements to not/or
    queries are easier to make because we now have "perfect" knowledge
    of all the IDLists involved, rather than just pairs at a time.
    
    https://pagure.io/389-ds-base/issue/49290
    
    Author: wibrown
    
    Review by: tbordaz, lkrispen (Thanks!)
    
        
file modified
+1 -0