#49372 optimise filters for common query types
Closed: wontfix 3 years ago by spichugi. Opened 6 years ago by firstyear.

Issue Description

During filter processing we attempt to minimise the amount of work we do. Some filter elements are likely always to be costly relative to the filter test threshold.

By moving some common known expensive elements to the tail of the filter, we can avoid costly index lookups and improve common search times. A prime example is that most common searches are:

(&(objectClass=person)(uid=william))

Why should we load the entire person index, when uid=william has an idl of size 1, and thus triggers the filter test threshold.

A similar example would be:

(&(!(objectClass=nsTombstone))(uid=william))

Because the not candidate is likely to be expensive if we pushed this last, this would again, benefit from the filter test threshold.


Metadata Update from @mreynolds:
- Custom field component adjusted to None
- Custom field origin adjusted to None
- Custom field reviewstatus adjusted to None
- Custom field type adjusted to None
- Custom field version adjusted to None
- Issue set to the milestone: 1.4 backlog

6 years ago

0001-Ticket-49372-filter-optimisation-improvements-for-co.patch

Improve common filter behaviours. This moves filter foldeing (IE (&(&()) -> (&()) to the filter_optimise step.

Additionally, we re-arrange the location where we apply optimisation to the "last possible" moment, which changes some locations and how we free their filters. But this gives us the best possible behaviour of the idl/idl set components.

To highlight this difference, this is a search where idlistscanlimit can accommodate the entire objectClass idl:

BEFORE:

(&(objectClass=person)(uid=test0010083))  etime=0.0007833143
(&(uid=test0010083)(objectClass=person))  etime=0.0000664115

AFTER:

(&(objectClass=person)(uid=test0010083))  etime=0.0000760933
(&(uid=test0010083)(objectClass=person))  etime=0.0000976630

Note the difference in order of the two sets: for uid= first this is approximately 1/8th of the processing time compared to objectClass=person.

The only way to achieve "near" this is to reduce the idlistscan limit, but even then:

BEFORE with lower idlistscanlimit:

(&(objectClass=person)(uid=test0010083))  etime=0.0001458060

This value still is about twice the value of the after.

Additionally, on more complex queries this would have a greater impact. For example:

BEFORE with lower idlistscanlimit:

(&(objectClass=inetOrgPerson)(objectClass=person)(objectClass=posixAccount)(uid=test0010083))  etime=0.0002360271

AFTER:

(&(objectClass=inetOrgPerson)(objectClass=person)(objectClass=posixAccount)(uid=test0010083))  etime=0.0000616770

This still shows a reasonable improvement in processing time.

Hello.

    # LDAP_FILTER_PRESENT 135
    if f_choice == 0x87:
        print("%s(%s=*)" % (pad, f_un['f_un_type']))
    # LDAP_FILTER_APPROX
    elif f_choice == 0xa8:
        pass
    # LDAP_FILTER_LE
    elif f_choice == 0xa6:
        pass
    ...

Would it be better to get rid of the comments in this part and define constants or enums instead?

The patch causes multiple regressions. Failing tests:

48252
48265 (hangs the server 100% CPU)

I couldn't run any more tests due to the hang in 48265, so there could be more failures/regressions. Please run the full test suite before sending out the next patch. Thanks.

48265 triggers a heap-use-after free it looks like. I'll investigate this more.

one more general comment. I think the optimisation idea is good and should be implemented if the tests pass. What I would like to see is a test for a database with referrals and tests wit/without managed site control since the handling of this is also changed

Metadata Update from @firstyear:
- Issue assigned to firstyear

6 years ago

0001-Ticket-49372-filter-optimisation-improvements-for-co.patch

@lkrispen I've already done a referral test, and that's happy. Pretty much every search is the referral case in fact.

Hew managedsait case however I'm not so sure about. How would I test this?

Anyway, the patch fixes the issues with crash on those tests

25 passed in 36.17 seconds

48252 is still failing, but that's not related to this. I did a dump on the filters:

(|
    (&
        (0x6020003a5070 "objectClass"=0x6020003a5050 "nstombstone")
        (0x6020003a50f0 "uid"=0x6020003a50d0 "test_user1")
    )
    (0x6020003a5130 "objectclass"=0x6020003a5150 "referral")
)

(|
    (&
        (0x6020003a50f0 "uid"=0x6020003a50d0 "test_user1")
        (0x6020003a5070 "objectClass"=0x6020003a5050 "nstombstone")
    )
    (0x6020003a5130 "objectclass"=0x6020003a5150 "referral")
)

So we aren't mangaling these, I think we are hitting a different issue, perhaps https://pagure.io/389-ds-base/issue/49355

48252 test was fixed, did you rebase with master yet?

There's been recent updates to the ci tests, so I'd like to see the latest tests run again with your patch.

I may not have rebased this branch. I'll check again shortly.

Okay, I've rebased and master works, but with this it does not. I wonder if I'm losing a flag on the filter for when oc is set perhaps? I'll investigate some more,

WORKING:
(| flags:2
    (& flags:2
        (0x6020003a5070 "objectClass"=0x6020003a5050 "nstombstone") flags:0
        (0x6020003a50f0 "uid"=0x6020003a50d0 "test_user1") flags:0
    )
    (0x6020003a5130 "objectclass"=0x6020003a5150 "referral") flags:0
)

FAILING:
(| flags:2
    (& flags:2
        (0x6020003a50f0 "uid"=0x6020003a50d0 "test_user1") flags:0
        (0x6020003a5070 "objectClass"=0x6020003a5050 "nstombstone") flags:0
    )
    (0x6020003a5130 "objectclass"=0x6020003a5150 "referral") flags:0
)

I think that this could be because we require nstombstone to be the first filter, because if we do the uid filter first we get 0 in the idl, so we trigger filter_test_threshold and return too early. So this could be the cause,

Really, what causes the problem with this patch, is we re-order the AND/OR components when we prioritise them - Perhaps if flags contains tombstone, we don't optimise?

https://pagure.io/lib389/issue/102 @lkrispen This lib389 patch adds support for smart referral objects, and a test case to show they work with the managedsait control. I tested it with this patch enabled and it passes :)

Metadata Update from @mreynolds:
- Custom field reviewstatus adjusted to ack (was: None)

6 years ago

commit 4cd1a24
To ssh://git@pagure.io/389-ds-base.git
4060848..4cd1a24 master -> master

Thanks Mark and Ludwig!

Metadata Update from @firstyear:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

6 years ago

Look like this also fixes issues with scope one searches in 1.3.7/1.3.8.

Metadata Update from @mreynolds:
- Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1598186
- Issue set to the milestone: 1.3.7.0 (was: 1.4 backlog)

5 years ago

1.3.7:

5ad7f14 Ticket 49372 - filter optimisation improvements for common queries

1.3.8
f7a21ac Ticket 49372 - filter optimisation improvements for common queries

We need to revert this fix in all branches. Its breaking all kinds of things, and its holding up F27. Going to revert for now until we get everything sorted out...

Metadata Update from @mreynolds:
- Issue status updated to: Open (was: Closed)

5 years ago

Reverts...

Master: d06b5bb..14a10a3 master -> master

commit 14a10a3 (HEAD -> ticket49372)
Revert "Ticket 49372 - filter optimisation improvements for common queries
This reverts commit 4cd1a24.

commit 104968b
Revert "Ticket 49432 - filter optimise crash"
This reverts commit 5c89dd8.

1.3.8: 96f334a..3386445 389-ds-base-1.3.8 -> 389-ds-base-1.3.8

commit 3386445 (HEAD -> 389-ds-base-1.3.8)
Revert "Ticket 49372 - filter optimisation improvements for common queries"
This reverts commit f7a21ac.

commit 1c5b9b0
Revert "Ticket 49432 - filter optimise crash"
This reverts commit d8d57c9.

1.3.7: d3cf9f2..79d0b4d 389-ds-base-1.3.7 -> 389-ds-base-1.3.7

commit 79d0b4d (HEAD -> 389-ds-base-1.3.7)
Revert "Ticket 49372 - filter optimisation improvements for common queries"
This reverts commit 5ad7f14.

commit 24b84d1
Revert "Ticket 49432 - filter optimise crash"
This reverts commit 855d78b.

the backout also fixed the issue reported in bz 1616412 (patch for #48275 longer working).

also the patch for #49617 which was not committed is no longer needed.

But if we decide to put the optimization back we need to address these two issues

Reference:

https://bugzilla.redhat.com/show_bug.cgi?id=1616412

This is the bug where this enhancement originally broke things

@lkrispen Yep, there is a debbuging patch in PR to check to see what's going on here. I think it's in re-arrangement of onelevel searches because that code was improved in this patch.

@mreynolds Thanks, I'll have a look, but witohut the extra debug logging, it may not be easy to use this to piece together what went wrong :(

Metadata Update from @mreynolds:
- Issue set to the milestone: 1.4.2 (was: 1.3.7.0)

4 years ago

Metadata Update from @vashirov:
- Issue set to the milestone: 1.4 backlog (was: 1.4.2)

4 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/2431

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix
- Issue status updated to: Closed (was: Open)

3 years ago

Login to comment on this ticket.

Metadata