PR#50379: Ticket 50349 - filter schema validation - 389-ds-base

389-ds-base

#50379 Ticket 50349 - filter schema validation

Closed 3 years ago by spichugi. Opened 4 years ago by firstyear.

firstyear/389-ds-base 50349-filter-schema-check into master

Ticket 50349 - filter schema validation

William Brown • 4 years ago

43f7b99

Makefile.am

file modified

		`@@ -2142,6 +2142,7 @@`
		`test/libslapd/counters/atomic.c \`
		`test/libslapd/pblock/analytics.c \`
		`test/libslapd/pblock/v3_compat.c \`
		`+ test/libslapd/schema/filter_validate.c \`
		`test/libslapd/operation/v3_compat.c \`
		`test/libslapd/spal/meminfo.c \`
		`test/plugins/test.c \`

README.md

file modified

+5 -1

		`@@ -27,15 +27,19 @@`
		`./configure --enable-debug --with-openldap --enable-cmocka --enable-asan`
		`make`
		`make lib389`
		`- make check`
		`sudo make install`
		`sudo make lib389-install`

		`Testing`
		`-------`

		`+ make check`
		`sudo py.test -s 389-ds-base/dirsrvtests/tests/suites/basic/`

		`+ To debug the make check item's, you'll need libtool to help:`
		`+`
		`+ libtool --mode=execute gdb /home/william/build/ds/test_slapd`
		`+`
		`More information`
		`----------------`

dirsrvtests/tests/stress/search/__init__.py

file added

empty or binary file added

dirsrvtests/tests/stress/search/simple.py

file added

+59

		`@@ -0,0 +1,59 @@`
		`+ # --- BEGIN COPYRIGHT BLOCK ---`
		`+ # Copyright (C) 2019 William Brown <william@blackhats.net.au>`
		`+ # All rights reserved.`
		`+ #`
		`+ # License: GPL (version 3 or any later version).`
		`+ # See LICENSE for details.`
		`+ # --- END COPYRIGHT BLOCK ---`
		`+ #`
		`+`
		`+ from lib389.topologies import topology_st`
		`+ from lib389.dbgen import dbgen`
		`+ from lib389.ldclt import Ldclt`
		`+ from lib389.tasks import ImportTask`
		`+`
		`+ from lib389._constants import DEFAULT_SUFFIX`
		`+`
		`+`
		`+ def test_stress_search_simple(topology_st):`
		`+ """Test a simple stress test of searches on the directory server.`
		`+`
		`+ :id: 3786d01c-ea03-4655-a4f9-450693c75863`
		`+ :setup: Standalone Instance`
		`+ :steps:`
		`+ 1. Create test users`
		`+ 2. Import them`
		`+ 3. Stress test!`
		`+ :expectedresults:`
		`+ 1. Success`
		`+ 2. Success`
		`+ 3. Results are written to /tmp`
		`+ """`
		`+`
		`+ inst = topology_st.standalone`
		`+`
		`+ inst.config.set("nsslapd-verify-filter-schema", "off")`
		`+ # Bump idllimit to test OR worst cases.`
		`+ from lib389.config import LDBMConfig`
		`+ lconfig = LDBMConfig(inst)`
		`+ # lconfig.set("nsslapd-idlistscanlimit", '20000')`
		`+ # lconfig.set("nsslapd-lookthroughlimit", '20000')`
		`+`
		`+`
		`+ ldif_dir = inst.get_ldif_dir()`
		`+ import_ldif = ldif_dir + '/basic_import.ldif'`
		`+ dbgen(inst, 10000, import_ldif, DEFAULT_SUFFIX)`
		`+`
		`+ r = ImportTask(inst)`
		`+ r.import_suffix_from_ldif(ldiffile=import_ldif, suffix=DEFAULT_SUFFIX)`
		`+ r.wait()`
		`+`
		`+ # Run a small to warm up the server's caches ...`
		`+ l = Ldclt(inst)`
		`+`
		`+ l.search_loadtest(DEFAULT_SUFFIX, "(mail=XXXX@example.com)", rounds=1)`
		`+`
		`+ # Now do it for realsies!`
		`+ # l.search_loadtest(DEFAULT_SUFFIX, "(\|(mail=XXXX@example.com)(nonexist=foo))", rounds=10)`
		`+ l.search_loadtest(DEFAULT_SUFFIX, "(mail=XXXX@example.com)", rounds=10)`
		`+`

dirsrvtests/tests/suites/filter/schema_validation_test.py

file added

+158

		`@@ -0,0 +1,158 @@`
		`+ # --- BEGIN COPYRIGHT BLOCK ---`
		`+ # Copyright (C) 2019 William Brown <william@blackhats.net.au>`
		`+ # All rights reserved.`
		`+ #`
		`+ # License: GPL (version 3 or any later version).`
		`+ # See LICENSE for details.`
		`+ # --- END COPYRIGHT BLOCK ---`
		`+ #`
		`+`
		`+ import pytest`
		`+ import ldap`
		`+ from lib389.topologies import topology_st`
		`+ from lib389.dirsrv_log import DirsrvAccessLog`
		`+ from lib389._mapped_object import DSLdapObjects`
		`+ from lib389._constants import DEFAULT_SUFFIX`
		`+`
		`+ def _check_value(inst_cfg, value):`
		`+ inst_cfg.set('nsslapd-verify-filter-schema', value)`
		`+ assert(inst_cfg.get_attr_val_utf8('nsslapd-verify-filter-schema') == value)`
		`+`
		`+`
		`+ @pytest.mark.ds50349`
		`+ def test_filter_validation_config(topology_st):`
		`+ """Test that the new on/warn/off setting can be set and read`
		`+ correctly`
		`+`
		`+ :id: ac14dad5-5bdf-474f-9936-7ce2d20fb8b6`
		`+ :setup: Standalone instance`
		`+ :steps:`
		`+ 1. Check the default value of nsslapd-verify-filter-schema`
		`+ 2. Set the value to "on".`
		`+ 3. Read the value is "on".`
		`+ 4. Set the value to "warn".`
		`+ 5. Read the value is "warn".`
		`+ 6. Set the value to "off".`
		`+ 7. Read the value is "off".`
		`+ 8. Delete the value (reset)`
		`+ 9. Check the reset value matches 1.`
		`+ :expectedresults:`
		`+ 1. Value is "on", "off", or "warn".`
		`+ 2. Success`
		`+ 3. Value is "on"`
		`+ 4. Success`
		`+ 5. Value is "warn"`
		`+ 6. Success`
		`+ 7. Value is "off"`
		`+ 8. Success`
		`+ 9. Value is same as from 1.`
		`+ """`
		`+ inst_cfg = topology_st.standalone.config`
		`+`
		`+ initial_value = inst_cfg.get_attr_val_utf8('nsslapd-verify-filter-schema')`
		`+`
		`+ _check_value(inst_cfg, "on")`
		`+ _check_value(inst_cfg, "warn")`
		`+ _check_value(inst_cfg, "off")`
		`+`
		`+ # This should fail`
		`+`
		`+ with pytest.raises(ldap.OPERATIONS_ERROR):`
		`+ _check_value(inst_cfg, "thnaounaou")`
		`+`
		`+ inst_cfg.remove_all('nsslapd-verify-filter-schema')`
		`+ final_value = inst_cfg.get_attr_val_utf8('nsslapd-verify-filter-schema')`
		`+ assert(initial_value == final_value)`
		`+`
		`+`
		`+ @pytest.mark.ds50349`
		`+ def test_filter_validation_enabled(topology_st):`
		`+ """Test that queries which are invalid, are correctly rejected by the server.`
		`+`
		`+ :id: 05afdbbd-0d7f-4774-958c-2139827fed70`
		`+ :setup: Standalone instance`
		`+ :steps:`
		`+ 1. Search a well formed query`
		`+ 2. Search a poorly formed query`
		`+ 3. Search a poorly formed complex (and/or) query`
		`+ 4. Test the server can be restarted`
		`+ :expectedresults:`
		`+ 1. No warnings`
		`+ 2. Query is rejected (err)`
		`+ 3. Query is rejected (err)`
		`+ 4. Server restarts`
		`+ """`
		`+ inst = topology_st.standalone`
		`+`
		`+ # In case the default has changed, we set the value to warn.`
		`+ inst.config.set("nsslapd-verify-filter-schema", "on")`
		`+ raw_objects = DSLdapObjects(inst, basedn=DEFAULT_SUFFIX)`
		`+`
		`+ # Check a good query has no errors.`
		`+ r = raw_objects.filter("(objectClass=*)")`
		`+`
		`+ with pytest.raises(ldap.UNWILLING_TO_PERFORM):`
		`+ # Check a bad one DOES emit an error.`
		`+ r = raw_objects.filter("(a=a)")`
		`+`
		`+ with pytest.raises(ldap.UNWILLING_TO_PERFORM):`
		`+ # Check a bad complex one does emit an error.`
		`+ r = raw_objects.filter("(&(a=a)(b=b)(objectClass=*))")`
		`+`
		`+ # Does restart work?`
		`+ inst.restart()`
		`+`
		`+`
		`+ @pytest.mark.ds50349`
		`+ def test_filter_validation_warning(topology_st):`
		`+ """Test that queries which are invalid, are correctly marked as "notes=F" in`
		`+ the access log.`
		`+`
		`+ :id: 8b2b23fe-d878-435c-bc84-8c298be4ca1f`
		`+ :setup: Standalone instance`
		`+ :steps:`
		`+ 1. Search a well formed query`
		`+ 2. Search a poorly formed query`
		`+ 3. Search a poorly formed complex (and/or) query`
		`+ :expectedresults:`
		`+ 1. No warnings`
		`+ 2. notes=F is present`
		`+ 3. notes=F is present`
		`+ """`
		`+ inst = topology_st.standalone`
		`+`
		`+ # In case the default has changed, we set the value to warn.`
		`+ inst.config.set("nsslapd-verify-filter-schema", "warn")`
		`+ # Set the access log to un-buffered so we get it immediately.`
		`+ inst.config.set("nsslapd-accesslog-logbuffering", "off")`
		`+`
		`+ # Setup the query object.`
		`+ # Now we don't care if there are any results, we only care about good/bad queries.`
		`+ # To do this we have to bypass some of the lib389 magic, and just emit raw queries`
		`+ # to check them. Turns out lib389 is well designed and this just works as expected`
		`+ # if you use a single DSLdapObjects and filter. :)`
		`+ raw_objects = DSLdapObjects(inst, basedn=DEFAULT_SUFFIX)`
		`+`
		`+ # Find any initial notes=F`
		`+ access_log = DirsrvAccessLog(inst)`
		`+ r_init = access_log.match(".notes=F.")`
		`+`
		`+ # Check a good query has no warnings.`
		`+ r = raw_objects.filter("(objectClass=*)")`
		`+ r_s1 = access_log.match(".notes=F.")`
		`+ # Should be the same number of log lines IE 0.`
		`+ assert(len(r_init) == len(r_s1))`
		`+`
		`+ # Check a bad one DOES emit a warning.`
		`+ r = raw_objects.filter("(a=a)")`
		`+ r_s2 = access_log.match(".notes=F.")`
		`+ # Should be the greate number of log lines IE +1`
		`+ assert(len(r_init) + 1 == len(r_s2))`
		`+`
		`+ # Check a bad complex one does emit a warning.`
		`+ r = raw_objects.filter("(&(a=a)(b=b)(objectClass=*))")`
		`+ r_s3 = access_log.match(".notes=F.")`
		`+ # Should be the greate number of log lines IE +2`
		`+ assert(len(r_init) + 2 == len(r_s3))`
		`+`
		`+`

ldap/servers/slapd/attrsyntax.c

file modified

+24 -2

		`@@ -93,8 +93,10 @@`
		`void`
		`attr_syntax_read_lock(void)`
		`{`
		`- if (0 != attr_syntax_init())`
		`+ if (0 != attr_syntax_init()) {`
		`+ PR_ASSERT(0);`
		`return;`
		`+ }`

		`AS_LOCK_READ(oid2asi_lock);`
		`AS_LOCK_READ(name2asi_lock);`
		`@@ -103,8 +105,10 @@`
		`void`
		`attr_syntax_write_lock(void)`
		`{`
		`- if (0 != attr_syntax_init())`
		`+ if (0 != attr_syntax_init()) {`
		`+ PR_ASSERT(0);`
		`return;`
		`+ }`

		`AS_LOCK_WRITE(oid2asi_lock);`
		`AS_LOCK_WRITE(name2asi_lock);`
		`@@ -383,6 +387,24 @@`
		`return asi;`
		`}`

		`+ /*`
		`+ * This assumes you have taken the attr_syntax read lock. Assert an attribute type`
		`+ * exists by name. 0 is false, 1 is true.`
		`+ *`
		`+ * The main reason to use this over attr_syntax_get_by_name_locking_optional is to`
		`+ * avoid the reference count increment/decrement cycle when we only need a boolean`
		`+ * of existance, rather than retriving the reference to the attribute itself.`
		`+ */`
		`+ int32_t`
		`+ attr_syntax_exist_by_name_nolock(char *name) {`
		`+ struct asyntaxinfo *asi = NULL;`
		`+ asi = (struct asyntaxinfo *)PL_HashTableLookup_const(name2asi, name);`
		`+ if (asi != NULL) {`
		`+ return 1;`
		`+ }`
		`+ return 0;`
		`+ }`
		`+`

		`/*`
		`* Give up a reference to an asi.`

ldap/servers/slapd/back-ldbm/filterindex.c

file modified

+97 -29

		`@@ -223,13 +223,33 @@`

		`switch (ftype) {`
		`case LDAP_FILTER_GE:`
		`- idl = range_candidates(pb, be, type, bval, NULL, err, &sattr, allidslimit);`
		`+ if (f->f_flags & SLAPI_FILTER_INVALID_ATTR) {`
		`+ /*`
		`+ * REMEMBER: this flag is only set on WARN levels. If the filter verify`
		`+ * is on strict, we reject in search.c, if we ar off, the flag will NOT`
		`+ * be set on the filter at all!`
		`+ */`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_FILTER_INVALID);`
		`+ idl = idl_alloc(0);`
tbordaz commented 4 years ago I have a doubt here. A component with an unknown attribute results in an idl_allids rather than an empty list. I have the same doubt at many places of the patch that contain idl_alloc(0)
		`+ } else {`
		`+ idl = range_candidates(pb, be, type, bval, NULL, err, &sattr, allidslimit);`
		`+ }`
		`slapi_log_err(SLAPI_LOG_TRACE, "ava_candidates", "<= %lu\n",`
		`(u_long)IDL_NIDS(idl));`
		`goto done;`
		`break;`
		`case LDAP_FILTER_LE:`
		`- idl = range_candidates(pb, be, type, NULL, bval, err, &sattr, allidslimit);`
		`+ if (f->f_flags & SLAPI_FILTER_INVALID_ATTR) {`
		`+ /*`
		`+ * REMEMBER: this flag is only set on WARN levels. If the filter verify`
		`+ * is on strict, we reject in search.c, if we ar off, the flag will NOT`
		`+ * be set on the filter at all!`
		`+ */`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_FILTER_INVALID);`
		`+ idl = idl_alloc(0);`
		`+ } else {`
		`+ idl = range_candidates(pb, be, type, NULL, bval, err, &sattr, allidslimit);`
		`+ }`
		`slapi_log_err(SLAPI_LOG_TRACE, "ava_candidates", "<= %lu\n",`
		`(u_long)IDL_NIDS(idl));`
		`goto done;`
		`@@ -273,11 +293,21 @@`
		`ptr[1] = NULL;`
		`ivals = ptr;`

		`- slapi_attr_assertion2keys_ava_sv(&sattr, &tmp, (Slapi_Value ***)&ivals, LDAP_FILTER_EQUALITY_FAST);`
		`- idl = keys2idl(pb, be, type, indextype, ivals, err, &unindexed, &txn, allidslimit);`
		`+ if (f->f_flags & SLAPI_FILTER_INVALID_ATTR) {`
		`+ /*`
		`+ * REMEMBER: this flag is only set on WARN levels. If the filter verify`
		`+ * is on strict, we reject in search.c, if we ar off, the flag will NOT`
		`+ * be set on the filter at all!`
		`+ */`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_FILTER_INVALID);`
		`+ idl = idl_alloc(0);`
		`+ } else {`
		`+ slapi_attr_assertion2keys_ava_sv(&sattr, &tmp, (Slapi_Value ***)&ivals, LDAP_FILTER_EQUALITY_FAST);`
		`+ idl = keys2idl(pb, be, type, indextype, ivals, err, &unindexed, &txn, allidslimit);`
		`+ }`
		`+`
		`if (unindexed) {`
		`- unsigned int opnote = SLAPI_OP_NOTE_UNINDEXED;`
		`- slapi_pblock_set(pb, SLAPI_OPERATION_NOTES, &opnote);`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_UNINDEXED);`
tbordaz commented 4 years ago @firstyear I like this change to use slapi_pblock_set_flag_operation_notes but it is a different change from checking the attribute against the schema. IMHO it makes review more complex and I would prefer a separate ticket. In addition I think it is a wrong use here. slapi_pblock_set does set the value while slapi_pblock_set_flag_operation_notes do a OR. Better to check the others places where you used slapi_pblock_set_flag_operation_notes that the OR was intended
tbordaz commented 4 years ago Just realized that you also created slapi_pblock_set_operation_notes that can be used here
		`pagedresults_set_unindexed(pb_conn, pb_op, pr_idx);`
		`}`

		`@@ -296,20 +326,30 @@`
		`slapi_ch_free((void **)&ivals);`
		`}`
		`} else {`
		`- slapi_value_init_berval(&sv, bval);`
		`- ivals = NULL;`
		`- slapi_attr_assertion2keys_ava_sv(&sattr, &sv, &ivals, ftype);`
		`- value_done(&sv);`
		`- if (ivals == NULL \|\| *ivals == NULL) {`
		`- slapi_log_err(SLAPI_LOG_TRACE, "ava_candidates",`
		`- "<= ALLIDS (no keys)\n");`
		`- idl = idl_allids(be);`
		`- goto done;`
		`+ if (f->f_flags & SLAPI_FILTER_INVALID_ATTR) {`
		`+ /*`
		`+ * REMEMBER: this flag is only set on WARN levels. If the filter verify`
		`+ * is on strict, we reject in search.c, if we ar off, the flag will NOT`
		`+ * be set on the filter at all!`
		`+ */`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_FILTER_INVALID);`
		`+ idl = idl_alloc(0);`
		`+ } else {`
		`+ slapi_value_init_berval(&sv, bval);`
		`+ ivals = NULL;`
		`+ slapi_attr_assertion2keys_ava_sv(&sattr, &sv, &ivals, ftype);`
		`+ value_done(&sv);`
		`+ if (ivals == NULL \|\| *ivals == NULL) {`
		`+ slapi_log_err(SLAPI_LOG_TRACE, "ava_candidates",`
		`+ "<= ALLIDS (no keys)\n");`
		`+ idl = idl_allids(be);`
		`+ goto done;`
		`+ }`
		`+ idl = keys2idl(pb, be, type, indextype, ivals, err, &unindexed, &txn, allidslimit);`
		`}`
		`- idl = keys2idl(pb, be, type, indextype, ivals, err, &unindexed, &txn, allidslimit);`
		`+`
		`if (unindexed) {`
		`- unsigned int opnote = SLAPI_OP_NOTE_UNINDEXED;`
		`- slapi_pblock_set(pb, SLAPI_OPERATION_NOTES, &opnote);`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_UNINDEXED);`
		`pagedresults_set_unindexed(pb_conn, pb_op, pr_idx);`
		`}`
		`valuearray_free(&ivals);`
		`@@ -341,19 +381,30 @@`
		`return (NULL);`
		`}`
		`slapi_pblock_get(pb, SLAPI_TXN, &txn.back_txn_txn);`
		`- idl = index_read_ext_allids(pb, be, type, indextype_PRESENCE,`
		`- NULL, &txn, err, &unindexed, allidslimit);`
		`+`
		`+ if (f->f_flags & SLAPI_FILTER_INVALID_ATTR) {`
		`+ /*`
		`+ * REMEMBER: this flag is only set on WARN levels. If the filter verify`
		`+ * is on strict, we reject in search.c, if we ar off, the flag will NOT`
		`+ * be set on the filter at all!`
		`+ */`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_FILTER_INVALID);`
		`+ idl = idl_alloc(0);`
		`+ } else {`
		`+ idl = index_read_ext_allids(pb, be, type, indextype_PRESENCE,`
		`+ NULL, &txn, err, &unindexed, allidslimit);`
		`+ }`

		`if (unindexed) {`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_UNINDEXED);`
		`+`
		`Operation *pb_op;`
		`Connection *pb_conn;`

		`int pr_idx = -1;`
		`- unsigned int opnote = SLAPI_OP_NOTE_UNINDEXED;`
		`slapi_pblock_get(pb, SLAPI_OPERATION, &pb_op);`
		`slapi_pblock_get(pb, SLAPI_CONNECTION, &pb_conn);`

		`- slapi_pblock_set(pb, SLAPI_OPERATION_NOTES, &opnote);`
		`slapi_pblock_get(pb, SLAPI_PAGED_RESULTS_INDEX, &pr_idx);`
		`pagedresults_set_unindexed(pb_conn, pb_op, pr_idx);`
		`}`
		`@@ -427,12 +478,21 @@`
		`struct berval *bvec[2];`
		`bvec[0] = *val;`
		`bvec[1] = NULL;`
		`+`
		`if (slapi_pblock_set(pb, SLAPI_PLUGIN_OBJECT, mrOBJECT) \|\|`
		`slapi_pblock_set(pb, SLAPI_PLUGIN_MR_VALUES, bvec) \|\|`
		`mrINDEX(pb) \|\|`
		`slapi_pblock_get(pb, SLAPI_PLUGIN_MR_KEYS, &keys)) {`
		`/* something went wrong. bail. */`
		`break;`
		`+ } else if (f->f_flags & SLAPI_FILTER_INVALID_ATTR) {`
		`+ /*`
		`+ * REMEMBER: this flag is only set on WARN levels. If the filter verify`
		`+ * is on strict, we reject in search.c, if we ar off, the flag will NOT`
		`+ * be set on the filter at all!`
		`+ */`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_FILTER_INVALID);`
		`+ idl = idl_alloc(0);`
		`} else if (keys == NULL \|\| keys[0] == NULL) {`
		`/* no keys */`
		`idl_free(&idl);`
		`@@ -448,14 +508,13 @@`
		`*key, NULL, 0, &txn, err, allidslimit);`
		`if (unindexed) {`
		`int pr_idx = -1;`
		`- unsigned int opnote = SLAPI_OP_NOTE_UNINDEXED;`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_UNINDEXED);`

		`Operation *pb_op;`
		`Connection *pb_conn;`
		`slapi_pblock_get(glob_pb, SLAPI_OPERATION, &pb_op);`
		`slapi_pblock_get(glob_pb, SLAPI_CONNECTION, &pb_conn);`

		`- slapi_pblock_set(glob_pb, SLAPI_OPERATION_NOTES, &opnote);`
		`slapi_pblock_get(glob_pb, SLAPI_PAGED_RESULTS_INDEX, &pr_idx);`
		`pagedresults_set_unindexed(pb_conn, pb_op, pr_idx);`
		`}`
		`@@ -889,7 +948,6 @@`
		`IDList *idl;`
		`Slapi_Value **ivals;`
		`int unindexed = 0;`
		`- unsigned int opnote = SLAPI_OP_NOTE_UNINDEXED;`
		`Slapi_Attr sattr;`
		`back_txn txn = {NULL};`
		`int pr_idx = -1;`
		`@@ -915,7 +973,7 @@`
		`if (ivals == NULL \|\| *ivals == NULL) {`
		`Operation *pb_op;`
		`Connection *pb_conn;`
		`- slapi_pblock_set(pb, SLAPI_OPERATION_NOTES, &opnote);`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_UNINDEXED);`
		`slapi_pblock_get(pb, SLAPI_OPERATION, &pb_op);`
		`slapi_pblock_get(pb, SLAPI_CONNECTION, &pb_conn);`
		`pagedresults_set_unindexed(pb_conn, pb_op, pr_idx);`
		`@@ -928,14 +986,24 @@`
		`* look up each key in the index, ANDing the resulting`
		`* IDLists together.`
		`*/`
		`- slapi_pblock_get(pb, SLAPI_TXN, &txn.back_txn_txn);`
		`- idl = keys2idl(pb, be, type, indextype_SUB, ivals, err, &unindexed, &txn, allidslimit);`
		`+ if (f->f_flags & SLAPI_FILTER_INVALID_ATTR) {`
		`+ /*`
		`+ * REMEMBER: this flag is only set on WARN levels. If the filter verify`
		`+ * is on strict, we reject in search.c, if we ar off, the flag will NOT`
		`+ * be set on the filter at all!`
		`+ */`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_FILTER_INVALID);`
		`+ idl = idl_alloc(0);`
		`+ } else {`
		`+ slapi_pblock_get(pb, SLAPI_TXN, &txn.back_txn_txn);`
		`+ idl = keys2idl(pb, be, type, indextype_SUB, ivals, err, &unindexed, &txn, allidslimit);`
		`+ }`
		`if (unindexed) {`
		`Operation *pb_op;`
		`Connection *pb_conn;`
		`slapi_pblock_get(pb, SLAPI_OPERATION, &pb_op);`
		`slapi_pblock_get(pb, SLAPI_CONNECTION, &pb_conn);`
		`- slapi_pblock_set(pb, SLAPI_OPERATION_NOTES, &opnote);`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_UNINDEXED);`
		`pagedresults_set_unindexed(pb_conn, pb_op, pr_idx);`
		`}`
		`valuearray_free(&ivals);`

ldap/servers/slapd/back-ldbm/index.c

file modified

+1 -4

		`@@ -1453,10 +1453,7 @@`
		`if (!is_indexed(indextype, ai->ai_indexmask, ai->ai_index_rules)) {`

		`/* Mark that the search has an unindexed component */`
		`- uint32_t opnote = 0;`
		`- slapi_pblock_get(pb, SLAPI_OPERATION_NOTES, &opnote);`
		`- opnote \|= SLAPI_OP_NOTE_UNINDEXED;`
		`- slapi_pblock_set(pb, SLAPI_OPERATION_NOTES, &opnote);`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_UNINDEXED);`

		`idl = idl_allids(be);`
		`slapi_log_err(SLAPI_LOG_TRACE,`

ldap/servers/slapd/back-ldbm/ldbm_search.c

file modified

+2 -3

		`@@ -857,11 +857,10 @@`
		`}`
		`}`

		`- slapi_pblock_get(pb, SLAPI_OPERATION_NOTES, &opnote);`
		`+ opnote = slapi_pblock_get_operation_notes(pb);`
		`opnote \|= SLAPI_OP_NOTE_FULL_UNINDEXED; /* the full filter leads to an unindexed search */`
		`opnote &= ~SLAPI_OP_NOTE_UNINDEXED; /* this note is useless because FULL_UNINDEXED includes UNINDEXED */`
		`- slapi_pblock_set(pb, SLAPI_OPERATION_NOTES, NULL);`
		`- slapi_pblock_set(pb, SLAPI_OPERATION_NOTES, &opnote);`
		`+ slapi_pblock_set_operation_notes(pb, opnote);`
		`slapi_pblock_get(pb, SLAPI_PAGED_RESULTS_INDEX, &pr_idx);`
		`slapi_pblock_get(pb, SLAPI_OPERATION, &pb_op);`
		`slapi_pblock_get(pb, SLAPI_CONNECTION, &pb_conn);`

ldap/servers/slapd/back-ldbm/vlv.c

file modified

+1 -2

		`@@ -1147,15 +1147,14 @@`
		`slapi_pblock_get(pb, SLAPI_SEARCH_STRFILTER, &fstr);`
		`slapi_rwlock_rdlock(be->vlvSearchList_lock);`
		`if ((pi = vlv_find_search(be, base, scope, fstr, sort_control)) == NULL) {`
		`- unsigned int opnote = SLAPI_OP_NOTE_UNINDEXED;`
		`int pr_idx = -1;`
		`Connection *pb_conn = NULL;`
		`Operation *pb_op = NULL;`

		`slapi_pblock_get(pb, SLAPI_PAGED_RESULTS_INDEX, &pr_idx);`
		`slapi_rwlock_unlock(be->vlvSearchList_lock);`
		`- slapi_pblock_set(pb, SLAPI_OPERATION_NOTES, &opnote);`

		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_UNINDEXED);`
		`slapi_pblock_get(pb, SLAPI_OPERATION, &pb_op);`
		`slapi_pblock_get(pb, SLAPI_CONNECTION, &pb_conn);`

ldap/servers/slapd/filter.c

file modified

+6 -3

		`@@ -59,12 +59,15 @@`
		`&has_tombstone_filter, &has_ruv_filter);`

		`if (0 == return_value) { /* Don't try to re-write if there was an error */`
		`- if (subentry_dont_rewrite \|\| scope == LDAP_SCOPE_BASE)`
		`+ if (subentry_dont_rewrite \|\| scope == LDAP_SCOPE_BASE) {`
		`(*filt)->f_flags \|= SLAPI_FILTER_LDAPSUBENTRY;`
		`- if (has_tombstone_filter)`
		`+ }`
		`+ if (has_tombstone_filter) {`
		`(*filt)->f_flags \|= SLAPI_FILTER_TOMBSTONE;`
		`- if (has_ruv_filter)`
		`+ }`
		`+ if (has_ruv_filter) {`
		`(*filt)->f_flags \|= SLAPI_FILTER_RUV;`
		`+ }`
		`}`

		`if (loglevel_is_set(LDAP_DEBUG_FILTER) && filt != NULL && fstr != NULL) {`

ldap/servers/slapd/libglobs.c

file modified

+117 -2

		`@@ -161,6 +161,7 @@`
		`CONFIG_SPECIAL_VALIDATE_CERT_SWITCH, /* maps strings to an enumeration */`
		`CONFIG_SPECIAL_UNHASHED_PW_SWITCH, /* unhashed pw: on/off/nolog */`
		`CONFIG_SPECIAL_TLS_CHECK_CRL, /* maps enum tls_check_crl_t to char * */`
		`+ CONFIG_ON_OFF_WARN, /* maps to a config on/warn/off enum */`
		`} ConfigVarType;`

		`static int32_t config_set_onoff(const char attrname, char value, int32_t configvalue, char errorbuf, int apply);`
		`@@ -250,6 +251,7 @@`
		`slapi_onoff_t init_extract_pem;`
		`slapi_onoff_t init_ignore_vattrs;`
		`slapi_onoff_t init_enable_upgrade_hash;`
		`+ slapi_onwarnoff_t init_verify_filter_schema;`

		`static int`
		`isInt(ConfigVarType type)`
		`@@ -1237,7 +1239,12 @@`
		`{CONFIG_ENABLE_UPGRADE_HASH, config_set_enable_upgrade_hash,`
		`NULL, 0,`
		`(void **)&global_slapdFrontendConfig.enable_upgrade_hash,`
		`- CONFIG_ON_OFF, (ConfigGetFunc)config_get_enable_upgrade_hash, &init_enable_upgrade_hash}`
		`+ CONFIG_ON_OFF, (ConfigGetFunc)config_get_enable_upgrade_hash, &init_enable_upgrade_hash},`
		`+ {CONFIG_VERIFY_FILTER_SCHEMA, config_set_verify_filter_schema,`
		`+ NULL, 0,`
		`+ (void **)&global_slapdFrontendConfig.verify_filter_schema,`
		`+ CONFIG_ON_OFF_WARN, (ConfigGetFunc)config_get_verify_filter_schema,`
		`+ &init_verify_filter_schema},`
		`/* End config */`
		`};`

		`@@ -1753,7 +1760,6 @@`
		`init_malloc_trim_threshold = cfg->malloc_trim_threshold = DEFAULT_MALLOC_UNSET;`
		`init_malloc_mmap_threshold = cfg->malloc_mmap_threshold = DEFAULT_MALLOC_UNSET;`
		`#endif`
		`-`
		`init_extract_pem = cfg->extract_pem = LDAP_ON;`
		`/*`
		`* Default upgrade hash to on - this is an important security step, meaning that old`
		`@@ -1767,6 +1773,7 @@`
		`* scheme set in cn=config`
		`*/`
		`init_enable_upgrade_hash = cfg->enable_upgrade_hash = LDAP_ON;`
		`+ init_verify_filter_schema = cfg->verify_filter_schema = SLAPI_WARN;`

		`/* Done, unlock! */`
		`CFG_UNLOCK_WRITE(cfg);`
		`@@ -7641,6 +7648,94 @@`
		`return retval;`
		`}`

		`+ static char *`
		`+ config_initvalue_to_onwarnoff(struct config_get_and_set cgas, char initvalbuf, size_t initvalbufsize) {`
		`+ char *retval = NULL;`
		`+ if (cgas->config_var_type == CONFIG_ON_OFF_WARN) {`
		`+ slapi_onwarnoff_t value = (slapi_onwarnoff_t )(intptr_t)cgas->initvalue;`
		`+ if (value != NULL) {`
		`+ if (*value == SLAPI_ON) {`
		`+ PR_snprintf(initvalbuf, initvalbufsize, "%s", "on");`
		`+ retval = initvalbuf;`
		`+ } else if (*value == SLAPI_WARN) {`
		`+ PR_snprintf(initvalbuf, initvalbufsize, "%s", "warn");`
		`+ retval = initvalbuf;`
		`+ } else if (*value == SLAPI_OFF) {`
		`+ PR_snprintf(initvalbuf, initvalbufsize, "%s", "off");`
		`+ retval = initvalbuf;`
		`+ }`
		`+ }`
		`+ }`
		`+ return retval;`
		`+ }`
		`+`
		`+ static int32_t`
		`+ config_set_onoffwarn(slapdFrontendConfig_t slapdFrontendConfig, slapi_onwarnoff_t target, const char attrname, char value, char *errorbuf, int apply) {`
		`+ if (target == NULL) {`
		`+ return LDAP_OPERATIONS_ERROR;`
		`+ }`
		`+`
		`+ if (config_value_is_null(attrname, value, errorbuf, 1)) {`
		`+ return LDAP_OPERATIONS_ERROR;`
		`+ }`
		`+`
		`+ slapi_onwarnoff_t p_val = SLAPI_OFF;`
		`+`
		`+ if (strcasecmp(value, "on") == 0) {`
		`+ p_val = SLAPI_ON;`
		`+ } else if (strcasecmp(value, "warn") == 0) {`
		`+ p_val = SLAPI_WARN;`
		`+ } else if (strcasecmp(value, "off") == 0) {`
		`+ p_val = SLAPI_OFF;`
		`+ } else {`
		`+ slapi_create_errormsg(errorbuf, SLAPI_DSE_RETURNTEXT_SIZE,`
		`+ "%s: invalid value \"%s\". Valid values are \"on\", \"warn\" or \"off\".", attrname, value);`
		`+ return LDAP_OPERATIONS_ERROR;`
		`+ }`
		`+`
		`+ if (!apply) {`
		`+ return LDAP_SUCCESS;`
		`+ }`
		`+`
		`+ CFG_LOCK_WRITE(slapdFrontendConfig);`
		`+ *target = p_val;`
		`+ CFG_UNLOCK_WRITE(slapdFrontendConfig);`
		`+`
		`+ return LDAP_SUCCESS;`
		`+ }`
		`+`
		`+`
		`+ int32_t`
		`+ config_set_verify_filter_schema(const char attrname, char value, char *errorbuf, int apply) {`
		`+ slapdFrontendConfig_t *slapdFrontendConfig = getFrontendConfig();`
		`+ slapi_onwarnoff_t *target = &(slapdFrontendConfig->verify_filter_schema);`
		`+ return config_set_onoffwarn(slapdFrontendConfig, target, attrname, value, errorbuf, apply);`
		`+ }`
		`+`
		`+ Slapi_Filter_Policy`
		`+ config_get_verify_filter_schema()`
		`+ {`
		`+ slapi_onwarnoff_t retVal;`
		`+ slapdFrontendConfig_t *slapdFrontendConfig = getFrontendConfig();`
		`+ CFG_LOCK_READ(slapdFrontendConfig);`
		`+ retVal = slapdFrontendConfig->verify_filter_schema;`
		`+ CFG_UNLOCK_READ(slapdFrontendConfig);`
		`+`
		`+ /* Now map this to a policy that the fns understand. */`
		`+ switch (retVal) {`
		`+ case SLAPI_ON:`
		`+ return FILTER_POLICY_STRICT;`
		`+ break;`
		`+ case SLAPI_WARN:`
		`+ return FILTER_POLICY_WARNING;`
		`+ break;`
		`+ default:`
		`+ return FILTER_POLICY_OFF;`
		`+ }`
		`+ /* Should be unreachable ... */`
		`+ return FILTER_POLICY_OFF;`
		`+ }`
		`+`
		`/*`
		`* This function is intended to be used from the dse code modify callback. It`
		`* is "optimized" for that case because it takes a berval** of values, which is`
		`@@ -7689,6 +7784,8 @@`
		`void *initval = cgas->initvalue;`
		`if (cgas->config_var_type == CONFIG_ON_OFF) {`
		`initval = (void *)config_initvalue_to_onoff(cgas, initvalbuf, sizeof(initvalbuf));`
		`+ } else if (cgas->config_var_type == CONFIG_ON_OFF_WARN) {`
		`+ initval = (void *)config_initvalue_to_onwarnoff(cgas, initvalbuf, sizeof(initvalbuf));`
		`}`
		`if (cgas->setfunc) {`
		`retval = (cgas->setfunc)(cgas->attr_name, initval, errorbuf, apply);`
		`@@ -7914,6 +8011,24 @@`

		`break;`

		`+ case CONFIG_ON_OFF_WARN:`
		`+ /* Is this the right default here? */`
		`+ if (!value) {`
		`+ slapi_entry_attr_set_charptr(e, cgas->attr_name, "off");`
		`+ break;`
		`+ }`
		`+`
		`+ if (((slapi_onwarnoff_t )value) == SLAPI_ON) {`
		`+ slapi_entry_attr_set_charptr(e, cgas->attr_name, "on");`
		`+ } else if (((slapi_onwarnoff_t )value) == SLAPI_WARN) {`
		`+ slapi_entry_attr_set_charptr(e, cgas->attr_name, "warn");`
		`+ } else {`
		`+ slapi_entry_attr_set_charptr(e, cgas->attr_name, "off");`
		`+ /* Default to off. */`
		`+ }`
		`+`
		`+ break;`
		`+`
		`default:`
		`PR_ASSERT(0); /* something went horribly wrong . . . */`
		`break;`

ldap/servers/slapd/opshared.c

file modified

+4 -3

		`@@ -501,7 +501,6 @@`
		`slapi_pblock_set(pb, SLAPI_PAGED_RESULTS_INDEX, &pr_idx);`
		`slapi_pblock_set(pb, SLAPI_PAGED_RESULTS_COOKIE, &pr_cookie);`
		`if ((LDAP_SUCCESS == rc) \|\| (LDAP_CANCELLED == rc) \|\| (0 == pagesize)) {`
		`- unsigned int opnote = SLAPI_OP_NOTE_SIMPLEPAGED;`
		`op_set_pagedresults(operation);`
		`pr_be = pagedresults_get_current_be(pb_conn, pr_idx);`
		`if (be_name) {`
		`@@ -522,10 +521,12 @@`
		`}`
		`pr_search_result = pagedresults_get_search_result(pb_conn, operation, 0 /not locked/, pr_idx);`
		`estimate = pagedresults_get_search_result_set_size_estimate(pb_conn, operation, pr_idx);`
		`+ /* Set operation note flags as required. */`
		`if (pagedresults_get_unindexed(pb_conn, operation, pr_idx)) {`
		`- opnote \|= SLAPI_OP_NOTE_UNINDEXED;`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_UNINDEXED);`
		`}`
		`- slapi_pblock_set(pb, SLAPI_OPERATION_NOTES, &opnote);`
		`+ slapi_pblock_set_flag_operation_notes(pb, SLAPI_OP_NOTE_SIMPLEPAGED);`
		`+`
		`if ((LDAP_CANCELLED == rc) \|\| (0 == pagesize)) {`
		`/* paged-results-request was abandoned; making an empty cookie. */`
		`pagedresults_set_response_control(pb, 0, estimate, -1, pr_idx);`

ldap/servers/slapd/pblock.c

file modified

+22

		`@@ -4331,6 +4331,28 @@`
		`pb->pb_intop->op_stack_elem = stack_elem;`
		`}`

		`+ uint32_t`
		`+ slapi_pblock_get_operation_notes(Slapi_PBlock *pb) {`
		`+ if (pb->pb_intop != NULL) {`
		`+ return pb->pb_intop->pb_operation_notes;`
		`+ }`
		`+ return 0;`
		`+ }`
		`+`
		`+ /* overwrite the flag */`
		`+ void`
		`+ slapi_pblock_set_operation_notes(Slapi_PBlock *pb, uint32_t opnotes) {`
		`+ _pblock_assert_pb_intop(pb);`
		`+ pb->pb_intop->pb_operation_notes = opnotes;`
		`+ }`
		`+`
		`+ /* ensure flag is set */`
		`+ void`
		`+ slapi_pblock_set_flag_operation_notes(Slapi_PBlock *pb, uint32_t opflag) {`
		`+ _pblock_assert_pb_intop(pb);`
		`+ pb->pb_intop->pb_operation_notes \|= opflag;`
		`+ }`
		`+`
		`/*`
		`* Clear and then set the bind DN and related credentials for the`
		* connection `conn'.

ldap/servers/slapd/proto-slap.h

file modified

		`@@ -89,6 +89,7 @@`
		`void attr_syntax_unlock_read(void);`
		`void attr_syntax_unlock_write(void);`
		`int attr_syntax_exists(const char *attr_name);`
		`+ int32_t attr_syntax_exist_by_name_nolock(char *name);`
		`void attr_syntax_delete(struct asyntaxinfo *asip, PRUint32 schema_flags);`
		`#define SLAPI_SYNTAXLENGTH_NONE (-1) /* for syntaxlength parameter */`
		`int attr_syntax_create(const char attr_oid, char const attr_names, const char attr_desc, const char attr_superior, const char mr_equality, const char mr_ordering, const char mr_substring, schemaext extensions, const char attr_syntax, int syntaxlength, unsigned long flags, struct asyntaxinfo **asip);`
		`@@ -574,6 +575,9 @@`
		`int config_set_enable_nunc_stans(const char attrname, char value, char *errorbuf, int apply);`
		`int config_set_extract_pem(const char attrname, char value, char *errorbuf, int apply);`

		`+ int32_t config_set_verify_filter_schema(const char attrname, char value, char *errorbuf, int apply);`
		`+ Slapi_Filter_Policy config_get_verify_filter_schema(void);`
		`+`
		`PLHashNumber hashNocaseString(const void *key);`
		`PRIntn hashNocaseCompare(const void v1, const void v2);`
		`int config_get_ignore_time_skew(void);`

ldap/servers/slapd/result.c

file modified

+66 -13

		`@@ -1844,12 +1844,15 @@`
		`{`
		`unsigned int snp_noteid;`
		`char *snp_string;`
		`+ char *snp_detail;`
		`};`

		`static struct slapi_note_map notemap[] = {`
		`- {SLAPI_OP_NOTE_UNINDEXED, "U"},`
		`- {SLAPI_OP_NOTE_SIMPLEPAGED, "P"},`
		`- {SLAPI_OP_NOTE_FULL_UNINDEXED, "A"}};`
		`+ {SLAPI_OP_NOTE_UNINDEXED, "U", "Partially Unindexed Filter"},`
		`+ {SLAPI_OP_NOTE_SIMPLEPAGED, "P", "Paged Search"},`
		`+ {SLAPI_OP_NOTE_FULL_UNINDEXED, "A", "Fully Unindexed Filter"},`
		`+ {SLAPI_OP_NOTE_FILTER_INVALID, "F", "Filter Element Missing From Schema"},`
		`+ };`

		`#define SLAPI_NOTEMAP_COUNT (sizeof(notemap) / sizeof(struct slapi_note_map))`

		`@@ -1880,24 +1883,73 @@`

		`p = buf;`
		`for (i = 0; i < SLAPI_NOTEMAP_COUNT; ++i) {`
		`+ /* Check if the flag is present in the operation notes */`
		`if ((notemap[i].snp_noteid & notes) != 0) {`
		`- if (p > buf && buflen > 0) {`
		`+ len = strlen(notemap[i].snp_string);`
		`+ if (p > buf) {`
		`+ if (buflen < (len + 1)) {`
		`+ break;`
		`+ }`
		`+ // Check buflen`
		`*p++ = ',';`
		`- *p = '\0';`
		`--buflen;`
		`} else {`
		`+ if (buflen < (len + 6)) {`
		`+ break;`
		`+ }`
		`+ // Check buflen`
		`strcpy(p, "notes=");`
		`p += 6;`
		`buflen -= 6;`
		`}`
		`- len = strlen(notemap[i].snp_string);`
		`- if (buflen < len) {`
		`- break; /* bail out (result is truncated) */`
		`- }`
		`memcpy(p, notemap[i].snp_string, len);`
		`buflen -= len;`
		`p += len;`
		`- *p = '\0';`
		`+ }`
		`+ }`
		`+`
		`+ /* Get a pointer to the "start" of where we'll write the details. */`
		`+ char *note_end = p;`
		`+`
		`+ /* Now add the details (if possible) */`
		`+ for (i = 0; i < SLAPI_NOTEMAP_COUNT; ++i) {`
		`+ if ((notemap[i].snp_noteid & notes) != 0) {`
		`+ len = strlen(notemap[i].snp_detail);`
		`+ if (p > note_end) {`
		`+ /*`
		`+ * len of detail + , + "`
		`+ */`
		`+ if (buflen < (len + 2)) {`
		`+ break;`
		`+ }`
		`+ /*`
		`+ * If the working pointer is past the start`
		`+ * position, add a comma now before the next term.`
		`+ */`
		`+ *p++ = ',';`
		`+ --buflen;`
		`+ } else {`
		`+ /*`
		`+ * len of detail + details=" + "`
		`+ */`
		`+ if (buflen < (len + 11)) {`
		`+ break;`
		`+ }`
		`+ /* This is the first iteration, so add " details=\"" */`
		`+ strcpy(p, " details=\"");`
		`+ p += 10;`
		`+ buflen -= 10;`
		`+ }`
		`+ memcpy(p, notemap[i].snp_detail, len);`
		`+ /*`
		`+ * We don't account for the ", because on the next loop we may`
		`+ * backtrack over it, so it doesn't count to the len calculation.`
		`+ */`
		`+ buflen -= len;`
		`+ p += len;`
		`+ /* Put in the end quote, then back track p. */`
		`+ *p++ = '"';`
		`+ *p--;`
		`}`
		`}`

		`@@ -1908,11 +1960,12 @@`
		`static void`
		`log_result(Slapi_PBlock pb, Operation op, int err, ber_tag_t tag, int nentries)`
		`{`
		`- char *notes_str, notes_buf[256];`
		`+ char *notes_str = NULL;`
		`+ char notes_buf[256] = {0};`
		`int internal_op;`
		`CSN *operationcsn = NULL;`
		`char csn_str[CSN_STRSIZE + 5];`
		`- char etime[ETIME_BUFSIZ];`
		`+ char etime[ETIME_BUFSIZ] = {0};`
		`int pr_idx = -1;`
		`int pr_cookie = -1;`
		`uint32_t operation_notes;`
		`@@ -1933,7 +1986,7 @@`

		`snprintf(etime, ETIME_BUFSIZ, "%" PRId64 ".%010" PRId64 "", (int64_t)o_hr_time_end.tv_sec, (int64_t)o_hr_time_end.tv_nsec);`

		`- slapi_pblock_get(pb, SLAPI_OPERATION_NOTES, &operation_notes);`
		`+ operation_notes = slapi_pblock_get_operation_notes(pb);`

		`if (0 == operation_notes) {`
		`notes_str = "";`

ldap/servers/slapd/schema.c

file modified

+88

		`@@ -697,6 +697,94 @@`
		`return (ret);`
		`}`

		`+ static Slapi_Filter_Result`
		`+ slapi_filter_schema_check_inner(Slapi_Filter *f) {`
		`+ /*`
		`+ * Default response to Ok. If any more severe things happen we`
		`+ * alter this to reflect it. IE we bubble up more severe errors`
		`+ * out.`
		`+ */`
		`+ Slapi_Filter_Result r = FILTER_SCHEMA_SUCCESS;`
		`+`
		`+ switch (f->f_choice) {`
		`+ case LDAP_FILTER_EQUALITY:`
		`+ case LDAP_FILTER_GE:`
		`+ case LDAP_FILTER_LE:`
		`+ case LDAP_FILTER_APPROX:`
		`+ if (!attr_syntax_exist_by_name_nolock(f->f_avtype)) {`
		`+ f->f_flags \|= SLAPI_FILTER_INVALID_ATTR;`
		`+ r = FILTER_SCHEMA_WARNING;`
		`+ }`
		`+ break;`
		`+ case LDAP_FILTER_PRESENT:`
		`+ if (!attr_syntax_exist_by_name_nolock(f->f_type)) {`
		`+ f->f_flags \|= SLAPI_FILTER_INVALID_ATTR;`
		`+ r = FILTER_SCHEMA_WARNING;`
		`+ }`
		`+ break;`
		`+ case LDAP_FILTER_SUBSTRINGS:`
		`+ if (!attr_syntax_exist_by_name_nolock(f->f_sub_type)) {`
		`+ f->f_flags \|= SLAPI_FILTER_INVALID_ATTR;`
		`+ r = FILTER_SCHEMA_WARNING;`
		`+ }`
		`+ break;`
		`+ case LDAP_FILTER_EXTENDED:`
		`+ /* I don't have any examples of this, so I'm not 100% on how to check it */`
		`+ if (!attr_syntax_exist_by_name_nolock(f->f_mr_type)) {`
		`+ f->f_flags \|= SLAPI_FILTER_INVALID_ATTR;`
		`+ r = FILTER_SCHEMA_WARNING;`
		`+ }`
		`+ break;`
		`+ case LDAP_FILTER_AND:`
		`+ case LDAP_FILTER_OR:`
		`+ case LDAP_FILTER_NOT:`
		`+ /* Recurse and check all elemments of the filter */`
		`+ for (Slapi_Filter *f_child = f->f_list; f_child != NULL; f_child = f_child->f_next) {`
		`+ Slapi_Filter_Result ri = slapi_filter_schema_check_inner(f_child);`
		`+ if (ri > r) {`
		`+ r = ri;`
		`+ }`
		`+ }`
		`+ break;`
		`+ default:`
		`+ slapi_log_err(SLAPI_LOG_ERR, "slapi_filter_schema_check_inner",`
		`+ "Unknown type 0x%lX\n", f->f_choice);`
		`+ r = FILTER_SCHEMA_FAILURE;`
		`+ break;`
		`+ }`
		`+`
		`+ return r;`
		`+ }`
		`+`
		`+ /*`
		`+ *`
		`+ */`
		`+ Slapi_Filter_Result`
		`+ slapi_filter_schema_check(Slapi_Filter *f, Slapi_Filter_Policy fp) {`
		`+ if (f == NULL) {`
		`+ return FILTER_SCHEMA_FAILURE;`
		`+ }`
		`+`
		`+ if (fp == FILTER_POLICY_OFF) {`
		`+ return FILTER_SCHEMA_SUCCESS;`
		`+ }`
		`+`
		`+ /*`
		`+ * Filters are nested, recursive structures, so we actually have to call an inner`
		`+ * function until we have a result!`
		`+ */`
		`+ attr_syntax_read_lock();`
		`+ Slapi_Filter_Result r = slapi_filter_schema_check_inner(f);`
		`+ attr_syntax_unlock_read();`
		`+`
		`+ /* If any warning occured, ensure we fail it. */`
		`+ if (fp == FILTER_POLICY_STRICT && r != FILTER_SCHEMA_SUCCESS) {`
		`+ r = FILTER_SCHEMA_FAILURE;`
		`+ }`
		`+ return r;`
		`+ }`
		`+`
		`+`
		`/*`
		`* The caller must obtain a read lock first by calling oc_lock_read().`
		`*/`

ldap/servers/slapd/search.c

file modified

+14

		`@@ -198,6 +198,20 @@`
		`goto free_and_return;`
		`}`

		`+ /*`
		`+ * Scan the filters syntax - depending on settings, this will do nothing, warn`
		`+ * or reject. A question is the location of this and if we should try to work with`
		`+ * internal searches too ...`
		`+ */`
		`+ Slapi_Filter_Result r = slapi_filter_schema_check(filter, config_get_verify_filter_schema());`
		`+ if (r == FILTER_SCHEMA_FAILURE) {`
		`+ char *errtxt = "The filter provided contains invalid attributes not found in schema";`
		`+ err = LDAP_UNWILLING_TO_PERFORM;`
		`+ log_search_access(pb, base, scope, "???", errtxt);`
		`+ send_ldap_result(pb, err, NULL, errtxt, 0, NULL);`
		`+ goto free_and_return;`
		`+ }`
		`+`
		`/* attributes */`
		`attrs = NULL;`
		`if (ber_scanf(ber, "{v}}", &attrs) == LBER_ERROR) {`

ldap/servers/slapd/slap.h

file modified

+15

		`@@ -459,6 +459,12 @@`
		`TLS_CHECK_ALL = 2,`
		`} tls_check_crl_t;`

		`+ typedef enum _slapi_onwarnoff_t {`
		`+ SLAPI_OFF = 0,`
		`+ SLAPI_WARN = 1,`
		`+ SLAPI_ON = 2,`
		`+ } slapi_onwarnoff_t;`
		`+`

		`struct subfilt`
		`{`
		`@@ -2261,6 +2267,8 @@`
		`#define CONFIG_MALLOC_TRIM_THRESHOLD "nsslapd-malloc-trim-threshold"`
		`#define CONFIG_MALLOC_MMAP_THRESHOLD "nsslapd-malloc-mmap-threshold"`

		`+ #define CONFIG_VERIFY_FILTER_SCHEMA "nsslapd-verify-filter-schema"`
		`+`
		`#define DEFAULT_MALLOC_UNSET (-10)`

		`/*`
		`@@ -2541,6 +2549,13 @@`
		`#endif`
		`slapi_onoff_t extract_pem; /* If "on", export key/cert as pem files */`
		`slapi_onoff_t enable_upgrade_hash; /* If on, upgrade hashes for PW at bind */`
		`+ /*`
		`+ * Do we verify the filters we recieve by schema?`
		`+ * on - yes, and reject if attribute not found`
		`+ * warn - yes, and warn that the attribute is unknown and unindexed`
		`+ * off - no, do whatever (old status-quo)`
		`+ */`
		`+ slapi_onwarnoff_t verify_filter_schema;`
		`} slapdFrontendConfig_t;`

		`/* possible values for slapdFrontendConfig_t.schemareplace */`

ldap/servers/slapd/slapi-plugin.h

file modified

+42 -3

		`@@ -1562,6 +1562,39 @@`
		`int slapi_entry_syntax_check(Slapi_PBlock pb, Slapi_Entry e, int override);`

		`/**`
		`+ * Filter policy definitions. These define how we should check and treat filters`
		`+ * that have non-conforming attributes in the request`
		`+ * - OFF - do no check, trust the filter.`
		`+ * - WARNING - Check, and flag filter elements that are not found in schema.`
		`+ * - STRICT - check and reject filter's that have elements not found in schema.`
		`+ */`
		`+ typedef enum {`
		`+ FILTER_POLICY_OFF,`
		`+ FILTER_POLICY_WARNING,`
		`+ FILTER_POLICY_STRICT,`
		`+ } Slapi_Filter_Policy;`
		`+`
		`+ typedef enum {`
		`+ FILTER_SCHEMA_SUCCESS = 0,`
		`+ FILTER_SCHEMA_WARNING = 1,`
		`+ FILTER_SCHEMA_FAILURE = 2,`
		`+ } Slapi_Filter_Result;`
		`+`
		`+ /**`
		`+ * Determine if a fiter conforms to schema, specifically, that all requested attributes`
		`+ * are in the schema.`
		`+ *`
		`+ * We assume that the filter HAS been normalised already so that the attribute names`
		`+ * match the values found in the attrsyntax hashmaps. We also base our return on the`
		`+ * provided policy.`
		`+ *`
		`+ * OFF - return SUCCESS, the filter is valid`
		`+ * WARNING - return SUCCESS, and flag filter elements that are not in schema.`
		`+ * STRICT - return SUCCESS only if all elements are found - else return FAILURE.`
		`+ */`
		`+ Slapi_Filter_Result slapi_filter_schema_check(Slapi_Filter *f, Slapi_Filter_Policy fp);`
		`+`
		`+ /**`
		`* Determines if the DN violates the Distinguished Name syntax rules.`
		`*`
		`* \param pb Parameter block.`
		`@@ -7202,9 +7235,15 @@`

		`/* Extra notes to be logged within access log RESULT lines */`
		`#define SLAPI_OPERATION_NOTES 57`
		`- #define SLAPI_OP_NOTE_UNINDEXED 0x01`
		`- #define SLAPI_OP_NOTE_SIMPLEPAGED 0x02`
		`- #define SLAPI_OP_NOTE_FULL_UNINDEXED 0x04`
		`+`
		`+ /* Remember these are FLAGS in a bitmask! */`
		`+ typedef enum _slapi_op_note_t {`
		`+ SLAPI_OP_NOTE_UNINDEXED = 0x01,`
		`+ SLAPI_OP_NOTE_SIMPLEPAGED = 0x02,`
		`+ SLAPI_OP_NOTE_FULL_UNINDEXED = 0x04,`
		`+ SLAPI_OP_NOTE_FILTER_INVALID = 0x08,`
		`+ } slapi_op_note_t;`
		`+`

		`/* Allows controls to be passed before operation object is created */`
		`#define SLAPI_CONTROLS_ARG 58`

ldap/servers/slapd/slapi-private.h

file modified

+14 -6

		`@@ -48,12 +48,17 @@`
		`#define SLAPI_SHUTDOWN_EXIT 3`

		`/* filter */`
		`- #define SLAPI_FILTER_LDAPSUBENTRY 1`
		`- #define SLAPI_FILTER_TOMBSTONE 2`
		`- #define SLAPI_FILTER_RUV 4`
		`- #define SLAPI_ENTRY_LDAPSUBENTRY 2`
		`- #define SLAPI_FILTER_NORMALIZED_TYPE 8`
		`- #define SLAPI_FILTER_NORMALIZED_VALUE 16`
		`+ typedef enum _slapi_filter_flags_t {`
		`+ SLAPI_FILTER_LDAPSUBENTRY = 1,`
		`+ SLAPI_FILTER_TOMBSTONE = 2,`
		`+ SLAPI_FILTER_RUV = 4,`
		`+ SLAPI_FILTER_NORMALIZED_TYPE = 8,`
		`+ SLAPI_FILTER_NORMALIZED_VALUE = 16,`
		`+ SLAPI_FILTER_INVALID_ATTR = 32,`
		`+ } slapi_filter_flags;`
		`+`
		`+ #define SLAPI_ENTRY_LDAPSUBENTRY 2`
		`+`

		`/*`
		`Optimized filter path. For example the following code was lifted from int.c (syntaxes plugin):`
		`@@ -1447,6 +1452,9 @@`
		`struct slapi_entry slapi_pblock_get_pw_entry(Slapi_PBlock pb);`
		`void slapi_pblock_set_pw_entry(Slapi_PBlock pb, struct slapi_entry entry);`

		`+ uint32_t slapi_pblock_get_operation_notes(Slapi_PBlock *pb);`
		`+ void slapi_pblock_set_operation_notes(Slapi_PBlock *pb, uint32_t opnotes);`
		`+ void slapi_pblock_set_flag_operation_notes(Slapi_PBlock *pb, uint32_t opflag);`

		`#ifdef __cplusplus`
		`}`

src/lib389/lib389/_mapped_object.py

file modified

+2 -2

		`@@ -945,7 +945,7 @@`
		`:type instance: lib389.DirSrv`
		`"""`

		`- def __init__(self, instance):`
		`+ def __init__(self, instance, basedn=""):`
		`self._childobject = DSLdapObject`
		`self._instance = instance`
		`super(DSLdapObjects, self).__init__(self._instance.verbose)`
		`@@ -953,7 +953,7 @@`
		`self._filterattrs = []`
		`self._list_attrlist = ['dn']`
		`# Copy this from the child if we need.`
		`- self._basedn = ""`
		`+ self._basedn = basedn`
		`self._scope = ldap.SCOPE_SUBTREE`
		`self._server_controls = None`
		`self._client_controls = None`

src/lib389/lib389/dbgen.py

file modified

		`@@ -99,6 +99,7 @@`
		`l: {LOCATION}`
		`ou: {OU}`
		`mail: {UID}@example.com`
		`+ mail: {UIDNUMBER}@example.com`
		`postalAddress: 518, Dept #851, Room#{OU}`
		`title: {TITLE}`
		`usercertificate;binary:: MIIBvjCCASegAwIBAgIBAjANBgkqhkiG9w0BAQQFADAnMQ8wDQYD`
		`@@ -168,6 +169,7 @@`
		`output.write(DBGEN_TEMPLATE.format(`
		`DN=dn,`
		`UID=uid,`
		`+ UIDNUMBER=i,`
		`FIRST=first,`
		`LAST=last,`
		`INITIALS=initials,`

src/lib389/lib389/instance/setup.py

file modified

+9 -2

		`@@ -40,6 +40,10 @@`

		`ds_paths = Paths()`

		`+ # We need this to decide if we should remove after a failed install - useful`
		`+ # for tests ONLY which is why it's the env flag still.`
		`+ DEBUGGING = os.getenv('DEBUGGING', default=False)`
		`+`

		`def get_port(port, default_port, secure=False):`
		`# Get the port number for the interactive installer and validate it`
		`@@ -653,8 +657,11 @@`
		`try:`
		`self._install_ds(general, slapd, backends)`
		`except ValueError as e:`
		`- self.log.fatal("Error: " + str(e) + ", removing incomplete installation...")`
		`- self._remove_failed_install(slapd['instance_name'])`
		`+ if DEBUGGING is False:`
		`+ self.log.fatal("Error: " + str(e) + ", removing incomplete installation...")`
		`+ self._remove_failed_install(slapd['instance_name'])`
		`+ else:`
		`+ self.log.fatal("Error: " + str(e) + ", preserving incomplete installation for analysis...")`
		`raise ValueError("Instance creation failed!")`

		`# Call the child api to do anything it needs.`

src/lib389/lib389/ldclt.py

file modified

+40 -13

		`@@ -66,7 +66,7 @@`
		`digits = len('%s' % max)`

		`cmd = [`
		`- '%s/bin/ldclt' % self.ds.prefix,`
		`+ '%s/ldclt' % self.ds.get_bin_dir(),`
		`'-h',`
		`self.ds.host,`
		`'-p',`
		`@@ -94,11 +94,25 @@`
		`raise(e)`
		`self.log.debug(result)`

		`+ def _run_ldclt(self, cmd):`
		`+ result = None`
		`+ self.log.debug("ldclt loadtest ...")`
		`+ self.log.debug(format_cmd_list(cmd))`
		`+ try:`
		`+ result = subprocess.check_output(cmd)`
		`+ # If verbose, capture / log the output.`
		`+ except subprocess.CalledProcessError as e:`
		`+ print(format_cmd_list(cmd))`
		`+ print(result)`
		`+ raise(e)`
		`+ self.log.debug(result)`
		`+ return result`
		`+`
		`def bind_loadtest(self, subtree, min=1000, max=9999, rounds=3):`
		`# The bind users will be uid=userXXXX`
		`digits = len('%s' % max)`
		`cmd = [`
		`- '%s/bin/ldclt' % self.ds.prefix,`
		`+ '%s/ldclt' % self.ds.get_bin_dir(),`
		`'-h',`
		`self.ds.host,`
		`'-p',`
		`@@ -114,14 +128,27 @@`
		`'-e',`
		`'bindonly',`
		`]`
		`- result = None`
		`- self.log.debug("ldclt loadtest ...")`
		`- self.log.debug(format_cmd_list(cmd))`
		`- try:`
		`- result = subprocess.check_output(cmd)`
		`- # If verbose, capture / log the output.`
		`- except subprocess.CalledProcessError as e:`
		`- print(format_cmd_list(cmd))`
		`- print(result)`
		`- raise(e)`
		`- self.log.debug(result)`
		`+ self._run_ldclt(cmd)`
		`+`
		`+ def search_loadtest(self, subtree, fpattern, min=1000, max=9999, rounds=3):`
		`+ digits = len('%s' % max)`
		`+ cmd = [`
		`+ '%s/ldclt' % self.ds.get_bin_dir(),`
		`+ '-h',`
		`+ self.ds.host,`
		`+ '-p',`
		`+ '%s' % self.ds.port,`
		`+ '-N',`
		`+ '%s' % rounds,`
		`+ '-f',`
		`+ fpattern,`
		`+ '-e',`
		`+ 'esearch,random',`
		`+ '-r%s' % min,`
		`+ '-R%s' % max,`
		`+ '-I',`
		`+ '32',`
		`+ '-e',`
		`+ 'randomattrlist=cn:uid:ou',`
		`+ ]`
		`+ self._run_ldclt(cmd)`

test/libslapd/schema/filter_validate.c

file added

+120

		`@@ -0,0 +1,120 @@`
		`+ /** BEGIN COPYRIGHT BLOCK`
		`+ * Copyright (C) 2019 William Brown <william@blackhats.net.au>`
		`+ * All rights reserved.`
		`+ *`
		`+ * License: GPL (version 3 or any later version).`
		`+ * See LICENSE for details.`
		`+ * END COPYRIGHT BLOCK **/`
		`+`
		`+ #include "../../test_slapd.h"`
		`+`
		`+ #include <slap.h>`
		`+ #include <proto-slap.h>`
		`+ #include <string.h>`
		`+`
		`+ /*`
		`+ * This is a test-only function, that is able to generate test, mock`
		`+ * asyntaxinfo's for us.`
		`+ */`
		`+`
		`+ static struct asyntaxinfo *`
		`+ attr_syntax_add_from_name(char name, char oid) {`
		`+`
		`+ char *names[2] = {0};`
		`+ names[0] = name;`
		`+`
		`+ /* This is apparently leaking the allocated struct, but not it's memebers ... huh? */`
		`+ struct asyntaxinfo *asi = NULL;`
		`+`
		`+ attr_syntax_create(`
		`+ oid, // attr_oid`
		`+ names, // attr_names`
		`+ "testing attribute type",`
		`+ NULL, // attr_supe`
		`+ NULL, // attr eq`
		`+ NULL, // attr order`
		`+ NULL, // attr sub`
		`+ NULL, // exten`
		`+ DIRSTRING_SYNTAX_OID, // attr_syntax`
		`+ SLAPI_SYNTAXLENGTH_NONE,// syntaxlen`
		`+ SLAPI_ATTR_FLAG_STD_ATTR \| SLAPI_ATTR_FLAG_OPATTR, // flags`
		`+ &asi // syntaxinfo **, where the function returns data to.`
		`+ );`
		`+`
		`+ assert_true(attr_syntax_add(asi, 0) == 0);`
		`+`
		`+ return asi;`
		`+ }`
		`+`
		`+ Slapi_Filter_Result`
		`+ validate_filter(char *fstr, Slapi_Filter_Policy policy) {`
		`+`
		`+ char fdup[256] = {0};`
		`+ strcpy(fdup, fstr);`
		`+ struct slapi_filter *f = slapi_str2filter(fdup);`
		`+ assert_true(f != NULL);`
		`+`
		`+ Slapi_Filter_Result r = slapi_filter_schema_check(f, policy);`
		`+ // Based on policy, we could assert if flags are set.`
		`+`
		`+ slapi_filter_free(f, 1);`
		`+ return r;`
		`+ }`
		`+`
		`+`
		`+ void`
		`+ test_libslapd_schema_filter_validate_simple(void **state __attribute__((unused)))`
		`+ {`
		`+ /* Setup a fake schema */`
		`+ attr_syntax_write_lock();`
		`+`
		`+ /* We'll add two attributes, test_a and test_b */`
		`+ struct asyntaxinfo *a = attr_syntax_add_from_name("test_a", "1.1.0.0.0.0.1");`
		`+ struct asyntaxinfo *b = attr_syntax_add_from_name("test_b", "1.1.0.0.0.0.2");`
		`+`
		`+ attr_syntax_unlock_write();`
		`+`
		`+ /* Get the read lock ready */`
		`+ attr_syntax_read_lock();`
		`+`
		`+ /* Test some simple filters */`
		`+ char *invalid = "(&(non_exist=a)(more_not_real=b))";`
		`+ char *par_valid = "(&(test_a=a)(more_not_real=b))";`
		`+ char *valid = "(&(test_a=a)(test_b=b))";`
		`+ char *valid_case = "(&(Test_A=a)(Test_B=b))";`
		`+`
		`+ /* Did they pass given the policy and expectations? */`
		`+`
		`+ /* simple error cases */`
		`+ assert_true(slapi_filter_schema_check(NULL, FILTER_POLICY_OFF) == FILTER_SCHEMA_FAILURE);`
		`+`
		`+ /* policy off, always success no matter what */`
		`+ assert_true(validate_filter(invalid, FILTER_POLICY_OFF) == FILTER_SCHEMA_SUCCESS);`
		`+ assert_true(validate_filter(par_valid, FILTER_POLICY_OFF) == FILTER_SCHEMA_SUCCESS);`
		`+ assert_true(validate_filter(valid, FILTER_POLICY_OFF) == FILTER_SCHEMA_SUCCESS);`
		`+ /* policy warning */`
		`+ assert_true(validate_filter(invalid, FILTER_POLICY_WARNING) == FILTER_SCHEMA_WARNING);`
		`+ assert_true(validate_filter(par_valid, FILTER_POLICY_WARNING) == FILTER_SCHEMA_WARNING);`
		`+ assert_true(validate_filter(valid, FILTER_POLICY_WARNING) == FILTER_SCHEMA_SUCCESS);`
		`+ /* policy strict */`
		`+ assert_true(validate_filter(invalid, FILTER_POLICY_STRICT) == FILTER_SCHEMA_FAILURE);`
		`+ assert_true(validate_filter(par_valid, FILTER_POLICY_STRICT) == FILTER_SCHEMA_FAILURE);`
		`+ assert_true(validate_filter(valid, FILTER_POLICY_STRICT) == FILTER_SCHEMA_SUCCESS);`
		`+`
		`+ /* policy warning, complex filters. Try to exercise all the paths in the parser!! */`
		`+ assert_true(validate_filter("(\|(!(b=b))(le<=a)(ge>=a)(sub=a)(pres=*))", FILTER_POLICY_WARNING) == FILTER_SCHEMA_WARNING);`
		`+`
		`+ /* Check case sense */`
		`+ assert_true(validate_filter(valid_case, FILTER_POLICY_WARNING) == FILTER_SCHEMA_SUCCESS);`
		`+`
		`+`
		`+ attr_syntax_unlock_read();`
		`+`
		`+ /* Cleanup */`
		`+ attr_syntax_write_lock();`
		`+ attr_syntax_delete(a, 0);`
		`+ attr_syntax_delete(b, 0);`
		`+ attr_syntax_unlock_write();`
		`+`
		`+ return;`
		`+ }`

test/libslapd/test.c

file modified

		`@@ -1,5 +1,6 @@`
		`/** BEGIN COPYRIGHT BLOCK`
		`* Copyright (C) 2017 Red Hat, Inc.`
		`+ * Copyright (C) 2019 William Brown <william@blackhats.net.au>`
		`* All rights reserved.`
		`*`
		`* License: GPL (version 3 or any later version).`
		`@@ -25,6 +26,7 @@`
		`cmocka_unit_test(test_libslapd_pblock_v3c_target_sdn),`
		`cmocka_unit_test(test_libslapd_pblock_v3c_original_target_dn),`
		`cmocka_unit_test(test_libslapd_pblock_v3c_target_uniqueid),`
		`+ cmocka_unit_test(test_libslapd_schema_filter_validate_simple),`
		`cmocka_unit_test(test_libslapd_operation_v3c_target_spec),`
		`cmocka_unit_test(test_libslapd_counters_atomic_usage),`
		`cmocka_unit_test(test_libslapd_counters_atomic_overflow),`

test/test_slapd.h

file modified

		`@@ -1,5 +1,6 @@`
		`/** BEGIN COPYRIGHT BLOCK`
		`* Copyright (C) 2017 Red Hat, Inc.`
		`+ * Copyright (C) 2019 William Brown <william@blackhats.net.au>`
		`* All rights reserved.`
		`*`
		`* License: GPL (version 3 or any later version).`
		`@@ -35,6 +36,9 @@`
		`void test_libslapd_pblock_v3c_original_target_dn(void **state);`
		`void test_libslapd_pblock_v3c_target_uniqueid(void **state);`

		`+ /* libslapd-schema-filter-validate */`
		`+ void test_libslapd_schema_filter_validate_simple(void **state);`
		`+`
		`/* libslapd-operation-v3_compat */`
		`void test_libslapd_operation_v3c_target_spec(void **state);`

firstyear commented 4 years ago

Bug Description: 389 Should assert that all attributes in a filter
are present and valid in schema. If there are attributes in a filter
that are not in schema, this can lead to DOS due to fall-back to
un-indexed scans, and it also can mask and cover-up application and
development issues with queries. For example, the referenced case was
caused by IPA mistakenly searching an attribute that can never be
satisfied by ACI/filter. If we warned or rejected filters in this case
we would have quickly communicated to the developer that they had caused
a mistake - feedback, being a vital component of psychology and usability
theory.

This should optionally be allowed to be disabled, due to some sites that
use things like extensibleObject that by nature, bypass and violate schema
checks.

Fix Description: TODO

This is currently only a partial start to implement the test framework to
demonstrate that we have a working filter check with our schema.

https://pagure.io/389-ds-base/issue/50349

Author: William Brown william@blackhats.net.au

Review by: ???

spichugi commented 4 years ago

The F29 build fails because of the test:

[  ERROR   ] --- validate_filter(invalid, FILTER_POLICY_WARNING) == FILTER_SCHEMA_WARNING
[   LINE   ] --- test/libslapd/schema/filter_validate.c:95: error: Failure!
[  FAILED  ] test_libslapd_schema_filter_validate_simple

firstyear commented 4 years ago

@spichugi yes that's expected, @tbordaz has answered my question (This code is not in a mergeable state yet).

2 new commits added

Tests passing!
Thanks to Thierry for finding my memory mistake!

4 years ago

firstyear commented 4 years ago

@tbordaz As a follow up, can you check this to see if my approach to checking if the filter attributes are in schema seems reasonable and correct? I've tried to make this a cmocka test so that we can be more confident in the code compare to just a lib389 test - it's also a bit easier to debug too. Would like to know your thoughts on if this seems reasonable so far. :)

tbordaz commented on line 43 of ldap/servers/slapd/attrsyntax.c 4 years ago

Why not using attr_syntax_get_by_name_locking_optional(name, FALSE, 0)

tbordaz commented on line 78 of ldap/servers/slapd/schema.c 4 years ago

I have a bit concern about locking the syntax tables here. I could increase contention on it.
Do we really need the lock for a lookup ?

tbordaz commented 4 years ago

The patch looks good but I am still not convinced of the need to check the filter against unknown attribute.
It is valid to have unknown attribute in a filter. ldap client can create unindexed search but if it is by mistake, client could check notes=U/A. If it is to prevent an attack why not relying on required-index param ?

firstyear commented 4 years ago

@tbordaz We need the read lock on the lookup, yes. Because it can be dynamically changed by schema, we really do need the read lock. Given it's an rwlock, I don't think there is much risk here as schema tends to be very static as a structure, and we only take the read path. The risk would actually be on linux that we delay the writer to schema because linux RW locks are reader favouring, not writer favouring.

I think I laid out the justification pretty well for why this is needed:

-- flags U/A doesn't not indicate the cause of the error missing index, and FreeIPA have shown they will ignore these flags when going to production.

-- We should provide appropriate feedback when an erroring query is submitted. The RFC is patently wrong to think "silently allowing" it is a reasonable course of action. It violates reasonable human interaction and psychology principles.

-- it absolutely can cause a denial of service on larger sites, and I have seen this in production both at UofA and in working with GSS at RH.

-- FreeIPA has gone to production, with queries on attributes that do not exist, that collapse to full table scans - because we do not tell people when they are making mistakes. I don't think we can keep blaming people for "holding it wrong" when our server is just so difficult to use and get right.

So it really really is justified to have this IMO which is why I'm pursuing so to much - because this is a feature I would LOVE to have if I was a sys-admin over again :)

Anyway, I really do appreciate the review, thank you!

Edited 4 years ago by firstyear

firstyear commented 4 years ago

Why not using attr_syntax_get_by_name_locking_optional(name, FALSE, 0)

Because I want to avoid the reference count cycle that's in this function, because I only need to check the char* is in the map, I don't need to take a reference to it. Basically saving two atomics in the process because else I'd need to call release as well.

tbordaz commented on line 7 of ldap/servers/slapd/attrsyntax.c 4 years ago

PR_ASSERT(0) ?

tbordaz commented 4 years ago

@tbordaz We need the read lock on the lookup, yes. Because it can be dynamically changed by schema, we really do need the read lock. Given it's an rwlock, I don't think there is much risk here as schema tends to be very static as a structure, and we only take the read path. The risk would actually be on linux that we delay the writer to schema because linux RW locks are reader favouring, not writer favouring.

I agree that update of schema are rare and most of the time the RWLock will be available in read. Now a RWLock has a cost even if it is free (see https://pagure.io/389-ds-base/issue/49873). Do you expect the filter validation to be enabled by default ?

I think I laid out the justification pretty well for why this is needed:
-- flags U/A doesn't not indicate the cause of the error missing index, and FreeIPA have shown they will ignore these flags when going to production.

When performance comes on the table, U/A and etime are important. I agree it does not help much to find the RC but it shows where to investigate.

-- We should provide appropriate feedback when an erroring query is submitted. The RFC is patently wrong to think "silently allowing" it is a reasonable course of action. It violates reasonable human interaction and psychology principles.
-- it absolutely can cause a denial of service on larger sites, and I have seen this in production both at UofA and in working with GSS at RH.

Using unknown attribute can create DOS. But there are many valid ways to create unindexed filter (having NOT ('!'), improbable attr (e.g. 'nsServerMigrationClassname'), improbable MR (e.g. 'userSMIMECertificate=foo')... requiring use of index look more global approach for preventing unindexed DOS.

IMHO a benefit of the patch is a help on diagnostic when by mistake client use wrong attribute name

-- FreeIPA has gone to production, with queries on attributes that do not exist, that collapse to full table scans - because we do not tell people when they are making mistakes. I don't think we can keep blaming people for "holding it wrong" when our server is just so difficult to use and get right.

Right.

So it really really is justified to have this IMO which is why I'm pursuing so to much - because this is a feature I would LOVE to have if I was a sys-admin over again :)
Anyway, I really do appreciate the review, thank you!

The justification make sense and the patch looks good. My only concern is about potential contention. Have you run some SRCH load to verify filter verification has no contention on attribute syntax ?

firstyear commented 4 years ago

@tbordaz We need the read lock on the lookup, yes. Because it can be dynamically changed by schema, we really do need the read lock. Given it's an rwlock, I don't think there is much risk here as schema tends to be very static as a structure, and we only take the read path. The risk would actually be on linux that we delay the writer to schema because linux RW locks are reader favouring, not writer favouring.

I agree that update of schema are rare and most of the time the RWLock will be available in read. Now a RWLock has a cost even if it is free (see https://pagure.io/389-ds-base/issue/49873). Do you expect the filter validation to be enabled by default ?

I would like it in warning by default (says that there are missing elements, but we don't reject bad filters).

We have plenty of atomics and locking through the codebase. We don't blink at adding another lock in a plugin or another atomic in libglobs.c. I think if we have a concern about locking, we should really rethink our concurrency strategy because it's a bit "un-planned". Saying this, I'm not worried too much about the read lock, read sides of those locks are very low cost. I would be more worried if this was a mutex .... (I'll still test performance though! I just think sometimes we misplace our concerns to single mutexs/locks when our issues are structural and how we use combinations of those constructions :) )

I think I laid out the justification pretty well for why this is needed:
-- flags U/A doesn't not indicate the cause of the error missing index, and FreeIPA have shown they will ignore these flags when going to production.

When performance comes on the table, U/A and etime are important. I agree it does not help much to find the RC but it shows where to investigate.

-- We should provide appropriate feedback when an erroring query is submitted. The RFC is patently wrong to think "silently allowing" it is a reasonable course of action. It violates reasonable human interaction and psychology principles.
-- it absolutely can cause a denial of service on larger sites, and I have seen this in production both at UofA and in working with GSS at RH.

Using unknown attribute can create DOS. But there are many valid ways to create unindexed filter (having NOT ('!'), improbable attr (e.g. 'nsServerMigrationClassname'), improbable MR (e.g. 'userSMIMECertificate=foo')... requiring use of index look more global approach for preventing unindexed DOS.
IMHO a benefit of the patch is a help on diagnostic when by mistake client use wrong attribute name

Yep, feedback certainly is a major element of this patch :) preventing dos in some cases is a nice side effect (especially if we can replace missing with idl of 0 len rather than allids).

This is absolutely true for the improbably attr, but the improbably MR would be rare I would think ... we should think about how to help in these cases too. :)

-- FreeIPA has gone to production, with queries on attributes that do not exist, that collapse to full table scans - because we do not tell people when they are making mistakes. I don't think we can keep blaming people for "holding it wrong" when our server is just so difficult to use and get right.

Right.

So it really really is justified to have this IMO which is why I'm pursuing so to much - because this is a feature I would LOVE to have if I was a sys-admin over again :)
Anyway, I really do appreciate the review, thank you!

The justification make sense and the patch looks good. My only concern is about potential contention. Have you run some SRCH load to verify filter verification has no contention on attribute syntax ?

Well, the patch at the moment is proof of concept that I'm looking up attrs and verifying filters correctly only - it's only cmocka tests! I wanted expert review to be sure I was approaching the problem the correct way before I went too deep into integration. I am still working on the integration to have it integrate with SRCH. I'll be sure to run some ldclt tests for you with good/bad filters to be sure :)

I'll probably have it integrated and working today I expect (I wrote the libglobs part yesterday, as well as lib389 tests), so we can have a better review shortly.

Thank you!

2 new commits added

Finish the feature and all tests cases. Likely needs rebase to filter opt which has many filter and search changes and fixes.
temp

4 years ago

firstyear commented 4 years ago

@tbordaz I have just pushed the rest of the "now integrated" feature, which works as expected and has a test case. I'd love your review of it (I plan to do a self-review soon to make sure I haven't missed anything).

In addition I'd like to wait for https://pagure.io/389-ds-base/pull-request/50252 to be merged before I merge this because I think they both touch similar areas, and I want the tests from 50252 to be merged so I can run them against this PR. So even if you ack this, I'll probably delay merge until the filter opt PR is merged too.

Thanks!

EDIT: PS - I'm about to setup and test performance of this now :)

Edited 4 years ago by firstyear

firstyear commented 4 years ago

Performed on 6core i7 w_ 32Gb of ram + SSD

OFF (mail=xxx@) lt + idlscan 20k

DEBUG:lib389:b'ldclt version 4.23\nldclt[30924]: Starting at Tue May 21 16:15:32 2019\n\nldclt[30924]: Average rate: 5569.20/thr  (5569.20/sec), total:  55692\nldclt[30924]: Average rate: 5347.30/thr  (5347.30/sec), total:  53473\nldclt[30924]: Average rate: 5492.10/thr  (5492.10/sec), total:  54921\nldclt[30924]: Average rate: 5397.30/thr  (5397.30/sec), total:  53973\nldclt[30924]: Average rate: 5413.00/thr  (5413.00/sec), total:  54130\nldclt[30924]: Average rate: 5469.10/thr  (5469.10/sec), total:  54691\nldclt[30924]: Average rate: 5461.30/thr  (5461.30/sec), total:  54613\nldclt[30924]: Average rate: 4843.30/thr  (4843.30/sec), total:  48433\nldclt[30924]: Average rate: 5037.10/thr  (5037.10/sec), total:  50371\nldclt[30924]: Average rate: 5248.30/thr  (5248.30/sec), total:  52483\nldclt[30924]: Number of samples achieved. Bye-bye...\nldclt[30924]: All threads are dead - exit.\nldclt[30924]: Global average rate: 53278.00/thr  (5327.80/sec), total: 532780\nldclt[30924]: Global number times "no activity" reports: never\nldclt[30924]: Global no error occurs during this session.\nldclt[30924]: Ending at Tue May 21 16:17:12 2019\nldclt[30924]: Exit status 0 - No problem during execution.\nldclt[30924]: T005: thread is dead.\nldclt[30924]: T000: thread is dead.\nldclt[30924]: T004: thread is dead.\n'

WARN (mail=xxx@) lt + idlscan 20k

DEBUG:lib389:b'ldclt version 4.23\nldclt[30777]: Starting at Tue May 21 16:12:52 2019\n\nldclt[30777]: Average rate: 5612.70/thr  (5612.70/sec), total:  56127\nldclt[30777]: Average rate: 5586.00/thr  (5586.00/sec), total:  55860\nldclt[30777]: Average rate: 5380.90/thr  (5380.90/sec), total:  53809\nldclt[30777]: Average rate: 5370.50/thr  (5370.50/sec), total:  53705\nldclt[30777]: Average rate: 5100.10/thr  (5100.10/sec), total:  51001\nldclt[30777]: Average rate: 5368.10/thr  (5368.10/sec), total:  53681\nldclt[30777]: Average rate: 5183.30/thr  (5183.30/sec), total:  51833\nldclt[30777]: Average rate: 5373.20/thr  (5373.20/sec), total:  53732\nldclt[30777]: Average rate: 5188.20/thr  (5188.20/sec), total:  51882\nldclt[30777]: Average rate: 5012.70/thr  (5012.70/sec), total:  50127\nldclt[30777]: Number of samples achieved. Bye-bye...\nldclt[30777]: All threads are dead - exit.\nldclt[30777]: Global average rate: 53175.70/thr  (5317.57/sec), total: 531757\nldclt[30777]: Global number times "no activity" reports: never\nldclt[30777]: Global no error occurs during this session.\nldclt[30777]: T005: thread is dead.\nldclt[30777]: Ending at Tue May 21 16:14:32 2019\nldclt[30777]: Exit status 0 - No problem during execution.\nldclt[30777]: T007: thread is dead.\nldclt[30777]: T003: thread is dead.\nldclt[30777]: T004: thread is dead.\nldclt[30777]: T008: thread is dead.\n'

OFF (|(mail=xxx@)(nonexist=foo)) lt + idlscan 20k

DEBUG:lib389:b'ldclt version 4.23\nldclt[30483]: Starting at Tue May 21 16:07:38 2019\n\nldclt[30483]: Average rate:    7.40/thr  (   7.40/sec), total:     74\nldclt[30483]: Average rate:    8.30/thr  (   8.30/sec), total:     83\nldclt[30483]: Average rate:    8.80/thr  (   8.80/sec), total:     88\nldclt[30483]: Average rate:    9.00/thr  (   9.00/sec), total:     90\nldclt[30483]: Average rate:    8.90/thr  (   8.90/sec), total:     89\nldclt[30483]: Average rate:    8.70/thr  (   8.70/sec), total:     87\nldclt[30483]: Average rate:    8.70/thr  (   8.70/sec), total:     87\nldclt[30483]: Average rate:    9.10/thr  (   9.10/sec), total:     91\nldclt[30483]: Average rate:    8.70/thr  (   8.70/sec), total:     87\nldclt[30483]: Average rate:    8.70/thr  (   8.70/sec), total:     87\nldclt[30483]: Number of samples achieved. Bye-bye...\nldclt[30483]: All threads are dead - exit.\nldclt[30483]: Global average rate:   86.30/thr  (  8.63/sec), total:    863\nldclt[30483]: Global number times "no activity" reports: never\nldclt[30483]: Global no error occurs during this session.\nldclt[30483]: Ending at Tue May 21 16:09:18 2019\nldclt[30483]: Exit status 0 - No problem during execution.\n'

WARN (|(mail=xxx@)(nonexist=foo)) lt + idlscan 20k

DEBUG:lib389:b'ldclt version 4.23\nldclt[30630]: Starting at Tue May 21 16:10:11 2019\n\nldclt[30630]: Average rate: 4808.70/thr  (4808.70/sec), total:  48087\nldclt[30630]: Average rate: 4734.90/thr  (4734.90/sec), total:  47349\nldclt[30630]: Average rate: 4846.30/thr  (4846.30/sec), total:  48463\nldclt[30630]: Average rate: 4819.70/thr  (4819.70/sec), total:  48197\nldclt[30630]: Average rate: 4529.80/thr  (4529.80/sec), total:  45298\nldclt[30630]: Average rate: 4489.10/thr  (4489.10/sec), total:  44891\nldclt[30630]: Average rate: 4327.60/thr  (4327.60/sec), total:  43276\nldclt[30630]: Average rate: 4730.40/thr  (4730.40/sec), total:  47304\nldclt[30630]: Average rate: 4686.30/thr  (4686.30/sec), total:  46863\nldclt[30630]: Average rate: 4575.80/thr  (4575.80/sec), total:  45758\nldclt[30630]: Number of samples achieved. Bye-bye...\nldclt[30630]: All threads are dead - exit.\nldclt[30630]: Global average rate: 46548.60/thr  (4654.86/sec), total: 465486\nldclt[30630]: Global number times "no activity" reports: never\nldclt[30630]: Global no error occurs during this session.\nldclt[30630]: Ending at Tue May 21 16:11:51 2019\nldclt[30630]: Exit status 0 - No problem during execution.\nldclt[30630]: T002: thread is dead.\nldclt[30630]: T007: thread is dead.\nldclt[30630]: T003: thread is dead.\nldclt[30630]: T006: thread is dead.\nldclt[30630]: T000: thread is dead.\nldclt[30630]: T005: thread is dead.\n'

There are two tests in here:

To show that there is no regression with this on warning compared to off.
To show the effect of the warning mode which replaces invalid types with idl_alloc(0).

Summary:

There is neglible change between off and warning (warning was infact slightly better, but I think this could be test noise as it's a low % change).
When warn is on and invalid types are used, the server is orders of magnitudes better at handling this as a DoS case (this is dependent on sites setting lookthrough to a high number). Even without this, if look through was low, you still are "shortcutting" by returning faster due to the empty or member so we aren't doing work and throwing it away before hitting admin limit.

Hope this helps!

Edited 4 years ago by firstyear

1 new commit added

Add stress case

4 years ago

tbordaz commented on line 59 of ldap/servers/slapd/back-ldbm/filterindex.c 4 years ago

@firstyear I like this change to use slapi_pblock_set_flag_operation_notes but it is a different change from checking the attribute against the schema. IMHO it makes review more complex and I would prefer a separate ticket.
In addition I think it is a wrong use here. slapi_pblock_set does set the value while slapi_pblock_set_flag_operation_notes do a OR.
Better to check the others places where you used slapi_pblock_set_flag_operation_notes that the OR was intended

tbordaz commented on line 13 of ldap/servers/slapd/back-ldbm/filterindex.c 4 years ago

I have a doubt here. A component with an unknown attribute results in an idl_allids rather than an empty list.
I have the same doubt at many places of the patch that contain idl_alloc(0)

tbordaz commented on line 59 of ldap/servers/slapd/back-ldbm/filterindex.c 4 years ago

Just realized that you also created slapi_pblock_set_operation_notes that can be used here

tbordaz commented 4 years ago

@firstyear, thanks for your performance tests. How many threads did you use for the load (10) ?
Do you mind to rerun the test (with call to slapi_filter_schema_check and without) with more threads than workers and check (pstack) if there are some of them in contention on asi lock during slapi_filter_schema_check.

tbordaz commented 4 years ago

Just forgot to say... it is very nice patch ;)

firstyear commented 4 years ago

I have a doubt here. A component with an unknown attribute results in an idl_allids rather than an empty list.
I have the same doubt at many places of the patch that contain idl_alloc(0)

I can see that this could be a more discussion heavy part of the change. IMO idl_alloc(0) is the correct thing to return because:

-- If the attribute is not in schema, it is not possible to validly index it due to missing equality, so the only reasonable result is to say nothing matches it.

-- It prevents the DOS case. (|(nonexist=foo)(uid=1000)). uid=1000 should return an IDL of length one, but if we did ALLIDS, this would become an ALLIDS search. This then causes us to do a fulltable scan, hitting the lookthrough limit (which could be in the thousands in some sites - admins do not understand this value and how it applies to the search process from what I've seen of deployments ...). By returning idl_alloc(0), we add a security feature which is that the search is never going to satisfy, preventing ALLIDS.

EDIT: This also is a problem for freeipa where a search or feature in development looks like it is functional, but in production is performing poorly or erroring with admin limits. So "fail fast" means the issue would be resolved by IPA in dev rather than in production (with a lot less finger pointing between 389 and freeipa to say who's at fault).

-- Returning idl_alloc(0) here would be RFC compliant (as much as I think the rfc is wrong here ....). From @abbra's comment:

 I don't think it is realistically possible to enforce that all attributes of a filter are present in schema and reject the filter otherwise. RFC 4511 section 4.5.1.7. states that

     Servers MUST NOT return errors if attribute descriptions or matching rule ids are not recognized, assertion values are invalid, or the assertion syntax is not supported. More details of filter processing are given in Clause 7.8 of [X.511].

and Clause 7.8 of https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-X.511-201610-I!!PDF-E&type=items has this language:

    Any assertion about the values of such an attribute is only defined if the AttributeType is known by the evaluating mechanism, the purported AttributeValue(s) conforms to the attribute syntax defined for that attribute type, the implied or indicated matching rule is applicable to that attribute type, and (when used) a presented matchValue conforms to the syntax defined for the indicated matching rules. When these conditions are not met, the FilterItem shall evaluate to the logical value UNDEFINED.
    An assertion which is defined by these conditions additionally evaluates to UNDEFINED if it relates to an attribute value and the attribute type is not present in an attribute against which the assertion is being tested. An assertion which is defined by these conditions and relates to the presence of an attribute type evaluates to FALSE.

The tl;dr being - if the attribute type is not known then it is undefined, and evals to false - idl_alloc(0) in our case because no candidate can satisfy a false requirement.

For sites relying on this behaviour for extensibleObject, they can set this feature to "off", which of course disables the warning and the idl_alloc(0) behaviour. Remember, the "on" behaviour is strict, which rejects the filter outright (default is warn).

IMO the rfc is wrong because the filter should never be accepted in the first place - in William's perfect world I would default to "strict" - but we do not live in my perfect world, so I chose to default to warn, which is RFC compliant.

So I think these are the reasons and justifications for the idl_alloc(0) rather than ALLIDS here.

@firstyear I like this change to use slapi_pblock_set_flag_operation_notes but it is a different change from checking the attribute against the schema. IMHO it makes review more complex and I would prefer a separate ticket.
In addition I think it is a wrong use here. slapi_pblock_set does set the value while slapi_pblock_set_flag_operation_notes do a OR.
Better to check the others places where you used slapi_pblock_set_flag_operation_notes that the OR was intended

This is actually really important to use the set_flag instead - if the filter was:

((mail=foo)(nonexist=bar))

Assume mail is unindexed, and nonexist is not in schema. As a result, after ava_candidates opnotes is OP_NOTE_UNINDEXED due to mail.

If we used opnote set, we would CLEAR the unindexed flag, then replace with OP_NOTE_FILTER_INVALID, losing the notes=U.

By using setflag, we keep both flags in the opnotes, building the opnotes to a set of flags instead. So this is actually a really key change to the api and process here.

Maybe it's also my perfection-ist nature, but the get/set I wrote is better to avoid a set of derferences and case-switch that is 'costly' (it probably isn't costly at all, but even 0.01% of a million is 100. It adds up :)

Thank you for your awesome review, I have addressed your other comments in the code, and I'm going to expand a few code comments to help make some other questions clearer too. Your other points have not been ignored!

Edited 4 years ago by firstyear

1 new commit added

thierrys updates

4 years ago

tbordaz commented 4 years ago

I have a doubt here. A component with an unknown attribute results in an idl_allids rather than an empty list.
I have the same doubt at many places of the patch that contain idl_alloc(0)

I can see that this could be a more discussion heavy part of the change. IMO idl_alloc(0) is the correct thing to return because:
-- If the attribute is not in schema, it is not possible to validly index it due to missing equality, so the only reasonable result is to say nothing matches it.
-- It prevents the DOS case. (|(nonexist=foo)(uid=1000)). uid=1000 should return an IDL of length one, but if we did ALLIDS, this would become an ALLIDS search. This then causes us to do a fulltable scan, hitting the lookthrough limit (which could be in the thousands in some sites - admins do not understand this value and how it applies to the search process from what I've seen of deployments ...). By returning idl_alloc(0), we add a security feature which is that the search is never going to satisfy, preventing ALLIDS.

Okay but if with a filter like (&(nonexist=foo)(uid=1000)), because of idl_alloc(0) this filter will return no candidate although it exists one.

EDIT: This also is a problem for freeipa where a search or feature in development looks like it is functional, but in production is performing poorly or erroring with admin limits. So "fail fast" means the issue would be resolved by IPA in dev rather than in production (with a lot less finger pointing between 389 and freeipa to say who's at fault).
-- Returning idl_alloc(0) here would be RFC compliant (as much as I think the rfc is wrong here ....). From @abbra's comment:
I don't think it is realistically possible to enforce that all attributes of a filter are present in schema and reject the filter otherwise. RFC 4511 section 4.5.1.7. states that
 Servers MUST NOT return errors if attribute descriptions or matching rule ids are not recognized, assertion values are invalid, or the assertion syntax is not supported. More details of filter processing are given in Clause 7.8 of [X.511].
and Clause 7.8 of https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-X.511-201610-I!!PDF-E&type=items has this language:
Any assertion about the values of such an attribute is only defined if the AttributeType is known by the evaluating mechanism, the purported AttributeValue(s) conforms to the attribute syntax defined for that attribute type, the implied or indicated matching rule is applicable to that attribute type, and (when used) a presented matchValue conforms to the syntax defined for the indicated matching rules. When these conditions are not met, the FilterItem shall evaluate to the logical value UNDEFINED.
An assertion which is defined by these conditions additionally evaluates to UNDEFINED if it relates to an attribute value and the attribute type is not present in an attribute against which the assertion is being tested. An assertion which is defined by these conditions and relates to the presence of an attribute type evaluates to FALSE.
The tl;dr being - if the attribute type is not known then it is undefined, and evals to false - idl_alloc(0) in our case because no candidate can satisfy a false requirement.

I think it is where I am confuse. The filter is used to build a candidate list and then to match a candidate before returning it.
To build a candidate list there is no undefined. There are only lists to merge/intersect.
Later during entry matching I agree that unknown attribute may land in 'undefined'.

For sites relying on this behaviour for extensibleObject, they can set this feature to "off", which of course disables the warning and the idl_alloc(0) behaviour. Remember, the "on" behaviour is strict, which rejects the filter outright (default is warn).
IMO the rfc is wrong because the filter should never be accepted in the first place - in William's perfect world I would default to "strict" - but we do not live in my perfect world, so I chose to default to warn, which is RFC compliant.
So I think these are the reasons and justifications for the idl_alloc(0) rather than ALLIDS here.

@firstyear I like this change to use slapi_pblock_set_flag_operation_notes but it is a different change from checking the attribute against the schema. IMHO it makes review more complex and I would prefer a separate ticket.
In addition I think it is a wrong use here. slapi_pblock_set does set the value while slapi_pblock_set_flag_operation_notes do a OR.
Better to check the others places where you used slapi_pblock_set_flag_operation_notes that the OR was intended

This is actually really important to use the set_flag instead - if the filter was:
((mail=foo)(nonexist=bar))

Assume mail is unindexed, and nonexist is not in schema. As a result, after ava_candidates opnotes is OP_NOTE_UNINDEXED due to mail.
If we used opnote set, we would CLEAR the unindexed flag, then replace with OP_NOTE_FILTER_INVALID, losing the notes=U.
By using setflag, we keep both flags in the opnotes, building the opnotes to a set of flags instead. So this is actually a really key change to the api and process here.
Maybe it's also my perfection-ist nature, but the get/set I wrote is better to avoid a set of derferences and case-switch that is 'costly' (it probably isn't costly at all, but even 0.01% of a million is 100. It adds up :)
Thank you for your awesome review, I have addressed your other comments in the code, and I'm going to expand a few code comments to help make some other questions clearer too. Your other points have not been ignored!

firstyear commented 4 years ago

Okay but if with a filter like (&(nonexist=foo)(uid=1000)), because of idl_alloc(0) this filter will return no candidate although it exists one.

But it can't exist! nonexist is not in schema so it would never be valid to add to an object, so this query can never match.

And if it is on the object you must have extensibleObject then, so you turn this setting "off".

The tl;dr being - if the attribute type is not known then it is undefined, and evals to false - idl_alloc(0) in our case because no candidate can satisfy a false requirement.

I think it is where I am confuse. The filter is used to build a candidate list and then to match a candidate before returning it.
To build a candidate list there is no undefined. There are only lists to merge/intersect.
Later during entry matching I agree that unknown attribute may land in 'undefined'.

The filter evaluates to undefined if the attribute requested is not in schema - undefined causing the filter to eval to false. Therefore the candidate list that filter yields must be idl_alloc(0), because false can never be satisfied. That's what the RFC is saying.

So as a result (|(nonexist=foo)(uid=william)) would yield only "uid=william", becuause the condition nonexist=foo is always evaluated to false, aka empty set. If this was an & condition, nothing could possibly match, because it's not possible for uid=william to have nonexist as an attribute as it's not valid on schema.

Of course, extensibleObject ruins and trashes all of this, but that's why there is a tunable to disable it for those people who chose to use EO on their main backend.

EDIT: Go back to the freeipa certmap case. Because their filter was:

(&
  (|
    (non-existant-ad-attribute=...)
    (certSubjectDn=something...)
  )
  (objectClass=ipaaccount)
)

Currently, this would ALWAYS evaluate to the candidate set of "objectClass=ipaaccount", which then must all be filter tested. That's because the non-existant-ad-attribute becomes ALLIDS, and ALLIDS in an OR must return ALLIDS. So then objectClass=ipaaccount is the only filter/index matching! For a large enough site, if the lookthroughlimit was low, this would cause all certmap queries to suddenly fail.

Instead, if you have non-existant-ad return idl_alloc(0), the certSubjectDn query can now be satisfied, which then can be ANDed with the objectClass=ipaaccount. We get the proper result, always, regardless of the database scale. (Throw in filter optimisation/test threshold, we may not even load the objectClass=ipaaccount index at all!).

So I stand by the idl_alloc(0) being the right decision here as part of filter validation, both in practical aspects to resolve active FreeIPA issues, and to conform to the RFC.

Edited 4 years ago by firstyear

tbordaz commented 4 years ago

@firstyear thanks for your explanations. So if I understand correctly, 'unknown attribute' filter check can fail if some entries are extensibleobject and contain unknown attributes. Correct ?

In that case, it looks important to me that the default config value is SLAPI_WARN (like it is currently).

So my last concern is the contention on attribute syntax lock. If your tests (with many threads) show no contention on it. I am okay with patch.

thanks

firstyear commented 4 years ago

@firstyear thanks for your explanations. So if I understand correctly, 'unknown attribute' filter check can fail if some entries are extensibleobject and contain unknown attributes. Correct ?

Yes - if you have an extensibleObject with attribute "foo=bar", and you did a filter of (foo=bar), this would yield no results as foo is not part of schema, so must evaluate to false (undefined).

In that case, it looks important to me that the default config value is SLAPI_WARN (like it is currently).

To be 100% clear:

off -> do nothing
warn -> warn that you have a schema-missing filter component AND evaluate that component to undefined
strict -> reject the filter if a schema-missing filter component is detected

To be sure there is no confusion about warn vs strict here :)

So my last concern is the contention on attribute syntax lock. If your tests (with many threads) show no contention on it. I am okay with patch.
thanks

I need to re-run the test with the pstacks to be 100% sure, but the initial results did not seem to display any contention. But I'll run again with pstacks for you :)

tbordaz commented 4 years ago

Yes - if you have an extensibleObject with attribute "foo=bar", and you did a filter of (foo=bar), this would yield no results as foo is not part of schema, so must evaluate to false (undefined).

In that case, it looks important to me that the default config value is SLAPI_WARN (like it is currently).

To be 100% clear:

off -> do nothing
warn -> warn that you have a schema-missing filter component AND evaluate that component to undefined

An admin may ignore if the DB contains extensibleObject entries with invalid attribute.
I see two options in WARN mode:

log a warning that the filter contains unknown attribute and that the set of returned entries may be incomplete
log a warning that the filter contains unknown attribute and that it could be the reason of unindexed search (perf issue).

firstyear commented 4 years ago

So I think you are suggesting:

off -> do nothing
warn -> warn that there is an unknown attr that may cause a perf issue
warn-strict -> warn there is unknown, and also set it to undefined so some results may be missing
strict -> reject the filter

Is that correct?

I still think that warn-strict should be the default to be rfc compliant, and because without a "feedback" such as missing results, people won't be inclined to look and work otu "why" something is happening.

firstyear commented 4 years ago

We agreed that @tbordaz would contact some people internal to RH to ask about the correct default value for this, but discussing between @tbordaz, @lkrispen and I, we agreed that warn-strict (current warn), is the correct option as most current situations work this way, and we have a security benefit. Any disruption to users would be very odd situations.

firstyear commented 4 years ago

In a meeting I noted from @mreynolds suggestion:

Add logconv.pl support
possible to add extended text from note=F to describe the problem. It could be worth adding extra text for A, U at the same time!

firstyear commented 4 years ago

@mreynolds I may need your help for logconv.pl :S I can't seem to work out where the line for notes=U/A/F is matched in this :( my perl skills are basically non-existant. Where should I look in the file?

firstyear commented 4 years ago

Okay, and a follow up - looking at result.c, I can see the note2str and notemap, but the mechanism for injection of extra messages to the access log is not obvious. Is it perhaps the PB_RESULT_TEXT you mean? I think that also may not really be correct to use here ....

mreynolds commented 4 years ago

Okay, and a follow up - looking at result.c, I can see the note2str and notemap, but the mechanism for injection of extra messages to the access log is not obvious. Is it perhaps the PB_RESULT_TEXT you mean? I think that also may not really be correct to use here ....

What I was thinking of is what we do in bind.c --> log_bind_access() So I guess it's not "built" into the logging framework.

firstyear commented 4 years ago

@mreynolds Ahhh okay! I see. So in that case, something like

if notes & SLAPI_OP_NOTE_FILTER_INVALID 
{ log_bind_access(" .... your filter in invalid to schema"); }

Is this what you had in mind?

mreynolds commented 4 years ago

@mreynolds Ahhh okay! I see. So in that case, something like
if notes & SLAPI_OP_NOTE_FILTER_INVALID
{ log_bind_access(" .... your filter in invalid to schema"); }

Is this what you had in mind?

Actually something like:

{ log_bind_access("... notes=F - filter attribute (marksInvalidAttribute) is not in the schema"); }

Not sure how easy it would be to pass that information, but it would be nice :-)

firstyear commented 4 years ago

Well, in that case we could just change the notemap to have "F" -> "F - filter attribute not found"

I think it may be hard to pass the filterattribute that wasn't found to this, because they could be 1 to N so perhaps thats a V2? Or we wait for the richer logging to be available and then log which attributes are invalid inside of the filter processing/application to idl? I'm just worried it could be a bit hacky to try and get the filter elements that are invalid here, but I also could be overthinking it ... your intent to provide more clarify about what is invalid is an awesome idea of course!

mreynolds commented 4 years ago

Actually something like:
{ log_bind_access("... notes=F - filter attribute (marksInvalidAttribute) is not in the schema"); }

And/or state that the filter component is ignore?

firstyear commented 4 years ago

Would you accept loging to the access log in the filter processing/validation step and having a unique log line to declare which elements will be ignored, rather than in the results line?

mreynolds commented 4 years ago

Well, in that case we could just change the notemap to have "F" -> "F - filter attribute not found"
I think it may be hard to pass the filterattribute that wasn't found to this, because they could be 1 to N so perhaps thats a V2? Or we wait for the richer logging to be available and then log which attributes are invalid inside of the filter processing/application to idl?

Yeah makes sense to do it in the new format. Otherwise you would have to use thread storage or like you said something hacky to pass that info along.

mreynolds commented 4 years ago

@mreynolds I may need your help for logconv.pl :S I can't seem to work out where the line for notes=U/A/F is matched in this :( my perl skills are basically non-existant. Where should I look in the file?

Its there, and it won't be trivial to add. Just look for:

    if (m/ notes=[A-Z,]*A/){

You would need matching code for:

    if (m/ notes=[A-Z,]*F/){

It would require adding new database hashes, and other little things. If you want I can do this for you once we get the logging format agreed upon.

firstyear commented 4 years ago

Well, at that point we have the pblock, so we can access the filter, and that has the flags associated to the filter elements? So we could do something like "filter2undefined(slapi_pblock pb, *char buf) {}" and then have it such that:

if (notes & SLAPI_OF_NOTES_FILTER) {
    char *buf = NULL;
    filter2undefined(pb, &buf);
    log_access("undefined by schema -> %s", buf)
    slapi_free(&buf);
}

But I think still that I would log the "undefined" elements into a seperate log line, rather than reusing the results line? So if we would use the seperate log line, we could easily just have inside of filter.c as we process the filter, calls to slapi_log_access to say "warning, op/conn this filter has undefined elements".

Thoughts?

I think I would appreciate the help with logconv, but as you say, lets get the format sorted. :)

mreynolds commented 4 years ago

What would the new access log lines look like? I think it might be best to add a new access log level to write these fine grained details. So have a more generic message for "notes=F - filter has invalid schema", then we get to the details with the new log level. Thoughts?

firstyear commented 4 years ago

I'd say "notes=FilterContainsUndefined", and then the new log level with the details about it?

firstyear commented 4 years ago

@tbordaz would like to keep this as notes=F, so maybe instead we can extend notesmap to have a third field like details, and then we can just put those into the buffer as well?

firstyear commented 4 years ago

Something like "notes=F,U details="Filter Contains Undefined, Unindexed""

tbordaz commented 4 years ago

@firstyear, yes I like the idea of details field.

rebased onto c2d0fff8c48137c7bed991e793d1286eecb2531e

4 years ago

firstyear commented 4 years ago

[04/Jul/2019:12:04:58.255243133 +1000] conn=1 op=12 RESULT err=0 tag=101 nentries=13 etime=0.0002102208 notes=U details="Partially Unindexed Filter"
[04/Jul/2019:12:05:06.783574304 +1000] conn=1 op=6 RESULT err=0 tag=101 nentries=0 etime=0.0001999146 notes=F details="Filter Element Missing From Schema"

Here is an example of what it looks like. I've pushed the changes to the branch too.

tbordaz commented 4 years ago

details are looking good.
If you manage to add detail msg for notes=U would it be possible to also add it to notes=A "Fully Unindexed Filter"

firstyear commented 4 years ago