We are using 389ds v1.3.2.24 in multi-master replicated triangle topology (3 servers, each MMR to 2 others).
The bug is easily reproduced: take an entry that does not have an attribute ("seeAlso", for example) and apply the following mod :
version: 1 #dn: uid=andrey.ivanov,ou=Personnel,ou=Utilisateurs,dc=id,dc=polytechnique,dc=edu changetype: modify replace: seeAlso -
Now make the search with the filter {{{'(&(uid=andrey.ivanov)(seeAlso=*))'}}}. The entry will be returned while it should not be - the attribute "seeAlso" does not exist in it.
The search {{{'(seeAlso=*)'}}} does not return the entry since it uses the index seeAlso.db which is correctly maintained (the id of the entry is not in seeAlso.db).
dbscan -f id2entry.db for the concerned entry:
id 7736 rdn: uid=andrey.ivanov entryusn;adcsn-5472f31e000000010001;vucsn-5472f31e000000010001: 626828 ... seeAlso;deleted: createTimestamp: 20100315161506Z ...
The problem does not exist without replication (the line "seeAlso;deleted:" does not appear in that case). If i put some non-empty value into the attribute "seeAlso" and then re-apply the abovementioned mod the bug disappears since the corresponding line in id2entry changes to something like
seeAlso;vucsn-5472ea37000000010000;vdcsn-5472ea40000000010000;deletedattribute;deleted: <previous value>
I've tested it with other attributes: businessCategory, physicalDeliveryOfficeName etc, both indexed and unindexed. The only general thing is:
Export/import using pure ldif without replication state of course does reolve the problem.
it's a rather critical issue for us since it gives us false positives for various group updates. The entries that should not be returned are returned.
in replication we try to track any delete of an attribute, to have the most correct update resolution. So this is done even for a delete of a non-existing attribute since there could be the following scenario: seeAlso does not exist on M1 add seealso value and concurrently (just before it was replicated to m2): on M2 delete the no existent seeAlso attr.
If these operations are then replicated on M1 the now existent attr would be removed and on M2 the attr would exist. So the attempt is to keep track of all operations, unfortunately it looks like the deletion of the non-existing attr is not correctly tracked: a deleted NULL value instead of a deleted attr.
As a consequence the attr is not seen ina search, but if used in a presence filter it is incorrectly used
attachment 0001-Ticket-47964-Incorrect-search-result-after-replacing.patch
Although the fix seems correct, it also seems too simple. This is a very sensitive area of the code and a fix such as this can introduce subtle regressions. How can we thoroughly test this to make sure we don't introduce any regressions?
the code change effects only one scenario, the attribute does not exist and no values are provided. If the attr exists or is deleted it will not get into this code path. If values are provided then the code was already executed.
The result is a state, which we did handle in migrations from SunDS (bug 1063990), where we exactly had: svattr;adcsn-XXXX;vdcsn-XXXX;deletedattribute;deleted:
and we could not create this, but fixed replication to work in this situation
Replying to [comment:6 lkrispen]:
the code change effects only one scenario, the attribute does not exist and no values are provided. If the attr exists or is deleted it will not get into this code path. If values are provided then the code was already executed. The result is a state, which we did handle in migrations from SunDS (bug 1063990), where we exactly had: svattr;adcsn-XXXX;vdcsn-XXXX;deletedattribute;deleted: and we could not create this, but fixed replication to work in this situation
Ok.
I've patched the entrywsi.c file - the fix does work. After the applying the incriminated modification the id2entry of both replicated servers contains
{{{ seeAlso;adcsn-54784a85000200010000;vdcsn-54784a85000200010000;deletedattribute;deleted: }}}
and the corresponding search does not return the entry.
Finally, this patch does fix the original problem but there is a side effect - the crashes of 389ds at certain conditions. I've managed to reproduce the crash with this patch. It happens in the following conditions: replication should be enabled (without replication everything is fine) and a sufficiently large ldif modify should be performed on an entry with a lot if "replace by nothing" mods (deleting an already empty attribute). I've checked out the latest GIT for the version 1.3.2, applied this patch (one-line diff) and ran the test. Without the patch the MOD is ok, with the patch applied 389ds crashes. If the MOD is small enough, everything is fine. At some critical size (or number of "replace" in ldif) the server starts to crash. I've attached the ldif MOD generating the crash and the stack trace extracted from the core according to http://www.port389.org/docs/389ds/FAQ/faq.html#sts=Debugging%C2%A0Crashes
The message in error log: [08/Jan/2015:10:12:28 +0100] - entry2str_internal_ext: array boundary wrote: bufsize=17165 wrote=17265
ldif with MODs to crash the patched 389ds large-mod.ldif
Patched server postmortem core stacktrace stacktrace-crash.txt
I will look into this, to make life more easy, could you provide the schema files used to create/modify the entries and a sample entry to apply the mods to ? thanks, Ludwig
Custom schema X-schema.ldif
Entry to modify in order to reproduce crash entry-to-modify.ldif
Ok. It's done. Schema and the entry are attched. I don't think it has anything to do with schema. It could be related to attribute options - we use them a lot. However, the number of mods in one operation matters - for other users with a smaller number of modifications everything is fine.
I agree, it has nothing to do with schema, it just lets me apply your test mod without problems. The problem is probably that we now write more data "deletedattribute;deleted" and the function calculating the buffersize before writing does not take this into account and allocation is too small.
attachment 0001-Ticket-47964-v2-Incorrect-search-result-after-replac.patch
reproduced and fixed with the attached patch, could you verify it works for you
The reason was that the space for "deletedattribute;deleted" was only allocated if the attribute has values, but the dummy value was added after this calculation.
It looks like the existing space calculation is already allocating more than actual required, so it becomes only visible if there are many of these empty attributes
Yes, it works for me, i've updated our production servers. Thank you, Ludwig!
committed: master: ldap/servers/slapd/entry.c | 2 ++ ldap/servers/slapd/entrywsi.c | 8 ++++++-- 2 files changed, 8 insertions(+), 2 deletions(-)
New commits: commit 3182d2e
1.3.3: ldap/servers/slapd/entry.c | 2 ++ ldap/servers/slapd/entrywsi.c | 8 ++++++-- 2 files changed, 8 insertions(+), 2 deletions(-)
New commits: commit 391539a
1.3.2: ldap/servers/slapd/entry.c | 2 ++ ldap/servers/slapd/entrywsi.c | 8 ++++++-- 2 files changed, 8 insertions(+), 2 deletions(-)
New commits: commit a379565
git patch file (1.2.11) -- applying the patch in master just to entry2str_internal_size_attrlist (entry.c). 0001-Ticket-47964-1.2.11-Incorrect-search-result-after-re.patch
Pushed to 389-ds-base-1.2.11: 96b3a5b..d1cf785 389-ds-base-1.2.11 -> 389-ds-base-1.2.11 commit d1cf785
dumped raw entry which crashes the server (size = 1618188B) huge_entry_snippet.txt
I think the data in comment #5 do not match this bug. for the lastlogintime accumulation we have ticket #48944. We should close this one again as fixed
Replying to [comment:24 lkrispen]:
I meant comment #22 (not #5).
Ahhh, sorry, Ludwig! It seems I made a mistake (but I don't know why I got confused...) :(
Closing this bug again with the original bug reference.
Metadata Update from @nhosoi: - Issue assigned to lkrispen - Issue set to the milestone: 1.2.11.33
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/1295
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: Fixed)
Log in to comment on this ticket.