#569 examine replication code to reduce amount of stored state information
Closed: Fixed None Opened 6 years ago by lkrispen.

When frequently updating an attribute, all the deleted values are kept until they eventually are purged. But they "most likely" never will be needed in update resolution.

The task is to review update/conflict resolution to decide if it is "most likely" or "definitely not", so that the deleted values could be immediately purged.
A side effect could be to create a design doc for the current replication protocol and identify other areas for improvement

I looked up old design documents and there is a scenario which requires to keep the deleted values with their deletion csn.
The scenario is quite artificial, but needs to be handled. It involves three masters, where in parrallel on one master a value "v" of the naming attribute is deleted and on the two others there is a modrdn operation on each, the one with the smaller csn makes the value "v" distinguished.
If then the order in which the mods are received are not in csn order there could be an inconsistent state if it is not recorded that "v" was deleted with a csn later than the modrdn operations.

So to handle this situation the deleted values need to be kept until they are purged. Need to look into the calculation of the purge csn and if the purge delay can be reduced

I started to write a doc on entry state resolution, which will explain which data are required, how the current code handles this and what could be optimized.

The current status of the document is here: http://port389.org/wiki/Entry_State_Resolution

The current state of the implementation is in the attached patch file.

What still needs to be done:
- complete the test cases and create test scripts
- verify acceptance test results, some tests fail since they examine deleted values and value csns based on the old algorithm, need to modify these test cases.
- there is one test failing because server seems to crash, need to investigate this
- code also contains fix for ticket #602 by setting subsequence numbers, investigate if this is correct for all usages of csn_compare or if csn_compare needs to be differentiated

verified TET results, failing tests are failing because they expect deleted values which are no longer kept or becasue of the presence of subsequence numberes. If the patch is accepted TET tests should be modified.

1728 valueset_add_string (&a->a_deleted_values, "", CSN_TYPE_VALUE_DELETED, a->a_deletioncsn);
do we need to account for this when calculating the size e.g. for the entry size for the entry cache?

Otherwise, looks good. We will need to test this quite a bit.

This value is added eventually when the entry is written in id2entry_add, it will then added to the cache and its size will be calculated and take this into a account like the other changes to the entry.

yes, testing will go on

git merge ticket569
Updating 79346de..c7f6f16
ldap/servers/slapd/entry.c | 10 ++++
ldap/servers/slapd/entrywsi.c | 553 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------------------------------------------------
ldap/servers/slapd/proto-slap.h | 1 +
ldap/servers/slapd/valueset.c | 9 +++
4 files changed, 378 insertions(+), 195 deletions(-)
$ git push origin master
Counting objects: 17, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (9/9), done.
Writing objects: 100% (9/9), 3.12 KiB, done.
Total 9 (delta 7), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
79346de..c7f6f16 master -> master

Description: commit c7f6f16
made this change:
In case the deleted value list in an attribute is empty:
* this means the entry is deleted and has no more attributes,
* when writing the attr to disk we would loose the AD-csn.
* Add an empty value to the set of deleted values. This will
* never be seen by any client. It will never be moved to the
* present values and is only used to preserve the AD-csn.
The AD-csn size was not counted for the buffer size to allocate.
This patch adds the size.

Reviewed by Nathan (Thank you!!)

Pushed to master: commit ca02529

This breaks single valued attribute updates
dn: cn=test,dc=example,dc=com
objectclass: extensibleObject
employeenumber: value1

then do ldapmodify
changetype: modify
add: employeenumber
employeenumber: value2
delete: employeenumber
employeenumber: value1

The end result is this:
dn: cn=test,dc=example,dc=com
The modify deletes the employeenumber attribute. it should have
employeenumber: value2

$ git merge ticket569-regression
Updating 161243d..b73f1e8
ldap/servers/slapd/entrywsi.c | 2 ++
1 file changed, 2 insertions(+)

$ git push origin master
Counting objects: 11, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (6/6), done.
Writing objects: 100% (6/6), 625 bytes, done.
Total 6 (delta 4), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
161243d..b73f1e8 master -> master

Metadata Update from @lkrispen:
- Issue assigned to lkrispen
- Issue set to the milestone: 1.3.2 - 05/13 (May)

2 years ago

Login to comment on this ticket.