#602 replication inconsistency if attribute is modified several times in one operaion
Closed: wontfix None Opened 11 years ago by lkrispen.

If attribute values are added and deleted in one operation, they get the same csn and the handling is correct. If the attribute is modified later again, the deleted value is resurrected incorectly.

Here is the scenario:

On M1 add/del values to attr description
dn: cn=Mr00006,ou=People,dc=example,dc=com
changetype: modify
add: description
description: aaa aaa
description: bbb
-
delete: description
description: bbb

SEARCH M1
description: aaa aaa
nscpentrywsi: description;vucsn-512f2fdd000000c80000: aaa aaa
nscpentrywsi: description;vucsn-512f2fdd000000c80000;vdcsn-512f2fdd000000c80001;deleted: bbb
SEARCH M2
description: aaa aaa
nscpentrywsi: description;vucsn-512f2fdd000000c80000: aaa aaa
nscpentrywsi: description;vucsn-512f2fdd000000c80000;vdcsn-512f2fdd000000c80000;deleted: bbb

both servers have the correct state
now add an other value on M2
dn: cn=Mr00006,ou=People,dc=example,dc=com
changetype: modify
add: description
description: bbb bbb

SEARCH M1
description: aaa aaa
description: bbb bbb
description: bbb
nscpentrywsi: description;vucsn-512f2fdd000000c80000: aaa aaa
nscpentrywsi: description;vucsn-512f304f000000640000: bbb bbb
nscpentrywsi: description;vucsn-512f2fdd000000c80000;vdcsn-512f2fdd000000c80000: bbb
SEARCH M2
description: aaa aaa
description: bbb bbb
nscpentrywsi: description;vucsn-512f2fdd000000c80000: aaa aaa
nscpentrywsi: description;vucsn-512f304f000000640000: bbb bbb
nscpentrywsi: description;vucsn-512f2fdd000000c80000;vdcsn-512f2fdd000000c80001;deleted: bbb

On M1 the deleted value "bbb" reappears


The reason for the different state is in the the little difference in the csns
M1: nscpentrywsi: description;vucsn-512f2fdd000000c80000;vdcsn-512f2fdd000000c80000;deleted: bbb
M2: nscpentrywsi: description;vucsn-512f2fdd000000c80000;vdcsn-512f2fdd000000c80001;deleted: bbb

On M2 the vdcsn has a subsequence number, which ensures that the vdcsn is later than the vucsn and the values stays deleted.

Probably also on the primary master the subsequence numbers need to be stored.

I introduced the change to use the subsequence numbers for ordering the individual mods in the modify operation on the consumer, because there was no other way to guarantee that they would be applied in the same order that they were applied on the supplier. I did not foresee this urp case unfortunately. Ideally, we would use the subsequence numbers only during the urp modify, but not store the subsequence numbers with the csn. We need to have all CSNs for the modify operation to be exactly the same for other urp and replication code to work correctly (e.g. maxcsn in the ruv, other places). Alternately, make the code that uses CSN comparisons smart enough to know when to use the subsequence number and when to ignore it. The problem with that is that it would touch a lot of code and have a good chance of destabilizing the server and breaking replication, especially in mixed environments with older and newer servers.

This appears to be working correctly in 1.3.2:

Master A:

dn: cn=csnsubseq,dc=example,dc=com
changetype: modify
add: description
description: aaa aaa
description: bbb
-
delete: description
description: bbb

Master A Search:

dn: cn=csnsubseq,dc=example,dc=com
description: aaa aaa
nscpentrywsi: description;vucsn-51faa9780000006f0000: aaa aaa
nscpentrywsi: description;vucsn-51faa9780000006f0000;vdcsn-51faa9780000006f0001;deleted: bbb

Master B search:

dn: cn=csnsubseq,dc=example,dc=com
description: aaa aaa
nscpentrywsi: description;vucsn-51faa9780000006f0000: aaa aaa
nscpentrywsi: description;vucsn-51faa9780000006f0000;vdcsn-51faa9780000006f0001;deleted: bbb

===> subseq number was updated on both masters

Master B Update:

dn: cn=csnsubseq,dc=example,dc=com
changetype: modify
add: description
description: bbb bbb

Master A search:

dn: cn=csnsubseq,dc=example,dc=com
description: aaa aaa
description: bbb bbb
nscpentrywsi: description;vucsn-51faa9780000006f0000: aaa aaa
nscpentrywsi: description;vucsn-51faaa14000000de0000: bbb bbb
nscpentrywsi: description;vucsn-51faa9780000006f0000;vdcsn-51faa9780000006f0001;deleted: bbb

Master B Search:

dn: cn=csnsubseq,dc=example,dc=com
description: aaa aaa
description: bbb bbb
nscpentrywsi: description;vucsn-51faa9780000006f0000: aaa aaa
nscpentrywsi: description;vucsn-51faaa14000000de0000: bbb bbb
nscpentrywsi: description;vucsn-51faa9780000006f0000;vdcsn-51faa9780000006f0001;deleted: bbb

The obvious difference is that the subsequence number was updated on both masters after the initial multi-mod update.

So, has this been "fixed" by another replication fix? Or do I need to modify the test case?

It's possible that one of the other fixes that Ludwig and I did also fixed this issue. I'm not sure, maybe Ludwig knows.

Good news!

I'm curious... How does 1.3.1 behave? It's fixed only in 1.3.2?

Replying to [comment:9 nhosoi]:

Good news!

I'm curious... How does 1.3.1 behave? It's fixed only in 1.3.2?

1.3.1 is broken! So it's one of the commits that just went into master(1.3.2).

It should be fixed by the changes in ticket #569, which is an enhancement and will not be backported.

You can check if just the part of creating subsequence numbers on the first master fixes the issue, bur italso will have the know effects on mmr TET tests failing.

I confirmed that this change "fixes" the issue, but I only ran the test case - nothing else:

{{{
diff --git a/ldap/servers/slapd/entrywsi.c b/ldap/servers/slapd/entrywsi.c
index 248a41f..9eed18c 100644
--- a/ldap/servers/slapd/entrywsi.c
+++ b/ldap/servers/slapd/entrywsi.c
@@ -825,7 +825,7 @@ entry_apply_mods_wsi(Slapi_Entry e, Slapi_Mods smods, const CSN csn, int urp)
and the csn doesn't already have a subsequence
if the csn already has a subsequence, assume it was generated
on another replica in the correct order
/
- if (urp && (csn_get_subseqnum(csn) == 0)) {
+ if (csn_get_subseqnum(csn) == 0) {
csn_increment_subsequence(&localcsn);
}
}

}}}

yes, this will fix it and I didn't find csn comparisons which would need subsequence 0000, we have a mixture anyway depending on which master the changes are applied.

But TET will get confuesed in the mmr and state tests since it tries to get the operation csn from teh modifiers name and with subsequence numbers this is no longer the same as in the attribute.

I only found this when working on tests for reducing repl meta data, so it is not from a customer and it is fixed in master, so the question remains if it needs to be backported.

Closing ticket since there is no customer request for this fix on 1.3.1.

Metadata Update from @mreynolds:
- Issue assigned to mreynolds
- Issue set to the milestone: 1.3.2 - 08/13 (August)

7 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/602

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Invalid)

3 years ago

Login to comment on this ticket.

Metadata