In MMR with a specific set of operations it is possible to create an entry like
dn: employeeNumber=3.2,ou=distinguished,ou=People,dc=example,dc=com employeeNumber: 3.1 objectClass: top objectClass: person objectClass: organizationalPerson objectClass: inetOrgPerson uid: 3 sn: 3 cn: 3
The test case is composed of 3 Masters but it is equivalent with 2 Masters. Topology is fully meshed but RA from M1 and M2 are disabled.
1 - On M1 MODRDN (E, rdn=employeeNumber=3.1) 2 - On M2 MODRDN (E, rdn=employeeNumber=3.2) 3 - On M1 MOD(E, [MOD_REPL, 'employeeNumber', '3.1'])
The entry is similar on all servers. The csn of (3) being the higher than (2), the MOD will remove the value '3.2' without taking into consideration that it is distinguished
The problem exists in master and possibly in older version
Will attache a testcase
the value of the the RDN attribute is not present in the entry
No Clear what we want to do. On one side we want the distinguished value to be present. But that would mean that the update (3) is ignored although it was successful.
<img alt="ticket49859_test.py" src="/389-ds-base/issue/raw/files/197129e038c02288f10a4f4cd3b578e11dd1ec1d07effa0c5f2aff9b32b04e15-ticket49859_test.py" />
Metadata Update from @tbordaz: - Custom field component adjusted to None - Custom field origin adjusted to None - Custom field reviewstatus adjusted to None - Custom field type adjusted to None - Custom field version adjusted to None
the test py seems to contain all the tests for #49658, which one is the specific test case for this ticket ?
Right, I forgot to rename each testcase. They are part (distinguished) of the #49658 tests. The tests 18,19,20 and 21 are failing because the RDN value is not present in the final entry.
Metadata Update from @mreynolds: - Issue set to the milestone: 1.3.8
Metadata Update from @gparente: - Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1647017
The bad news is that the recent fix for #49658 doest not fix this problem. But can/should it be fixed :( ?
Looking at the description the 'employeeNumber' should be 3.1 (most recent update) but at the same time the 'employeeNumber' being a RDN it should be 3.2. But being a single valued attribute, we must choose '3.1' or '3.2'. IMHO both approach are valid but I prefer to give priority to the value explicitly set by the ldapclient '3.1'.
The only good news is that this test case has the same behavior with/without the fix #49658
I would vote for closing this bug/ticket as wontfix
I think the issue is that for it to be a valid ldap object, the rdn component must be an attr:value in the object. So if we have a situation where these can desync, that seems invalid.
I can understand it's a rare case for this to occur, but that doesn't prevent it being an incorrect case.
In my view, this could go two ways. It could be either close wontfix, very hard to produce (provided all servers arrive at the same invalid object state, it can then be corrected by hand, we could even write a healtcheck in lib389 to detect and alert on these), or we could fix it - but I assume the fix would be complex.
So I think my vote is a lib389 tool to detect and correct these in the dsconf command. Is that reasonable?
Thanks @firstyear for your feedback !
Looking at https://tools.ietf.org/html/rfc4511
The MODRDN operation Attribute values of the new RDN not matching any attribute value of the entry are added to the entry, and an appropriate error is returned if this fails." The Modify operation cannot be used to remove from an entry any of its distinguished values, i.e., those values which form the entry's relative distinguished name. An attempt to do so will result in the server returning the notAllowedOnRDN result code.
So from the protocol pov, it is a requirement that at the same time an entry must conform the schema and the distinguished value should exist in the entry. The issue appears with replication where two valid operations can lead to an entry that does not conform the schema.
I can understand it's a rare case for this to occur, but that doesn't prevent it being an incorrect case. In my view, this could go two ways. It could be either close wontfix, very hard to produce (provided all servers arrive at the same invalid object state, it can then be corrected by hand, we could even write a healtcheck in lib389 to detect and alert on these), or we could fix it - but I assume the fix would be complex. So I think my vote is a lib389 tool to detect and correct these in the dsconf command. Is that reasonable?
An other option would be to detect the invalid entry and flag it (add 'nsds5ReplConflict: unable to get distinguished value in the entry').
@tbordaz The problem with turning it into a conflict is that the entry then "vanishes". It certainly is a correct action how ever as it certainly is a conflict. Something I want is a "restore" tool for replication to restore conflicts to real entries (probably by reading the conflict and creating a new entry from the values).
@firstyear, I did not mean moving it to a subentry just flagging it a conflict. I think glue entry may also be conflict entry without being subentry. Regarding the conflict/restore aspect, I prefer not to open such a box for this specific ticket. The ticket is a corner case but conflict/restore have larger impact and would require a RFE. @lkrispen, @gparente , do you think it is acceptable to have conflict entries that are not subentry.
If it's flagged as a conflict, don't we hide it from searches though? Or do we rely on ldapsubentry for this?
An issue with making it a conflict is we'll also no longer replicate any future changes to the object until we remove the conflict from it I think ...
I think the right answer here is conflict and hide, exactly how we would treat a normal conflict entry. Having something vanish - but be recoverable - would certainly alert to the existance of a problem. Having an associated error in logs would help too. Finally, having a tool in dsconf healthcheck to detect these is the final piece of the puzzle to make issues like this visible to admins.
I hope this helps :)
I think we agree the entry is a conflict. Now hiding a conflict is not systematic, for example a glue entry is not hidden:
ldapsearch -LLL -h ... -b "dc=example,dc=com" 'ou=glue' objectclass nsds5ReplConflict dn: ou=glue,dc=example,dc=com objectclass: top objectclass: organizationalUnit objectclass: extensibleobject objectclass: glue nsds5ReplConflict: deletedEntryHasChildren
IIUC, with @lkrispen patch, only naming conflicts are hidden (subentry). Here an option would be to keep it visible (not a ldapsubentry) and flag it something like "nsds5ReplConflict: RDN attribute/value is missing'
Also AFAIK replication replicates and applies update on all entries even conflicts.
replcheck is reporting conflicts, it could be wrap by dsconf healthcheck.
Yep, replcheck should be part of healthcheck, but that may be a seperate issue.
So perhaps this should be a test case to produce the issue and assert that replcheck can detect and report it? Otherwise I'm happy for it to be a conflict like this provided that future chatnges to the entry continue to be replicated (this can also be a test extension ...).
Coming late I want to add some comments regarding conflicts.
We have to clearly distinguish two types of conflict: 1] entry level: conflicting entries, as a result of adding same dn in parallel, or deleting an entry and adding children in parallel 2] attribute level: conflicts inside a regular entry. One existing example is if you have a required attribute with two values and delete them in parallel, each operation is valid, but after replication both values are gone, we keep this incorrect stat but add a nsds5replconflict tracker. I think teh second type matche what Thierry has in mind and it is absolutely ok to add the nsds5replconflict attr
Okay. I think we are all in agreement then. The attribute level conflict is what we want to show here, we want to make sure repl check can display entries in this invalid conflict state, and we want a test case to show we can correctly detect entries in this state.
Metadata Update from @mreynolds: - Issue set to the milestone: 1.3.10 (was: 1.3.8)
Metadata Update from @mreynolds: - Issue set to the milestone: 1.4.3 (was: 1.3.10)
Metadata Update from @mreynolds: - Issue priority set to: normal
PR https://pagure.io/389-ds-base/pull-request/51149
f75fd1a..2ccd0be master 81cca09..113a104 389-ds-base-1.4.3
Metadata Update from @tbordaz: - Issue close_status updated to: fixed - Issue status updated to: Closed (was: Open)
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/2918
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: fixed)
Login to comment on this ticket.