#49859 A distinguished value of a single valued attribute can be missing in an entry
Opened 2 years ago by tbordaz. Modified 16 days ago

Issue Description

In MMR with a specific set of operations it is possible to create an entry like

dn: employeeNumber=3.2,ou=distinguished,ou=People,dc=example,dc=com
employeeNumber: 3.1
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
uid: 3
sn: 3
cn: 3

The test case is composed of 3 Masters but it is equivalent with 2 Masters.
Topology is fully meshed but RA from M1 and M2 are disabled.

1 - On M1 MODRDN (E, rdn=employeeNumber=3.1)
2 - On M2 MODRDN (E, rdn=employeeNumber=3.2)
3 - On M1 MOD(E, [MOD_REPL, 'employeeNumber', '3.1'])

The entry is similar on all servers. The csn of (3) being the higher than (2), the MOD will remove the value '3.2' without taking into consideration that it is distinguished

Package Version and Platform

The problem exists in master and possibly in older version

Steps to reproduce

Will attache a testcase

Actual results

the value of the the RDN attribute is not present in the entry

Expected results

No Clear what we want to do.
On one side we want the distinguished value to be present. But that would mean that the update (3) is ignored although it was successful.


Metadata Update from @tbordaz:
- Custom field component adjusted to None
- Custom field origin adjusted to None
- Custom field reviewstatus adjusted to None
- Custom field type adjusted to None
- Custom field version adjusted to None

2 years ago

the test py seems to contain all the tests for #49658, which one is the specific test case for this ticket ?

Right, I forgot to rename each testcase. They are part (distinguished) of the #49658 tests. The tests 18,19,20 and 21 are failing because the RDN value is not present in the final entry.

Metadata Update from @mreynolds:
- Issue set to the milestone: 1.3.8

2 years ago

Metadata Update from @gparente:
- Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1647017

a year ago

The bad news is that the recent fix for #49658 doest not fix this problem. But can/should it be fixed :( ?

Looking at the description the 'employeeNumber' should be 3.1 (most recent update) but at the same time the 'employeeNumber' being a RDN it should be 3.2. But being a single valued attribute, we must choose '3.1' or '3.2'. IMHO both approach are valid but I prefer to give priority to the value explicitly set by the ldapclient '3.1'.

The only good news is that this test case has the same behavior with/without the fix #49658

I would vote for closing this bug/ticket as wontfix

I think the issue is that for it to be a valid ldap object, the rdn component must be an attr:value in the object. So if we have a situation where these can desync, that seems invalid.

I can understand it's a rare case for this to occur, but that doesn't prevent it being an incorrect case.

In my view, this could go two ways. It could be either close wontfix, very hard to produce (provided all servers arrive at the same invalid object state, it can then be corrected by hand, we could even write a healtcheck in lib389 to detect and alert on these), or we could fix it - but I assume the fix would be complex.

So I think my vote is a lib389 tool to detect and correct these in the dsconf command. Is that reasonable?

Thanks @firstyear for your feedback !

I think the issue is that for it to be a valid ldap object, the rdn component must be an attr:value in the object. So if we have a situation where these can desync, that seems invalid.

Looking at https://tools.ietf.org/html/rfc4511

 The MODRDN operation
 Attribute values of the new RDN not matching any
 attribute value of the entry are added to the entry, and an
 appropriate error is returned if this fails."

 The Modify operation cannot
 be used to remove from an entry any of its distinguished values,
 i.e., those values which form the entry's relative distinguished
 name.  An attempt to do so will result in the server returning the
 notAllowedOnRDN result code.

So from the protocol pov, it is a requirement that at the same time an entry must conform the schema and the distinguished value should exist in the entry. The issue appears with replication where two valid operations can lead to an entry that does not conform the schema.

I can understand it's a rare case for this to occur, but that doesn't prevent it being an incorrect case.
In my view, this could go two ways. It could be either close wontfix, very hard to produce (provided all servers arrive at the same invalid object state, it can then be corrected by hand, we could even write a healtcheck in lib389 to detect and alert on these), or we could fix it - but I assume the fix would be complex.
So I think my vote is a lib389 tool to detect and correct these in the dsconf command. Is that reasonable?

An other option would be to detect the invalid entry and flag it (add 'nsds5ReplConflict: unable to get distinguished value in the entry').

@tbordaz The problem with turning it into a conflict is that the entry then "vanishes". It certainly is a correct action how ever as it certainly is a conflict. Something I want is a "restore" tool for replication to restore conflicts to real entries (probably by reading the conflict and creating a new entry from the values).

@firstyear, I did not mean moving it to a subentry just flagging it a conflict. I think glue entry may also be conflict entry without being subentry. Regarding the conflict/restore aspect, I prefer not to open such a box for this specific ticket. The ticket is a corner case but conflict/restore have larger impact and would require a RFE.
@lkrispen, @gparente , do you think it is acceptable to have conflict entries that are not subentry.

If it's flagged as a conflict, don't we hide it from searches though? Or do we rely on ldapsubentry for this?

An issue with making it a conflict is we'll also no longer replicate any future changes to the object until we remove the conflict from it I think ...

I think the right answer here is conflict and hide, exactly how we would treat a normal conflict entry. Having something vanish - but be recoverable - would certainly alert to the existance of a problem. Having an associated error in logs would help too. Finally, having a tool in dsconf healthcheck to detect these is the final piece of the puzzle to make issues like this visible to admins.

I hope this helps :)

I think we agree the entry is a conflict. Now hiding a conflict is not systematic, for example a glue entry is not hidden:

ldapsearch -LLL -h ... -b "dc=example,dc=com" 'ou=glue' objectclass nsds5ReplConflict
dn: ou=glue,dc=example,dc=com
objectclass: top
objectclass: organizationalUnit
objectclass: extensibleobject
objectclass: glue
nsds5ReplConflict: deletedEntryHasChildren

IIUC, with @lkrispen patch, only naming conflicts are hidden (subentry).
Here an option would be to keep it visible (not a ldapsubentry) and flag it something like "nsds5ReplConflict: RDN attribute/value is missing'

Also AFAIK replication replicates and applies update on all entries even conflicts.

replcheck is reporting conflicts, it could be wrap by dsconf healthcheck.

Yep, replcheck should be part of healthcheck, but that may be a seperate issue.

So perhaps this should be a test case to produce the issue and assert that replcheck can detect and report it? Otherwise I'm happy for it to be a conflict like this provided that future chatnges to the entry continue to be replicated (this can also be a test extension ...).

Coming late I want to add some comments regarding conflicts.

We have to clearly distinguish two types of conflict:
1] entry level: conflicting entries, as a result of adding same dn in parallel, or deleting an entry and adding children in parallel
2] attribute level: conflicts inside a regular entry. One existing example is if you have a required attribute with two values and delete them in parallel, each operation is valid, but after replication both values are gone, we keep this incorrect stat but add a nsds5replconflict tracker.
I think teh second type matche what Thierry has in mind and it is absolutely ok to add the nsds5replconflict attr

Okay. I think we are all in agreement then. The attribute level conflict is what we want to show here, we want to make sure repl check can display entries in this invalid conflict state, and we want a test case to show we can correctly detect entries in this state.

Metadata Update from @mreynolds:
- Issue set to the milestone: 1.3.10 (was: 1.3.8)

6 months ago

Metadata Update from @mreynolds:
- Issue set to the milestone: 1.4.3 (was: 1.3.10)

16 days ago

Metadata Update from @mreynolds:
- Issue priority set to: normal

16 days ago

Login to comment on this ticket.

Metadata
Attachments 1
Attached 2 years ago View Comment