There's a race condition in getting/updating the nsDS5ReplicaId in cn=replication,cn=etc that could result in duplicate nsDS5ReplicaId if multiple replicas are installed in parallel (e.g. with Puppet). We should switch to an atomic get-and-increment operation (or remove/add in a loop) instead of doing the increment in Python.
This will fix the case of installing from a single master. (If replicas are installed in parallel from different masters, we need a different solution. This is work for DNA and the topology plugin.)
This is more difficult than it seems, especially when 2 replicas are installed simultaneously against different masters.
DNA or similar mechanism would need to be introduced and used. The effort could be done together with #4302.
For now, we will only require parallel installation against one master. We will thus need to at least start using the DEL+ADD operations instead of REPLACE and allow installer to retry multiple times (with random delay) when there is a conflict.
This is important for the new installer methods, so that cluster of FreeIPA replicas can be installed in parallel.
AFAIU, there are two issues reported in that ticket. - nsDS5ReplicaId value that are identical on different server in a replication topology - need to test and set nsDS5ReplicaId value in an atomic way
nsDS5ReplicaId being singled value, MOD_DEL+MOD_ADD in a same operation will guarantee that a given value is replace by an other value. If the initial value is not what was expected, the operation fails and we can retry.
Having nsDS5ReplicaId different on all the replica. If the selection of the value is spread on several server, then it would require some range values (per server) to be sure there is no collision. If the selection of the value is done centralized, it can be done by a simple counter.
master:
Metadata Update from @pviktori: - Issue assigned to mbabinsk - Issue set to the milestone: FreeIPA 4.2
Log in to comment on this ticket.