#48976 After reinit a topology can silently diverge instead of breaking replication
Opened 3 years ago by tbordaz. Modified 3 months ago

After a reinit there is a common administrative error to allow update on the server while this servers has not yet received all the updates it was originator (see https://fedorahosted.org/389/ticket/47986).

In such case the replication should fail else the topology will silently diverge


Per scrum, setting the target milestone to 1.3.5...
Thierry, please reset it to the older versions (e.g., 1.3.4, if needed...)

After offline discussion with Thierry we arrived at the following proposal:

  • add an option to set the backend into 'referral on update' after a restore or import.
  • the default could be on, so that the problematic scenario is handled by default

  • handle the reset of this flag, so that the server is accepting updates again, this could be done under different conditions
    1] allow a manual reset by an admin, admin should ensure replicas are in sync for this replicaID
    2] return to accepting updates after a configured time period, assuming that replicas will get in sync
    3] try to verify if all servers are in sync for the initialized/restored replica ID. This could be done by a method similar to cleanallruv, sending out a EXT op and requesting the maxCSN for the RID and start accepting updates once this maxCSN is available on the current replica.
    Of course, 1] .. 3] should not be exclusive. 1] should always be possible, 3] should stop once a timeout 2] is reached

I'm afraid we have still problems with this approach:

for external ops we can send referrals until state is switched, but what about internal ops, triggered by password policy attr updates or login recording or by plugins.

The proposed solution would show the problems of readonly replicas

Metadata Update from @lkrispen:
- Issue set to the milestone: 1.3.6.0

2 years ago

Metadata Update from @mreynolds:
- Custom field component reset (from Replication - General)
- Issue close_status updated to: None
- Issue set to the milestone: 1.3.7 backlog (was: 1.3.6.0)

2 years ago

Metadata Update from @mreynolds:
- Custom field reviewstatus adjusted to None
- Issue set to the milestone: 1.4.2 (was: 1.3.7 backlog)

3 months ago

Login to comment on this ticket.

Metadata