Some admins may want a way to recover from certain actions or operations. This could be a delete, modrdn, modify, add etc. Key is that it should work clearly with replication and be easy to access by admins if required.
There are a few scenarioes I can easily see:
I think that undo of the modrdn and addition is easy - modrdn can be undone by another modrdn, and add can be undone be a delete.
The two difficult ones are undoing a delete, and undoing a modification. There are follow up comments on possible processes we could use.
It was suggested that some environments may want a delay where a change is held for a configurable time before being sent. The example was you have 4 write masters and X rd only servers. The replication between the 4 write masters is "instant", but each change is delayed to the rd only by 1 hour. This way if a mistake occurs, the masters can be isolated and reverted.
By making this per-change rather than scheduled, one can deal with the situation where if the scheduled replication was every hour, then if one minute before the schedule the error is introduced, it would be replicated 60 seconds later.
In some cases, some changes could be replicated instantly (userPassword). This may be a future idea though.
I can see this being implemented as per-agreement, where the agreement would say "csn's up to time minus delay". Of course, I expect that there will be some subtle detail that I have missed about this :)
The value proposition of this feature is to catch-and-reset changes that cause issues from propagating. An alternate suggestion of how to handle this could be to implement a method to "undo" changes that have been sent out and applied rather than trying to delay-and-catch.
cc @lkrispen for your ideas and input, and @darix who suggested the idea.
so here are my comments :-)
I understand the intention and think it is a valid request, but I think the real request is "provide a feature to rollback changes" and not a specific way to handle it.
About the suggested delay solution, I can see the difficulties to implement it and not so much benefit for resolving the problem. If I understand it correctly you want to have a server, which is always 1hr behind and in case of a detected failure want to use this server to reinit the topology. But this is a great interruption and while you undo the incorrect change you also undo all the correct changes in the last hour.
You can almost achieve this by doing hourly backups on a dedicated read only consumer.
In my opinion we should look into more specific solutions, to "repair" entries. We already have a lot of information to do this. Most of the changes are kept in the entry in the replication meta data, eg deleted values together with their timestamps, a deleted entry is available as tombstone, can be read, cleaned up and readded. What unfortunately is missing, was intentionally done, is keeping the values of completely deleted attributes (since they are not needed for update resolution and would blow up the meta data).
But we could make it optional to keep more data or to extend what is written to the (or a new) change or audit log (eg a before state for any update.
Metadata Update from @lkrispen:
- Custom field origin adjusted to None
- Custom field reviewstatus adjusted to None
@lkrispen has suggested an extra control that allows the restoration of a tombstone to a real entry. We would need to allow (or provide tooling) to demonstrate how to search this). A key point is if tombstones are kept in a single server case, and how long for.
My suggestion was that user-initiated deletes actually trigger a modrdn to a hidden container (cn=recyclebin), and that masks the entries as invisible. To restore you would modrdn them back. A consideration and concern is making sure this doesn't not affect plugins.
@tbordaz has suggested that entries could store "versions" of themself, IE COW/MVCC on the entry. This would allow viewing of the entry at a "point in time" and then to roll back to a specific point in time. IT would be important to consider how this would work with replication of course.
Metadata Update from @mreynolds:
- Issue set to the milestone: FUTURE
to comment on this ticket.