#50265 the warning about skew time could last forever
Closed: wontfix 4 years ago by lkrispen. Opened 5 years ago by lkrispen.

Issue Description

from the BZ:
after a problem with skew time for csn generators, the server tries to adjust the delta but sometimes, instead of reduce it till it will dissapear, we are incrementing it. So, the WARN can last forever.

Example:

[26/Jan/2019:15:25:33.964643906 +0300] - WARN - csngen_new_csn - Too much time skew (-492 secs). Current seqnum=4
[26/Jan/2019:15:25:33.976372625 +0300] - WARN - csngen_new_csn - Too much time skew (-493 secs). Current seqnum=1
[26/Jan/2019:15:25:33.979472136 +0300] - WARN - csngen_new_csn - Too much time skew (-494 secs). Current seqnum=1
[26/Jan/2019:15:25:34.120723671 +0300] - WARN - csngen_new_csn - Too much time skew (-494 secs). Current seqnum=1

it's each time bigger. I will never arrive to 0.


Metadata Update from @lkrispen:
- Custom field origin adjusted to None
- Custom field reviewstatus adjusted to None
- Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1673472

5 years ago

The ultimate goal of csn time management is to ensure taht never a csn is generated which is smaller than any existing csn, there are adjustemnts for local and remote time changes or differences and over time the timestamp in the csns can differ from the system times on the machines participating. But the csn time is consistently only moving forward.

For the problem reported here we can concentrate on a single master scenario, the problem is caused by local time change and not affected or healed by remote csns.

In one master we have acsn generator state containing three time components: sampled time, remote_offset, local_offset (and despite their name the offsets are trggered by remote time differences). A new csn will get the time stamp: sampled_time + remote_offset + local_offset.

The sampled time corresponds to the locsal system time and we have to handle time changes, and we always store the value we determined last time what should be the sampled time.
If the current_time > sampled time this can have two reasons, some time has passed since the last csn was generated and sampled time was updated or the system clock was moved forward. In both cases we adjust the sampled time to the current time and know that the generated csn will not fall behind.

If current_time < sampled_time this means that the system clock was turned back. The good thing is we do not need to do any adjustment, we use the sampled time and increase the sequence number and know that the new csn will be ahead of existing csns (NOTE: the resulting time difference to other systems will be handled in the offstes, not in the sampled time).
So far so good, but there is one scenario where this could go wrong. If the sampled time keeps stable until the sytem time has catched up, everything will be handled in the sequence number and if the time is set back too much there could be more updates in that time than we have sequence numbers (65k) , and they could roll over.
And here the extra adjustment for the case current_time - sampled_time< -300 kicks in. The assumption that it takes 300 secs to generate 65k mods is > 300 sec. so in that case the sampled time is increased by one to compensate this.

But there are several issues with this approach.
- the method itself increases the time difference andso it can diverge more and more
- there is no memory that it was done already
- the 300 sec can be wrong already or soon

In my opinion a better approach would be to tackle the problem where the seq_num could overflow, and only then increase sampled time (and reset seqnum)

I agree. So long as the csn clock continually is advancing (and it does due to replication advancing the partners csn clock), then you are correct that the seq_num overflow is the only possible issue. I would naively assume we could handle it by seq_num when approaching MAX, we just increment the csn clock "ahead of schedule"?

The alternative would be to make seq_num a larger type than a 16bit int, say 32/64, but I think that would be very invasive to the replication system.

PS: I think we can remove the cclock skew warning all together, since the design of the clock, skew doesn't really negatively affect the system since our csn clock is effectively seperate for csn time keeping purposes. We only care that the system clock is ticking - not that it's accurate.

I join my voice to @firstyear. The mechanism is there to protect from seqnum overflow so there is no reason use a time base threshold.
Increasing the timestamp on seqnum rotation looks safe to me.

However I miss how it can address theoretically two of your concerns (continuous diverging time (csn vs current), memory of previous adjustment). Now from a practical pov, I doubt that seqnum will rotate frequently.

However I miss how it can address theoretically two of your concerns (continuous diverging time (csn vs current), memory of previous adjustment). Now from a practical pov, I doubt that seqnum will rotate frequently.

The drift of csn time and real time is no concern, it is a consequence of having a distributed system with, at least, temporary time differnces. Even with time management with ntp or chronyd the seconds on the system are not exacly aligned. We manage this and it only becomes visible if someone decodes a csn.
If we do not apply time adjustment like now with adding a sec if the difference is <-300 we do not need the memory, but we could log adjustment because of seqnum overflow

I still think there is little need to log the adjustment though. We are building a lamport clock, so it's expected that we will continually be adjusting and advancing the clock state based on events and realtime? I think we should just remove this warning altogether and deal with the overflow as you have mentioned :)

In fact there is already a check for seq_num overflow just after this questionable adjustment:

   if (gen->state.seq_num == CSN_MAX_SEQNUM) {
       slapi_log_err(SLAPI_LOG_INFO, "csngen_new_csn", "Sequence rollover; "
                                                       "local offset updated.\n");
       gen->state.local_offset++;
       gen->state.seq_num = 0;
   }

so it is handled, and logged. Removal of the -300 part should be sufficient

https://pagure.io/389-ds-base/pull-request/50299
This fix is missing in 1.3.9. @lkrispen, could you please push it there?
Thanks!

Fix is in 1.3.9, please ignore.

Metadata Update from @lkrispen:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

4 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/3324

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: fixed)

3 years ago

Login to comment on this ticket.

Metadata