#62 Possible to miss updates
Closed: Fixed 6 years ago Opened 6 years ago by tibbs.

If you're a pre-bitflip mirror, it's possible to miss the bitflip if the file lists aren't regenerated immediately when the flip happens. And it's possible to miss updates to existing non-checksummed files because of file list update delays.

Here's the bitflip case:

  • Bit flip happens at time T but file lists aren't immediately updated
  • Mirror polls at time U; sees no changes.
  • File lists are regenerated at V > U > T. File list now contains change time T for the relevant directory.
  • Mirror polls again at W, gets new file list, but since the last mirror time U > T, it doesn't see any change there and so doesn't include the directory in the transfer list.

Plus, even if it does get into the transfer list, our directory times aren't necessary kept in sync so rsync might not transfer the changes anyway.

Mirrors which just get unrestricted content will simply see the files appear and, because it doesn't have them locally, transfer them regardless of timestamp.

In fact, this is a more general problem if the content of a file changes and the file lists aren't updated before a client next polls. If it takes X minutes for the host to generate a file list after the change, the client must look back for the next X minutes for updated files, regardless of how often it polls.

A possible solution is to somehow sort/diff the file list and unconditionally transferring any changes. In most cases we will have the previous file list handy so this shouldn't actually be too difficult.

So it comes down to have an extra step:

  • Extract timestamp/path pair from both old and new file lists (or timestamp/size/path tuple), sorting both by the path and diffing. In fact, it should be possible to use the existing awk syntax to ust pull the [files] section and use that directly. This should already be done for the new file lists.
  • Using sort -t$'\t' -k2 (or -k3 depending) to sort by the path.
  • Using diff --changed-group-format='%>' --unchanged-group-format='' old new makes it possible to avoid postprocessing the diff output too much, though you still need to extract just the path from the set of changes.

This should catch permission changes whenever they happen, as well. It will in fact basically duplicate some of the existing change detection. But that's OK; we want to be thorough and we will deduplicate the transfer list at the end of the process.


Metadata Update from @tibbs:
- Issue tagged with: semi-bug

6 years ago

Metadata Update from @tibbs:
- Issue untagged with: semi-bug
- Issue tagged with: bug, immediate

6 years ago

Metadata Update from @tibbs:
- Issue set to the milestone: 0.1

6 years ago

I've decided that this is really a blocker, since it can also trigger when rawhide is re-signed with the next key, so any mirror would go partially stale once per release cycle.

I just had this happen again to my tier1 mirror after Tuesday's bitflip. Do you have a recommended way to get it "unstuck" after this happens? I just deleted the fullfiletimelist-* files which seems to work.

I pushed a number of commits which together implement the file list diffing method, though slightly differently than I outlined initially. Test cases have also been added which attempt to duplicate a few of the scenarios which could have caused missed update, all of which failed before and pass now.

This has soaked on my servers for a bit and at least doesn't appear to cause any problems.

Also, @cra, I unfortunately did not see your message before now but the answer to your question is the -a flag. That causes q-f-m to always do full file list processing even if the lists are unchanged on the master.

Metadata Update from @tibbs:
- Issue untagged with: bug, immediate
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

6 years ago

Login to comment on this ticket.

Metadata