Learn more about these different git repos.
Other Git URLs
Since we know before we call rsync which files we may be transferring, and we have some idea of whether we already have files with matching names/sizes (from the file lists), we can, for each file we expect to transfer which doesn't already exist,
If that ends up being the wrong file, the rsync will simply overwrite it. If it's the right file, then we potentially saved ourselves a transfer. This would help even if the files aren't hardlinked on the master server.
Open questions:
Having a few extra files shouldn't actually hurt anything; it might look weird, and they would be pulled down by anyone who uses plain rsync to download content, but they would be removed automatically the next time the module changes. The files wouldn't be unsafe because they would simply be copied from elsewhere in the repo.
It occurs to me that depending on how things are set up, a noarch package could end up signed by a different key depending on where it was built, and that somehow this could maybe end up with the same size. This file just might end up being linked into place, and then the rsync call could bomb out before this "error" gets corrected. And then because the file looks perfectly correct, a succeeding run wouldn't see it.
So:
Detection is technically possible because the inode change time on the now "bad" file will be newer than the last mirror time (which won't be updated since the rsync call failed). So we'd need to have our big find over $moduledir/* also output %c and include in the list anything which is newer than the last mirror time. Which isn't really that tough and introduces no additional stat calls. Since hardlinking does change the ctime on both inodes being linked together, this would occasionally result in a few extra files being sent to rsync, but this is almost completely harmless as they won't be transferred.
So, the bottom line is that yes, Fedora it seems will intentionally change the content of RPM files without changing the name, and the size remains the same as well. Which means that at least for now, any attempt to be this fancy is going to have to take extra care.
The dates do change when this happens, and thus the file list will reflect the change, but it's possible due to bugs and whatnot that something is missed. So if I go forward with an implementation of this, it needs to be very well tested against various failure modes including interrupted transfers.
Some basic explanation as to why this would still be worth the effort:
Metadata Update from @tibbs: - Issue tagged with: feature
Login to comment on this ticket.