#4539 Create file for rsync mirrors indicating last update
Closed: Fixed None Opened 6 years ago by adrian.

My mirror (ftp-stud.hs-esslingen.de) is using dl.fedoraproject.org::fedora-buffet0/ to sync from. Most of the times rsync is running there is something new to transfer. There are, however, cases when nothing changed since the last run and to transfer 0 files rsync is running for about 2500 seconds.

The following is an overview of the my rsync runs, how long they take and how many files are transferred:

http://ftp-stud.hs-esslingen.de/info/status.php4?details=4

It would be nice if each rsync module would contain an (empty) file which is updated every time the rsync module content has changed. This way I could compare the date of this file with the date of the local copy and only start the complete rsync run if something has actually changed. The existing file DIRECTORY_SIZES.txt is only updated once a day as far as I can see. For the fedora-enchilada there is a file like that (fullfilelist) but not for fedora-buffet.


We have something already like this... our syncs are generating fedmsg messages. So you can query for the last one of those or wait for one.

See: https://fedorahosted.org/fedora-infrastructure/ticket/1133#comment:7

Basically we need this documented/tested and then we can tell mirrors to use it (if they like).

Should we close this ticket in favor of that one? Is this something you might be able to work on? ;)

This would work for most content except "alt/". I guess this is good enough.

I would be willing to work on it/document it. Unfortunately I am not able to get the list of topics from https://apps.fedoraproject.org/datagrepper/. It just times out.

We were having some nasty database problems with datagrepper. ;(

Can you try again now? Hopefully it will be better...

I see the following topics which could mean the sync has complete.

The first two seem to indicate that the epel and fedora sync has completed:

  • org.fedoraproject.prod.bodhi.updates.fedora.sync
  • org.fedoraproject.prod.bodhi.updates.epel.sync

I suppose

  • org.fedoraproject.prod.compose.branched.rsync.start
  • org.fedoraproject.prod.compose.branched.rsync.complete

this would mean that the F21 branch has completed

  • org.fedoraproject.prod.compose.rawhide.rsync.complete
  • org.fedoraproject.prod.compose.rawhide.rsync.start

and this means that rawhide has completed. The start topic is probably irrelevant for mirrors.

Is that correct? Are those messages the ones I am interested in? Do I need to look at other events?

yes, those all look correct. ;)

I added an attachment: last-sync

It tries to use the previously mentioned topics to display information about the last sync.

{{{
[adrian@dcbz last-sync]$ ./last-sync -f
updates-19: Mon, 29 Sep 2014 03:47:15 +0000
updates-20: Mon, 29 Sep 2014 03:50:59 +0000
updates-testing-19: Mon, 29 Sep 2014 03:51:49 +0000
updates-testing-20: Mon, 29 Sep 2014 03:52:17 +0000
updates-testing-21: Mon, 29 Sep 2014 03:53:06 +0000
[adrian@dcbz last-sync]$ ./last-sync -e
epel-5: Sun, 28 Sep 2014 18:20:53 +0000
epel-testing-5: Sun, 28 Sep 2014 18:21:19 +0000
epel-6: Sun, 28 Sep 2014 18:28:48 +0000
epel-testing-6: Sun, 28 Sep 2014 18:29:25 +0000
epel-7: Sun, 28 Sep 2014 18:33:04 +0000
epel-testing-7: Sun, 28 Sep 2014 18:33:38 +0000
[adrian@dcbz last-sync]$ ./last-sync -r
rawhide-arm: Mon, 29 Sep 2014 06:50:49 +0000
rawhide-ppc: Mon, 29 Sep 2014 09:00:09 +0000
rawhide-s390: Mon, 29 Sep 2014 09:44:45 +0000
rawhide-primary: Mon, 29 Sep 2014 10:25:55 +0000
[adrian@dcbz last-sync]$ ./last-sync -b
21-s390: Sun, 28 Sep 2014 13:11:45 +0000
21-arm: Mon, 29 Sep 2014 09:02:20 +0000
21-primary: Mon, 29 Sep 2014 10:09:48 +0000
}}}

If a sync was found on the message bus it returns 0 and if not 1. It can be run with '-q' to suppress output and the default delta of 86400 can be changed.

An additional test if it works and if the output is sane would be nice. I would then document it and post about it on the mirror-list. The script could live in the MirrorManager git. Just like report_mirror.

Excellent! ;)

So, how could mirrors easily use this in their sync scripts? I guess they would need to record their last run and check it against the above?

It would be ideal if we had a very simple way to use this information and only rsync if newer pushes have happened. I guess we might need to write the entire sample rsync then though. ;(

For my mirror setup it is easy. For each mirror I record the starttime, endtime and status in a database. So all I have to do is find the starttime from the last successful run and pass it to my script as a parameter. I am happy now. I will post this script/setup on the mirror-list@redhat.com mailinglist to get feedback from there and how it can be best used or improved.

{{{

check if mirror needs to be updated

CURDATE=date +%s
LASTRUN=$( psql -c 'select starttime from mirror_status where mirror_id=4 and status=0 order by starttime desc limit 1;' ftpadmin | head -3 | tail -1 )
DELTA=echo ${CURDATE}-${LASTRUN} | bc

~ftpadmin/bin/last-sync -d ${DELTA} -q

if [ "$?" -ne "0" ]; then
# no changes on the master mirror
# abort
exit 0
fi
}}}

Could you also document it on the wiki?

Once it's there we can point new mirrors at it, etc and can close this out? Along with 1133 I think.

I will document it and then close this ticket.

Have you had a chance to document this yet?

Login to comment on this ticket.

Metadata