#5566 quick-fedora-mirror issues
Closed: Fixed 7 years ago Opened 7 years ago by tibbs.

I just wanted to open an issue to keep track of quick-fedora-mirror things which relate to its implementation in Fedora infra as opposed to upstream or client bugs.

First off is that file list creation has stopped on fedora-secondary.


@pbrobinson can you take a look?

I'm not even sure where the secondary compose/updates runs from... if you let me know I can look at it too if you prefer.

:cow2:

I'm happy to take any output the code is producing. It's conceptually really simple so I don't see how it could be going wrong.

I ran the plain filelist creation script on batcave and it sure looks like everything is OK there, but it's quite possible there's some issue in moving the generated file lists into place. I'll need to dig deeper to see how that's happening, but I don't think I can do much in the way of experimentation with the privileges I have.

Once the freeze is over it would probably be a good idea for me to try and simplify the process. I'm trying to get my head back into this code but right now I just need to get my mirrors to fully synchronize.

So here's the bit from the playbook where it gets called:

ansible/playbooks/groups/secondary.yml:    cron: name="update-fullfiletimelist" hour="*" minute="55" user="root"
ansible/playbooks/groups/secondary.yml:        job="/usr/local/bin/lock-wrapper update-fullfiletimelist '/usr/local/bin/update-fullfiletimelist -l /tmp/update-fullfiletimelist.lock -t /srv/pub alt'"
ansible/playbooks/groups/secondary.yml:        cron_file=update-fullfiletimelist

I'm not sure what lock-wrapper is or why it's being used, since update-fullfiletimelist does locking internally. It's also only updating the alt module, not the fedora-secondary module.

I don't see any other instance of update-fullfiletimelist being called in ansible that also references secondary in any way. But it did run as of a few days ago, so I will dig back through the history a bit.

So at some point yesterday the file lists did update. But somehow the two lists have timestamps ten minutes apart, which is kind of odd, and nothing has happened since 15:45 yesterday (Nov 15).

Since I can't figure out where in the playbooks the fedora-secondary file lists are ever getting generated, I'm at a loss to explained why it happened for a while and then stopped again. Perhaps someone did a manual run, but @nirik didn't do one.

So, the lock-wrapper we can probibly just drop, it was added I think before that script had locking.

The complexity here comes from a bunch of different machines updating those things at different times. Each alternative arch compose box updates fedora-secondary when it pushes updates (which is mostly manual I think), they also update fedora-secondary daily when their branched compose runs. alt content can be updated by humans anytime. Rawhide updates from rawhide-composer, branched from branched-composer. fedora also updated from bodhi-backend03 for updates.

I think the best thing to do will be:

Short term:

  • Just make the cron job on alt also update fedora-secondary hourly (or every 2 hours or something) Thats probibly overkill, but should work until after release.
  • Drop any secondary/alternative stuff thats updating for now, just let the cron job handle things on one host.

Longer term:

  • With f25 out the door, the next branched will hopefully all run from branched-composer, so thats a lot less boxes to worry about.
  • Set script to use a lockfile on the nfs mount, and have the actual machines doing updates always update their stuff with a lockfile the other machines can see.

So, looking more I see possibly where this started happening. We recently pulled aarch64, ppc64, ppc64le into the primary koji and so rawhide composes those as well, but puts them in fedora-secondary.

So, I think the secondary updates pushes are updating things ok, but the rawhide compose isn't updating anything so the files get out of sync until the next updates run.

ok. I think this is fixed now.

The cron job on alt updates alt and (can manually) update archive, but it's the only thing that does so it's fine.

rawhide-composer, branched-composer, compose-ppc64-01, and bodhi-backend03 all update fedora and fedora-secondary. They now all use the same lock file in fedora-secondary (it may appear in the list, but oh well, no big deal) and they shouldn't step on each others updates now.

:bento:

@kevin changed the status to Closed

7 years ago

Login to comment on this ticket.

Metadata