#9717 proposal: "archive" yum repo for Fedora; non-latest packages from the updates repo
Closed: Fixed 9 months ago by pingou. Opened a year ago by dustymabe.

  • Describe the issue

In Fedora CoreOS, SIlverblue, and IoT, we sometimes have issues where layering packages on top of the base OSTree layer doesn't work because the base layer is slightly out of date with the latest available package set in Fedora's update repos.

For Silverblue (not sure about IoT) this can be worked around by updating the base OSTree layer at the same time you attempt to layer packages. For Fedora CoreOS, we have a stable stream that lags behind by a few weeks so it's not possible to workaround the issue so easily. Also, if Silverblue ever changes and starts releasing on a weekly cadence, instead of every day, it would also be affected.

The proposed solution to this problem is to have an "archive repo" where old packages from the updates repo are kept and available for situations like this. I have a proof of concept set of code that:

  • downloads builds from koji in a given tag
    • subsequent runs only download what has changed since last time
  • stores all files in s3 (no added infrastructure storage load)
  • runs createrepo_c against the package set

I have demonstrated this proof of concept and it appears to be received well.

Next steps for this would be:

FOR DUSTY

  • placing the code into a releng repo (pagure.io/releng/archive-repo-manager maybe?)
  • updating code to be triggerered by fedmsg (end of bodhi run)
  • updating code to apply to more architectures
  • updating code to run in a container (i.e. can run in infra openshift)
  • delivering the repo via a subpackage of fedora-repos rpm.
    • Only installed on OSTree based systems by default.

FOR RELENG

  • determining a fedora project URL to use for the repo (redirect to s3)
  • create an s3 bucket in prod account to be used
    • maybe bucket name of fedora-archive-yum-repo
    • give dustymabe access to the bucket for development purposes
    • create ansible secrets with credentials to access the bucket
    • make objects in the bucket completely public
  • verifying what createrepo arguments we want to use for the repository

There's been discussions in the past about having a week or n-1 or something like for the repos in general to address other issues in the context, I feel that coming to an agreement on that would solve most of the issues outlined above

@pbrobinson would you like to schedule some time to discuss that as an option? Or add the proposal to the mailing list discussion or to this ticket?

Metadata Update from @humaton:
- Issue tagged with: meeting

a year ago

@pbrobinson I think its better have the yum repo generated for all the non latest rpms rather than doing it every week or something like that since that might miss some builds.

We briefly discussed this topic at the releng meeting today and again we dont know the other proposal, please let us know about it and we can evaluate it as well.

Thanks.

Metadata Update from @mohanboddu:
- Issue tagged with: dev, groomed, high-gain, high-trouble, ops

a year ago

I've got the initial version of the code for the archive-repo-manager up.

Right now it doesn't work in openshift because I can't figure out how to get /dev/fuse to properly get mounted into a container. Worst case scenario we'll just run it in an AWS instance (which makes sense since we're storing most of the content in s3) for now until /dev/fuse is easier to get running in OpenShift.

I also talked to @jdieter about zchunk and I got it wired up so it's creating zchunk data, which has made incremental metadata updates quite nice.

I've also removed the creation of sqlite metadata files. Between me and @jdieter we don't think they are used any more and creating them means more copying data back and forth between s3. Should we stop creating them for our other repos we create?

I think the next steps are to do the RELENG items from the description.

I've got the initial version of the code for the archive-repo-manager up.

Right now it doesn't work in openshift because I can't figure out how to get /dev/fuse to properly get mounted into a container. Worst case scenario we'll just run it in an AWS instance (which makes sense since we're storing most of the content in s3) for now until /dev/fuse is easier to get running in OpenShift.

I also talked to @jdieter about zchunk and I got it wired up so it's creating zchunk data, which has made incremental metadata updates quite nice.

I've also removed the creation of sqlite metadata files. Between me and @jdieter we don't think they are used any more and creating them means more copying data back and forth between s3. Should we stop creating them for our other repos we create?

They are used for https://mdapi.fedoraproject.org , before removing them we need to rewrite that service so that it does not depend on the sqlite files.

I think the next steps are to do the RELENG items from the description.

ok the archive-repo-manager is running and @kevin set up a URL (https://fedoraproject-updates-archive.fedoraproject.org/) for us to use that is fronted by CDN.

Here is the PR to add the yum repo as a fedora-repos-archive subpackage of the fedora-repos rpm.

https://src.fedoraproject.org/rpms/fedora-repos/pull-request/79

@dustymabe Is there anything else needed on that ticket ?

@dustymabe We are closing this ticket and please let us know if you need any more help by reopening this ticket

Metadata Update from @mohanboddu:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

10 months ago

Issue status updated to: Open (was: Closed)

9 months ago

Issue status updated to: Closed (was: Open)
Issue close_status updated to: Fixed

9 months ago

Login to comment on this ticket.

Metadata
Boards 2
Dev Status: Done
Ops Status: Done