When pushing container to the registry, we should push them first under a tag that is the full compose version (41.20241115.0) and then update the release tag (41) as well.
41.20241115.0
41
We probably need to figure a tagging naming scheme for testing builds as well.
What container image specifically are you talking about here? What tag did it get pushed under? The current logic for tag handling intentionally replicates how the shell scripts were doing it, if we want it to do something different that needs to be planned out.
So if we're talking about atomic desktops containers: yes, AFAICS, this would be new behavior. The old sync-ostree-base-containers.sh did not do this. If you look at https://quay.io/repository/fedora/fedora-silverblue?tab=tags , there are no such tags.
sync-ostree-base-containers.sh
If we're going to do this we would also have to implement garbage collection, wouldn't we? Otherwise we'd rapidly wind up with hundreds of tags, which probably wouldn't make quay very happy.
What's the use-case? We could provide version data as annotations. Either org.opencontainers.image.version or, if we want to set that to the major Fedora release version, org.fedoraproject.image.compose-id or something. Of course, if the reason is you want to be able to check out older builds that won't help you much.
org.opencontainers.image.version
org.fedoraproject.image.compose-id
And yeah, if we do this we'll need to decide on a cleanup policy.
Yes, while this is driven by the need for the Atomic Desktops, this would also probably be good for all containers. See an example in https://quay.io/repository/fedora-ostree-desktops/silverblue?tab=tags (where I also include the git shorthash but we probably don't want that).
For garbage collection, the policy should be: - Keep the last three months of Rawhide - Keep branched release - Keep all images for current stable and old stable releases - Keep only the latest build for EOL releases to let users update older systems.
Note that due to how Bootable Container images are built, a new tag does not necessarily mean a full image size in storage as identical layers are shared between builds. This is why the Silverblue repo for example has a size of 110 GB (https://quay.io/organization/fedora-ostree-desktops) while we have 110 tags in the repo and each image is about 2 GB (https://quay.io/repository/fedora-ostree-desktops/silverblue?tab=tags).
What's the use-case?
Classic ostree repos include the full log and history of all the builds. We want to preserve the same feature for the new format. It's massively useful for development and debugging (for bissecting issues) and it will enable us to implement proper update channels in the future. It's also how we want users to use those containers for derivation: pin to a given release, refresh the pinned release regularly via a CI job.
See: - https://github.com/coreos/fedora-coreos-tracker/issues/1367 - https://quay.io/repository/fedora/fedora-coreos?tab=tags
Fedora CoreOS is also implementing garbage collection for those images so we could share there: https://github.com/coreos/fedora-coreos-tracker/issues/99
See: https://discussion.fedoraproject.org/t/we-need-to-come-up-with-a-consistent-approach-for-generating-and-publishing-containers-both-traditional-and-atomic-desktop-containers-both-stable-and-unstable-releases/109213/14
Cool, thanks for the details.
I think this would be nice, and can devote some time to the implementation, but I also think I shouldn't be the person to yay/nay this. I'm not, however, certain what the appropriate workflow should be. Maybe we make a releng ticket and get sign-off there?
Beyond just agreeing to a tagging and cleanup policy it would be good to ensure it's uniform and documented. I like the idea of sharing garbage collection with CoreOS. My preference would be for that to be in place before we start pushing tags since otherwise it's possible it won't get done until it's an emergency, but again that feels like something releng people should decide.
I was pointed to this thread and want to add another use case where preventing images from garbage collection would be great.
Let's consider the Dockerfile of a pet project (i.e., https://github.com/vrothberg/fedora-bootc-workstation/blob/main/Dockerfile#L1). I am referencing the fedora-bootc:41 image via a digest. Using the digest allows Renovate Bot to open a PR as soon as the image was updated on the registry while also allowing reproducibility. I am in full control of what goes in which is exactly what I want for the use case.
However, the aggressive GC can easily render the Dockerfile to not build as the referenced digest does not exist on the registry anymore. Having a three month grace period or keeping images around until they go EOL would all work for me. I am mostly looking for a policy we can document and work with.
I kinda feel like we could probably do the garbage collection inline - when pushing images from a new compose, also wipe the corresponding images that are more than X months old (or whatever the relevant heuristic is). Seems like it'd be neatest that way. I can try and find some time to draft an implementation of this, maybe.
We don't really keep around old content for any other deliverable, I'm not sure why we should do it for containers, especially when it means we could easily fill up various storage quotas doing it.
Atomic Desktop container images are not small, and Fedora isn't paying for space on the various registries. We have no budget for it and no means to support it. Not to mention it creates a support problem as we can get people using old content forever, which we do not support as a project.
[...] we can get people using old content forever [...]
Can you elaborate on that?
The GC policy should avoid images from lingering around forever, which should address the issue of users potentially using outdated content forever. The storage issues may turn into a problem but I would wait and see.
The -bootc images are published on Quay where we can work with Red Hat in case the storage hunger is getting too big. It is a strategic direction for Red Hat, so I am sure we won't get in trouble with Quay. I cannot speak for other registries but I personally care most about the -bootc images at the moment.
I'm not against doing that (we do it for Azure images) and it does have the advantage of not splitting the management across projects. I'd hoped we could reduce the duplication between CoreOS and everything else, but unless @siosm weighs in and can help co-ordinate that then just doing it ourselves is okay.
I don't want us making decisions based on invisible internal decision-making from a provider, because that can change at a drop of a hat in a fairly painful way. Our images are supposed to be available in multiple provider locations (Fedora hosted, Quay hosted, Docker Hub hosted), so it matters how we use other people's resources.
We keep lots and lots of "old" (but not EOL) content on AWS and Azure so I'm not sure what the issue is with containers on Quay/Fedora hosted/Docker Hub.
I haven't seen anyone suggesting we keep content forever and there's plenty of middle ground between "today's image" and "every image ever produced". We can define whatever retention policy we think is a good fit for users and hosts and adjust as necessary when it isn't perfect.
That sounds good to me. This is essentially what the job Fedora CoreOS pipeline does if I'm not mistaken.
We don't really keep around old content for any other deliverable, I'm not sure why we should do it for containers, especially when it means we could easily fill up various storage quotas doing it. Atomic Desktop container images are not small, and Fedora isn't paying for space on the various registries. We have no budget for it and no means to support it. Not to mention it creates a support problem as we can get people using old content forever, which we do not support as a project.
See https://pagure.io/cloud-image-uploader/issue/37#comment-944447 where most of this is addressed and where I make an initial garbage policy suggestion. I'm OK with deleting more images but please make a clear policy suggestion.
I like the policy suggested in https://pagure.io/cloud-image-uploader/issue/37#comment-944447 but I'd add:
This would allow you to say test something in f37 or whatever. or work around some issue that showed up in updates in the latest update image for that release.
I don't think we need to keep all the EOL releases, as if we really really need something for some reason, we have them in koji still.
Also, I made a https://quay.io/organization/fedora-testing org, which as time permts we should setup to handle testing/candidate/etc things.
According to https://docs.projectquay.io/use_quay.html#setting-tag-expirations-v2-ui (section: Setting tag expirations by using the API), it is possible to set an expiration date via the API for image uploaded to Quay.io.
I'm planning to take a look at implementing this when I return from vacation in mid-April, if someone doesn't beat me to it.
it's in my backlog too, but didn't get to it yet :/
Log in to comment on this ticket.