#376 Deprecate tag-repository for Fedora CI / CentOS Stream CI
Opened 2 months ago by mvadkert. Modified 22 days ago

We are deprecating the tag repository for Testing Farm, which can affect Fedora CI.

What is tag repository? See:

https://docs.testing-farm.io/general/0.1/user-guide/test-environment.html#_tag_repository

What is the advised solution:

  1. dnf update each Fedora and CentOS Stream test environment before running tests
  2. in case kernel was updated, reboot the machine to have running kernel match the latest installed one
  3. work on automating the CI image updates to daily (might take some time)

Is this the Koji repo for Fedora? We need it because that is where the package is built in and what it needs to be tested with.

Why do you deprecate it?

@churchyard hi, yep, it is the Koji repo for Fedora.

Why do you deprecate it?

It is causing unpleasant surprises in general, and it does not seem to be widely known by the general Fedora development audience seems.

Maybe we can find a solution for you if you really need it.

So, your package is only available there, not pushed to the Fedora repositories?

My package is available everywhere. But the latest version of libraries it was built against might not be available in the composed repository.

Consider this example: My package link against libfoo. The libfoo version in the composed/mirrored repository is libfoo.so.1, but it has been updated to libfoo.so.2 in the Koji repo.

I submit a pull request against my package. It builds in Koji, so it links against libfoo.so.2. Then, the CI tries to install my package, but the dependency on libfoo.so.2 is not available in the composed/mirrored repo at all and the installation fails.

Or consider this example: My package interacts with a service. This service was updated in the Koji repo and changed behavior slightly. I submit a pull request against my package. The CI tests this with the old version of the service available in the composed/mirrored repo. Everything is good, so I merge the PR and submit the update. When my update goes to the composed/mirrored repo, it goes together with the new version of the service -- and the integration was untested.

@churchyard thanks for providing the use case, we will talk it through.

That's true. It's based on the fact that after tagging a package, it appears in Koji repository in a few minutes. But it takes 24 hours to appear in a compose repository. And it can take longer if the compose process crashes. Testing a package against a compose repository means testing it against an up-to-24-hour-old state.

Could be weeks for older releases or EPEL. E.g. when an update is pending but has a buildroot override.

Buildroot overrides are deprecated by side-tags. I would ignore overrides because relengs don't like them and want to disable them in long term.

Buildroot overrides are deprecated by side-tags. I would ignore overrides because relengs don't like them and want to disable them in long term.

Unfortunately, side tag support in PR CI is stalled for almost 2 years and buildroot overrides are the only way to use the CI in pull requests for buildtime-interdependent changes.

From the latest discussion about side tags on the devel list that I recall, the conclusion was that packagers should prefer side tags to ship interdependent updates, but I have never heard about a plan to disable buidlroot overrides.

Nevertheless, if side tag support is to be added (pretty please), a Koji repo for the particular side tag would need to be used anyway, as side tag repos are not composed at all.

It is causing unpleasant surprises in general,

What kind of surprises does it cause? I've looked over mentioned https://pagure.io/fedora-ci/general/issue/364 and https://pagure.io/fedora-ci/general/issue/252 and I don't see the connection.

and it does not seem to be widely known by the general Fedora development audience seems.

What makes you assume this?

This definitely will cause failures you don't currently see as-is, yes. openQA effectively works this way at the moment for Rawhide updates when the maintainer doesn't group them via a side tag, and this definitely does result in failures, e.g. this one from yesterday.

I'm trying to get maintainers to group updates for Rawhide, but obviously folks are used to doing it the old way and it'll take a while. The issue @churchyard linked obviously doesn't help with getting folks to use side tags either.

I asked @fbo, I had no idea this works in C9s already, if so, I do not think it should be super hard to replicate to Fedora.

@churchyard @ppisar @adamwill thanks for sharing, so what I am hearing is, once we land sidetag testing support via PRs, we can revisit this; otherwise it will cause issues.

Broadly, yeah. But we also need to work to get more buy-in to have maintainers do rebases using side tags - not directly in Rawhide, or using buildroot overrides on stable releases. If you merge this change, both things will potentially cause failures in CI, I think. For openQA, the former is a problem but not the latter.

I suppose you should also consider the case of CI tests run on package builds (as opposed to PRs). With this merged, you'll have a similar problem there.

Say a maintainer wants to bump the soname of libfoo and rebuild a dependent package, barusesfoo. Let's assume the dependent package has a test in CI that checks the package works.

If the maintainer does things the old way, they will send the libfoo build directly to Rawhide, wait for it to appear in the buildroot repo - that's the same "tag repo" this PR is about - then send the barusesfoo build. CI will automatically run tests on each build when it completes.

Right now, the test for barusesfoo will pass, because it will pull in the new libfoo with the soname bump from the "tag repo". But if you merge this change, the test for barusesfoo will fail, because it will run against the libfoo from the rawhide repo, which will be the one from the most recent successful compose, with the old soname.

I don't know what CI currently does for side tag builds. Does it schedule tests for them at all? Does it automatically pull in the other packages from the same side tag? If so, the solution to this problem is the same as for the PR case: try and get maintainers to use side tags.

Hmm, further thought here: I see a distinction between using the buildroot repo on Rawhide and using it on any other release.

Fundamentally, using the buildroot repo gets you two sets of things compared to using just the normal fedora/updates/updates-testing/rawhide repos:

  1. Packages that have been pushed stable since the last time a compose succeeded (for Rawhide that's a Rawhide compose, for Branched it's a Branched compose, for stable releases it's an updates compose)
  2. Packages that have active buildroot overrides

On releases other than Rawhide, I see pulling in buildroot overrides as potentially problematic, because they can cross-pollute in various ways between builds intended as separate updates. There are various scenarios, but basically, you can't be sure (random package X) really "ought" to be tested along with (random potentially unrelated buildroot override Y). This of course isn't just a test problem - it can screw up the actual builds and updates too, if something gets built against something it wasn't expecting to be built against, and this is a major reason some of us want to get rid of buildroot overrides. But for testing purposes, just because X and Y are both in the buildroot, you don't know whether X was meant to be built against Y, whether it was built against Y, and when each will go stable, which obviously causes all sorts of potential problems.

On Rawhide I think this is less of a problem, because there's very little reason to do a buildroot override on Rawhide, since Rawhide updates can be (and usually are automatically) pushed stable immediately. If we turn on openQA gating for Rawhide this might potentially cause people to start submitting buildroot overrides for it too, I guess; I'll have to think about that. But right now I don't think anybody really does it.

There's also somewhat less benefit to using the buildroot repo on stable releases, because updates composes are very unlikely to fail, so we can be fairly sure things 'pushed stable' appear in the 'updates' repo after maximum 24 hours. Full Branched and Rawhide composes are more likely to fail, so it's more likely that you get a situation (like we have with Rawhide right now...) where there's no compose for three days and everything that's been "pushed stable" since is still not in the main repositories.

Still the ideal world is one where buildroot overrides are not allowed and all rebases must use side tags. That would make the buildroot repo much 'safer' as it should only ever contain things that have been "pushed stable" and thus which it's appropriate to test against. Then I suppose the only problem for CI would be the stuff discussed in https://pagure.io/fedora-ci/general/issue/364 , about ensuring things from the buildroot repo are actually installed while not being too inefficient...

thanks for sharing, so what I am hearing is, once we land sidetag testing support via PRs, we can revisit this; otherwise it will cause issues.

To be clear, once we land side tag testing support via PRs, using Koji repos is still my preferred option for PRs that don't use side tags, especially for rawhide.

buildroot overrides are the only way to use the CI in pull requests for buildtime-interdependent changes.

Altering buildroot in order to test a pull request is utterly wrong approach. It's a pity Pagure does not support side tags, but I would never dare to abuse buildroot like this.

I have never heard about a plan to disable buidlroot overrides

I remember relengs articulating it on a mailing list when they announced side tags.

If the maintainer does things the old way, they will send the libfoo build directly to Rawhide, wait for it to appear in the buildroot repo

Here will act CI. It won't allow you to push rebased libfoo into Rawhide buildroot because it would break a dependency from barusesfoo.

A user will be forced (well, recommended because the user can waive CI results) to use a side tag and group all builds as a single update.

Then the problem boils down to a delay between merging a side tag and creating a new compose. But the delay already exists now. Regardless you build a single package or multiple of them.

That's why I don't think side tags play role here. They make the problem of the delay more visible, but it's not the cause.

thanks for sharing, so what I am hearing is, once we land sidetag testing support via PRs, we can revisit this; otherwise it will cause issues.

To be clear, once we land side tag testing support via PRs, using Koji repos is still my preferred option for PRs that don't use side tags, especially for rawhide.

I am confused, I thought side tags will be auto-created if you link PRs together via Zuul's Depends-on there should be nothing else ideally needed.
Is that not usable to you, or why you say you will still not use it?

If the maintainer does things the old way, they will send the libfoo build directly to Rawhide, wait for it to appear in the buildroot repo

Here will act CI. It won't allow you to push rebased libfoo into Rawhide buildroot because it would break a dependency from barusesfoo.

A user will be forced (well, recommended because the user can waive CI results) to use a side tag and group all builds as a single update.

Then the problem boils down to a delay between merging a side tag and creating a new compose. But the delay already exists now. Regardless you build a single package or multiple of them.

Why creating the compose?
The packages will be available once they are in the repos?

That's why I don't think side tags play role here. They make the problem of the delay more visible, but it's not the cause.

Well, it definitely seems to address the problem @churchyard has now?
Or does it not?

Altering buildroot in order to test a pull request is utterly wrong approach. It's a pity Pagure does not support side tags, but I would never dare to abuse buildroot like this.

To clarify, I only do this if the buildroot override does not break dependencies for other packages.

Well, it definitely seems to address the problem @churchyard has now?

What particular problem do you mean?

If the maintainer does things the old way, they will send the libfoo build directly to Rawhide, wait for it to appear in the buildroot repo

Here will act CI. It won't allow you to push rebased libfoo into Rawhide buildroot because it would break a dependency from barusesfoo.

Well, that'd be great, except we don't have that. AFAIK we do not have that test in Fedora CI or openQA (openQA catches some such cases, but not all). We do not gate normal Rawhide builds on any results, from CI or openQA. I am describing the situation as it exists now, not some ideal situation we would like to have. Right now, people still do these kinds of builds the way I described, which means that right now, if you disable the buildroot repo, failures will start happening.

Then the problem boils down to a delay between merging a side tag and creating a new compose. But the delay already exists now. Regardless you build a single package or multiple of them.

Why creating the compose?
The packages will be available once they are in the repos?

Here creating the compose = be in the repos. (Either in a buildroot or in a daily compose. Both create a YUM repository with createrepo_c tool. I sometimes call these activities interchangeably. I'm sorry for the confusion.)

A thing is that when CI sanctions an update, the packages are tagged almost immediately, but they appear in the repository only after finishing createrepo_c invocation. Between tagging and creating the buildroot repository is a delay. The delay exists even in Rawhide. Few minutes. Any independently built package in the delay period will be built against buildroot repository with old packages. Any independent update tested in the delay period will be tested against the old repository. Any independent update built in the delay period but tested after the period will be tested against new repository. Those are race conditions inherent to the build-CI system. (Some of these conditions, especially the latest case, are actually expected to fail in CI.)

Then there are race conditions between updates themselves. You can have two updates, each CI-passing against the old repository, but both combined together produce a failure. To catch this race, CI would have to pause creating a new buildroot repository, group all pending updates into one combined CI run and only when passed together unpause creating the buildroot.

That's why I don't think side tags play role here. They make the problem of the delay more visible, but it's not the cause.

Well, it definitely seems to address the problem @churchyard has now?
Or does it not?

It will address it mostly. However, not completely as I elaborated above. The shorter the delay is the less probably race occurs.

@churchyard's use case is that the buildroot contains packages which have not yet been composed. Either because the buildroot is contaminated with a buildroot override. Or because the compose process has 24 hour (or if crashing a longer) delay.

Using a buildroot repository is the easiest (for Koji, not for CI) approach how to address @churchyard's use case. However, there can be different approaches. E.g. packages pending for a compose won't be tagged into buildroot until the compose finishes.

It's basically a fight between a paradigm where buildroot is an independent, self-controlled entity where from new packages falls down into CI and later into a compose for users, where broken CI or broken compose cannot stop buildroot from processing. And a paradigm where buildroot depends on and is controlled by results from CI and compose. The first approach is more agile and less prone to external failures (failed CI, compose) that latter provides more tested, slowly evolving, and more error prone environment.

I guess Fedora is closer to the first paradigm: Packagers want buildroot overrides https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/X2RQQQ76NYDF4Y3L3WSSNW2MSIOI6CHW/, @adamwill in the previous comment reminds that Bodhi does not enforce gating. I understand the packagers, but not enforcing gating is, in my opinion, huge failure. We are literally feeding a beast (CI machines) but have no benefit from it (ignore the results).

Well, it definitely seems to address the problem @churchyard has now?

What particular problem do you mean?

@churchyard I meant the problem you are hitting that you need koji repository?
You have updates that depend on each other, and you need a reasonable way to build and test them together (now you workaround that by landing them sequentially and having them ASAP available for the next package in the "chain"?) Or am I lost here?

@ppisar sorry did not see your comment when writing the previous comment :)

Well, it definitely seems to address the problem @churchyard has now?

What particular problem do you mean?

@churchyard I meant the problem you are hitting that you need koji repository?
You have updates that depend on each other, and you need a reasonable way to build and test them together (now you workaround that by landing them sequentially and having them ASAP available for the next package in the "chain"?) Or am I lost here?

Yes, I have updates that depend on each other, and I need a reasonable way to build and test them together -- I've opened https://pagure.io/fedora-ci/general/issue/240 to get this problem solved.

But I also need to test my updates against the current buildroot because that is the environment my updates are built in. That currently works nicely, because we use the Koji repo. Not using the Koji repo (as proposed here) will create a problem that I currently don't have. I described that in my first example in https://pagure.io/fedora-ci/general/issue/376#comment-830518

@ppisar there is AFAIK no distro-wide gating for any Fedora CI tests at present. Individual packages can opt-in to gating (at update push-to-testing and/or push-to-stable point, on whatever set of tests they like) by placing a file in their dist-git repo, and can configure pull request gating through Pagure I guess.

Updates for stable and Branched releases are gated distro-wide, on push-to-stable, on the openQA tests.

@churchyard any opinion on CentOS Stream "koji repository"?

@ppisar @churchyard @adamwill so trying to summarize:

For Fedora:

  1. for rawhide, keep the tag repository available, as there is no alternative
  2. for non-rawhide, seems dropping the tag repository should be viable, as @adamwill described in this comment
  3. strive for fixing the PR workflow to support side-tags (need to talk to OSCI & Zuul folks about that)

For CentOS Stream:

  • The benefit of tag repository is unknown. We seem to have PRs with sidetags, so seems dropping it is a way to go.

Does that match your expectations?

For Fedora:

  1. for rawhide, keep the tag repository available, as there is no alternative

Right, one needs some "buffer" repository in between new builds and what gets composed as "the OS" (i.e. the composes that users actually install), as a place to stage and finish transitions. I.e. if you update a libfoo.so.1 to .2, all the rebuilds need to be staged in either a side-tag or that tag repo, until the transition is complete and the whole set of libfoo and its reverse dependencies can migrate to stable in lockstep.

I think the main concern here is that this tag repository is so obscure, undiscoverable, and magic. It'd be better IMHO to make this a "proper" repo like -updates-testing (and be refreshed very often -- e.g. Ubuntu refreshes its "proposed" repo every 30 minutes), and make it very easy for developers and CI system to turn on (like dnf --enablerepo=...).

  1. for non-rawhide, seems dropping the tag repository should be viable, as @adamwill described in this comment

That's assuming library transitions or other package inter-dependent changes don't happen any more in stable. Is that true?

For CentOS Stream:

  • The benefit of tag repository is unknown. We seem to have PRs with sidetags, so seems dropping it is a way to go.

Library transitions have to be done in CentOS stream somehow, they can't land piece by piece. If side-tags work well there, that'd be ideal, yes.

Thanks!

I think it's a good step in the right direction. Let's do it and observe the results.

Dropping the repo for stable/Branched releases is still going to result in more failures, in all likelihood, because people still do use buildroot overrides to do rebases in stable/Branched right now. It won't be as many as in Rawhide, most likely, but it will happen.

Also I don't think anybody explained yet exactly what CI does with side tags. Does it run the tests automatically on builds in side tags? If so, does it properly pull in other packages from the same side tag when doing so?

I can't think very well about the pull request workflow as I'm not very familiar with that one.

for rawhide, keep the tag repository available, as there is no alternative

+1

for non-rawhide, seems dropping the tag repository should be viable

IMHO This will still cause more problems than it solves, but it won't be as critical as in Rawhide.

However, please don't do this for branched before "updates-testing activation point". Before that point, branched behaves like rawhide.

strive for fixing the PR workflow to support side-tags

Yes, please.

any opinion on CentOS Stream "koji repository"?

No opinion.

Also I don't think anybody explained yet exactly what CI does with side tags. ... I can't think very well about the pull request workflow...

In c9s, you can specify the side tag in the initial PR comment. All scratch builds are performed in that side tag and the testing has the Koji repository of that side tag enabled.

Unfortunately, it is still impossible to use to cross-test multiple PRs against each other if there are BuildRequires between them.

However, if you land one change to a side tag, you can test PRs in other packages "in" that side tag.

FWIW, prompted by this thread, I've turned on usage of the buildroot repo for openQA Rawhide update tests, so if you turn it off for non-Rawhide tests but leave it on for Rawhide, we'll at least be in sync. :D

Login to comment on this ticket.

Metadata