The duplication is causing now unpleasant load on koji, and @kevin and @adamwill proposed that we should finally get this resolved.
The plan is: [ ] move installability to Fedora Zuul [ ] deprecate rpm-install test from Zuul [ ] disable Fedora CI for pull request testing
infra ticket: https://pagure.io/fedora-infrastructure/issue/11920
for the record, we didn't cover in the initial discussion of this that it's also possible for per-package tests to run on PRs, e.g. https://src.fedoraproject.org/rpms/selinux-policy/pull-request/438 - note the "Fedora CI - dist-git test" result. But at a quick look it seems like zuul does this too - note the "check-for-sti-tests", "check-for-fmf-tests", and "rpm-tmt-test" items in the zuul comment.
Zuul indeed does this.
Firstly, Fedora CI runs for all package PRs, while Zuul CI only for selected ones. Are there plans to enable Zuul CI for all package PRs as part of this effort?
Secondly, Fedora CI can be configured to report each test plan result as a separate flag in Pagure PR, as seen in eg. https://src.fedoraproject.org/rpms/maven/pull-request/54 Can such way of reporting results be enabled in Zuul CI? It is very useful that results of quick smoke tests are reported quickly, while slow tests are reported eventually as they complete.
It is very useful that results of quick smoke tests are reported quickly, while slow tests are reported eventually as they complete.
So, https://pagure.io/fedora-ci/general/issue/15 is actually possible? Nice!
@churchyard yeah, completely forgot, yes, with tmt it is possible:
https://docs.fedoraproject.org/en-US/ci/tmt/#_multiple_plans
@mizdebsk thanks for pointing this out, this is something we have to fix. I do believe it is possible, need to talk to @fbo about it.
Following up on Fedora CI chat:
https://matrix.to/#/!cfWVeczGVJbiKSlrwi:fedoraproject.org/$xDbwnuFtvLYkAF87h4HgUNCM-mKfjttJi-H3rCv4DWw?via=fedoraproject.org&via=fedora.im&via=matrix.org
Anyway. I worry that if we deprecate everything and move to Zuul only, we will end up in a situation where we rely on Zuul while CentOS is moving to a different solution. We will have a Fedora-only solution.
Should we perhaps use the same system CentOS is going to use, sans the integration to Gitlab?
CentOS isn't going to use Fedora CI either. So continuing to use Fedora CI does nothing to improve that situation.
@churchyard Zuul will be run by Fedora CI. If gitlab materializes for Fedora, we can move to similar GitLab based pipelines.
Also there is discussion to have Packit involved, which would bring us the experience we have on GitHub, which would be great.
If you know where those discussions are happening, can you please emphasize that any plan to involve packit needs to include fixing https://github.com/packit/packit/issues/1870 ? packit is not viable for Fedora without that and I'm sick of cleaning up bad updates caused by using packit on interdependent packages.
@adamwill yeah, mentioned it there, thanks for pointing out :) definitely they should look at this before those plans materialize :)
[citest]
@vondruch thanks for adding your concerns
Fedora CI can be restarted by [citest] comment. I don't know how to restart Zuul.
it is recheck
recheck
Zuul runs only x86_64 build, while Fedora CI runs regular scratch build on all platforms.
ack, adding to requirements
Fedora CI results are easy to understand, while Zuul is one big mess.
Can you be more specific so we understand what you exactly find confusing, basically at the end you should get to Testing Farm results page in both cases. I assume you mean some other checks not run tmt / Testing Farm
@vondruch thanks for adding your concerns Fedora CI can be restarted by [citest] comment. I don't know how to restart Zuul. it is recheck
If it is documented somewhere, then it is hard to find.
Fedora CI results are easy to understand, while Zuul is one big mess. Can you be more specific so we understand what you exactly find confusing, basically at the end you should get to Testing Farm results page in both cases. I assume you mean some other checks not run tmt / Testing Farm
Take this PR as an example:
https://src.fedoraproject.org/rpms/ruby/pull-request/177
Clicking "Fedora CI - scratch build" brings me to Koji, easy. Click on "Fedora CI - installability" brings me to "artifacts". I can unfold the "/installability/installability" and the output makes sense (could be easier, but ....)
But now click on "Zuul", brings me to "Zuul". Ok, there seems to be two issues. Click on "eln-rpm-scratch-build" brings me to some details, where I can see the Koji link, fain but what is the rest? " container 26 OK 15 Changed 1 Failure". How can I see some details? Am I supposed to see details? Why I cannot click on that.
What does even mean "Task failed running on host container"
What are the other tabs?
What are the other results, e.g. "check-for-arches".
And the order does not look trustworthy. Why would the "rpm-install-test" be in front of "rpm-scratch-build". Or are they independent?
Every time I see these results, I feel intimidated, like if I have not finished basic school. This is not designed for users.
Zuul runs a x86_64 scratchbuild + individual scratchbuilds for other architectures. That makes the feedback loop slightly improved because a failure to build on s390x doe snot block the CI testes to report back the results.
Zuul runs only x86_64 build, while Fedora CI runs regular scratch build on all platforms. Zuul runs a x86_64 scratchbuild + individual scratchbuilds for other architectures. That makes the feedback loop slightly improved because a failure to build on s390x doe snot block the CI testes to report back the results.
Where are those builds in my example PR?
Nowhere. I suggest you report that as a bug. Zuul only runs those builds on packages that are not noarch and I think the noarch rubygemns subpackage makes the detection (check-for-arches) fail.
Eh, I have failed to understand you at the first read, because that seems unbelievable to me. Now I likely understand but I still can't believe it. That just supports my first statement about Zuul.
Also, one more thing about Zuul. It builds SRPMs on old Fedoras/RHELs instead of building from SCM: https://pagure.io/fedora-ci/general/issue/461
As a result, if the specfile uses new RPM macros, the CI fails, which manifests like e.g. as https://pagure.io/fedora-ci/general/issue/381
As an example, this PR failed on Zuul but the other scratchbuild succeeded: https://src.fedoraproject.org/rpms/python-dask/pull-request/12
I'm starting to feel like this ticket should be called "Deprecate Zuul for Fedora CI for dist-git pull request testing" at this point...:D
@adamwill , thanks for bringing this up. (And sorry for that in the first place!) The good news is that we are working on this for the last month or two and should have this quite soon.
And regarding the Packit plans:
There are multiple reasons why we are thinking about this:
There are still a lot of things to decide -- opinions/feedback/suggestions are welcome! I don't want to steal this discussion - if there is anyone interested, either let me know directly or here is an issue I've just created to have a place to collect feedback.
František (from the Packit team)
PS: Sorry for being late to the party, I was mostly out of the computer the last two weeks.
So I'm gonna suggest we broaden this out as a ticket for "rationalize PR testing on Fedora", so it can accommodate the possibilities of "drop the Zuul scratch build pipeline instead", or "replace both Zuul and Fedora CI for this purpose with Packit" (and, I guess, "replace everything with Konflux").
Does that sound sensible? Then we could perhaps close https://github.com/packit/packit-service/issues/2453 to avoid duplication.
It might be a good idea to start by saying "what would happen in an ideal world?" for a few cases (package onboarded to packit, package not onboarded to packit, package with its own tests, package without its own tests, package that is a dep of some other package with important tests...) and then we could look at how big the work would be to get there from where we are, with each of the various possibilities?
@lachmanfrantisek from TF perspective this would be the best solution, and I am glad your plans are going forward. And also aligns with our internal goals. I agree we should start a proposal on a reasonable place, hopefully I have some news on that new space next week.
Good point, @adamwill , I've tried to clean up the Packit ticket to be just tracking on Packit's side to link the related tasks and forward discussion here.
Just to provide a bit of an update from Packit:
Speaking of UX and the "ideal world" (independent to the service of choice) -- these are the questions/subtopics I can see here:
Just to add another choice :D , there's some discussion on the forge switch topic about using the new forge's CI abilities to trigger tests on PRs (either Gitlab CI or Forgejo Actions). tagging @siosm
It would indeed be great if folks working on Fedora CI / Zuul / Testing Farm or any other Ci infra in Fedora / CentOS space could take a look at the Git forge options (from https://discussion.fedoraproject.org/t/inviting-testers-for-git-forge-usecases/129016) and comment on the feasibility of integrating the existing setup with GitLab CI or Forgejo Actions.
We can definitely take a look at these, but so far, we've tried to avoid these custom Ci-systems in Packit since it's not compatible across git-forges and is usually hard to work with workflows triggered outside of the git-forge. (Both points might be relevant to Fedora as well.)
But I haven't played with Forgejo actions yet so need to check before making any judgements..;)
@siosm Zuul CI supports GitLab https://zuul-ci.org/docs/zuul/latest/drivers/gitlab.html but it does not support Forgejo.
Log in to comment on this ticket.