#11737 Use DNF 5 as package manager for Mock
Closed: Fixed a month ago by kevin. Opened 4 months ago by egoode.

  • Describe the issue

We want to change the Mock configuration in Mock (mock-core-configs), Koji, and Copr to use DNF 5 as Mock's package manager instead of DNF 4.

See https://fedoraproject.org/w/index.php?title=Changes/BuildWithDNF5

  • When do you need this? (YYYY/MM/DD)

If/when the change proposal is approved, or at the latest, the F40 Beta Freeze on 2024/02/20

  • When is this no longer needed or useful? (YYYY/MM/DD)

There is no definite end date. If a new package manager replaces DNF 5, or if the switch to DNF 5 is canceled, then this change will no longer be needed.

  • If we cannot complete your request, what is the impact?

DNF 5 would not be as thoroughly tested when the switch to DNF 5 occurs in (likely) Fedora 41.


Metadata Update from @phsmoura:
- Issue tagged with: medium-gain, medium-trouble, ops

4 months ago

Assuming DNF5 5.1.10 gets to Rawhide mirrors today/tomorrow; we plan to "build F40 with DNF5" in Fedora Copr ASAP to test.

There are no known blockers right now, it is just suggested to update Koji to use Mock 5.3 to have a better DNF5 log output (with 5.3, dnf5 process has non-terminal stdin/out/err so the output in logs has no terminal sequences - and thus much more readable output).

Can we have a plan for Koji change before the mass rebuild on Wed 2024-01-17?

This has been deployed to Fedora Copr now.

Seems like there's a problem with cachedir= option. This breaks libguestfs build (and perhaps a few other packages that rely on dnf's caching side-effects). This problem appears fixable to me, but I'll let @egoode to comment on this. No other problem reported so far.

Considering we'd like to apply this change before the f40 mass rebuild starts, can anyone guess how difficult is to push this into Koji and how much time we actually have for any related actions? What needs to be done from our side (change owners)?

So, we need to revert / remove 0f4026b2a16 in ansible I think...

dnf5 isn't going to be default in rawhide itself right? or is it? If not, we need to change that config to point to dnf5?

Strange, 0f4026b2a16 seems like a no-op to me with today's Mock defaults. Did we need to deploy this change faster?

I expect that we have to configure the F40+ target with config_opts['package_manager'] = 'dnf5'.

dnf5 isn't going to be default in rawhide itself right? or is it?

Correct, default package manager for users at runtime is dnf which points to dnf-3. But default package for building should be switched to dnf5.

If not, we need to change that config to point to dnf5?

Right, via config_opts['package_manager'] we need to tell Mock to start using dnf5_command over dnf_command (and other dnf5_*).

Seems like there's a problem with cachedir= option. This breaks libguestfs build (and perhaps a few other packages that rely on dnf's caching side-effects). This problem appears fixable to me, but I'll let @egoode to comment on this.

WRT to this issue, we'll prbably need to modify DNF's [main] section to have
system_cachedir=/var/cache/dnf (and wait for the next DNF5 release, which
can be done promptly, hopefuly).

Strange, 0f4026b2a16 seems like a no-op to me with today's Mock defaults. Did we need to deploy this change faster?

Yes, this predates that default. We had to do it because dnf5 by default just landed in rawhide and broke all builds. ;)

I expect that we have to configure the F40+ target with config_opts['package_manager'] = 'dnf5'.

I am not sure how we can do that.

We can set defaults in site_defaults.cfg for everything.
We can tell koji per tag to use 'dnf' or 'yum'

Right, via config_opts['package_manager'] we need to tell Mock to start using dnf5_command over dnf_command (and other dnf5_*).

And I can do that on a site/global basis, but not per tag, unless I am missing something.

We can set defaults in site_defaults.cfg for everything.
We can tell koji per tag to use 'dnf' or 'yum'

Can not we use mock.package_manager=dnf5 per tag? Or is that a bool config (dnf vs yum)?

https://pagure.io/fedora-infra/ansible/blob/2de76376c5ec7a8005913f74b857ef9d76f102e3/f/playbooks/manual/releng/koji-release-tags.yml#_43

Don't we need a new Koji per-tag option for this? CC @tkopecek

$ koji taginfo f40
Tag: f40 [71268]
Arches: None
Groups: appliance-build, build, livecd-build, livemedia-build, srpm-build
Required permission: 'autosign'
Tag options:
  mock.new_chroot : 1
  mock.package_manager : 'dnf'
Inheritance:

Seems like a string value; but I'm not sure what happens when we simply do s/dnf/dnf5 here. Perhaps we could test on stage?

Seems like a string value; but I'm not sure what happens when we simply do s/dnf/dnf5 here. Perhaps we could test on stage?

It should work this way.

Ah, I missed where this was passed (and also stupidly did a git grep dnf5 instead of looking for package_manager). Sorry about that.

So, I can edit this whenever then. We should notify maintainers that we are going to change it. When would you like me to change it? And would you like to announce it, or want me to?

So, I can edit this whenever then. We should notify maintainers that we are going to change it.

Glad to hear that!

We still have a day or two till we have a fixed DNF5 (cachedir) in Rawhide; would you mind testing this in the staging environment right now now? Just to prove that we don't need to fix something really Koji-specific and obvious.

When would you like me to change it?

In production - preferably as soon as we have the new DNF5 release? I'll update this ticket when we are ready.

And would you like to announce it, or want me to?

I dumped a note into the change discussion that this is going to happen very soon; probably not enough.. I bet. So if you could announce where and you are used to announce, it would be nice. Thank you @kevin.

Note that the remaining /var/cache/dnf "issue" affects only a few packages; so this should be perfectly testable for majority of packages in Koji Staging.

But the Mock 5.1.11 is getting finalized. As soon as it is available in Rawhide's buildroot, we should be OK to push this change even into Koji production.

ok. Just let me know and I will switch rawhide and announce that it's happened, we can then see what if anything breaks.

We are running pretty low on time before the mass rebuild tho, so I would prefer to do this as soon as we can so we can fix any problems before mass rebuild.

ok. Just let me know and I will switch rawhide and announce that it's happened, we can then see what if anything breaks.

We are running pretty low on time before the mass rebuild tho, so I would prefer to do this as soon as we can so we can fix any problems before mass rebuild.

Finally did some testing in staging today and we hit a snag... koji call 'package_manager groupinstall ...' but dnf5 uses 'group install'.
Possibly we can alias this in a mock update, we are going to try that soon.

I'm checking once more the testing build we did yesterday, and I noticed that the Koji (staging) builders have dnf5 installed on the host. The DNF5 on host is not under Mock's control, the version is not predictable. But still, Mock prefers using dnf5 for installing bootstrap, if installed and available on host.

It would be nice to ensure that a reasonably new DNF5 is installed on builders, ideally the latest one from updates testing. Or remove the dnf5 package (which would make Mock to use DNF4 for the dnf5-capable bootstrap installation).

Anyway, the bootstrap installation is much easier task compared to the buildroot preparation. So we might be good even with older dnf5? (I am not 100% sure). I'm rather writing this comment to avoid surprises.

ok, this should be live now.

I guess we can close this?

I noticed many (but apparently not all) CI scratchbuilds fail with

DEBUG util.py:461:  Unknown argument "groupinstall" for command "dnf5". Add "--help" for more information about the arguments.
DEBUG util.py:461:  It could be a command provided by a plugin, try: dnf install dnf5-command(groupinstall)
DEBUG util.py:610:  Child return code was: 2

E.g. https://koji.fedoraproject.org/koji/taskinfo?taskID=111920315 or https://koji.fedoraproject.org/koji/taskinfo?taskID=111919727

This has impacted loads of my builds for the past few hours:
https://koji.fedoraproject.org/koji/builds?userID=jwakely&order=-build_id&state=3

I'm still seeing the groupinstall one now, long after the hotfix was applied, e.g.
https://koji.fedoraproject.org/koji/taskinfo?taskID=111919581

And loads and loads of the --allowerasing one.

One of the builders was running old version buildvm-x86-11.iad2.fedoraproject.org its updated now.

https://koji.fedoraproject.org/koji/taskinfo?taskID=111923623 has

    DEBUG util.py:461:  Unknown argument "--allowerasing" for command "install". Add "--help" for more information about the arguments.
    DEBUG util.py:610:  Child return code was: 2

On buildvm-x86-28.iad2.fedoraproject.org

I can backport a support for dnf5 group install --allowerasing to Fedora dnf5 builds, but I first need a build target without the --allowerasing option to be able to build dnf5 packages.

I'm not sure it's a proper timing to bring --allowerasing to rawhide right now, when mass rebuild is about to start. We have an updated koji.rpm package for this, but not all builders are updated, yet.

No problem for me. If you manage to fix it on Koji side, I will refrain from updating dnf5.

No problem for me. If you manage to fix it on Koji side, I will refrain from updating dnf5.

I am not releng member nor infra member... I am saying here because I see dnf issues on koji.

All the builders should be fine now and ready now:
- the kojid processes were restarted by @humaton
- I double checked koji and koji-builder versions, and these are OK
- the site-defaults.cfg was outdated on one builder (buildvm-x86-28.iad2.fedoraproject.org)

I think we are done here, can anyone close this?

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

a month ago

Login to comment on this ticket.

Metadata
Boards 1
Ops Status: Backlog