#12214 Investigate and untag packages that failed gating but were merged in via mass rebuild
Opened 6 months ago by adamwill. Modified 6 months ago

  • Describe the issue
    We recently merged in the mass rebuild for Fedora 41. However, in at least one case, a build was merged that is identical to one that failed required tests and was previously gated from landing in Rawhide. I see several other cases that are likely the same; just about every update more than ~12 hours old (as I write this ticket) in https://bodhi.fedoraproject.org/updates/?search=&critpath=True&status=testing&releases=F41 is suspect, starting about httpd-2.4.62-1.fc41 . All those updates are stuck in updates-testing status for F41, meaning they likely failed gating tests, and we should check whether the mass rebuild build of the package was the same as the build that failed tests or not, and consider untagging those that are.

In future we should probably consider some kind of mechanism to block packages with currently-gated updates from the mass rebuild to avoid this kind of issue happening.

I will go through these myself on Wednesday if nobody else does it before then, but I'm on vacation without my laptop ATM so can't easily do it myself.

  • When do you need this? (YYYY/MM/DD)
    ASAP

  • When is this no longer needed or useful? (YYYY/MM/DD)
    When all affected packages have been fixed some other way

  • If we cannot complete your request, what is the impact?
    Breakage that should have been blocked by the gating mechanism will land in Rawhide proper


List of the updates filtered for NVR's in them.

 curl -X GET "https://bodhi.fedoraproject.org/updates/?search=&critpath=True&status=testing&releases=F41" | jq --raw-output '.updates[].builds[].nvr'

running it in for loop with koji list-history

for build in $( curl -X GET "https://bodhi.fedoraproject.org/updates/?search=&critpath=True&status=testing&releases=F41" | jq --raw-output '.updates[].builds[].nvr'); do koji list-history --tag f41 --build $build; done

So I had the curl wrong in my previous comment. Updated now and I don't see packages from the search query tagged into f41.

Metadata Update from @kevin:
- Issue assigned to humaton
- Issue tagged with: medium-gain, medium-trouble, ops

6 months ago

My bad I misread what is happening, now after the weekly meeting I am smarter. Will produce the list of affected NVRs today.

note, the query I posted isn't comprehensive; we should also check unpushed updates...

So this produces a better list.

for build in $( curl -X GET "https://bodhi.fedoraproject.org/updates/?search=&critpath=True&status=testing&releases=F41&gating=failed&per_page=100" | jq --raw-output '.updates[].builds[].nvr'); do koji list-history --after 22-07-2024 --tag f41 --package ${build%-*-*}; done

I will include unpushed updates as well.

So will this search yield the expected results? Or is there some API magic I can do to get more data from bodhi?

Ok I see what you mean by not comprehensive the list got lot longer now:

Mon Jul 22 19:04:03 2024 iproute-6.8.0-5.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 18:58:34 2024 annobin-12.62-2.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 18:59:14 2024 binutils-2.42.50-20.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:03:55 2024 httpd-2.4.61-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:02:01 2024 go-rpm-macros-3.6.0-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:00:22 2024 filesystem-3.18-23.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:07:47 2024 parted-3.6-6.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:00:22 2024 firewalld-2.2.0-2.fc41 tagged into f41 by kevin
Tue Jul 23 06:06:03 2024 firewalld-2.2.0-2.fc41 untagged from f41 by kevin
Mon Jul 22 21:40:26 2024 toolbox-0.0.99.5-14.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 18:59:14 2024 bind-9.18.26-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 18:59:21 2024 chkconfig-1.28-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 18:59:26 2024 coreutils-9.5-7.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 18:59:29 2024 cyrus-sasl-2.1.28-27.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:00:07 2024 device-mapper-multipath-0.9.9-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:00:07 2024 device-mapper-persistent-data-1.0.12-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:00:08 2024 dmidecode-3.6-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:00:10 2024 dosfstools-4.2-13.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:00:12 2024 e2fsprogs-1.47.1-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:00:13 2024 efibootmgr-18-7.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:00:17 2024 esmtp-1.2-26.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:00:18 2024 exim-4.98-2.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:00:22 2024 filesystem-3.18-23.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:00:25 2024 fping-5.2-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:01:57 2024 glusterfs-11.1-6.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:02:00 2024 gnupg2-2.4.5-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:03:53 2024 hdparm-9.65-6.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:03:54 2024 hfsutils-3.2.6-50.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:03:55 2024 httpd-2.4.61-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:04:02 2024 initscripts-10.25-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:04:03 2024 iproute-6.8.0-5.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:04:03 2024 iptables-1.8.10-15.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:04:11 2024 kexec-tools-2.0.28-14.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:04:53 2024 kmod-31-7.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:05:00 2024 libcap-2.70-4.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:05:48 2024 libreswan-4.15-4.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:05:49 2024 libselinux-3.7-5.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:05:55 2024 lm_sensors-3.6.0-20.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:05:59 2024 lvm2-2.03.25-4.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:06:52 2024 msmtp-1.8.25-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:06:52 2024 msr-tools-1.3-26.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 03:58:26 2024 nbdkit-1.40.0-1.fc41 tagged into f41 by bodhi [still active]
Mon Jul 22 19:06:56 2024 net-tools-2.0-0.71.20160912git.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:06:57 2024 nfs-utils-2.6.4-0.rc6.fc41.2 tagged into f41 by kevin [still active]
Mon Jul 22 19:06:57 2024 nilfs-utils-2.2.11-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:06:59 2024 ntpsec-1.2.3-7.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:07:04 2024 ocfs2-tools-1.8.8-5.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:07:43 2024 opensmtpd-7.5.0p0-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:07:48 2024 pciutils-3.13.0-5.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:11:07 2024 policycoreutils-3.7-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:11:08 2024 postfix-3.9.0-6.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:11:09 2024 ppp-2.5.0-13.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:11:09 2024 procps-ng-4.0.4-4.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:11:10 2024 psmisc-23.7-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:22:09 2024 rpcbind-1.2.6-5.rc3.fc41.1 tagged into f41 by kevin [still active]
Mon Jul 22 19:22:09 2024 rpm-4.19.92-5.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 21:17:35 2024 sanlock-3.9.3-4.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 21:40:09 2024 sendmail-8.18.1-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 21:40:10 2024 shadow-utils-4.15.1-8.fc41 tagged into f41 by kevin
Tue Jul 23 09:15:09 2024 shadow-utils-4.15.1-8.fc41 untagged from f41 by jnsamyak
Wed Jul 24 00:57:43 2024 shadow-utils-4.15.1-9.fc41 tagged into f41 by bodhi [still active]
Mon Jul 22 21:40:13 2024 smartmontools-7.4-6.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 21:40:21 2024 systemd-256.2-16.fc41 tagged into f41 by kevin [still active]
Tue Jul 23 21:06:53 2024 systemd-256.3-2.fc41 tagged into f41 by bodhi [still active]
Mon Jul 22 21:40:22 2024 tcpdump-4.99.4-9.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 21:41:00 2024 util-linux-2.40.2-4.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 17:43:21 2024 v4l-utils-1.28.0-1.fc41 tagged into f41 by bodhi [still active]
Tue Jul 23 17:45:53 2024 v4l-utils-1.26.1-2.fc40 untagged from f41 by oscar
Mon Jul 22 21:41:11 2024 xfsprogs-6.8.0-4.fc41 tagged into f41 by kevin [still active]
Tue Jul 23 11:01:06 2024 xfsprogs-6.9.0-1.fc41 tagged into f41 by bodhi [still active]
Mon Jul 22 19:05:10 2024 libqalculate-5.2.0-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:00:07 2024 dhcpcd-10.0.8-2.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:06:55 2024 ncurses-6.5-2.20240629.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:04:04 2024 iscsi-initiator-utils-6.2.1.10-0.gitd0f04ae.fc41.1 tagged into f41 by kevin [still active]
Mon Jul 22 18:59:19 2024 cava-0.10.2-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:04:02 2024 iniparser-4.2.4-2.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:04:04 2024 isomaster-1.3.17-3.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 19:06:55 2024 ndctl-79-4.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 21:40:16 2024 spindown-0.4.0-39.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 21:40:29 2024 ubridge-0.9.18-13.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 18:59:16 2024 boost-1.83.0-8.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 18:59:15 2024 blivet-gui-2.5.0-4.fc41 tagged into f41 by kevin [still active]
Mon Jul 22 18:58:17 2024 389-ds-base-3.1.0-11.fc41 tagged into f41 by kevin [still active]

Let's see...

  • iproute - OK (was in bin/sbin merge update, changes are conditionalized)
  • annobin - questionable (previous update CI failure was a timeout, a new update has been created since mass rebuild and it failed a CI test because make build failed due to guile soname issues, https://osci-jenkins-1.ci.fedoraproject.org/job/scratch-build-test/1476/console / https://kojipkgs.fedoraproject.org//work/tasks/5875/120865875/root.log )
  • binutils - questionable (I think CI failed on the rpm 4.20 build directory layout change, https://github.com/teemtee/tmt/issues/2987 - see https://artifacts.dev.testing-farm.io/f8f0704e-86cb-432b-b1c6-96972fa5cd13/ , things like cp /tmp.5MUmvdKQZq/BUILD/binutils-2.42.90/build-x86_64-redhat-linux/binutils/binutils.log /tmp.5MUmvdKQZq/LOGS/binutils-x86_64.log failed)
  • httpd - not sure, test logs have been garbage collected. the f39 and f40 updates passed the tests, though
  • go-rpm-macros - test was broken (it downloads and rebuilds an F38 .src.rpm and was looking in the non-archived F38 path for it). I've fixed that and sent a new build
  • filesystem - uncertain; the gating failure looks like a test system issue, but the fact it happened means we don't know if the update would otherwise have passed the test
  • parted - not sure. fails on a test which tries to rebuild itself, implicitly to get %check done. not sure how useful of a test this really is. the failure is during the rpmbuild part of the test, not sure if that's somehow caused by the rpm 4.20 thing or something else. @bcl ?
  • firewalld - was bad, we unpushed it, a new version has since been sent out which passed tests

will keep looking tomorrow....

So is the solution to this problem to check in the rebuild script for such updates extract packages from them and skip them during the rebuild?

We can create a new report with "skipped because failed gating or something".

Not sure what's up with parted. When I get time I'll probably change that test to something better, but for now I'm not sure what's going on. It's failing to find the source:

https://artifacts.dev.testing-farm.io/45f100f1-92df-4d25-9e03-ff9baa2e620d/

For now I'd just ignore it.

@bcl there was a change in rpm 4.20 which caused some havoc, but I'm not really sure whether it would be affecting this test. see https://github.com/teemtee/tmt/issues/2987 for some background (it has references to upstream tickets and stuff).

So is the solution to this problem to check in the rebuild script for such updates extract packages from them and skip them during the rebuild?

We can create a new report with "skipped because failed gating or something".

for the future, yeah, though I think it might need to be a semi-automated thing. we can have a script generate a list of possible problem packages, and have a human review it before the merge happens, as we're doing now retrospectively, and decide which to omit from the merge...

yeah, the mass tag script would need to grow some kind of '--exclude' or something so we could exclude things.

I think right now, we should probably just leave things tagged unless there's some pretty serious breakage?

I didn't get back to look through the rest of the list today, will try and do it tomorrow and flag up any that look really bad.

...okay, that's enough for today, back to it later.

  • filesystem - unsure. it failed a couple of its package-specific tests, I am not sure how serious those failures were. the build that got tagged had https://src.fedoraproject.org/rpms/filesystem/c/c874ad66ba6bdd27ee6ec044b5b279ab1a8da46c?branch=rawhide in it so it's claimed to be functionally identical to the pre-sbin-merge build, but I don't know if that commit was perfect. we'll just have to hope this is ok, I guess
  • fping - was in the sbin merge update without changes, fine
  • glusterfs - was in the sbin merge, had some changes made but they look like they should be no-op without the sbin merge
  • gnupg2 - was in the sbin merge, changes look safe
  • hdparm - ditto
  • hfsutils - was in sbin merge, had some significant changes, but they look like they should be ok
  • httpd - ditto
  • initscripts - was in sbin merge, no significant changes
  • iproute - was in sbin merge, a later update has since passed its own tests so looks fine
  • iptables - was in sbin merge, has some significant changes but I think they should be OK
  • kexec-tools - was in sbin merge, no significant changes
  • kmod - was in sbin merge, has some significant changes but I think they should be OK
  • libcap - was in sbin merge, no significant changes
  • libreswan - was in sbin merge, has some significant changes but I think they should be OK
  • libselinux - was in sbin merge, no significant changes, passed its own test suite since
  • lm_sensors - was in sbin merge, no significant changes
  • lvm2 - was in sbin merge, no significant changes
  • msmtp - was in sbin merge, no significant changes

whew, getting there...

So is the solution to this problem to check in the rebuild script for such updates extract packages from them and skip them during the rebuild?

We can create a new report with "skipped because failed gating or something".

for the future, yeah, though I think it might need to be a semi-automated thing. we can have a script generate a list of possible problem packages and have a human review it before the merge happens, as we're doing now retrospectively, and decide which to omit from the merge...

My idea is to create a new report, during the rebuild with the list of potentially problematic packages. May the condition be "If the gating failed and releng rebuilt the package"? Said report will be generated by script the same way we do the other reports for mass rebuild.

Extend the mass-tag.py with --exclude [nvr,nvr,nvr] so the person doing the tagging can decide which builds to omit.

Yes. (rpmdiff between device-mapper-persistent-data-1.0.12-1.fc41.x86_64.rpm and device-mapper-persistent-data-1.0.12-3.fc41.x86_64.rpm doesn't show any path changes.)

  • msr-tools - was in sbin merge, no significant changes
  • nbdkit - was in sbin merge, no significant changes, passed its own test suite since
  • net-tools - was in sbin merge, no significant changes
  • nfs-utils was in sbin merge, had some significant changes but they look to still be appropriate, probably OK
  • nilfs-utils - ditto
  • ntpsec - was in sbin merge, no significant changes
  • ocfs2-tools - was in sbin merge, had some significant changes but they look to still be appropriate, probably OK
  • opensmtpd - was in sbin merge, no significant changes
  • pciutils - was in sbin merge, no significant changes, passed its own test suite since
  • policycoreutils - was in sbin merge, no significant changes
  • postfix - has been failing its own tests since 3.9.0, all previous 3.9.0 updates were gated by this, but now 3.9.0 is in because of the mass rebuild. we need @jskarvad to investigate the failures and decide what to do about them. see most recent run on the 3.9.0-8 update, which is gated
  • ppp - was in sbin merge, no significant changes
  • procps-ng - was in sbin merge, no significant changes
  • psmisc - was in sbin merge, no significant changes
  • rpcbind - was in sbin merge, no significant changes, passed its own test suite since
  • rpm - was in sbin merge, had a patch but it was reverted before mass rebuild
  • sanlock - was in sbin merge, no significant changes
  • sendmail - was in sbin merge, looks like it actually introduced a bug but it has been fixed by jskarvad, and since passed its own test suite

...whew, that's the whole list.

I've changed the parted test to use tmt style testing, and to be an actual test of the installed package instead of rerunning the build (the point was to run the parted unit tests which aren't easy to run outside the source tree). it now uses the installed parted to partition a disk image and examines the results. It should now be less sensitive to whatever oddness is going on in the background.

that sounds great! thanks. the unit tests are run by the package build already, so yes, having the CI test do that was kinda redundant; if anything else wants to check it doesn't break parted's test suite, it can always just trigger a scratch build of parted as a reverse dep check I guess.

Log in to comment on this ticket.

Metadata
Boards 1
Ops Status: Backlog