Issue #2: Enable gating in Fedora - AtomicCi

The bodhi side of things should be ready, and just needs a configuration change.

However, the Fedora infrastructure is currently in freeze, so that change would require the approval of two members of the sysadmin-main group to be allowed.

Freeze is currently expected to end on 2017-09-26

dperpeet commented 6 years ago

One could argue that during freeze, with fewer changes, might actually be a better time even to enable gating.

pingou commented 6 years ago

There seems to be an issue with greenwave potentially making it return inconsistent results.

This is will likely going to be a blocker for enabling gating :(

dperpeet commented 6 years ago

Let's see if we can help resolve that blocker. I think it's important we enable gating as soon as possible.

pingou commented 6 years ago

Blocker: https://github.com/CentOS-PaaS-SIG/ci-pipeline/issues/382

Itself blocked on: https://pagure.io/koji/issue/550

pingou commented 6 years ago

I was curious about the effect that the Atomic CI pipeline would have if we turn gating on, so I queried resultsdb for the results it has for the last 3, 7 and 15 days, here is the outcome:

Over the last 3 days there were:
      30   Builds checked
      20   Builds passed (66.67%)
      10   Builds failed (33.33%)
       0   Builds unstable (0.00%)
       0   Invalid results

Over the last 7 days there were:
      44   Builds checked
      32   Builds passed (72.73%)
      12   Builds failed (27.27%)
       0   Builds unstable (0.00%)
       0   Invalid results

Over the last 15 days there were:
     131   Builds checked
      48   Builds passed (36.64%)
      82   Builds failed (62.60%)
       1   Builds unstable (0.76%)
       0   Invalid results

legend:

build passed: the pipeline run entirely and successfully
build failed: the pipeline failed to run the tests (ie: error in the pipeline itself)
build unstable: the pipeline ran the tests but they failed (ie: error in the tests)
build invalid: just a check added to make sure we weren't including results from the stage pipeline

So looking at this, over the last 2 weeks, more than 60% of the builds would have been gated based on issues in the pipeline itself.

I was also curious to see if some run were done against the same commit so, for example one run that fails is restarted and followed by one that succeeds.
This is the outcome:

container/cockpit#19d738deb86f6cf695bf5675b54c5d21b2615cc4 was tested 3 times on 1 branches with the same results: SUCCESS
container/cockpit#4105401fa46f1565cb57352c7a88df8e63fa2f08 was tested 3 times on 1 branches with the same results: SUCCESS
null/null#null was tested 4 times on 1 branches with the same results: FAILURE
rpms/cockpit#2ae3a79c7f15d689aa440794a670b41e35575dc1 was tested 6 times on 1 branches with the same results: FAILURE
rpms/criu#2ec40be859a588a7b14de9a58f0a0e9b297a9d0f was tested 2 times on 1 branches with the same results: SUCCESS
rpms/criu#386bedee49cb887626140f2c60522751ec620f1d was tested 2 times on 1 branches with the same results: SUCCESS
rpms/criu#66f579493ace9223c0634fbd3f46a83f00353b4d was tested 2 times on 1 branches with the same results: SUCCESS
rpms/criu#7d1ac81bb410d66a26a535a71e340d3da529689a was tested 2 times on 1 branches with the same results: SUCCESS
rpms/criu#9b903f6a012c4c04014f509967a0d98c74da34fd was tested 2 times on 1 branches with the same results: SUCCESS
rpms/criu#c84a7ab6b62a4850c4a57ebeab79969a98a24a55 was tested 2 times on 1 branches with the same results: SUCCESS
rpms/criu#db13fc1d36841847d75ebed3e18c9010dfd853f8 was tested 2 times on 1 branches with the same results: SUCCESS
rpms/dnf#1e59b6dabbc7449e8e00dbaa005138d5c919b4cb was tested 3 times on 1 branches with the same results: FAILURE
rpms/dnsmasq#1bca83a5d36c969c4e8c2c2343d592e57094903d was tested 3 times on 1 branches with the same results: SUCCESS
rpms/dnsmasq#4506b269ebdbb49f34e02813f1bf3936fa41a15f was tested 3 times on 1 branches with the same results: SUCCESS
rpms/dnsmasq#4b39bb3db4590e5332e9c67af9d2b94213839996 was tested 3 times on 1 branches with the same results: SUCCESS
rpms/dnsmasq#7bfc213fc8b072f9bbcfb07b1183b9c31bac5ef1 was tested 3 times on 1 branches with the same results: SUCCESS
rpms/dnsmasq#9d8d62978be7bf1853dca2181359cf0ab2a9e6af was tested 3 times on 1 branches with the same results: SUCCESS
rpms/dnsmasq#a9cd1c2e160f9b8913e80a8edbcc85a3ba4976d8 was tested 3 times on 1 branches with the same results: SUCCESS
rpms/dnsmasq#b75bf67ab76b97e7676cf2b28b172e8aff49eb3f was tested 3 times on 1 branches with the same results: SUCCESS
rpms/dnsmasq#b964c11672e89522f110180b06e588d16675dc97 was tested 3 times on 1 branches with the same results: SUCCESS
rpms/dnsmasq#fbec0655c95160d6e5f9d857294ec8dfc4a44354 was tested 3 times on 1 branches with the same results: SUCCESS
rpms/fontconfig#1161c4a337de895b8dfaa4205ea33bec3790e1fa was tested 3 times on 1 branches with the same results: SUCCESS
rpms/fontconfig#74db2e7ea3d186c60fbac8a616beba397d6f9660 was tested 3 times on 1 branches with the same results: FAILURE
rpms/fontconfig#812cb7e365a70347485586260fb27df509ab4a8a was tested 3 times on 1 branches with the same results: FAILURE
rpms/gnupg2#bff0c11537205b0a6ce8cab0a1bb57cf17df57c7 was tested 3 times on 1 branches with the same results: FAILURE
rpms/hwdata#b7958825c778de93d6dbf665064c5bff4ed9007b was tested 3 times on 1 branches with the same results: SUCCESS
rpms/iproute#62d79125de33e17275ffbf3f2cc42d61ba355a98 was tested 3 times on 1 branches with the same results: SUCCESS
rpms/iproute#9cac8d8147c8ae9daf7bc47f3d3da016bc8b3fc4 was tested 3 times on 1 branches with the same results: SUCCESS
rpms/kernel#7f47d72242ab3efa94c76edbe0e0cd8de2f544c5 was tested 2 times on 1 branches with the same results: FAILURE
rpms/kernel#d1e360d978557ebee08ff5f95830c8142fd4c92f was tested 2 times on 1 branches with the same results: FAILURE
rpms/nss-softokn#adde27e55ecda414c684728b31dd4b7a049bac5e was tested 3 times on 1 branches with the same results: FAILURE
rpms/nss-util#41c8edf9342c80d75d088a43b0187922e4de5849 was tested 3 times on 1 branches with the same results: FAILURE
rpms/nss-util#be647d69488bbffcb31d417c2e9f3b813b38b104 was tested 3 times on 1 branches with the same results: FAILURE
rpms/ostree#c27ecad288ca1ec1a3169eea42a7319ab9e86d38 was tested 3 times on 1 branches with the same results: SUCCESS
rpms/python-six#95795c62e5a562d980a51612b5459f22ed413cd8 was tested 4 times on 1 branches with 2 different results: FAILURE, SUCCESS
rpms/python3#8be884894c484066effe23cf8bbbd7b5a7dd9f83 was tested 3 times on 1 branches with the same results: FAILURE
rpms/samba#780c9b1a5005805f36125d3e25866e09d057a9a2 was tested 2 times on 1 branches with 2 different results: FAILURE, SUCCESS
rpms/selinux-policy#1ddd3d3e08f0d0a91ba416fba9428b6ee327a1bc was tested 2 times on 1 branches with the same results: SUCCESS
rpms/selinux-policy#757425578543e25c0f0274f1bb1c161ef6499c6f was tested 2 times on 1 branches with the same results: SUCCESS
rpms/selinux-policy#e54b1de1323f315294e84eebb5ff4d33d686c8cb was tested 2 times on 1 branches with the same results: SUCCESS
rpms/vim#0abe9cad318d180e42bef3b2c1d1442115e13dad was tested 3 times on 1 branches with the same results: SUCCESS
rpms/vim#2de6e6aa8b44757956859dd9be6147bc37bae6b5 was tested 2 times on 1 branches with the same results: SUCCESS
rpms/vim#905a7f266b4b12ef5faf5e8ad78abcdfd65fffbb was tested 3 times on 1 branches with the same results: SUCCESS
rpms/vim#926fadfc42af7a2384ff8bf21550524451d529b0 was tested 3 times on 1 branches with the same results: SUCCESS
rpms/vim#b8d394bfbbba23ed47f254758bd8d69b24af9346 was tested 3 times on 1 branches with the same results: SUCCESS
rpms/vim#c48460c7fa141685754758c5917200bdd3f5ba62 was tested 2 times on 1 branches with the same results: SUCCESS

So amusingly, it seems that all but one of the run produced the same outcome which is mostly: SUCCESS, only 11 commits have been tested more than once with the same FAILURE outcome and only 2 commits were tested more than once and got both SUCCESS and FAILURE (I didn't go to check in which order).

pingou commented 6 years ago

Note about these numbers, I queried resultsdb which we saw in #19 may have missed some messages, so the percentages may need to be taken with a grain of salt.
However, this remains a much higher rate of failure than we would like to see for something that actually blocks the Fedora packager workflow.

pfrields commented 6 years ago

@pingou and I were discussing this morning and found that there seem to be builds happening against, for example, F26 Atomic branch but with commits from the master (Rawhide) branch in dist-git. Would it be more correct for the CI pipeline to be testing builds on the same branch as the Atomic tree being used (in other words, commits to F26 dist-git branch tested on the Atomic F26 tree)? @dperpeet @alivigni

alivigni commented 6 years ago

AFAIK we are only running the pipeline on f26. We have not turned on f27 or Rawhide.

pfrields commented 6 years ago

@alivigni: Understood, thanks. What I'm saying is, commits/PRs on master may not be appropriate to run on the f26 pipeline, as opposed to commits/PRs on the f26 branch. Does that make sense?

alivigni commented 6 years ago

@pfrields It wouldn't be on the f26 pipeline it would be on its own parallel pipeline for master/rawhide. Does that makes sense?

pfrields commented 6 years ago

Possibly. Here's a specific example:

Commit on master in dist-git: https://src.fedoraproject.org/rpms/mesa/c/dc290cbce6a9756fb36043990716dcaa0b1f977a?branch=master
Pipeline result, fails due to missing libffi: https://jenkins-continuous-infra.apps.ci.centos.org/job/continuous-infra-ci-pipeline-f26/458/
Build on koji for f28 (Rawhide <=> master) works OK, no F26 build was submitted: https://koji.fedoraproject.org/koji/buildinfo?buildID=973334

It's possible that builds don't get submitted by the maintainer on a specific branch because they're known to not work. I'm not sure whether this means builds in the pipeline on a different branch should be ignored for now, or this is a concern for the future.

In other words, is this simply an artifact of the pipeline only doing one branch right now? Is it expected to separate those branches later, watching for commits/PRs from the matching dist-git branch only?

Edited 6 years ago by pfrields

alivigni commented 6 years ago

we are only triggering on dist-git messages on f26 so this was merged
https://src.fedoraproject.org/fork/pingou/rpms/mesa/commits/f26 there for a dit-git push message went out and we triggered.

branch is:

https://jenkins-continuous-infra.apps.ci.centos.org/job/continuous-infra-ci-pipeline-f26/458/parameters/

So there is no logic for us to differentiate. If a message goes out we trigger.

pfrields commented 6 years ago

Ah, this makes sense. That commit was indeed merged onto f26 branch. So as long as we're relying on HEAD on our side, things are OK. I now return this ticket to the main issue on gating.

dperpeet commented 6 years ago

For this to work properly, we want retriggering to be enabled (to resolve test flakes without having to push to dist-git again). Also, we want a significant number of tests to actually pass: >= 80%.

pingou commented 6 years ago

We'll also need to wait/check for the new duffy (FOSS) release to be out and deployed.

pfrields commented 6 years ago

I would like to suggest the duffy release not be a blocker per se, but that we do expect the new release to be on the way. IOW, continue on good faith it's going to be available in the foreseeable future.

pfrields commented 6 years ago

I would like to suggest the duffy release not be a blocker per se, but that we do expect the new release to be on the way. IOW, continue on good faith it's going to be available in the foreseeable future.

@pingou tells me it's unlikely the duffy release will go longer than other blocker issues, so this may be a moot point.

pingou commented 6 years ago

As of January 18th 2018, gating has been enabled in Fedora.

Announce

fedora-ci / AtomicCi

#2 Enable gating in Fedora

Closed: Fixed 6 years ago Opened 6 years ago by pingou.

Metadata

fedora-ci / AtomicCi

Source Code

#2 Enable gating in Fedora Closed: Fixed 6 years ago Opened 6 years ago by pingou.

Metadata

#2 Enable gating in Fedora

Closed: Fixed 6 years ago Opened 6 years ago by pingou.