Looking at: https://jenkins-continuous-infra.apps.ci.centos.org/job/fedora-rawhide-stage-build-pipeline/ the CI pipeline for rawhide in staging doesn't seem to be running anymore.
I've made a few builds and bodhi updates since November 15th but that is the last build there.
There was an issue with the trigger, that I've just fixed. But there seems to be another issue.
The current issue it appears the default CI parameters are empty when a job gets triggered by CI: https://jenkins-continuous-infra.apps.ci.centos.org/view/Fedora%20All%20Packages%20Pipeline/job/fedora-stage-build-trigger/build?delay=0sec
@jimbair do you think it could be related to plugin upgrade? I think the trigger jobs used in production don't set any parameters so they are not affected by it...
@bgoncalv that seems likely given the timing. I do recall someone was tasked to update the plugin to add some data into the payload, so maybe this was it? I'll try to dig up the specifics for who was assigned the task.
The payload change was asked when we send message to Fedora-Messaging, but in this case is for the trigger and we still trigger on FedMSg, in this case FedMsg stage.
@bstinson is it possible the upgrade may have affected FedMsg stage? That seems unlikely but not entirely impossible...
ping?
@jimbair @bstinson any news?
@bgoncalv none from my side - I'm not sure what the best method is to sort this out, though I am curious if the payloads generated from before/after the migration can be examined in some way? So we can see what, if anything, changed. We already fixed the unix time, so maybe there's something on the fedmsg side that's also been changed in a similar manner?
I'm not sure how to do that short of rolling back the update to see if things behave again. @bstinson any thoughts?
Or maybe stage Bodhi stopped to send message to stage FedMsg?
@pingou the issue still happens?
Trying to figure this out, bodhi had some issues in stg indeed, hopefully it's now fixed.
I'll update this ticket ASAP :)
Ok, so I have a build, an update which was announced by bodhi: https://apps.stg.fedoraproject.org/datagrepper/id?id=2020-59d6de8c-4f48-4468-bc9d-55f845256ab4&is_raw=true&size=extra-large but I'm not seeing the corresponding CI run/messages for it (the NVR is: fedora-gather-easyfix-0.2.1-86.fc32)
fedora-gather-easyfix-0.2.1-86.fc32
Do you have more luck?
The build gets triggered, but it seems since the plugin upgrade the default parameters that we trigger the build are empty, and this causes it to fail on stage env.
https://github.com/CentOS-PaaS-SIG/upstream-fedora-pipeline/blob/master/JenkinsfileStageBuildTrigger#L30
https://jenkins-continuous-infra.apps.ci.centos.org/job/fedora-stage-build-trigger/182/parameters/
Looks like something is still up: https://jenkins-continuous-infra.apps.ci.centos.org/job/fedora-rawhide-stage-build-pipeline/ ::(
I think we need @jimbair or @bstinson to check the plugin. Do we know what plugins got updated?
I reached out to @bstinson and he noticed that the staging pipeline is running the build but it's sending production messages:
https://jenkins-continuous-infra.apps.ci.centos.org/job/fedora-rawhide-stage-build-pipeline/176/console
<bstinson> so, the staging pipeline is running the koji build: https://koji.stg.fedoraproject.org/koji/taskinfo?taskID=90003965 <bstinson> but it's sending production messages: <bstinson> 09:42:45 Message topic: org.centos.prod.ci.pipeline.allpackages-build.package.complete <bstinson> and trying to download from production koji <bstinson> 09:42:39 + koji download-task --arch=x86_64 --arch=src --arch=noarch --logs 90003965 <bstinson> 09:42:40 No such task: #90003965
So something, somewhere, is using prod when it should be using stage. Thoughts?
It feels like this should be handled here:
https://github.com/CentOS-PaaS-SIG/upstream-fedora-pipeline/blob/master/src/org/centos/pipeline/PackagePipelineUtils.groovy#L331-L339
And we have "fedora-fedmsg-stage" and "FedoraMessagingStage" configured, so I'm not sure where the disconnect is...
As I mentioned on my first comment the problem is with the trigger job (https://jenkins-continuous-infra.apps.ci.centos.org/view/Fedora%20All%20Packages%20Pipeline/job/fedora-stage-build-trigger/) and not with (https://jenkins-continuous-infra.apps.ci.centos.org/job/fedora-rawhide-stage-build-pipeline/).
The issue is on trigger job we set parameters that change the default settings for the message provider and koji instance, the default is the production ones, and we change it to stage. https://github.com/CentOS-PaaS-SIG/upstream-fedora-pipeline/blob/master/JenkinsfileStageBuildTrigger#L30
But for some reason (it appears since the plugin upgrade) these parameters are not being set in the trigger job when it is triggered by CI message (https://jenkins-continuous-infra.apps.ci.centos.org/view/Fedora%20All%20Packages%20Pipeline/job/fedora-stage-build-trigger/217/parameters/).
If a manually try to "Build with Parameters" it works: https://jenkins-continuous-infra.apps.ci.centos.org/job/fedora-stage-build-trigger/222/ https://jenkins-continuous-infra.apps.ci.centos.org/job/fedora-rawhide-stage-build-pipeline/177/parameters/
Sorry - I missed that part, :( But wouldn’t we hit the unknown provider block if it was empty?
https://github.com/CentOS-PaaS-SIG/upstream-fedora-pipeline/blob/master/src/org/centos/pipeline/PackagePipelineUtils.groovy#L338
But thanks for the follow up - I can look a bit more today as well and ping Brian again. :)
Sorry - I missed that part, :( But wouldn’t we hit the unknown provider block if it was empty? https://github.com/CentOS-PaaS-SIG/upstream-fedora-pipeline/blob/master/src/org/centos/pipeline/PackagePipelineUtils.groovy#L338
Not really, the code above set the provider if it was not set yet. https://github.com/CentOS-PaaS-SIG/upstream-fedora-pipeline/blob/master/src/org/centos/pipeline/PackagePipelineUtils.groovy#L324
We had a long discussion in #fedora-ci this morning to work through this, so I'll give a lengthy summary here:
So hopefully the above gets us back to a working staging pipeline.
Pingou, if you can try another stage koji build so we can see if it triggers, that is our next step.
Thanks!
I've created a Jenkinsfile to test the trigger on Prod FedoraMessaging.
It triggered, but it still had the same issue with default parameters now being set.
Also, I've noticed the CI_MESSAGE changed.
example:
{"deliveryTag":17,"msg":{"agent":"bodhi","artifact":{"builds":[{"component":"nano","id":1457844,"issuer":"kdudka","nvr":"nano-4.8-1.fc32","scratch":false,"task_id":41422439,"type":"koji-build"}],"id":"FEDORA-2020-b20bb0ca98-42150e1a8c4da1a9f1ddd1e75b05b21ac33da6f7","release":"f32","repository":"https://bodhi.fedoraproject.org/updates/FEDORA-2020-b20bb0ca98","type":"koji-build-group"},"contact":{"docs":"https://docs.fedoraproject.org/en-US/ci/","email":"admin@fp.o","name":"Bodhi","team":"Fedora CI"},"generated_at":"2020-02-07T12:05:36.924879Z","re-trigger":false,"version":"0.2.2"},"msg_id":"19eaa876-e0a8-49f4-9745-a6d662f84d75","timestamp":1581077137585,"topic":"org.fedoraproject.prod.bodhi.update.status.testing.koji-build-group.build.complete"}
{"version":"0.2.2","re-trigger":false,"agent":"bodhi","contact":{"docs":"https://docs.fedoraproject.org/en-US/ci/","team":"Fedora CI","name":"Bodhi","email":"admin@fp.o"},"artifact":{"release":"f32","type":"koji-build-group","id":"FEDORA-2020-b20bb0ca98-42150e1a8c4da1a9f1ddd1e75b05b21ac33da6f7","repository":"https://bodhi.fedoraproject.org/updates/FEDORA-2020-b20bb0ca98","builds":[{"nvr":"nano-4.8-1.fc32","task_id":41422439,"scratch":false,"component":"nano","type":"koji-build","id":1457844,"issuer":"kdudka"}]},"generated_at":"2020-02-07T12:05:36.924879Z"}
Basically the CI_MESSAGE from FedMsg is just a subset (the content of msg) from the message that FedoraMessaging sends. I'm not sure if this change was intended because I'd expect the CI_MESSAGE to just contain the message, like FedMsg does.
msg
This is the message in datagrepper: https://apps.fedoraproject.org/datagrepper/id?id=2020-19eaa876-e0a8-49f4-9745-a6d662f84d75&is_raw=true&size=extra-large
Confirmed by my script:
12:18:45 - Retrieving update created [DONE] Update automatically created : FEDORA-2020-b218270632 12:18:49 - Retrieving koji tags: ['f32-updates-candidate', 'f32-updates-testing-pending'] [DONE] 12:18:50 - bodhi to CI results in datagrepper returned - ran for: 0s [DONE] 12:33:52 - CI (running) results not found in datagrepper - ran for: 901s [FAILED]
Confirmed by my script: 12:18:45 - Retrieving update created [DONE] Update automatically created : FEDORA-2020-b218270632 12:18:49 - Retrieving koji tags: ['f32-updates-candidate', 'f32-updates-testing-pending'] [DONE] 12:18:50 - bodhi to CI results in datagrepper returned - ran for: 0s [DONE] 12:33:52 - CI (running) results not found in datagrepper - ran for: 901s [FAILED]
Strange, unless I didn't configure https://jenkins-continuous-infra.apps.ci.centos.org/view/Fedora%20All%20Packages%20Pipeline/job/fedora-stage-build-trigger/ properly we were not even able to trigger using FedoraMessaging Stage...
So it seems there are 3 problems being discussed here, I opened different issues for them :)
this issue we track the default parameters in the trigger that causes stage job to not be triggered properly. This issue happens doesn't matter which provider we use.
Can't trigger on FedoraMessage Stage: https://pagure.io/fedora-ci/general/issue/94
Problem with CI_MESSAGE format on FedoraMessaging: https://github.com/jenkinsci/jms-messaging-plugin/issues/165
This issue should be fixed now after updating jms-messaging-plugin to version 1.1.12.
1.1.12
@pingou can you confirm it?
Looks to be fine:
15:41:04 - bodhi to CI results in datagrepper returned - ran for: 0s [DONE] 15:41:42 - CI (running) results in datagrepper returned running - ran for: 37s [DONE] 15:48:32 - CI (complete) results in datagrepper returned error - ran for: 410s [DONE]
Now onto why greenwave in stg isn't sending messages :)
Thanks folks! :)
Metadata Update from @pingou: - Issue status updated to: Closed (was: Open)
Login to comment on this ticket.