#15 taskotron-trigger is not scheduling jobs for koji build messages in dev, stg or prod
Closed: Fixed None Opened 9 years ago by tflink.

We're not exactly sure when this started but around the time of the last taskotron-trigger upgrade, the taskotron instances in the Fedora infra stopped scheduling jobs for koji builds. @mkrizek was working on the issue and wasn't getting very far with it because the issue wasn't reproducible on his local systems.

@tflink started digging into it as well for a sanity check and found that he could reproduce the problem locally

We were able to verify that the systems were capable of receiving the relevant fedmsgs and those fedmsgs were properly formatted - eliminating most bits of the system as causes.

Upon closer examination, it turns out that while the fedmsg is received and the correct callbacks are made, the util function to determine the arches of a build was always returning empty lists and without arches, no jobs are actually scheduled.

Putting a simple time.sleep(1) before the utils.get_arches() call solves the issue, builds are found, arches are detected and jobs are properly scheduled.

At the moment, it looks like there is a delay between fedmsg emission and when koji responds to queries about the data contained in those fedmsgs. If this theory pans out, @mkrizek may not have seen the error on his dev instance due to network lag giving enough delay for koji to be able to repond properly to the queries.

Finish debugging and propose a fix so that the Taskotron instances start scheduling jobs properly again.


This ticket had assigned some Differential requests:
D238
D240

Bah, not sure it's actually koji load.

Looking at stats from resultsdb in dev:
| date | # of rpmlint jobs scheduled |
|------|-----------------------------|
|2014-09-21|417|
|2014-09-22|345|
|2014-09-23|87|
|2014-09-24|29|

stg shows none of the downturn in scheduled rpmlint jobs, taskotron-trigger-0.8.7 was installed in dev on 2014-09-22.

! In #341#4, @tflink wrote:
stg shows none of the downturn in scheduled rpmlint jobs, taskotron-trigger-0.8.7 was installed in dev on 2014-09-22.

I take that back, stg has had no rpmlint jobs scheduled since last Wednesday (unless you count the 1 scheduled on Thursday instead of the 300+/day scheduled previously).

Let's give the delay a shot and see if that works as a workaround

Login to comment on this ticket.

Metadata