#28 Update trigger to support per-image tasks for docker layered images
Closed: Invalid None Opened 7 years ago by tflink.

Update taskotron-trigger such that it listens for the fedmsgs emitted upon docker layered image build completion. Ensure that the proper per-image tasks are triggered when an appropriate message is detected.


This ticket had assigned some Differential requests:
D1002

This seems like a better place for discussion than D1002, so I'm starting the conversation here.

In D1002, the description says:

With @kparal, I discussed where do we want to query koji, whether do it in trigger or in task using a koji directive. Kamil is for koji directive, me and Josef are for trigger. The trigger approach can have problem, when the koji is slow, but it produces item in 'docker pullable' format (URL to registry), and compared to directive approach, the koji is queried only once when there are several tasks for given image.
When item is URL to registry, we would need to extract cockpit:0.95-0.6 from the URL, to get 'proper' item name to report against, or start passing more arguments for runtask and be sure that the test is reported against correct nvr.

My instinct is to agree with @kparal on this and put the lookup logic into libtaskotron. I understand your argument about the number of koji queries but I'm tempted to say that if it's an issue, they need to put more information in the fedmsg that's being emitted during the build process.

Are there any other cases we know of where some form of extra processing and/or network calls are going to be required before the incoming fedmsg can be processed?

If this is going to be a lone case, I'd really rather have the logic in #libtaskotron for the following reasons (in order of what I think is most important):

  1. Doing the processing inside the trigger starts to blur the lines between what trigger is responsible for and what libtaskotron is responsible for. Koji lookups to find the bits that will actually be poked during task execution is something libtaskotron already does in many tasks.
  2. Trigger logs and failures are not as easily available as failures later in execution are. I don't want the only indication that there are problems during docker image triggering is a slow/non-responsive trigger with only the logs on the triggering machine to indicate what the problem might be.
  3. Slow koji could slow the trigger down which can cause problems for everything other kind of job which needs to be triggered
  4. I want to keep the trigger dumb and consistent - this will make the eventual web interface easier

Other thoughts?

Are there any other cases we know of where some form of extra processing and/or network calls are going to be required before the incoming fedmsg can be processed?

For example, the whole task-discovery code does (and will do) git-clone, so, yes, we do have extra network calls.

  1. Doing the processing inside the trigger starts to blur the lines between what trigger is responsible for and what libtaskotron is responsible for. Koji lookups to find the bits that will actually be poked during task execution is something libtaskotron already does in many tasks.

I kind of believe, that this particular kind of information is important even to trigger - I can easily think of tasks, that would only get executed on some of the repositories. Take koji-tag related scheduling - we disregard non-primary instances, pending tags, and we do it in trigger, instead of having the code in libtaskotron, and being smart there.
I know, that it seems to be a fundamental difference, since he koji-tag code does not query anything, but the information seems to have the same value. It is just not present in the fedmsg now.
So I think we could make a case for the fedmsgs to contain the information we need, but that does not mean we just need to offload to libtaskotron, before it's done.
Integral conceptual part of the new trigger is, that it gathers as much relevant information as possible, and then does it's thing. I even think there is a strong case for trigger passing more information to runtask, then it does now, but that would need changes in libtaskotron, as runtask now has a very limited set of params.

  1. Trigger logs and failures are not as easily available as failures later in execution are. I don't want the only indication that there are problems during docker image triggering is a slow/non-responsive trigger with only the logs on the triggering machine to indicate what the problem might be.

Then let's make the logs available. We could easily use ExecDB to "report" that koji was unresponsive - we have (want to have) the concept of different "failure levels", adding a job with "FAILED_TRIGGER" to ExecDB is easy, and would really make sense.

  1. Slow koji could slow the trigger down which can cause problems for everything other kind of job which needs to be triggered

To be honest, I'm not sure how the fedmsg consuming works now (I can/will test) but if the consumer method calls are blocking (I don't think they are, we'll see), then we could easily just execute the _process_docker code in a separate, non-blocking thread.

  1. I want to keep the trigger dumb and consistent - this will make the eventual web interface easier

So do I, but I don't see any tie between grabbing "more information" during trigger time, and a web interface, honestly.

Makes sense?

! In #840#11915, @jskladan wrote:
Are there any other cases we know of where some form of extra processing and/or network calls are going to be required before the incoming fedmsg can be processed?

For example, the whole task-discovery code does (and will do) git-clone, so, yes, we do have extra network calls.

Not quite sure how I forgot about that.

  1. Doing the processing inside the trigger starts to blur the lines between what trigger is responsible for and what libtaskotron is responsible for. Koji lookups to find the bits that will actually be poked during task execution is something libtaskotron already does in many tasks.

I kind of believe, that this particular kind of information is important even to trigger - I can easily think of tasks, that would only get executed on some of the repositories. Take koji-tag related scheduling - we disregard non-primary instances, pending tags, and we do it in trigger, instead of having the code in libtaskotron, and being smart there.
I know, that it seems to be a fundamental difference, since he koji-tag code does not query anything, but the information seems to have the same value. It is just not present in the fedmsg now.

That makes sense.

So I think we could make a case for the fedmsgs to contain the information we need, but that does not mean we just need to offload to libtaskotron, before it's done.

One question is whether we want to file an issue against koji for the fedmsg to contain more information. I wonder how quickly that could be acted upon

Integral conceptual part of the new trigger is, that it gathers as much relevant information as possible, and then does it's thing. I even think there is a strong case for trigger passing more information to runtask, then it does now, but that would need changes in libtaskotron, as runtask now has a very limited set of params.

  1. Trigger logs and failures are not as easily available as failures later in execution are. I don't want the only indication that there are problems during docker image triggering is a slow/non-responsive trigger with only the logs on the triggering machine to indicate what the problem might be.

Then let's make the logs available. We could easily use ExecDB to "report" that koji was unresponsive - we have (want to have) the concept of different "failure levels", adding a job with "FAILED_TRIGGER" to ExecDB is easy, and would really make sense.

Yet another case for centralized logging, some day that'll get done :)

How would a FAILED_TRIGGER state in execdb work? Does that mean we'd have a try-except block that catches errors and submits them to execdb?

  1. Slow koji could slow the trigger down which can cause problems for everything other kind of job which needs to be triggered

To be honest, I'm not sure how the fedmsg consuming works now (I can/will test) but if the consumer method calls are blocking (I don't think they are, we'll see), then we could easily just execute the _process_docker code in a separate, non-blocking thread.

fedmsg-hub uses a twisted reactor IIRC which should be non-blocking to an extent. It'd be worth verifying that and figuring out how it's configured, though.

  1. I want to keep the trigger dumb and consistent - this will make the eventual web interface easier

So do I, but I don't see any tie between grabbing "more information" during trigger time, and a web interface, honestly.

I think the biggest thing here is how to represent the YAML from trigger in a web interface as things get more complicated. Not an insurmountable problem but something that I'd like to keep in mind.

Overall, you've convinced me, though. Objection withdrawn.

Login to comment on this ticket.

Metadata