#874 "FINISHED" compose missing some artifacts
Closed: Fixed 5 years ago Opened 6 years ago by mohanboddu.

So, it turns out "FINISHED" compose is not actually fully finished compose.

https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20180312.n.0/compose/Container/ is from a "FINISHED" compose but its missing ppc64le and s390x images.

koji failed task - https://koji.fedoraproject.org/koji/taskinfo?taskID=25643870


the parent koji task succeeds because the non failable components passed, which is probably the problem, right?

I would like for these missing artifiacts to be accounted for as well.

@dustymabe No, if all the failable artifacts are composed, then it should be "FINISHED_INCOMPLETE".

  • FINISHED - All artifacts succeed and are delivered
  • FINISHED_INCOMPLETE - All non-failable artifacts succeed and are delivered.

I agree that if ppc64le images fail to build then the state should be FINISHED_INCOMPLETE. What I'm saying is that the reason this is probably happening is because the parent koji task succeeds because the images marked as not failable succeeded and most likely all pungi looks at is whether the koji task succeeded (which only tells us that non-failable tasks succeeded).

I'm venturing into new territory for me - but here goes.

What seems to be happening is pungi/phases/image_build.py invokes

175         with failable(compose, self.can_fail, variant, '*', 'image-build', subvariant,                                                                                                                         
176                       logger=self.pool._logger):                                                                                                                                                               
177             self.worker(num, compose, variant, subvariant, cmd)

That is from pungi.util where the signature looks like:

474 def failable(compose, can_fail, variant, arch, deliverable, subvariant=None, logger=None):

can_fail gets used as a block if defined, but then it sends 'arch' which is just *

So it's going to pass if one of the defined arches in CONTAINER variant works in the imagebuild phase, or that's how it appears to me.

@lsedlar what do you think of my comment above?

The failable context manager discards a potential exception that can be thrown when the koji task fails. If the deliverable is not blocking, the exception is logged and nothing else happens. If it is blocking, the exception is re-thrown and aborts the compose.

What's happening here is that only the optional architectures fail, and therefore the whole koji task is successful, so there is no exception and pungi does not even know there are some images missing. It needs to check individual subtasks and check if they were successful.

The subtasks are already examined to extract paths to the created images. It's just that failed subtasks are silently ignored: https://pagure.io/pungi/blob/master/f/pungi/wrappers/kojiwrapper.py#_365
Instead they should be recorded as failed and properly logged.

This problem was hit as again during beta as there are missing 32 bit images but probably no one was looking at them to try to fix them because the status of the compose is FINISHED. See https://pagure.io/releng/issue/7418#comment-503006

Proposed fix in #929. I'm currently trying to test that in stage.

Metadata Update from @lsedlar:
- Issue tagged with: 4.1.24

5 years ago

Looking forward for this, even though we might not see FINISHED not as often as we do now :cry:

Login to comment on this ticket.

Metadata