#677 use output of ostree step to determine if later steps should be done
Opened 2 years ago by dustymabe. Modified 2 years ago

In rawhide for the past few days our ostree builds have been failing because of https://bugzilla.redhat.com/show_bug.cgi?id=1472573. But our image builds have been succeeding just fine (which is a little misleading, since they are just building from old content). Is it possible to make a later task only execute if an earlier task succeeded?

While we are on this topic, in the future it would be nice to be able to do this:

  • ostree failed to generate => don't build images
  • ostree succeeded to generate => build images
  • ostree returned that there were no changes and thus no reason to generate a new ostree => don't build images but this is not a failure

Thoughts?


I like the idea, and agree it would make sense to do something like this. No point rebuilding an image if nothing in it changed.

When talking about images though, which image do you mean? The ones produced by ostree_installer phase? The ones in image_build phase?

The most difficult part here I think is mapping the image to the corresponding ostree repo. There is currently no way to tell this on the ostree repo level.

I guess we could do it on variant level. That would mean a variant could only have one ostree repo.

We could then store for each variant the status of ostree (not-configured, updated, no-change, failed). When building some image for the variant, we would first check this status: if it's no-change, then not build any images without reporting errors, if failed not build images but report them as missing. If not-configured and updated the image would be built normally.

If done like this, another consequence would be that all images in a variant with ostree would be affected by this, no matter if they actually use the ostree bits or not.

I like the idea, and agree it would make sense to do something like this. No point rebuilding an image if nothing in it changed.
When talking about images though, which image do you mean? The ones produced by ostree_installer phase? The ones in image_build phase?

if ostree for x86_64 fails then no need to do ostree_installer nor image_build phase for x86_64

The most difficult part here I think is mapping the image to the corresponding ostree repo. There is currently no way to tell this on the ostree repo level.

what's the question?

I guess we could do it on variant level. That would mean a variant could only have one ostree repo.
We could then store for each variant the status of ostree (not-configured, updated, no-change, failed). When building some image for the variant, we would first check this status: if it's no-change, then not build any images without reporting errors, if failed not build images but report them as missing. If not-configured and updated the image would be built normally.
If done like this, another consequence would be that all images in a variant with ostree would be affected by this, no matter if they actually use the ostree bits or not.

yeah maybe this is complicated. I was just trying to say if ostree phase for Atomic variant for x86_64 fails then no reason to do ostree_installer or image_build.

One issue that's quite related to all of this is - the second we start using rpm-ostree's built in change detection (as we are now), it's going to cause version number desync with the cloud images unless we also avoid building cloud images (as you suggest here).

Except here's where things get tricky - in some cases we might push a change to just the kickstart files which only affect cloud images, and we do want to rebuild there. So pungi would need to gain some logic to do something similar to what rpm-ostree does in recording an "input hash" or git commits.

Backing up to a higher level problem: For the "unified compose" model where e.g. FAH and FAW come out of the "big Everything nightly" compose, today if e.g. someone pushes a change to a package not in either of those (say KDE, some random game or node.js module or whatever) - the compose will have a different version number from the ostree commits.

I am not sure how to really fix this without really splitting up the streams so they have distinct versions. Which is basically what we in fact do with the two-week composes.

I think my overall feeling on this is: It's critical to make the RPM batching really work and use change detection for the ostree for real release streams. The "pointless updates" problem is bad and embarassing. But for devel streams like rawhide...eh. (Though obviously things are going to work a lot better if rawhide works like a release stream too...)

One issue that's quite related to all of this is - the second we start using rpm-ostree's built in change detection (as we are now), it's going to cause version number desync with the cloud images unless we also avoid building cloud images (as you suggest here).

Agree.

Except here's where things get tricky - in some cases we might push a change to just the kickstart files which only affect cloud images, and we do want to rebuild there. So pungi would need to gain some logic to do something similar to what rpm-ostree does in recording an "input hash" or git commits.

Yes, I suppose this would be a separate RFE?

Backing up to a higher level problem: For the "unified compose" model where e.g. FAH and FAW come out of the "big Everything nightly" compose, today if e.g. someone pushes a change to a package not in either of those (say KDE, some random game or node.js module or whatever) - the compose will have a different version number from the ostree commits.

Yes, but is that a problem? If we implement the RFE requested by this issue then ostree will kick out and say "no changes", and then any derivative artifacts won't attempt to be created. Which means no ISOs/Cloud images (that had the wrong version #) would get created, right?

I am not sure how to really fix this without really splitting up the streams so they have distinct versions. Which is basically what we in fact do with the two-week composes.

what are "streams" here? FAH and FAW? or are you referring to FAH updates stream and FAH testing stream?

I think my overall feeling on this is: It's critical to make the RPM batching really work and use change detection for the ostree for real release streams. The "pointless updates" problem is bad and embarassing. But for devel streams like rawhide...eh. (Though obviously things are going to work a lot better if rawhide works like a release stream too...)

Now that FAW can properly not compose when no changes are made and we can force FAH to compose even when no changes are made, what are we missing (other than this RFE, of course, so that we don't have to force-nocache FAH)?

what are "streams" here? FAH and FAW? or are you referring to FAH updates stream and FAH testing stream?

Yes to both. Also rawhide vs f27 for those. Basically the full matrix - which is actually today reified as ostree refspecs as was the original intent!

Now that FAW can properly not compose when no changes are made and we can force FAH to compose even when no changes are made, what are we missing (other than this RFE, of course, so that we don't have to force-nocache FAH)?

Right, using force-nocache always for FAH is probably an OK short term hack. The problem became urgent for FAW because we don't have a batching process there.

But I do think this whole thing is going to keep tripping us up. Another corner case is say when we want to respin an ISO but the ostree content doesn't change (though this should be relatively rare).

Yes to both. Also rawhide vs f27 for those. Basically the full matrix - which is actually today reified as ostree refspecs as was the original intent!

I really like that full matrix and the ostree refspecs btw :)

Right, using force-nocache always for FAH is probably an OK short term hack. The problem became urgent for FAW because we don't have a batching process there.

+1

But I do think this whole thing is going to keep tripping us up. Another corner case is say when we want to respin an ISO but the ostree content doesn't change (though this should be relatively rare).

Yeah that is a problem, but I think we can force things easy enough by cosmetically changing the manifest or treecompose post and forcing it that way? It would be a hack, but I think it would work, right?

Login to comment on this ticket.

Metadata