#221 Tolerate invalid gating.yaml files in some reasonable way
Closed: Fixed 5 years ago Opened 5 years ago by dcallagh.

It's possible that a package maintainer could accidentally commit a gating.yaml file which is invalid or broken in some way. Greenwave currently does not attempt to handle that situation at all and I expect it will just give a 500 response when asked about a decision in that case.

(A good first step in solving this issue would be to just enumerate all the different possible ways a gating.yaml file might be "invalid" or broken, and confirm how Greenwave handles those situations today.)

But we anticipate there might be situations where the gating.yaml file is broken in some way but the build needs to be shipped anyway. That's what we have Waiverdb (when the automation says it should not pass the gate, but we want to push it through the gate anyway). But Waiverdb is no help to us if Greenwave is returning a 500 response for the decision.

We should harden Greenwave's handling of the gating.yaml so that it can give more specific information in its decision, if the gating.yaml is invalid -- and then we should make sure there is a way to waive the remote checks in that case.


Metadata Update from @gnaponie:
- Issue assigned to gnaponie

5 years ago

Ralph and I discussed about it in our 1x1. And I'm going with this approach:
- let's skip the check when something goes in a wrong way (cases that I have in mind: wrong yaml, RemoteRule inside the gating.yaml, I'll think about something else).
- we are going to add a "fake" result with "outcome" == "NEEDS_INSPECTION" among the results returned in the "decision" API with the information that something was wrong, so that the user will realize that his/her gating.yaml is wrong and he/she can correct it.

How does that sound?
Otherwise instead of the result we can add a key in the response that it will be something like "errors" that will be visible with "verbose==True" and it will explain to the user what was the problem. Not sure which one is better.

In case of RemoteRule configured inside the gating.yaml file, now we raise a RuntimeError. Same behavior for other cases (not related to a misconfigured RemoteRule policy) handled in the "validate_policies" function. Examples: Policies missing some required attribute (ex. subject_type).

For the moment, I'm going to change just the RemoteRule case. Eventually, if we think we should handle also the other cases in the same way, we can submit another PR.

I'm writing an update after the meeting I had with @dcallagh and @lholecek.
We decided that it would be maybe better to just not-return a 500 like it is now, and instead of accepting an invalid gating.yaml and return a fake result (that probably the user will never see) as in the #234 PR, we can return (from the decision API) with policies_satisfied = False and a "summary" that indicates that there was a problem with the gating.yaml.
And in the end, greenwave can check if there is a specific waiver in waiverdb with testcase that can be something like "invalid.gating.yaml" and if there is, it will ignore the problem. Of course, in a second part, we are going to document everything...

How does that sound? I hope I made a good summary.
@ralph do you agree with this approach?

Yes, that's much better. :)

Metadata Update from @dcallagh:
- Issue close_status updated to: Fixed
- Issue set to the milestone: 0.8
- Issue status updated to: Closed (was: Open)

5 years ago

Login to comment on this ticket.

Metadata