Learn more about these different git repos.
Other Git URLs
Running git hooks (to be more precise the dynamic authentication plugin and subclasses of `BaseRunner) in sshd has several disadvantages:
`BaseRunner
The solution that I'd like to propose is that we make hookrunner a very lightweight process that only gathers basic info and triggers an internal endpoint of Pagure server. Running the hooks inside Pagure server will actually be pretty cheap, since it has all the libraries and connections initialized and should be able to execute the hooks very fast.
hookrunner
After some poking into this, I'd split this into two parts:
writer
print
The advantage of the two-step approach is that even if we later figure out that the approach is generally not viable, we can still use the writer object to better control what/when/how gets printed to user instead of letting hooks print stuff randomly (if we ever do need to control this).
Metadata Update from @pingou: - Issue tagged with: RFE
Would we need one or two writers then? (stdout vs stderr)
Also, I'm wondering if the first step isn't decoupled from the second (and thus we could separate both issues)
@pingou good point with 2 writers. It'd be possible to just pass a writer object that would have out() and err() methods, but I think this might be a problem, We'd need to record when every line was emitted to be able to print them to user in the correct order. Since user can't really tell which lines are printed to stdout and which to stderr, I'd suggest that we create just writer.out() method and leave the space open for implementing the writer.err() method if we ever see a usecase for it (or we make err an alias of out in our default implementation).
out()
err()
writer.out()
writer.err()
err
out
And yes, it was exactly my point to separate the two issues, even though the second one depends on the first.
Metadata Update from @bkabrda: - Issue untagged with: RFE
I'm not quite sure the second depends on the first, the second could just print to the user the output of the API call
Well if the API endpoint has no means of gathering the output of hooks it runs, it can't return the output to hookrunner and that can't display it to user. Technically the tasks don't depend on each other in terms of hook execution, but they do in terms of getting proper feedback to user.
Well if the API endpoint has no means of gathering the output of hooks it runs, it can't return the output to hookrunner and that can't display it to user.
Duh! You are of course right :)
The problem I can see with this is a matter of scaling. You'd bottleneck on the central pagure server, which you can only have one instance of. How do we deal with needing the handle more webhook runs?
@ngompa I've previously successfully scaled Pagure server to more replicas (it's possible if you use redis-based sessions). Is there any new issue with scaling up server that I don't know about?
Metadata Update from @pingou: - Issue set to the milestone: 5.3
Fixing this ticket will also fix: https://pagure.io/pagure/issue/2874
Metadata Update from @karsten: - Issue assigned to karsten
So I'm going to try to explain why after discussing this with @puiterwijk we believe this is not going to be something that we will be able to fix.
Basically, with repoSpanner the git hook does not have access to the git objects being pushed at the time they run, which means we are not able to rely on them existing, if we create the API endpoint to accept enough information to act, then we end up with basically sending the entire git objects to the API endpoint.
In addition, from the discussion we had it turns out that most of the hooks we have should be moved from update to post-receive and most of the pre-receive should become update (but we can fix this in another commit/PR later).
So I'm going to close this ticket as Won't Fix since it's not something we can fix :(
Metadata Update from @pingou: - Issue close_status updated to: Won't Fix - Issue status updated to: Closed (was: Open)
Login to comment on this ticket.