#3555 Better support for microservicing
Opened 2 years ago by karsten. Modified 2 years ago

One of the problems when microservicing Pagure is that a lot of parts are interconnected in a way that makes it very hard to decouple them and run them as separate services/pods. Even though this will be major effort, we should put some work in this. Specifically, these points would be great to have fixed:

It'd be ideal if the git hooks didn't import pagure directly, but rather use (newly created) API endpoints. This way, the sshd service could be just a simple sshd and wouldn't have to carry all Pagure dependencies (this would also mean we'd be able to have a separate image for it, which we could upgrade separately).

Another problem that we have right now is that we have 3 kinds of services that need to see the disk volume with repositories - server, workers and sshd. In microservices world, there'd ideally be just one microservice actually touching the data and the rest would be using its API ([1]). This would probably require a huge effort, but we should at least try to see if it's possible for some of the services.

[1] Our problems with this made us run all these services on the same physical node in UpShift, which is problematic - if the node fails, we don't have a deployment. The reason we need to run them on the same node is that these services immediately use data written by each other - if running on different nodes, NFS doesn't sync fast enough which results in all kinds of different failures.
(Bohuslav Kabrda)

I wonder if the repospanner integration that @puiterwijk is working on could help with this.

Because short of re-writing git/pygit2 I'm not quite sure how this would be achievable otherwise.

I like the idea of having the hooks use the REST APIs, instead of having a dependency to pagure. That would also make setting up the dev environment in container easier.

I'm also +1 on this, also for security reasons (I even thought we already had a
ticket for this)

With the repoSpanner integration, the local filesystem assumption/requirement for repos should go away.
So with that, the only remaining challenge for adding multiple pods would be attachments (which I've got a patch for).
With repoSpanner, the hooks are also moved out of the ssh pod, into the repoSpanner one (still less ideal, but inside there it's severely limited).

Metadata Update from @karsten:
- Issue assigned to karsten

2 years ago

I don't quite have a time estimate for this ticket, given the changes that repoSpanner already brings in and the different options available:
- Do we want to move the entire logic in one API? Multiple?
- Do we want to allow the sshd pod to query to database (R/O)?
- Do we want to split the pagure source-code to have something more lightweight on the sshd pod where only the code used in the hooks would be present?

As said earlier this is likely not going to be a small task.

Login to comment on this ticket.