OpenVPN appears to be taking a long time to route traffic from phx to tummy.com. There might be some way we can optimize this.
One symptom of the time it's taking is that hitting either of these two pages in the packagedb yields a proxy error when the request is served from app5:
https://admin.fedoraproject.org/pkgdb/acls/vcs?tg_format=json https://admin.fedoraproject.org/pkgdb/acls/bugzilla?tg_format=json
Note that these URLs have to retrieve a large amount of data from the database which could be part of the problem.
The problem can be evoked by logging into app5 and running: wget http://localhost:8086/pkgdb/acls/bugzilla?tg_format=json
Note that when run on app5, this will eventually retrieve the page.
I'm not sure if this is an openvpn problem or not but we do need to do some benchmarks and see exactly how much of a slowdown we'd expect for our apps when running over a non LAN link. Interestingly the above command took about 2:40 to run over vpn and almost 6:30 to run via ssh tunnel.
The particular problem with pkgdb was solved by adding a method to fas that retrieves information about all of the users in the db instead of having to retrieve the information user by user. Utilizing this new method in the acls pages cut the time to:
5s | wget http://localhost:8086/pkgdb/acls/bugzilla?tg_format=json 14s | wget http://localhost:8086/pkgdb/acls/vcs?tg_format=json
So it appears to be latency between the two colos. This is probably to be expected and we'll just have to remember that any app we develop in the same colo as the database server needs to be tested in a different colo for any such issues.
Another note -- after seeing the way SQLObject behaves in mirrormanager, I'm guessing that SQLObject based apps will suffer from this more than SA apps.
Do you want to leave this open for benchmarks or should we close this?
Slightly related note: vpn1 is built and running openvpn - maybe we can look at a switchover at some point.
I'm no longer actively working on this but do not want to close it.
Just rebuilt vpn1 (it was lost in the Incident).
This is all a network thing, not an openvpn thing. We've moved the remote servers to be backups of the phx local servers.
Log in to comment on this ticket.