#12291 pagure returning 500s due to error in flask_oidc
Closed: Fixed 18 days ago by abompard. Opened 23 days ago by adamwill.

Describe what you would like us to do:

Fix a problem causing Pagure to return 500 Internal Server Error for logged-in users sometimes.

This is affecting me and others in https://pagure.io/fedora-infrastructure/issue/12260 (but that issue is for something different, so I'm filing this). For me, I get a 500 any time I access pagure.io in Firefox, but it works in epiphany and when using curl. Looking on the server end, I found this in the logs:

[Fri Nov 15 08:51:03.418011 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058] 2024-11-15 08:51:03,417 [INFO] flask_oidc: Could not refresh token {'access_token': '[UUID-like string of 70 chars]', 'expires_at': 1731635156, 'expires_in': 3600, 'refresh_token': '[UUID-like string of 168 chars]', 'token_type': 'Bearer'}: invalid_grant: Something went wrong refreshing
[Fri Nov 15 08:51:03.419763 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058] mod_wsgi (pid=830440): Exception occurred processing WSGI script '/var/www/pagure.wsgi'.
[Fri Nov 15 08:51:03.420899 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058] Traceback (most recent call last):
[Fri Nov 15 08:51:03.420950 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1997, in __call__
[Fri Nov 15 08:51:03.420957 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]     return self.wsgi_app(environ, start_response)
[Fri Nov 15 08:51:03.420966 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]   File "/usr/lib/python3.6/site-packages/whitenoise/base.py", line 85, in __call__
[Fri Nov 15 08:51:03.420971 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]     return self.application(environ, start_response)
[Fri Nov 15 08:51:03.420980 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]   File "/usr/lib/python3.6/site-packages/pagure/proxy.py", line 74, in __call__
[Fri Nov 15 08:51:03.420985 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]     return self.app(environ, start_response)
[Fri Nov 15 08:51:03.420993 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1993, in wsgi_app
[Fri Nov 15 08:51:03.420998 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]     ctx.auto_pop(error)
[Fri Nov 15 08:51:03.421007 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]   File "/usr/lib/python3.6/site-packages/flask/ctx.py", line 387, in auto_pop
[Fri Nov 15 08:51:03.421012 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]     self.pop(exc)
[Fri Nov 15 08:51:03.421020 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]   File "/usr/lib/python3.6/site-packages/flask/ctx.py", line 353, in pop
[Fri Nov 15 08:51:03.421025 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]     self.app.do_teardown_request(exc)
[Fri Nov 15 08:51:03.421033 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1879, in do_teardown_request
[Fri Nov 15 08:51:03.421039 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]     func(exc)
[Fri Nov 15 08:51:03.421047 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]   File "/usr/lib/python3.6/site-packages/pagure/flask_app.py", line 536, in end_request
[Fri Nov 15 08:51:03.421053 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]     flask.g.session.remove()
[Fri Nov 15 08:51:03.421069 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]   File "/usr/lib/python3.6/site-packages/werkzeug/local.py", line 347, in __getattr__
[Fri Nov 15 08:51:03.421074 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058]     return getattr(self._get_current_object(), name)
[Fri Nov 15 08:51:03.421095 2024] [wsgi:error] [pid 830440:tid 140652618757888] [remote 2001:569:70c8:c900:351f:a8d1:3133:6620:49058] AttributeError: '_AppCtxGlobals' object has no attribute 'session'

I have edited the actual tokens for obvious reasons, you can see the real values in /var/log/httpd/error_log on pagure02 if you have access.

When do you need this to be done by? (YYYY/MM/DD)

ASAP.


So we could work around this easily enough:

diff --git a/pagure/flask_app.py b/pagure/flask_app.py
index 3481e957..8d5c6208 100644
--- a/pagure/flask_app.py
+++ b/pagure/flask_app.py
@@ -529,7 +529,8 @@ def end_request(exception=None):
         Details: https://pagure.io/pagure/issue/2302

     """
-    flask.g.session.remove()
+    if hasattr(flask.g, "session"):
+        flask.g.session.remove()
     gc.collect()

but is that right? Do we need to figure out why flask.g doesn't have session here? Google finds only this suspiciously spammy page: https://www.programmersought.com/article/436010733176/

So I think what's happening here is that through flask's authentication stuff and the flask-oidc plugin, we actually wind up hitting end_request without ever (in this instance of the app) having gone through set_request, so there just is no flask.g.session. Given that, I think https://pagure.io/pagure/pull-request/5510 is probably an acceptable fix.

Clearing pagure.io cookies does solve this for an individual user, as expected.

I was directed here after I reported an error 500 from https://src.fedoraproject.org/fork/music/rpms/python-setuptools-gettext/diff/rawhide..pyproject (or any attempt to open a PR on that project) in https://pagure.io/fedora-infrastructure/issue/12260#comment-944421, but I think this is not the same issue I am seeing either, because clearing Pagure cookies does not help and even curl https://src.fedoraproject.org/fork/music/rpms/python-setuptools-gettext/diff/rawhide..pyproject demonstrates the problem.

Yeah, that's not the same thing. Please open another issue. I'll try and find the failure on the server end.

Yeah, that's not the same thing. Please open another issue. I'll try and find the failure on the server end.

https://pagure.io/fedora-infrastructure/issue/12292

Oof, tricky thing about this: it comes back. It's not just a one-time thing. I just started getting 500s again - presumably my token expired and needed to be refreshed. Had to clear pagure.io cookies again to get rid of the 500s, and re-login.

Metadata Update from @phsmoura:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: medium-gain, medium-trouble, ops

19 days ago

Metadata Update from @phsmoura:
- Issue tagged with: outage

19 days ago

Metadata Update from @phsmoura:
- Issue untagged with: outage

19 days ago

Metadata Update from @abompard:
- Issue assigned to abompard

18 days ago

I've hotfixed it in prod until we deploy a new version with you PR merged in.

Metadata Update from @abompard:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

18 days ago

Log in to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog