#11887 504 when authenticating through id.fedoraproject.org
Closed: Fixed with Explanation a month ago by kevin. Opened a month ago by zlopez.

Describe what you would like us to do:


Today I'm getting 504 - Gateway Timeout when trying to login to different Fedora services.
I also noticed plenty of http-accounts errors in nagios.

When do you need this to be done by? (YYYY/MM/DD)


asap


In sssd log on ipsilon01 I found plenty of:

(2024-04-19 11:39:44): [pam] [cache_req_common_process_dp_reply] (0x3f7c0): [CID#5330] CR #36198: Could not get account info [1432158212]: SSSD is offline

But checking the sssd service it seems to be running.

Found this in httpd error_log on ipsilon01 when tracking the transaction_id:

[Fri Apr 19 12:07:35.906225 2024] [wsgi:error] [pid 2370894:tid 2371055] [client 10.3.163.74:45548] Timeout when reading response headers from daemon process 'ipsilon': /usr/libexec/ipsilon/ipsilon, referer: https://ipa01.iad2.fedoraproject.org/login/gssapi/negotiate?ipsilon_transaction_id=18c74230-a5f3-4b08-8fec-7fc3bc047ab3

Metadata Update from @zlopez:
- Issue untagged with: high-gain
- Issue tagged with: low-gain

a month ago

Also seeing this repeatedly. It's not letting me login to Fedora services here, so this is a little urgent I guess.

Metadata Update from @zlopez:
- Issue untagged with: low-gain
- Issue tagged with: high-gain

a month ago

There was update of access switches in IAD2 datacenter, which was finished at 13:18 UTC, but it seems that the issue persists. So this probably wasn't the root cause of this issue.

I also noticed that the login requests sometimes finishes correctly, but in most cases I'm getting 504.

This is causing my new-repository requests to be closed as invalid because the releng bot cannot find me in FAS: https://pagure.io/releng/fedora-scm-requests/issue/61861

Can anyone who was seeing this please check again now and see if anything like it is still happening?

If it is, please list the app you were trying to login to/auth against and time/date.

Can anyone who was seeing this please check again now and see if anything like it is still happening?

If it is, please list the app you were trying to login to/auth against and time/date.

i can login to copr now using FAS.

I think everything is back to normal now.

I am still not fully sure what the cause was, it seemed to be several issues at once. The database sever for accounts was under heavy load, the ipa cluster seemed to be in a odd state.

Please report any further issues...

Metadata Update from @kevin:
- Issue close_status updated to: Fixed with Explanation
- Issue status updated to: Closed (was: Open)

a month ago

Login to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog