#10430 Site down: https://copr.fedorainfracloud.org/
Closed: Fixed with Explanation 2 years ago by kevin. Opened 2 years ago by jlvillal.

The site https://copr.fedorainfracloud.org/ is down at the moment.


It seems OOMd killed httpd. ;(

...
Dec 18 21:39:53 copr-fe.aws.fedoraproject.org systemd[1]: httpd.service: Killing process 2521997 (httpd) with signal SIGKILL.
Dec 18 21:39:53 copr-fe.aws.fedoraproject.org systemd[1]: httpd.service: systemd-oomd killed 563 process(es) in this unit.
Dec 18 21:39:56 copr-fe.aws.fedoraproject.org systemd[1]: httpd.service: Failed with result 'signal'.
Dec 18 21:39:56 copr-fe.aws.fedoraproject.org systemd[1]: httpd.service: Consumed 4d 3h 22min 29.629s CPU time.

I restarted it.

CC: @praiskup

I don't know if we need to add more memory or more swap, or adjust OOmd. I will leave that to copr folks to decide. :)

Metadata Update from @kevin:
- Issue close_status updated to: Fixed with Explanation
- Issue status updated to: Closed (was: Open)

2 years ago

Thanks for reporting this, and bringing it up again quickly!

@jorton, FTR, the OOMPolicy = continue did not help here (1947475).

systemd-oomd killed 563 process(es) in this unit.

Hmm, even now:

$ pstree 2522571 -a -p -c | wc -l
512

This sounds like too much, we should take a look what is going on here.

Ok, pstree shows by defaults also threads. It is about 2x as many threads as I would expect, but it is not that bad. With pstree -T it is just "31" processes, which sounds OK.

Moved to our TODO list to resolve the automatic restart, at least.

Login to comment on this ticket.

Metadata