#2125 PHX2 outage

Created 6 years ago by mmcgrath

= phenomenon =
PHX2 was completely unavailable. This caused all apps requiring our common storage data layer to be offline as well as the buildsystem.

= reason =
Power outage. Hosts came back online but network was unavailable. Most services were back online in 2 and a half hours. pkgdb took longer because bugzilla was offline. Some services, like mirrormanager, was only down for a couple of minutes while we put some temporary fixes in place to keep it online. docs, fedoraproject.org main site, start, etc all remained online. Fedorahosted.org was fine except that logging in to the trac instances was down.

= recommendation =

Wait for the final RFO on this. The word is still out. I think having a secondary offsite vpn would have helped keep downtime to a minimum but outbound UDP is still at question.

Had another outage just now, lasted about 1 hour. Network related, root cause not completely clear yet.

Login to comment on this ticket.