This Sunday ,we got zabbix/monitoring alerts about some services being down. All services/Virtual Machines were running on same hypervisor. We'll need to take contact with the Hardware vendor, as despite all storage being configured as raid 6 array on a hardware raid controller with read/write cache, it confirms (fw/bios message on reboot) some corruption and data loss.
Impacted services :
mail thread : https://lists.centos.org/pipermail/centos-devel/2024-March/165559.html
Normally all restored now but keeping ticket open in case someone would find related issue after we restored the services
INC2889880 (for hardware issue follow-up)
status update : still waiting for hardware vendor to fix the issue
hardware controller replaced and server is now being reprovisionned and added back in ansible
Metadata Update from @arrfab: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.