debuginfod01 went unresponsive and stopped processing, so I force reset it. It came back up and is processing, but there's a bunch of ext4 errors in dmesg. ;(
Additionally, it alerted about being low on disk space.
I think we should do a clean install of f41 on it when we can. (It's currently a f39 install anyhow)
CC: @fche
ext4 errors, yikes. I can't explain the fileystem-full indication. debuginfod is keeping about 13GB in open deleted filehandles, and du -x shows about 200G used in normal files ... so where's the other 650GB going ?!
du -x
wdyt about rebooting with a forced fsck?
The VM is actually up-to-date F40 (ansible trailing).
The debuginfod SOP mentions that the sqlite index file may be copied between servers. So when/if the main server here is fixed or rebuilt, the /var/cache/debuginfod/debuginfod.sqlite file from the .stg. server can be copied over for pretty quick service restoration. (Halt the staging server debuginfod for file integrity during the copy)
/var/cache/debuginfod/debuginfod.sqlite
We could try a reboot if you like... shall I?
I rebooted it and it's running a fsck now... but its been spewing for quite a while. not sure its going to get to a good state. will keep any eye on it
Surprisingly, looks great after the fsck/reboot !
Metadata Update from @zlopez: - Issue assigned to kevin - Issue priority set to: Waiting on Assignee (was: Needs Review) - Issue tagged with: medium-gain, medium-trouble, ops
Yeah. :)
So, shall we close this now I guess? Or do you still want to reinstall?
Metadata Update from @kevin: - Issue untagged with: medium-gain, medium-trouble, ops - Issue priority set to: Needs Review (was: Waiting on Assignee)
Metadata Update from @kevin: - Issue priority set to: Waiting on Assignee (was: Needs Review) - Issue tagged with: medium-gain, medium-trouble, ops
Let's close.
easy peesy.
Metadata Update from @kevin: - Issue close_status updated to: Fixed with Explanation - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.