#12352 bvmhost-p09-03 ends in emergency mode
Opened a month ago by zlopez. Modified 2 days ago

Describe what you would like us to do:


Today I tried to restart bvmhost-p09-03 and ended up in emergency mode:

[  272.173282] dracut-initqueue[1684]: Warning: dracut-initqueue: timeout, still waiting for following initqueue hooks:                                                                                                                     
[  272.174729] dracut-initqueue[1684]: Warning: /lib/dracut/hooks/initqueue/finished/90-crypt.sh: "[ -e /dev/disk/by-id/dm-uuid-CRYPT-LUKS?-*457151a4f4044232a873cea57ec14ca5*-* ] || exit 1"                                               
[  272.176463] dracut-initqueue[1684]: Warning: /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2fdisk\x2fby-id\x2fmd-uuid-bc65d529:55a1eab2:b79d716e:e29d799a.sh: "[ -e "/dev/disk/by-id/md-uuid-bc65d529:55a1eab2:b79d716e:e29d799
a" ]"                                                      
[  272.178777] dracut-initqueue[1684]: Warning: /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2fmapper\x2fvg_guests-LogVol00.sh: "if ! grep -q After=remote-fs-pre.target /run/systemd/generator/systemd-cryptsetup@*.service 2>/d
ev/null; then                                              
[  272.179304] dracut-initqueue[1684]:     [ -e "/dev/mapper/vg_guests-LogVol00" ]                                    
[  272.180298] dracut-initqueue[1684]: fi"                 
[  272.181092] dracut-initqueue[1684]: Warning: /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2fvg_guests\x2fLogVol00.sh: "[ -e "/dev/vg_guests/LogVol00" ]"                                                                      
[  272.183395] dracut-initqueue[1684]: Warning: /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2fvg_guests\x2fLogVol01.sh: "[ -e "/dev/vg_guests/LogVol01" ]"
[  272.185709] dracut-initqueue[1684]: Warning: dracut-initqueue: starting timeout scripts
[  272.186575] dracut-initqueue[1684]: Warning: Could not boot.
         Starting dracut-emergency.service - Dracut Emergency Shell...
Warning: /dev/disk/by-id/md-uuid-bc65d529:55a1eab2:b79d716e:e29d799a does not exist
Warning: /dev/mapper/vg_guests-LogVol00 does not exist
Warning: /dev/vg_guests/LogVol00 does not exist
Warning: /dev/vg_guests/LogVol01 does not exist
Warning: crypto LUKS UUID 457151a4-f404-4232-a873-cea57ec14ca5 not found

When do you need this to be done by? (YYYY/MM/DD)



Metadata Update from @zlopez:
- Issue tagged with: Needs investigation, ops

a month ago

yeah.

So the problem is this:

There's a 'sorta bad' disk. It's not completely bad, it still shows up, but it gets errors and drops out and resets and then is back. I think we need to replace this disk.

In the mean time the 'fix' is to:

  • login on the ipmi console with root password
  • mdadm -S /dev/md2 to stop the incomplete array
  • mdadm -A /dev/md2 --run --force /dev/sde3 /dev/sdb3 /dev/sdh3 /dev/sdg3 /dev/sda3 /dev/sdd3 /dev/sdc3 /dev/sdf3
    to restart the array without the bad disk
  • reboot again to bring it back up.

This machine is a loaner from ibm, the last time we had this happen we just bought our own replacement drive, I'll look into if we can just do that again.

Metadata Update from @zlopez:
- Issue untagged with: Needs investigation
- Issue tagged with: low-gain, low-trouble

a month ago

Thanks for the explanation @kevin Do you want to keep the ticket open or can we close it?

Lets keep it to track replacing the disk?

Metadata Update from @phsmoura:
- Issue untagged with: low-gain, low-trouble
- Issue assigned to kevin
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: medium-gain, medium-trouble

a month ago

Metadata Update from @kevin:
- Assignee reset

2 days ago

Hey @dkirwan would you be willing to do this one?

It's a bit different than the others, in that this is a ibm loaner box, so we have to pay for the new drive ourselves (and expense it).

Log in to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog