#12253 bvmhost-x86-03.stg.iad2.fedoraproject.org has failed disk
Opened 2 months ago by kevin. Modified 11 hours ago

bvmhost-x86-03.stg needs a disk replaced. ;(

CC: @dkirwan if you want to take it.


Metadata Update from @phsmoura:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: hardware, low-gain, low-trouble, ops

2 months ago

Metadata Update from @dkirwan:
- Issue assigned to dkirwan

a month ago

I can see the machine running again now. Closing this as fixed.

Metadata Update from @zlopez:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

18 days ago

Not so fast. :)

The machine has been running the entire time, it's just functioning with one gone disk (that we still need to replace)

# ssh bvmhost-x86-03.stg.iad2.fedoraproject.org cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4] 
md1 : active raid1 sdh2[4] sdd2[5] sdc2[7] sdi2[3] sdf2[2] sdg2[9] sdb2[0] sde2[8] sda2[1]
      488384 blocks super 1.0 [10/9] [UUUUUU_UUU]
      bitmap: 1/1 pages [4KB], 65536KB chunk

md2 : active raid6 sdf3[2] sdd3[5] sdg3[9] sda3[1] sdi3[3] sdh3[4] sdb3[0] sdc3[7] sde3[8]
      4670816256 blocks super 1.2 level 6, 512k chunk, algorithm 2 [10/9] [UUUUUU_UUU]
      bitmap: 2/5 pages [8KB], 65536KB chunk

md0 : active raid1 sdg1[9] sda1[1] sdf1[2] sdd1[5] sdi1[3] sdb1[0] sdh1[4] sde1[8] sdc1[7]
      1022976 blocks super 1.2 [10/9] [UUUUUU_UUU]
      bitmap: 1/1 pages [4KB], 65536KB chunk

unused devices: <none>

The _ there indicates a failed/missing drive. ;)

Metadata Update from @kevin:
- Issue status updated to: Open (was: Closed)

17 days ago

Oops, my assumption was wrong. Let's keep it open then.

Looks like there is a power issue on this machine also:

Power supply redundancy is lost.    Tue Oct 15 2024 12:37:59
The Power Supply Unit (PSU) 1 is not receiving input power because of issues in PSU or cable connections.   Tue Oct 15 2024 12:37:55

Looks like there is a power issue on this machine also:

Power supply redundancy is lost. Tue Oct 15 2024 12:37:59 The Power Supply Unit (PSU) 1 is not receiving input power because of issues in PSU or cable connections. Tue Oct 15 2024 12:37:55

oops nevermind thats the prod box.. might need to look at that too..

Arranging Dell access to datacenter with replacement hd with pcole internally.

Dell shipped new HD, pcole will install himself when it arrives.

Ah, pcole in rdu2 today, Dell actually shipping engineer today, this engineer will fit the HD.

disk replaced and showing up in idrac, added virtual disk, will sort out the remaining steps in the morning.

Log in to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog