#8365 openqa-ppc64le-01.qa needs firmware update to run qemu (I think)
Closed: Fixed 4 years ago by adamwill. Opened 4 years ago by adamwill.

I updated the openQA staging boxes to Fedora 31 this week, and I think this has caused qemu to stop working on openqa-ppc64le-01.qa entirely. My thesis is that this qemu commit:

https://github.com/qemu/qemu/commit/2782ad4c4102d57f7f8e135dce0c1adb0149de77

means it won't work unless it has some kind of firmware update (note the "By now machine firmware should have been upgraded to allow these settings" line). However, openqa-ppc64le-01.qa seems to be on pretty old firmware:

[root@openqa-ppc64le-01 adamwill][PROD]# lsmcode
Version of System Firmware : 
 Product Name          : OpenPOWER Firmware
 Product Version       : IBM-habanero-ibm-OP8_v1.7_1.62
 Product Extra         :    hostboot-bc98d0b-1a29dff
 Product Extra         :    occ-0362706-16fdfa7
 Product Extra         :    skiboot-5.1.13
 Product Extra         :    hostboot-binaries-43d5a59
 Product Extra         :    habanero-xml-a71550e-cdd3b31
 Product Extra         :    capp-ucode-105cb8f

best as I can tell the 'v1.7_1.62' there indicates it's seriously outdated... https://delivery04.dhe.ibm.com/sar/CMA/SFA/061tg/1/8348_810.1603.20160310b.html#__RefHeading___Toc1250_1053759979 suggests it's from 2016. Looking this stuff up is difficult as I don't know what the actual model number of the server is, but best I can tell, a current version should be something like IBM-habanero-ibm-OP8_v1.8_1.6, per https://delivery04.dhe.ibm.com/sar/CMA/SFA/07fch/0/8348_810.1806.20180213a.html#__RefHeading___Toc1210_1053759979 .

I tested that I can run qemu if I pass -M pseries-3.1 to set the machine type to pseries-3.1 - which, as you can see from the commit, follows the older code path and sets the caps to SPAPR_CAP_BROKEN not SPAPR_CAP_WORKAROUND. So that does seem to support my theory.

The documentation I've found - that last link, mainly - seems fairly terrifying and I don't want to just start downloading things that might be right and firing off ipmitool commands in case I brick the thing. But not being able to run qemu without a custom argument is a bit of a problem, I would have to do custom builds of a couple of things and keep them installed in order for the box to be usable. It would be great if we can get the firmware updated.


@michelmno @smooge says he can't find any details on the hardware, do you have any or any advice here? This is the box you loaned us back in 2017, not the more recent power9 boxes.

@michelmno @smooge says he can't find any details on the hardware, do you have any or any advice here? This is the box you loaned us back in 2017, not the more recent power9 boxes.

I also have locally an habanero P8 with same lsmcode output, I am trying to upgrade the firmware to 820.30 firmware "OP820 1923D" released 07/03/2019 (do not know the url of the external link).
FYIO, I have some problems to execute the proper ipmi commands, expecting to solve that soon.

OK, well if you can help us do ours after you figure out yours, that'd be great :)

Metadata Update from @kevin:
- Issue priority set to: Waiting on External (was: Needs Review)

4 years ago

(In reply to Adam Williamson from https://bugzilla.redhat.com/show_bug.cgi?id=1768551#c9)

Thanks! Any chance you can help us upgrade ours, now?

Not sure it be will be easy for me to access from outside your infra network to issue ipmitool commands to do the FW upgrade.
The Upgrade is relatively easy following doc in https://delivery04.dhe.ibm.com/sar/CMA/SFA/08ct3/0/8348_820.1923.20190613n.html and specific paragraphs:
* 6 for firmware download in https://delivery04.dhe.ibm.com/sar/CMA/SFA/08ct3/0/8348_820.1923.20190613n.html#__RefHeading___Toc1208_1053759979
- that refer to fix-central: https://www-945.ibm.com/support/fixcentral/ search FW OP820 for 8348-21C
- it should point to a "OP8_v1.12_2.96_H" url used to download 8348_820.1923.20190613n_update.hpm file
* 7.2 for set of commands to use this hpm file for upgrade in https://delivery04.dhe.ibm.com/sar/CMA/SFA/08ct3/0/8348_820.1923.20190613n.html#__RefHeading___Toc1214_1053759979

thanks! I'll try and get it done with those pointers, if we can't, we can just add you to the qa-admin group :)

hum, actually, as this involves using IPMI I think maybe @smooge or @kevin will have to do it. Can one of you take a look? I already downloaded the firmware file and will stick it in /root on the box so you don't have to jump through all the hoops to download it.

ok, gave this a go... but ran into issues. ;(

➜  ~ ipmitool -I lanplus -H 10.5.130.32 -P xxxx -U ADMIN chassis power off      
Chassis Power Control: Down/Off
➜  ~ ipmitool -I lanplus -H 10.5.130.32 -P xxxx -U ADMIN mc reset cold          
Sent cold reset command to MC
➜  ~ ipmitool -I lanplus -H 10.5.130.32 -P xxxx -U ADMIN -z 25000 hpm upgrade 8348_820.1923.20190613n_update.hpm force
Error: Unable to establish IPMI v2 / RMCP+ session
➜  ~ ipmitool -I lanplus -H 10.5.130.32 -P xxxx -U ADMIN raw 0x32 0xba 0x18 0x00
Error: Unable to establish IPMI v2 / RMCP+ session
➜  ~ ipmitool -I lanplus -H 10.5.130.32 -P xxxx -U ADMIN raw 0x32 0xba 0x18 0x00

➜  ~ ipmitool -I lanplus -H 10.5.130.32 -P xxxx -U ADMIN -z 25000 hpm upgrade 8348_820.1923.20190613n_update.hpm force
Setting large buffer to 25000

PICMG HPM.1 Upgrade Agent 1.0.9: 

Validating firmware image integrity...OK
Performing preparation stage...
Services may be affected during upgrade. Do you wish to continue? (y/n): y
OK

Performing upgrade stage:

-------------------------------------------------------------------------------
|ID  | Name        |                     Versions                        | %  |
|    |             |      Active     |      Backup     |      File       |    |
|----|-------------|-----------------|-----------------|-----------------|----|
|*  2|BIOS         |   0.00 00000000 | ---.-- -------- |   1.12 96000000 |  0%|Skip|
-------------------------------------------------------------------------------
(*) Component requires Payload Cold Reset
Firmware upgrade procedure failed

First, after the mc reset cold, it stops responding for a while... then of course the second problem is the 'payload cold reset error' :(

Any ideas?

@kevin in your flow of commands, it is normal to have an error after the "mc reset cold" if you issue another ipmitool command too early after this one.
I did the hpm upgrade command with only -z 15000 parameter and it worked.
in same command I specified the hpm file with its full path, I do not know if that make a difference.

Had to go down to 5000 to get it working...

➜  ~ ipmitool -I lanplus -H 10.5.130.32 -P xxxx -U ADMIN -z 5000 hpm upgrade 8348_820.1923.20190613n_update.hpm force 
Setting large buffer to 5000

PICMG HPM.1 Upgrade Agent 1.0.9: 

Validating firmware image integrity...OK
Performing preparation stage...
Services may be affected during upgrade. Do you wish to continue? (y/n): y
OK

Performing upgrade stage:

-------------------------------------------------------------------------------
|ID  | Name        |                     Versions                        | %  |
|    |             |      Active     |      Backup     |      File       |    |
|----|-------------|-----------------|-----------------|-----------------|----|
|*  2|BIOS         |   0.00 00000000 | ---.-- -------- |   1.12 96000000 |100%|
|    |Upload Time: 09:49             | Image Size: 33554584 bytes              |
|*  0|BOOT         |   2.16 AA660100 | ---.-- -------- |   2.16 51220300 |100%|
|    |Upload Time: 00:03             | Image Size:  262296 bytes              |
|*  1|APP          |   2.16 AA660100 | ---.-- -------- |   2.16 51220300 |100%|
|    |Upload Time: 06:49             | Image Size: 33292440 bytes              |
-------------------------------------------------------------------------------
(*) Component requires Payload Cold Reset

Firmware upgrade procedure successful

but then things didn't go well. I powered back on and got:

 11.74411|System shutting down with error status 0x90FF0002


--== Welcome to Hostboot hostboot-p8-c893515-pd6f049d/hbicore.bin ==--

  5.53008|System shutting down with error status 0x90FF0003

then I power cycled it again and now it doesn't output anything. ;(

On the web interface I see:

Firmware Revision: 2.16.65371
Firmware Build Time: Jun 13 2019 11:15:14 CDT
BIOS Version: 1.12.96

and the "OCC Status" light is red. ;(

any ideas to bring it back to life? ;(

...and... it came up? @adamwill can you check that it's the right version for what you need now?

yup! looks good now. thanks!

Metadata Update from @adamwill:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

4 years ago

Login to comment on this ticket.

Metadata