There is something wrong with the Cloud-Base-Generic qcow2 image on ppc64le in Rawhide (and F-40). It won't boot and fdisk reports a different partition layout compared to a F-39 qcow2 image after converting the qcow2 to a raw image. Without the PReP partition the image can't be booted.
Cloud-Base-Generic
ppc64le
fdisk
[dan@talos tmp]$ fdisk -l cloud-f41.raw GPT PMBR size mismatch (1310719 != 10485759) will be corrected by write. Disk cloud-f41.raw: 5 GiB, 5368709120 bytes, 10485760 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0xb41e4af5 Device Boot Start End Sectors Size Id Type cloud-f41.raw1 1 10485759 10485759 5G ee GPT
[dan@talos tmp]$ fdisk -l cloud-f39.raw Disk cloud-f39.raw: 5 GiB, 5368709120 bytes, 10485760 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: B47DE384-32B5-4A47-A4D1-882C70A42AA5 Device Start End Sectors Size Type cloud-f39.raw1 2048 10239 8192 4M PowerPC PReP boot cloud-f39.raw2 10240 2058239 2048000 1000M Linux filesystem cloud-f39.raw3 2058240 2263039 204800 100M EFI System cloud-f39.raw4 2263040 2265087 2048 1M BIOS boot cloud-f39.raw5 2265088 10483711 8218624 3.9G Linux filesystem
I have checked that this issue is persist in fedora 40 edition also. I have checked "Fedora-Cloud-Base-Generic.ppc64le-40-1.14.qcow2"
Right, firmware="ofw" should be specified in the image description such that the PrEP partition gets created
That is present: https://pagure.io/fedora-kiwi-descriptions/blob/rawhide/f/teams/cloud/cloud.xml#_117
Hmm, that's strange. The trigger for prep is only the firmware setting, kiwi has this this
if self.firmware.ofw_mode(): log.info('--> creating PReP partition') partition_mbsize = self.firmware.get_prep_partition_size() disk.create_prep_partition( partition_mbsize ) disksize_used_mbytes += partition_mbsize
Can I see the buildlog somewhere ?
Checked our integration here:
That one has
fdisk -l kiwi-test-image-disk.ppc64le-1.15.1-PhysicalBSZ_512-Build99.1.raw Disk kiwi-test-image-disk.ppc64le-1.15.1-PhysicalBSZ_512-Build99.1.raw: 1.2 GiB, 1284505600 bytes, 2508800 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: B0434928-8532-42EF-BDB6-F1B2F4457893 Device Start End Sectors Size Type kiwi-test-image-disk.ppc64le-1.15.1-PhysicalBSZ_512-Build99.1.raw1 2048 18431 16384 8M PowerPC PReP boot kiwi-test-image-disk.ppc64le-1.15.1-PhysicalBSZ_512-Build99.1.raw2 18432 2508766 2490335 1.2G unknown
https://koji.fedoraproject.org/koji/taskinfo?taskID=118436863 is the last Rawhide run with all the logs
Thanks, I'll take a look
The log says:
[ DEBUG ]: 08:10:28 | EXEC: [losetup --sector-size 4096 -f --show /builddir/result/image/Fedora-Cloud-Base-Generic.ppc64le-Rawhide.raw] [ INFO ]: 08:10:30 | --> creating PReP partition [ DEBUG ]: 08:10:30 | EXEC: [sgdisk -n 1:2048:+8M -c 1:p.prep /dev/loop0]
So the image is a 4k blocksize image, is that intentional ?
Thus if you lookup the partition table you also need to loopsetup it as 4k disk
losetup --sector-size 4096 Fedora-Cloud-Base-Generic.ppc64le-Rawhide.raw gdisk -l /dev/loop0
which shows the PReP partition. But I doubt you can boot a 4k disk as virtual system, same issue as with the s390 DASD image.
yeah, 4k block size sounds wrong and it explains the GPT PMBR size mismatch (1310719 != 10485759), this should be a simple fix
GPT PMBR size mismatch (1310719 != 10485759)
https://pagure.io/fedora-kiwi-descriptions/pull-request/61 makes grub to start, but then it fails to load the kernels to boot (hardcoded /var/tmp/ path?)
captured output of virt-install --name localcloud-41 --memory 4096 --nographics --noreboot --os-variant detect=on,name=fedora-unknown --cloud-init user-data="/home/dan/cloudinit-user-data.yaml" --disk=size=8,backing_store="/var/lib/libvirt/images/Fedora.ppc64le-Rawhide.qcow2"
virt-install --name localcloud-41 --memory 4096 --nographics --noreboot --os-variant detect=on,name=fedora-unknown --cloud-init user-data="/home/dan/cloudinit-user-data.yaml" --disk=size=8,backing_store="/var/lib/libvirt/images/Fedora.ppc64le-Rawhide.qcow2"
... Welcome to Open Firmware Copyright (c) 2004, 2017 IBM Corporation All rights reserved. This program and the accompanying materials are made available under the terms of the BSD License available at http://www.opensource.org/licenses/bsd-license.php Trying to load: from: /pci@800000020000000/scsi@4 ... Successfully loaded .[30m.[47mWelcome to GRUB! .[37m.[40m.[37m.[40m.[37m.[40merror: ../../grub-core/term/serial.c:217:serial port `com0' isn't found. error: ../../grub-core/commands/terminal.c:138:terminal `serial' isn't found. error: ../../grub-core/commands/terminal.c:138:terminal `serial' isn't found. ..[2J.[m.[1;1H Booting `Fedora Linux (6.10.0-0.rc1.20240531git4a4be1ad3a6e.21.fc41.ppc64le) 41 (Cloud Edition Prerelease)' error: ../../grub-core/fs/fshelp.c:257:file `/var/tmp/work/build/image-root/boot/vmlinuz -6.10.0-0.rc1.20240531git4a4be1ad3a6e.21.fc41.ppc64le' not found. error: ../../grub-core/loader/powerpc/ieee1275/linux.c:333:you need to load the kernel first. Press any key to continue...
Seems the grub or BLS config files use the full host paths for the kernel files.
Looks like and we know that grub behaves really weird if the grub tools runs on system that is not the later target system. In kiwi there are some methods that tries to _fix the broken files tools like grub-mkconfig produces. There is for example:
_fix_grub_loader_entries_linux_and_initrd_paths
https://github.com/OSInside/kiwi/blob/main/kiwi/bootloader/config/grub2.py#L319
Maybe this code there doesn't do the right thing on ppc64le
Hmm, I see in the log that the _fix adaptions got applied
[ DEBUG ]: 08:14:18 | Existing loader entry: linux /root/var/lib/mock/f41-kiwi-build-51316245-6132131/root/builddir/result/image/build/image-root/boot/vmlinuz-6.10.0-0.rc1.20240531git4a4be1ad3a6e.21.fc41.ppc64le [ DEBUG ]: 08:14:18 | Updated loader entry: linux /vmlinuz-6.10.0-0.rc1.20240531git4a4be1ad3a6e.21.fc41.ppc64le [ DEBUG ]: 08:14:18 | Existing loader entry: initrd /root/var/lib/mock/f41-kiwi-build-51316245-6132131/root/builddir/result/image/build/image-root/boot/initramfs-6.10.0-0.rc1.20240531git4a4be1ad3a6e.21.fc41.ppc64le.img [ DEBUG ]: 08:14:18 | Updated loader entry: initrd /initramfs-6.10.0-0.rc1.20240531git4a4be1ad3a6e.21.fc41.ppc64le.img [ DEBUG ]: 08:14:18 | custom arguments for bootloader installation {'boot_device': '/dev/loop0p2', 'root_device': '/dev/loop0p3', 'write_device': '/dev/loop0p3', 'firmware': <kiwi.firmware.FirmWare object at 0x7fffaab49f70>, 'target_removable': None, 'install_options': [], 'shim_options': [], 'prep_device': '/dev/loop0p1', 'system_volumes': {'home': {'volume_options': 'subvol=home,compress=zstd:1', 'volume_device': '/dev/loop0p3'}, 'var': {'volume_options': 'subvol=var,compress=zstd:1', 'volume_device': '/dev/loop0p3'}}, 'system_root_volume': 'root'} [ INFO ]: 08:14:18 | Installing grub2 on disk /dev/loop0 [ DEBUG ]: 08:14:18 | EXEC: [mountpoint -q /var/tmp/kiwi_mount_manager.lelv27wb]
But I also see that the mountpoint got not umounted prior grub2-install, which means the actual change could not be written at the time grub2-install got called.
I need to check if the theory is correct
I think the "root" cause is in the BLS snippets, if those are wrong, then the generated grub2.cfg will be wrong too. I have also noticed the BLS snippet contains kernel parameters from the host, not the ones from kiwi.
yes and all that "crap" we correct with the _fix methods. I'm currently testing a fix in kiwi ...
I really think we are hitting a bug in kiwi and proposed the following patch:
I don't have access to ppc64le systemd, maybe you can locally patch your build system and check of that fixes it ?
Thanks
no change with PR 2561 applied :-(
Hrm, thanks much for testing. I'm running out of ideas for this one and have no ppc64le system for debugging. Do you think you could arrange some sort of ssh access for me on your system ?
I will figure out something tomorrow about a remote access to some of our systems.
Hey I have found working cloud image which is Fedora41 rawhide only, "Fedora-Server-KVM-Rawhide-20240517.n.0.ppc64le.qcow2", can we just compare and see what is different in error pron images? I'm new to this, but I can check.
I got:
Disk Fedora-Rawhide-KVM.raw: 7 GiB, 7516192768 bytes, 14680064 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: D1F17B85-7DD2-429A-AB2F-2138E4518A58
Device Start End Sectors Size Type Fedora-Rawhide-KVM.raw1 2048 10239 8192 4M PowerPC PReP boot Fedora-Rawhide-KVM.raw2 10240 2107391 2097152 1G Linux extended boot Fedora-Rawhide-KVM.raw3 2107392 14678015 12570624 6G Linux LVM
Hi, thanks much for helping. I think we know what's wrong, problem is I could not find the reason in the kiwi code why we are hitting this. Background is this. The image is configured to use grub, grub on Fedora is BLS grub and therefore grub-mkconfig produces data in boot/loader/entries/xxx.conf. The information in these files are unfortunately wrong, wrong paths, wrong cmdline options... because that grub thing always thinks it is running on the target and takes assumptions that are simply wrong when you build an image on a host which is not the target. We have code in kiwi in so called fix... methods which tries to correct the broken information such that the result image can boot.
Here is the problem with this particular one. It seems that the _fix method did not do the job even though in the log we can see that it fixes the information. The PR I did on kiwi with one issue I found did not work according to @sharkcz and so I'm a bit clueless without further debugging.
Debugging would work best on a Fedora ppc64le host, but I don't have one. We could checkout the git there, run poetry to setup the dev env and rebuild from the git and I'm sure we will find what's going on that way.
Thanks @osinside for the insights, can I see boot/loader/entries/xxx.conf ? I want this data because may be some alignment issue is occurring.
You can, just fetch the image and mount it similar to
kpartx -a image.raw mount /dev/mapper/loop0p2 /mnt cat /mnt/boot/loader/entries/*.conf
hi @osinside , I have sent you an email with some options, please check your spam folder as google/gmail doesn't like me sometimes :-)
But @osinside for corrupted raw disk img, kpartx is not creating any partition map.
But @osinside for corrupted iso, kpartx is not creating any partition map.
iso ? I thought we are talking about a raw disk image ?
@sharkcz thanks for providing me access to the fedora test infrastructure. I did setup a build environment on the ppc64le machine and rebuild the Fedora ppc64le cloud image. The thing is, I can't reproduce the error reported here.
Can you double check on the test server:
(unit_py3_11) [osinside@ppc64le-test kiwi][PROD]$ ll ~ total 8 drwxr-xr-x. 9 osinside osinside 4096 Jun 5 16:52 fedora-kiwi-descriptions drwxr-xr-x. 13 osinside osinside 4096 Jun 5 16:06 kiwi
You find the build results in /tmp/mytest/
[root@ppc64le-test mytest][PROD]# fdisk -l /tmp/mytest/Fedora.ppc64le-Rawhide.raw Disk /tmp/mytest/Fedora.ppc64le-Rawhide.raw: 5 GiB, 5368709120 bytes, 10485760 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: 4E1C342C-1A5C-4A71-B1AA-CAC7D982477B Device Start End Sectors Size Type /tmp/mytest/Fedora.ppc64le-Rawhide.raw1 2048 18431 16384 8M PowerPC PReP boot /tmp/mytest/Fedora.ppc64le-Rawhide.raw2 18432 2066431 2048000 1000M Linux extended boot /tmp/mytest/Fedora.ppc64le-Rawhide.raw3 2066432 10485726 8419295 4G Linux root (PPC64LE)
I mounted the image and looked at the entries file which says:
[root@ppc64le-test mytest][PROD]# cat /mnt/loader/entries/40e9937cdd1f4dae8a824d7307728866-6.10.0-0.rc2.24.fc41.ppc64le.conf title Fedora Linux (6.10.0-0.rc2.24.fc41.ppc64le) 41 (Cloud Edition Prerelease) version 6.10.0-0.rc2.24.fc41.ppc64le linux /vmlinuz-6.10.0-0.rc2.24.fc41.ppc64le initrd /initramfs-6.10.0-0.rc2.24.fc41.ppc64le.img options no_timer_check console=tty1 console=ttyS0,115200n8 systemd.firstboot=off root=UUID=906dbbdf-d292-4630-a3c5-b0f12f165b22 rootflags=subvol=root grub_users $grub_users grub_arg --unrestricted grub_class fedora
which looks correct to me, given the image has an extra boot partition
@osinside I have noticed that raw disk image that are prone to error are compatible to version 1.1 and all fine are compatible to 0.10 qemu version Hey @osinside the image you have tested is the image that persist error, right?
So the paths are still wrong for me, when running kiwi 1.0.21 (downloaded from koji) on a F-40 system (and SELinux set manually to permissive mode) :-( I have used sudo ./kiwi-build --output-dir=/var/tmp/work --image-type=oem --image-profile=Cloud-Base-Generic in the descriptions directory.
sudo ./kiwi-build --output-dir=/var/tmp/work --image-type=oem --image-profile=Cloud-Base-Generic
I'm not sure I can follow you. What I did was this:
[root@ppc64le-test mytest][PROD]# qemu-system-ppc64 -hda Fedora.ppc64le-Rawhide.qcow2 qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-cfpc=workaround qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-sbbc=workaround qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-ibs=workaround qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-ccf-assist=on gtk initialization failed Booting `Fedora Linux (6.10.0-0.rc2.24.fc41.ppc64le) 41 (Cloud Edition Prerelease)' OF stdout device is: /vdevice/vty@71000000 Preparing to boot Linux version 6.10.0-0.rc2.24.fc41.ppc64le (mockbuild@dccfaea498bd46159692b12a4b579a84) (gcc (GCC) 14.1.1 20240522 (Red Hat 14.1.1-4), GNU ld version 2.42.50.20240513) #1 SMP Mon Jun 3 14:01:47 UTC 2024 Detected machine type: 0000000000000101 ... and so on until login
So for me it all worked. Thus I don't know what's wrong
I reproduced this step. I fetched the kiwi build 10.0.21 from koji here
and installed that package, next I copied your command and build the image.
btw my test system is fc39 therefore I needed the following patch to make it succeed
[osinside@ppc64le-test fedora-kiwi-descriptions][PROD]$ git diff diff --git a/repositories/core-rawhide.xml b/repositories/core-rawhide.xml index c2ae124..b6977b3 100644 --- a/repositories/core-rawhide.xml +++ b/repositories/core-rawhide.xml @@ -1,7 +1,6 @@ <image> <repository type="rpm-md" alias="rawhide" sourcetype="metalink"> <source path="https://mirrors.fedoraproject.org/metalink?repo=rawhide&arch=$basearch"> - <signing key="file:///usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-rawhide-primary"/> </source> </repository> </image>
I needed this because the rawhide gpg key is not present on the fc39 host. But this just disables the package gpg checking and should not harm the test
At the end I got this result files
[osinside@ppc64le-test fedora-kiwi-descriptions][PROD]$ cd /var/tmp/work/ [osinside@ppc64le-test work][PROD]$ ls -l total 1586480 drwxr-xr-x. 3 root root 46 Jun 7 09:50 build -rw-r--r--. 1 root root 2406399 Jun 7 09:57 Fedora.ppc64le-Rawhide.changes -rw-r--r--. 1 root root 42953 Jun 7 09:57 Fedora.ppc64le-Rawhide.packages -rw-r--r--. 1 root root 373096448 Jun 7 10:00 Fedora.ppc64le-Rawhide.qcow2 -rw-r--r--. 1 root root 5368709120 Jun 7 09:57 Fedora.ppc64le-Rawhide.raw -rw-r--r--. 1 root root 149 Jun 7 09:58 Fedora.ppc64le-Rawhide.verified -rw-r--r--. 1 root root 49502 Jun 7 10:00 kiwi.result -rw-r--r--. 1 root root 904 Jun 7 10:00 kiwi.result.json
This is also what you got right ?
Then
[osinside@ppc64le-test work][PROD]$ sudo kpartx -a Fedora.ppc64le-Rawhide.raw [osinside@ppc64le-test work][PROD]$ sudo mount /dev/mapper/loop0p2 /mnt [osinside@ppc64le-test work][PROD]$ sudo -i cd /mnt/boot/loader/entries cat 89beb977a02740b286d53f8d88072c75-6.10.0-0.rc2.20240605git32f88d65f01b.26.fc41.ppc64le.conf
And I can not see any wrong path in there and also this image just boots
Sorry guys I'm running out of ideas :)
I believe I have a plausible idea :-) Things work fine for a pseries class machine (like KVM or LPAR in PowerVM), but fail for powernv (aka bare metal). I was able to create a working image on a F-39 VM. But my main system is bare metal and it doesn't work well with BLS (too old petitboot bootloader in the firmware) and while the BLS snippet is correct, the grub.cfg in the image is not. It contains full paths to the workdir for the linux and initrd options (added by grub's 10_linux script) and also contains boot entries from the host itself (added by 30_os-prober).
pseries
powernv
grub.cfg
linux
initrd
10_linux
30_os-prober
@osinside, I can see now images are able to mount, did you fix something?
Hi, yes it makes sense but our kiwi _fix methods also covers the main grub.cfg file not only the BLS snippets. So when I mount the created Fedora.ppc64le-Rawhide.raw image on the infrastructure machine I can see the following
[root@ppc64le-test work][PROD]# pwd /var/tmp/work kpartx -a Fedora.ppc64le-Rawhide.raw mount /dev/mapper/loop0p2 /mnt [root@ppc64le-test work][PROD]# find /mnt/ | grep grub.cfg /mnt/grub2/grub.cfg vi /mnt/grub2/grub.cfg ### BEGIN /etc/grub.d/30_os-prober ### menuentry 'Fedora Linux 39 (Server Edition) (on /dev/mapper/fedora_rh--power--vm14-root)' --class gnu-linux --class gnu --class os $menuentry_id_option 'osprober-gnulinux-/vmlinuz-6.8.10-200.fc39.ppc64le--4bdb04fc-5246-4711-9fa4-2cf87b0c1d60' { insmod part_msdos insmod xfs search --no-floppy --fs-uuid --set=root 0a7d9720-8834-4d2d-860d-920ea3bdf145 linux /vmlinuz-6.8.10-200.fc39.ppc64le root=/dev/dm-0 initrd /initramfs-6.8.10-200.fc39.ppc64le.img } ... and so on
So also this file looks correct to me.
@sharkcz is this the same file you looked at from your image build ?
Log in to comment on this ticket.