dustymabe commented 5 years ago

hey @siosm - since f29 is almost out, can you try with f29 silverblue and see if the problem is resolved?

https://kojipkgs.fedoraproject.org/compose/branched/Fedora-29-20181016.n.0/compose/Silverblue/x86_64/iso/

walters commented 5 years ago

I can reproduce this with F29AH using default partitioning.

(Also I notice we regressed to using ext4 for / - seems like we lost the installclass?)

walters commented 5 years ago

This is filed against atomic-wg so let's leave Silverblue out of this.

What Anaconda is doing here in hardcoding the resume= argument seems pretty broken to me. Why would I want to enable hibernation on my servers?

dustymabe commented 5 years ago

This is filed against atomic-wg so let's leave Silverblue out of this.

oops. I confused this with "atomic workstation", my fault.

What Anaconda is doing here in hardcoding the resume= argument seems pretty broken to me. Why would I want to enable hibernation on my servers?

agree.. let's see what rabbit hole this leads me down.

dustymabe commented 5 years ago

installclass issue

(Also I notice we regressed to using ext4 for / - seems like we lost the installclass?)

Yep that's definitely true. I see differing behavior (i.e. xfs by default on rootfs) if I force the installclass by using the following in a kickstart:

%anaconda
installclass --name="Atomic Host"
%end

I opened https://pagure.io/atomic-wg/issue/514 for the installclass issue.

resume= issue

however that doesn't fix the resume= issue. I suspect that is done for all fedora now (related bz: 1206936.

I'll test to see what happens on fedora-server.

dustymabe commented 5 years ago

@walters
What Anaconda is doing here in hardcoding the resume= argument seems pretty broken to me. Why would I want to enable hibernation on my servers?

I don't disagree, but I just checked and this was enabled in f28 too. So the "timing out" is new functionality in f29. We don't see this issue on our cloud images because we don't have swap on them so no resume=/path/to/swap gets added by default.

I just checked and on fedora server we do get the resume= option on the kernel command line (i.e it gets added for all fedora) but I don't notice the delay timeout that I do on Atomic Host:

Atomic Host

[root@localhost ~]# journalctl | grep swap 
Oct 17 23:04:11 localhost kernel: Command line: BOOT_IMAGE=/ostree/fedora-atomic-8f228e830fca568a137305e295ef96bafa0d08b0e27aa8c133a3bd24fff07280/vmlinuz-4.18.12-300.fc29.x86_64 resume=/dev/mapper/fedora-swap rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap root=/dev/mapper/fedora-root ostree=/ostree/boot.0/fedora-atomic/8f228e830fca568a137305e295ef96bafa0d08b0e27aa8c133a3bd24fff07280/0
Oct 17 23:04:11 localhost kernel: Kernel command line: BOOT_IMAGE=/ostree/fedora-atomic-8f228e830fca568a137305e295ef96bafa0d08b0e27aa8c133a3bd24fff07280/vmlinuz-4.18.12-300.fc29.x86_64 resume=/dev/mapper/fedora-swap rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap root=/dev/mapper/fedora-root ostree=/ostree/boot.0/fedora-atomic/8f228e830fca568a137305e295ef96bafa0d08b0e27aa8c133a3bd24fff07280/0
Oct 17 23:04:11 localhost kernel: zswap: loaded using pool lzo/zbud
Oct 17 23:04:11 localhost dracut-cmdline[204]: Using kernel command line parameters: BOOT_IMAGE=/ostree/fedora-atomic-8f228e830fca568a137305e295ef96bafa0d08b0e27aa8c133a3bd24fff07280/vmlinuz-4.18.12-300.fc29.x86_64 resume=/dev/mapper/fedora-swap rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap root=/dev/mapper/fedora-root ostree=/ostree/boot.0/fedora-atomic/8f228e830fca568a137305e295ef96bafa0d08b0e27aa8c133a3bd24fff07280/0
Oct 17 23:05:41 localhost systemd[1]: dev-mapper-fedora\x2dswap.device: Job dev-mapper-fedora\x2dswap.device/start timed out.
Oct 17 23:05:41 localhost systemd[1]: Timed out waiting for device dev-mapper-fedora\x2dswap.device.
Oct 17 23:05:41 localhost systemd[1]: Dependency failed for Resume from hibernation using device /dev/mapper/fedora-swap.
Oct 17 23:05:41 localhost systemd[1]: systemd-hibernate-resume@dev-mapper-fedora\x2dswap.service: Job systemd-hibernate-resume@dev-mapper-fedora\x2dswap.service/start failed with result 'dependency'.
Oct 17 23:05:41 localhost systemd[1]: dev-mapper-fedora\x2dswap.device: Job dev-mapper-fedora\x2dswap.device/start failed with result 'timeout'.
Oct 17 23:05:41 localhost dracut-initqueue[484]: Scanning devices sda2  for LVM logical volumes fedora/root fedora/swap
Oct 17 23:05:41 localhost dracut-initqueue[484]: inactive '/dev/fedora/swap' [1.50 GiB] inherit
Oct 17 23:05:43 localhost.localdomain kernel: Adding 1572860k swap on /dev/mapper/fedora-swap.  Priority:-2 extents:1 across:1572860k FS

Fedora Server

[root@localhost ~]# journalctl | grep swap
Oct 17 22:46:45 localhost.localdomain kernel: Command line: BOOT_IMAGE=/vmlinuz-4.18.14-300.fc29.x86_64 root=/dev/mapper/fedora-root ro resume=/dev/mapper/fedora-swap rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap rhgb quiet LANG=en_US.UTF-8
Oct 17 22:46:45 localhost.localdomain kernel: Kernel command line: BOOT_IMAGE=/vmlinuz-4.18.14-300.fc29.x86_64 root=/dev/mapper/fedora-root ro resume=/dev/mapper/fedora-swap rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap rhgb quiet LANG=en_US.UTF-8
Oct 17 22:46:45 localhost.localdomain kernel: zswap: loaded using pool lzo/zbud
Oct 17 22:46:45 localhost.localdomain dracut-cmdline[199]: Using kernel command line parameters: BOOT_IMAGE=/vmlinuz-4.18.14-300.fc29.x86_64 root=/dev/mapper/fedora-root ro resume=/dev/mapper/fedora-swap rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap rhgb quiet LANG=en_US.UTF-8
Oct 17 22:46:45 localhost.localdomain dracut-initqueue[298]: Scanning devices vda2  for LVM logical volumes fedora/root fedora/swap
Oct 17 22:46:45 localhost.localdomain dracut-initqueue[298]: inactive '/dev/fedora/swap' [1.50 GiB] inherit
Oct 17 22:46:46 localhost.localdomain systemd[1]: Found device /dev/mapper/fedora-swap.
Oct 17 22:46:46 localhost.localdomain systemd[1]: Starting Resume from hibernation using device /dev/mapper/fedora-swap...
Oct 17 22:46:46 localhost.localdomain systemd-hibernate-resume[395]: Could not resume from '/dev/mapper/fedora-swap' (253:1).
Oct 17 22:46:46 localhost.localdomain systemd[1]: Started Resume from hibernation using device /dev/mapper/fedora-swap.
Oct 17 22:46:46 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-hibernate-resume@dev-mapper-fedora\x2dswap comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 22:46:46 localhost.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-hibernate-resume@dev-mapper-fedora\x2dswap comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 22:46:47 localhost.localdomain kernel: Adding 1572860k swap on /dev/mapper/fedora-swap.  Priority:-2 extents:1 across:1572860k FS

On Atomic Host I'll highlight the dracut-initqueue message and also the timeout error:

Oct 17 23:05:41 localhost systemd[1]: dev-mapper-fedora\x2dswap.device: Job dev-mapper-fedora\x2dswap.device/start timed out.
Oct 17 23:05:41 localhost systemd[1]: Timed out waiting for device dev-mapper-fedora\x2dswap.device.
Oct 17 23:05:41 localhost systemd[1]: Dependency failed for Resume from hibernation using device /dev/mapper/fedora-swap.
...
...
Oct 17 23:05:41 localhost dracut-initqueue[484]: Scanning devices sda2  for LVM logical volumes fedora/root fedora/swap
Oct 17 23:05:41 localhost dracut-initqueue[484]: inactive '/dev/fedora/swap' [1.50 GiB] inherit

And the same for Fedora Server

Oct 17 22:46:45 localhost.localdomain dracut-initqueue[298]: Scanning devices vda2  for LVM logical volumes fedora/root fedora/swap
Oct 17 22:46:45 localhost.localdomain dracut-initqueue[298]: inactive '/dev/fedora/swap' [1.50 GiB] inherit
Oct 17 22:46:46 localhost.localdomain systemd[1]: Found device /dev/mapper/fedora-swap.
Oct 17 22:46:46 localhost.localdomain systemd[1]: Starting Resume from hibernation using device /dev/mapper/fedora-swap...

so it looks like something in dracut is causing LVM to get scanned early on Fedora Server, but not on Atomic Host.

Edited 5 years ago by dustymabe

dustymabe commented 5 years ago

@walters
What Anaconda is doing here in hardcoding the resume= argument seems pretty broken to me. Why would I want to enable hibernation on my servers?

started a devel list discussion about this topic

walters commented 5 years ago

What changed in 29 might be the systemd hibernation implementation?

This also might be a server-side vs client-side initramfs thing. On classic systems, /etc/fstab will end up in the initramfs. That's not true for ostree-based ones by default.

That said Anaconda should be injecting the rd.lvm.lv bits on the kernel commandline and wait till they're probed.

Ahh...I have an idea; ordinarily nothing else in the initramfs is waiting for devices; the systemd fstab generator will handle /etc/fstab in the initramfs and make proper device/mount units for them. But now the resume generator is going out and running before dracut waits for devices or so?

walters commented 5 years ago

OK so I can definitely reproduce this if I tweak the current FCOS config/image.ks to use LVM and have a swap partition.

dustymabe commented 5 years ago

That said Anaconda should be injecting the rd.lvm.lv bits on the kernel commandline and wait till they're probed.

I definitely see them on the command line

Oct 17 23:04:11 localhost kernel: Command line: BOOT_IMAGE=/ostree/fedora-atomic-8f228e830fca568a137305e295ef96bafa0d08b0e27aa8c133a3bd24fff07280/vmlinuz-4.18.12-300.fc29.x86_64 resume=/dev/mapper/fedora-swap rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap root=/dev/mapper/fedora-root ostree=/ostree/boot.0/fedora-atomic/8f228e830fca568a137305e295ef96bafa0d08b0e27aa8c133a3bd24fff07280/0

Ahh...I have an idea; ordinarily nothing else in the initramfs is waiting for devices; the systemd fstab generator will handle /etc/fstab in the initramfs and make proper device/mount units for them. But now the resume generator is going out and running before dracut waits for devices or so?

yeah that seems to be the case at least from my log output in https://pagure.io/atomic-wg/issue/513#comment-536727

OK so I can definitely reproduce this if I tweak the current FCOS config/image.ks to use LVM and have a swap partition.

+1.. I'll dig into dracut a bit. Maybe we just disable systemd-hibernate-resume.service on ostree systems for now, but there might be some people on silverblue that want it. Not sure.

Edited 5 years ago by dustymabe

walters commented 5 years ago

My attempts to remove the systemd generator weren't enough; I eventually discovered there's also a /usr/lib/dracut/modules.d/95resume that parses the resume= kernel command line. But nuking that still doesn't seem to be enough.

walters commented 5 years ago

Here's what I have now, but it still times out. Removing the resume= kernel command line argument fixes it. I can't figure out what else is parsing it.

diff --git a/fedora-coreos-base.yaml b/fedora-coreos-base.yaml
index db3e42c..89bc4f8 100644
--- a/fedora-coreos-base.yaml
+++ b/fedora-coreos-base.yaml
@@ -106,6 +106,14 @@ postprocess:
     enable coreos-growpart.service
     EOF

+    # https://pagure.io/atomic-wg/issue/513
+    sed -i '/ConditionKernelCommandLine.*resume/d' /usr/lib/dracut/modules.d/98dracut-systemd/dracut-cmdline.service
+    rm -rf /usr/lib/dracut/modules.d/95resume
+    rm -vf /usr/lib/systemd/system/systemd-hibernate*.service \
+           /usr/lib/systemd/system/systemd-hibernate \
+           /usr/lib/systemd/systemd-hibernate-resume \
+           /usr/lib/systemd/system-generators/systemd-hibernate-resume-generator
+
     # Let's have a non-boring motd, just like CL (although theirs is more subdued
     # nowadays compared to early versions with ASCII art).  One thing we do here
     # is add --- as a "separator"; the idea is that any "dynamic" information should
diff --git a/image.ks b/image.ks
index 34df026..3efc9cc 100644
--- a/image.ks
+++ b/image.ks
@@ -11,7 +11,7 @@
 # coreos-assembler.

 # This line is interpreted by coreos-virt-install
-#--coreos-virt-install-disk-size-gb: 8
+#--coreos-virt-install-disk-size-gb: 40
 # We use this because Kickstart doesn't have a way to specify
 # the *total* size of the disk.
 text
@@ -32,6 +32,7 @@ network --bootproto=dhcp --onboot=on

 zerombr
 clearpart --initlabel --all
+
 # Add the following to kernel boot args:
 #  - ip=dhcp           # how to get network
 #  - rd.neednet=1      # tell dracut we need network
@@ -42,9 +43,12 @@ bootloader --timeout=1 --append="no_timer_check console=ttyS0,115200n8 console=t
 # See also coreos-growpart.service defined in fedora-coreos-base.yaml
 # You can change this partition layout, but note that the `boot` and `root`
 # filesystem labels are currently mandatory (they're interpreted by coreos-assembler).
+part pv.01 --grow
+volgroup coreos pv.01
 part /boot --size=300 --fstype="xfs" --label=boot
+logvol swap --name=swap --size=9000 --label=swap --vgname=coreos
 # Note no reflinks for /boot since the bootloader may not understand them
-part / --size=3000 --fstype="xfs" --label=root --grow --mkfsoptions="-m reflink=1"
+logvol / --name=root --size=3000 --fstype="xfs" --label=root --grow --mkfsoptions="-m reflink=1" --vgname=coreos

 reboot

Edited 5 years ago by walters

dustymabe commented 5 years ago

i'm still poking at this. it's hurting my head as well

walters commented 5 years ago

Perhaps the more correct thing to do here is for Anaconda to have an API to disable its automatic injection of resume=. I don't think we can do it in %post since that happens before the bootloader, and we don't have an API to manipulate them there.

walters commented 5 years ago

https://github.com/rhinstaller/anaconda/issues/1667

dustymabe commented 5 years ago

interestingly I just downloaded a rawhide silverblue image to test out https://pagure.io/pungi-fedora/pull-request/665 and noticed I did not hit this issue. I then tried rawhide Atomic Host and did see the issue. So it doesn't appear to affect silverblue (at least in rawhide).

dustymabe commented 5 years ago

and.. just confirmed I do not see this issue on f29 silverblue.

dustymabe commented 5 years ago

ok i've managed to be able to convert the f29 silverblue system (the one that was not showing the issue) into one that does show the issue.. here are my rough notes right now:

rpm-ostree install dracut-network iscsi-initiator-utils
rpm-ostree initramfs --enable
reboot

I need to eat lunch. will pick this up after

dustymabe commented 5 years ago

ok since this problem was introduced in the f28 stream I decided to run an rpm-ostree bisect to see where it first started happening. The results probably won't surprise you:

Starting RPM-OSTree Bisect Testing...
: Using data file at: /var/lib/rpm-ostree-bisect.json
: Did not find device timeout. Test Passes
: Removed /etc/systemd/system/multi-user.target.wants/rpm-ostree-bisect.service.
: BISECT TEST RESULTS:
: Last known good commit:
:   5736e83 : 28.20180708.0 : 2018-07-08T20:03:31Z
: First known bad commit:
:   bc3aa17 : 28.20180711.0 : 2018-07-11T18:26:22Z
: libostree pull from 'fedora-atomic' for fedora/28/x86_64/atomic-host complete
  security: GPG: commit http: TLS
  non-delta: meta: 270 content: 0
  transfer: secs: 43 size: 2.7 MB
: ostree diff commit old: 5736e832b1fd59208465458265136fbe2aa4ba89517d8bdcc91bc84724f40a8e
: ostree diff commit new: bc3aa17a5ad6c04103563bd93c4c668996ef786ec04f989a3209a9887c8e982c
: Upgraded:
:   acl 2.2.52-20.fc28 -> 2.2.53-1.fc28
:   attr 2.4.47-23.fc28 -> 2.4.48-1.fc28
:   dracut 047-8.git20180305.fc28 -> 048-1.fc28
:   dracut-config-generic 047-8.git20180305.fc28 -> 048-1.fc28
:   dracut-network 047-8.git20180305.fc28 -> 048-1.fc28
:   kernel 4.17.3-200.fc28 -> 4.17.4-200.fc28
:   kernel-core 4.17.3-200.fc28 -> 4.17.4-200.fc28
:   kernel-modules 4.17.3-200.fc28 -> 4.17.4-200.fc28
:   libacl 2.2.52-20.fc28 -> 2.2.53-1.fc28
:   libattr 2.4.47-23.fc28 -> 2.4.48-1.fc28
:   openldap 2.4.46-1.fc28 -> 2.4.46-2.fc28
:   podman 0.6.5-1.git9d97bd6.fc28 -> 0.7.1-1.git802d4f2.fc28
:   python3-pytoml 0.1.16-1.fc28 -> 0.1.17-1.fc28
: Removed:
:   grubby-8.40-11.fc28.x86_64
: Added:
:   libkcapi-1.1.1-1.fc28.x86_64
:   libkcapi-hmaccalc-1.1.1-1.fc28.x86_64
Started RPM-OSTree Bisect Testing.

So with everything we've learned so far the problem is in the code in the dracut-network rpm and introduced in dracut 048. Also the smoking gun seems to be somewhere in the iscsi module because when iscsi-initiator-utils isn't installed we don't have the problem.

Edited 5 years ago by dustymabe

walters commented 5 years ago

Ah. Fun. Thanks a lot for diving into this. Hmm...let's look at the source to those modules.

Though I am feeling that the right approach here is to turn it off in Anaconda.

walters commented 5 years ago

On the other hand, if we are needing to change this relatively sensitive part of Anaconda now, maybe the right thing is to just mark it as a Known Issue for FAH29? If it doesn't affect Silverblue since it's some dracut-network/iscsi hot mess then...eh. Let's focus on FCOS?

dustymabe commented 5 years ago

passing this off to the dracut team: https://bugzilla.redhat.com/show_bug.cgi?id=1641268

dustymabe commented 5 years ago

The fix for this is now in the updates-testing. Please test the updates testing ISO and add karma in the bodhi update.

siosm commented 5 years ago

LGTM for FAH 29 but I have not yet tested if this does not break iscsi support.

dustymabe commented 5 years ago

LGTM for FAH 29 but I have not yet tested if this does not break iscsi support.

According to harald upstream the socket activation should be sufficient to not break things, but please do report if you find issues.

siosm commented 5 years ago

Will do. Thanks!

atomic-wg

#513 Hibernation enabled with ISO based installation

Closed: Fixed 5 years ago Opened 5 years ago by siosm.

installclass issue

resume= issue

Metadata

atomic-wg

Source Code

#513 Hibernation enabled with ISO based installation Closed: Fixed 5 years ago Opened 5 years ago by siosm.

installclass issue

resume= issue

Metadata

#513 Hibernation enabled with ISO based installation

Closed: Fixed 5 years ago Opened 5 years ago by siosm.