Today I installed a pkg w/ rpm-ostree install and then ran rpm-ostree ex livefs because I didn't want to reboot right away. I was notified, as is often the case with package layering, that some packages were replaced, and replace wasn't specified, so I ran rpm-ostree ex livefs --replace. That didn't work, and I don't recall what the specific output was, but I rebooted to find that my layered package wasn't installed.
rpm-ostree install
rpm-ostree ex livefs
rpm-ostree ex livefs --replace
I rebooted and selected my second/previous(?) boot entry in grub, and the system wouldn't boot at all. I went back to the other boot entry, tried to get my rpm-ostree status and noticed that the service wasn't running. The error message is:
rpm-ostree status
Aug 08 12:31:45 sfjbrooks.localdomain systemd[1]: Starting RPM-OSTree System Management Daemon... Aug 08 12:31:45 sfjbrooks.localdomain rpm-ostree[4987]: Reading config file '/etc/rpm-ostreed.conf' Aug 08 12:31:45 sfjbrooks.localdomain rpm-ostree[4987]: error: Couldn't start daemon: Error setting up sysroot: readlinkat: No such file or directory Aug 08 12:31:45 sfjbrooks.localdomain systemd[1]: rpm-ostreed.service: Main process exited, code=exited, status=1/FAILURE Aug 08 12:31:45 sfjbrooks.localdomain systemd[1]: rpm-ostreed.service: Failed with result 'exit-code'. Aug 08 12:31:45 sfjbrooks.localdomain systemd[1]: Failed to start RPM-OSTree System Management Daemon.
@walters suggested I post the result of this command:
# ls -ld /boot/loader* lrwxrwxrwx. 1 root root 8 Aug 8 07:13 /boot/loader -> loader.0 drwxr-xr-x. 3 root root 4096 Aug 8 07:13 /boot/loader.0
i wonder if this is silverblue specific, or more a bug/issue in RPM-OSTree ?
Hmm. Did you have [Experimental] StageDeployments=true in /etc/rpm-ostreed.conf?
[Experimental] StageDeployments=true
/etc/rpm-ostreed.conf
What's the output ofls -ald /ostree/boot.*/*/*/* ?
ls -ald /ostree/boot.*/*/*/*
I do have [Experimental] StageDeployments=true in /etc/rpm-ostreed.conf
$ ls -ald /ostree/boot.*/*/*/* lrwxrwxrwx. 1 root root 108 Aug 8 12:18 /ostree/boot.0.0/fedora-workstation/4e3b649dce917f9210948b8fd12433f664294c86c0660a09ef70f74cad1fd8a2/0 -> ../../../deploy/fedora-workstation/deploy/d3051fd505ecb778e2a0951f000b37f8b5f80944ac330122cc327bdc8c317a21.0 lrwxrwxrwx. 1 root root 108 Aug 8 12:18 /ostree/boot.0.0/fedora-workstation/4e3b649dce917f9210948b8fd12433f664294c86c0660a09ef70f74cad1fd8a2/1 -> ../../../deploy/fedora-workstation/deploy/d3051fd505ecb778e2a0951f000b37f8b5f80944ac330122cc327bdc8c317a21.1 lrwxrwxrwx. 1 root root 108 Aug 8 07:13 /ostree/boot.0.1/fedora-workstation/321d724b4aaed18afcfa63f1e9d7a7f34a7dc2720a07c892f752bdf08dd9138d/0 -> ../../../deploy/fedora-workstation/deploy/4e7e3d537d9a5758cc5ea248c36eb1124624e8e5d966d64f4f58679f4dc602aa.0 lrwxrwxrwx. 1 root root 108 Aug 8 07:13 /ostree/boot.0.1/fedora-workstation/4e3b649dce917f9210948b8fd12433f664294c86c0660a09ef70f74cad1fd8a2/0 -> ../../../deploy/fedora-workstation/deploy/d3051fd505ecb778e2a0951f000b37f8b5f80944ac330122cc327bdc8c317a21.0 lrwxrwxrwx. 1 root root 108 Aug 8 12:18 /ostree/boot.0/fedora-workstation/4e3b649dce917f9210948b8fd12433f664294c86c0660a09ef70f74cad1fd8a2/0 -> ../../../deploy/fedora-workstation/deploy/d3051fd505ecb778e2a0951f000b37f8b5f80944ac330122cc327bdc8c317a21.0 lrwxrwxrwx. 1 root root 108 Aug 8 12:18 /ostree/boot.0/fedora-workstation/4e3b649dce917f9210948b8fd12433f664294c86c0660a09ef70f74cad1fd8a2/1 -> ../../../deploy/fedora-workstation/deploy/d3051fd505ecb778e2a0951f000b37f8b5f80944ac330122cc327bdc8c317a21.1
OK, I think this is probably related to https://github.com/ostreedev/ostree/pull/1672 and https://github.com/projectatomic/rpm-ostree/pull/1456
Will I be able to recover from this?
Yeah, almost certainly, but it's a bit predicated on us figuring out exactly what's wrong. It's still not clear to me what's broken - I think it's something in the /ostree/boot.* symlinks but when I try to reproduce this starting from
/ostree/boot.*
ostree://fedora-atomic:fedora/28/x86_64/atomic-host Version: 28.20180722.0 (2018-07-23 00:38:05)
I get instead: Replacing /usr... error: No such metadata object 22e3a432406d2d9df2babffc800081913bc34e108f299a66754a1240041a76f2.commit which is the expected problem. It might
Replacing /usr... error: No such metadata object 22e3a432406d2d9df2babffc800081913bc34e108f299a66754a1240041a76f2.commit
Anyways...does ostree admin deploy fedora-atomic:fedora/28/x86_64/workstation work? if you reboot into that you should be in the base.
ostree admin deploy fedora-atomic:fedora/28/x86_64/workstation
OK yep reproduced from ostree://fedora-atomic:fedora/28/x86_64/atomic-host Version: 28.20180804.0 (2018-08-04 19:52:51) Looking...
ostree://fedora-atomic:fedora/28/x86_64/atomic-host Version: 28.20180804.0 (2018-08-04 19:52:51)
# ostree admin deploy fedora-atomic:fedora/28/x86_64/workstation error: readlinkat: No such file or directory
OK, I'm still not sure why, but indeed there's an "orphaned" bootloader entry which is pointing at the rollback deployment. For me, just mv /boot/loader/entries/ostree-fedora-atomic-1.conf{,.bak} fixed it. Before doing this, run cat /proc/cmdline and look at the ostree=/ostree/boot.0/$stateroot/$somechecksum/$v arg - don't rename/delete a bootloader entry that contains that.
mv /boot/loader/entries/ostree-fedora-atomic-1.conf{,.bak}
cat /proc/cmdline
ostree=/ostree/boot.0/$stateroot/$somechecksum/$v
At this point let's take this to rpm-ostree upstream; I'll file an issue there.
https://github.com/projectatomic/rpm-ostree/issues/1495
OK, I'm still not sure why, but indeed there's an "orphaned" bootloader entry which is pointing at the rollback deployment. For me, just mv /boot/loader/entries/ostree-fedora-atomic-1.conf{,.bak} fixed it.
That appears to have done the trick, thanks!
closing since we are tracking in https://github.com/projectatomic/rpm-ostree/issues/1495
Metadata Update from @otaylor: - Issue status updated to: Closed (was: Open)
Login to comment on this ticket.