Issue #65: Pacakge reset causes first run on new ostree - teamsilverblue

teamsilverblue

#65 Pacakge reset causes first run on new ostree

Closed 5 years ago by jakfrost. Opened 5 years ago by jakfrost.

Late last year, I was using fedora-toolbox on Silverblue. After a podman/buildah update, my fedora-toolbox broke. So I talked with @rishi about it and was guided to override those two packages, which solved my issue with using my pet container. Recently, with the new upgrade to fedora-toolbox, I tried to reset those downgrades, beginning with buildah. So I used a command something like this 'rpm-ostree override reset buildah https://koji.fedoraproject.org/koji/buildinfo?buildID=1157063' which failed. I then typed 'rpm-ostree install --allow-inactive buildah-1.5-1.gite94b4f9.fc29" which seemed to work, i did 'systemctl reboot'. On reboot I was faced with the first run user setup screen. I can repeat this, using a different package (podman in this case), which would seem to indicate it isn't related to those packages. This has also happened to me on an upgrade of Silverblue from F28 to F29.

jakfrost commented 5 years ago

Fun fact: I was notified of pending ostree updates (via Gnome SW) while I was filing this. After I finished I closed out everything (except Gnome SW) and applied updates, then did the restart and update for the ostree and codec's and firefox. When I rebooted, I was faced with the first login user setup screen. So it happens on an ostree update too!

jlebon commented 5 years ago

Hmm, I think I've seen reports of the initial setup showing up after an upgrade elsewhere too. Are all your configurations still in place otherwise?

rpm-ostree override reset buildah https://koji.fedoraproject.org/koji/buildinfo?buildID=1157063

To undo an override, you simply specify the package name on which the override was placed. E.g.

$ rpm-ostree override replace https://kojipkgs.fedoraproject.org//packages/podman/0.10.1.3/4.gitdb08685.fc29/x86_64/podman-0.10.1.3-4.gitdb08685.fc29.x86_64.rpm
$ # some time later...
$ rpm-ostree override reset podman

I then typed rpm-ostree install --allow-inactive buildah-1.5-1.gite94b4f9.fc29 which seemed to work

This likely doesn't do what you want. It'll make sure that buildah-1.5-1.gite94b4f9.fc29 is always installed in your tree. So on the next buildah bump, you'll get an error that buildah-1.5-1.gite94b4f9.fc29 was not found in the repos (and if somehow it were found, you'd get conflicts with the builtin one).

jakfrost commented 5 years ago

Hello,
Initially, I was trying override with reset option for either podman or buildah, I would get the message that there was nothing to do. When I checked their version it would show the earlier version I wanted to not use. So I did an rpm-ostree status to see if the packages were actually there, etc... anyway it showed them (podman and buildah) with their current up to date versions as inactive. So I thought to use --allow-inactive, although rpm-ostree balked at the switch/option. Yeah, i uninstalled the buildah rpm, then allowed the upgrade to go ahead.
As for the upgrade and new user thing, yes this is the third time it has happened to me personally, and there was another user I talked with on the Silverblue discussion site about it happening to them. I can access my original user afterwards, but I must go through the initial setup of another user to get there. The packages my original user had installed are all there less the layered ones, with the new user. The original user dotfiles, home, everything is still intact, so nothing lost. I suppose, I could have interrupted the bootup and gone directly to a prompt and logged in a terminal at start instead of Gnome. Then likely prevented this from happening by ???? creating a 'firstrundone' text file?

jakfrost commented 5 years ago

Is there a way for me to go back to the image of rpm-ostree prior to my updates being applied? I am thinking I can go through my shell history file to retrace the steps to create the problem.

jlebon commented 5 years ago

Yes, you should be able to do rpm-ostree deploy <VERSION>. If you can write down an exact procedure for reproducing the problem, it'll definitely be easier for someone to investigate.

jakfrost commented 5 years ago

Yes, you should be able to do rpm-ostree deploy <VERSION>. If you can write down an exact procedure for reproducing the problem, it'll definitely be easier for someone to investigate.

I did redeploy to the version I could find using ostree refs command. It was not far enough back.
So I have been going back as far as December 19 2018 version by guessing the version date and consequently number. I was wondering if there is a meta data file that hold info on the versions on my system. Also, could there be more than one version for a specific date, indicated by the last number .0 or .1 or some such? I have had success in getting back to those versions so far, just when I upgrade the issue doesn't happen yet. Also, I know for a fact that I had Podman and Buildah downgraded to specific versions for fedora-toolbox to work, would that have been part of the issue since it happened after I had tried to restore them to their proper release level due to the upgrade in fedora-toolbox (to 0.3). Something like dnf has with distro-sync would have been real nice.

Edited 5 years ago by jakfrost

jakfrost commented 5 years ago

Hello @jlebon ,
@dustymabe helped get me directed to where to look with ostree pull --commit -data-only and I was able to go back to where I know the timeframe is of the issue. I get the following printout from the command (showing only the pertinent timeframe)

The deploy for the 7th works, the one for the 11th (which was after I accepted defeat and just logged in as new user) works fine too. The two on 10th however are unreachable giving me the error message about

What I can tell you is I had my podman buildah set up as downgraded with rpm-ostree override command as per this info
"https://koji.fedoraproject.org/koji/buildinfo?buildID=1157063
https://koji.fedoraproject.org/koji/buildinfo?buildID=1157063
And also buildah down to 1.4.x.
downgrade to a 0.10.x version of podman"
from @rishi to get my fedora-toolbox working, which would have been for some time too since it broke for me after the initial release was upgraded (0.1 to 0.2 )? Is there anything file wise on my system that could help to diagnose this? I mean that you think I should paste somewhere. I don't mind testing something for it too if necessary, all my work is from my home at this point in time so it's easy to progress through things sometimes.

Edited 5 years ago by jakfrost

dustymabe commented 5 years ago

@jakfrost - I think the dates look a little off because the output is telling when the commits were signed in your local TZ but the version numbers of the commits represent the date in UTC when the compose was started.

Here are the commits for the 9th and 11th

commit d30e6754df8d97569cb90ce3eebb0b419c1afef7694fadd5526ac9ca52a1d49f
ContentChecksum:  1886731169822980471944f0173a5f0eb5326e46bdab9653d1a5b595eacb022d
Date:  2019-01-11 02:41:43 +0000
Version: 29.20190111.0
(no subject)

commit f701d4fac2af4d79dbcdc6a335b52e24cd85cd08d3a170957f592feb0a04667a
ContentChecksum:  7e9189688e34a21d9fa6316a4e26f735a71736281dce4d253e91f2c8ee17731d
Date:  2019-01-09 21:49:36 +0000
Version: 29.20190109.2
(no subject)

jakfrost commented 5 years ago

After spending the better part of two days trying to recreate this issue, I am still unable to. I was able to get back to the particular ostree image I was interested in, I just cannot get it to recreate the issue of first run after upgrade. Which makes me think it is likely something not part of the image itself, but the user system. Through the efforts of jogging my memory on the steps I executed that resulted in the issue noted, I realised that at least twice I had performed an rpm-ostree commit before rebooting into a previous commit. The last such time (in memory) podman was still at version 0.10.3 and still reporting nothing to reset with the override reset command. Buildah was for sure the rpm version I had downloaded (also noted above) and managed to get installed (through removing the base version). I know I had used ostree command's in the mix, but I remember that being for checking for the existing images, and no writing/committing function was ever used.

The screenshot above shows the output of both podman images and buildah images commands, done this morning before I continue trying. My point here is that I am pretty certain I was the creator of my issue as noted above, and so too were others likely that had the same problem. Having said that I am all for finding out what I did to avoid doing that specific thing again. In my day to day job there are times when the how is not as important as the why, which when applied to this issue brings me to the point that this approach to a workstation setup is new to a lot of us. Although I keep up to date on the newer technologies as they get rolled out, I am cognizant always that I generally don't employ most aspects of it at the levels of use that it was designed to be, and as a consequence misapply some commands likely, break some verboten rules (also likely) without realising it.

jakfrost commented 5 years ago

I have been going through my logs for the dates around the time this issue occurred and I noticed that just before the reboot for an ostree update, there was an error of Gimp failing to update due to remote disconnecting. I remember this as having happened on the same day as the issue. I forced a reboot when I saw that gdm had loaded the initial user setup after install dialogue. I did a hard reset to force the reboot. Which logs would be a good place to dig further into this?

jlebon commented 5 years ago

Unfortunately, if it's a subtle interaction of your system state in /etc and /var wrt the OSTree commit, it's possible indeed that it will be impossible to reproduce the issue. To confirm, other than the first user setup window, you didn't actually experience any data/config loss, right?

Which logs would be a good place to dig further into this?

The system journal is about the best you can do, esp. if it's something that happened a while ago already.

jakfrost commented 5 years ago

Unfortunately, if it's a subtle interaction of your system state in /etc and /var wrt the OSTree commit, it's possible indeed that it will be impossible to reproduce the issue. To confirm, other than the first user setup window, you didn't actually experience any data/config loss, right?

No loss of data, and the original user was intact right down to shell script's, et. al.

The system journal is about the best you can do, esp. if it's something that happened a while ago already.

I was doing a comparison between both the system and user journals, to try and piece together the steps (combined with the shell history file, and the rpm-ostree history). I think I should close this issue now, so as not to block any similar that may come along and be better diagnosed.

Metadata Update from @jakfrost:
- Issue status updated to: Closed (was: Open)

5 years ago

Metadata

Assignee

None

Tags

None

Blocking

None

Depending on

None

Milestone

None

Attachments 3

Screenshot_from_2019-01-25_13-32-42.png

Attached 5 years ago View Comment

Screenshot_from_2019-01-25_13-36-09.png

Attached 5 years ago View Comment

Screenshot_from_2019-01-26_08-09-13.png

Attached 5 years ago View Comment

teamsilverblue

Source Code

#65 Pacakge reset causes first run on new ostree Closed 5 years ago by jakfrost. Opened 5 years ago by jakfrost.

Metadata

Attachments 3

#65 Pacakge reset causes first run on new ostree

Closed 5 years ago by jakfrost. Opened 5 years ago by jakfrost.