#3169 Weigh in on installer storage setup issue
Closed: Accepted 11 months ago by zbyszek. Opened 11 months ago by sgallagh.

There's a potentially-blocking issue under discussion at https://bugzilla.redhat.com/show_bug.cgi?id=2263964

The short version of this is that blivet-gui (and the automatic and "custom" storage setups) in Anaconda for Fedora 39 and earlier had functionality that allowed queuing up the storage setup and then applying it as a batch, whereas the cockpit-storage-based approach makes changes immediately as you go. My primary concern is that this will lead to data loss for those users who are trying to set up a dual-boot and mis-click.

Please have the discussion over on the Bugzilla ticket, but I want to have this ticket available for us to possibly vote on this as a FESCo issue/blocker if we need to.


There are kinda wider concerns that I think FESCo should consider here, too.

This is the Change that we are supposedly implementing, as it was previously approved by FESCo. The scope of that Change describes the way partitioning will be implemented like this:

"Guided partitioning

The current (GTK) Anaconda UI approach is to have three types of partitioning.

  • Automatic - do everything automatically
  • Custom - you can do everything with top-down approach where users work on mount points and specified what technology they want to use and how
  • Blivet-gui - added later as bottom-up approach which enables users to create the partitioning stack themselves manually

These methods are giving great freedom but each of these has its issues. For automatic, the issue is almost no customizations and not a clear output. For custom and blivet-gui, you need to understand the Linux storage really well to know what you are doing, which could be intimidating. Because of those issues, we decided to choose another approach, which we are calling guided partitioning. This type of partitioning is giving users paths with explanations of what will happen but does not overload them with too many options at once. These paths could be then customized. This solution was taken as the best compromise between the automatic (no customization) and custom/blivet-gui, which was too heavy and hard to maintain.

We will provide the recommended solution and improved customization based on the users feedback. However, in case someone is not happy about the recommended solution, we are going to provide a way to guide users, to create their partitioning themselves (with a tool of their choice) and then tell Anaconda how to use it. This method could be also used for easy re-installation of the existing system and we are planning to improve the experience in the future even more."

That is not how things have wound up. The "there is no in-line custom partitioning" approach was abandoned fairly early and webui grew a button to launch blivet-gui as a separate app. Then various problems with that were noted and blivet-gui got a special mode which tweaked various things to be more appropriate to this flow. Then there was a period where blivet-gui was actually embedded into webui. And then finally, this anaconda update from 2024-02-07 - I believe - changed it so the "Modify storage" button runs an embedded instance of the Cockpit storage module (with some behavioural customizations for the installer environment). Rawhide composes were broken at that point, so we did not get a Rawhide compose with the change in it until Fedora-Rawhide-20240210.n.1 over the weekend.

So, the scope of the Change has crept substantially from what was initially approved, including a major revision which has landed (without prior formal notice through any Fedora process) three days short of the generic "testable" deadline for all Changes. Beta freeze is in two weeks, and we are going through branching now, which takes quite a lot of effort for releng and me, and interrupts the flow of composes for a bit. So we have two or three weeks to completely revise our installer testing approach for this custom storage mechanism which has never been used in an OS install context before - and all its integration into the actual installer workflow - carry out all the tests, and go through the test/fix/retest/file new bugs/repeat cycle with the anaconda team. I am not really confident that this is sufficient time, and I'm concerned it puts the F40 schedule substantially at risk.

Hi @awilliam , we definitely didn't wanted to make this silent action. It was unfortunate mistake from our side -- we just forgot that this wasn't officially announced by us :(.

Anyway, could you please write here what are the issues you are facing raised by this? What needs to be done? If we know the list of issues that was raised by this we might try to help to mitigate these somehow.

Well, that's hard, because the main concern is known unknowns, in Rumsfeld terminology.

The known knowns are relatively trivial: we know we need to rewrite all the tests, and then execute them all. I would have preferred to have more time to do that, because I have now thrown away all the work I was otherwise planning to do for the next week and a half and will spend it doing that instead (this was less necessary with the prior approach because we already had tests for blivet-gui, they would only have needed relatively minor tweaking for the webui workflow).

But then there will be known unknowns: the problems that are exposed by the testing. We don't know what all of those will be yet, but it would be pretty surprising if there weren't any. Kamil and I have both spent just a couple of hours poking at this since it landed and already filed multiple bugs. The issue that prompted this ticket was one of them, and it's a significant one.

We would usually run at least one test day for a change of this magnitude, so we will probably need to pull one of those together quite quickly. By its nature, that will probably result in several more bug reports which will need review and triage.

We have to go through those processes, assess the initial bug reports, address the ones that are found to be most critical, confirm the fixes, and try to get this new approach into shippable state ideally by March 5 (the last 'comfortable' date for building a Beta RC that would hit the early release target of March 12), all the while trying to keep an assessment of whether we are practically going to be able to get something of vaguely shippable quality by that date or at least only a week or two behind it, and when we should abandon the effort and trigger the contingency plan (back to gtkui) if we decide we can't.

I would love to be able to say "these are the exact issues we need to deal with", but I can't right now, because figuring out what issues there are is the first step in the process and it will take us at least a few days. It would have been a deal more comfortable if we could have done that several weeks or months ago.

BTW, for convenience for anyone who wants to look at what this whole new workflow actually looks like - grab https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20240212.n.0/compose/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-Rawhide-20240212.n.0.iso and run through an install. To see this "cockpit storage" workflow you need to click "Modify storage" on the first page of the installer proper.

Oh, I forgot to mention, there is another concern around this Change more generally: it relies on some significant integration work with the desktop team which seems to be at-risk. The Change includes a carefully-designed new workflow where the live image boots to a specific mode of gnome-initial-setup that does locale and keyboard layout configuration, passes that configuration to anaconda, then anaconda runs, writes some configuration into the installed system which indicates that locale/keyboard layout configuration has already happened, then on first boot of the installed system, g-i-s runs again but skips the locale/layout steps that have already happened.

This all relies on some changes to GNOME components that never completely got upstreamed, at least the following (there may be others I'm missing):

we have been carrying these as downstream patches, but this is now causing issues because the latter two conflict with subsequent changes in GNOME 46, and that has not been reconciled. So we can't update F40/Rawhide to current GNOME 46 unless we drop those patches, which would break this Change. The patches need to at least be rebased and ideally finished and merged upstream, but - AIUI - @rstrode is heavily involved in writing them but is currently busy with something else and does not have time to work on them. So we are kind of stuck there. CCing @catanzaro and @fmuellner for that angle.

I can look into refreshing the gnome-desktop patchset. Unfortunately it was intended to be integrated with gnome-control-center as well, but the gnome-control-center patchset was dropped here.

Another wrinkle here: it appears a significant change to how the cockpit storage-based workflow works is expected to land fairly soon, and will change the behaviour in various ways (it has already been referenced as the fix for several of the bugs we've filed so far). This makes it a bit awkward to update the test cases (manual and automated) because they will likely almost immediately be obsolete and have to be revised again. It also means some of the bugs we're finding are "already obsolete" but, OTOH, there may well be new bugs associated with that PR that we can't find yet.

For instance, as I'm revising them right now I have to instruct the tester to set mount points in the anaconda "assign mount points" screen for the partitions created in the cockpit UI even if they also specified mount points in the cockpit UI, but that will apparently no longer be necessary after that PR lands.

If possible, I would not like to kill the initiative based on known unknowns until there are a known-known reason to do that. We are trying to keep the test coverage high so my hope is that we already hit a lot of bugs.

About the integration part, we willingly decided to split this from the integration logic for reasons to simplify this. And honestly I'm happy about it because it started these discussions sooner :).

I'm trying the installer. Some questions:
1. The installer says on the first page "Instalowanie na 0x1af4 (vda)" (with Polish messages). This is a strange identifier, nothing in udevadm info /dev/vda has this number, it's also not the device number or name. Where does the 0x1af4 come from?
2. The "Configure storage part" says "Sformatuj /dev/vda. Typ: EXT4", which has me confused. Does it want to put the fs directly on the whole disk? Also, I think we switched to brtfs as the default.
3. OK, I tried to add a partition, and I immediately fell into the trap we are talking about here. I didn't make a boot partition, but the dialogue in the previous point defaults to using the full disk, so now I have a file system, I click OK and the system tells me that I'm missing the bootbios partition.
4. And it actually created the file system on the whole device, with no partition table.
5. OK, so I start the "Configure storage" part again, and create a GPT table and I can then create a partition with an fs. There's a strange gotcha: because I'm using Polish, the size is displayed as "21,5" GB (with a comma), but when I edit that to "20,5", it gets interpreted as 0. I guess the user would figure this out because the dialogue refuses to proceed with 0 size, but it's ugly.
6. OK, I created / and /boot, and I return to the installation window, and it still suggests "Erase data and install". I guess I need to select "Mount point assignment" to not lose the work I just did?
7. But if I go to "Mount point assignment" now, then why did it ask me to assign mount points in the "Configure storage part" parts?
8. In part 6., I gave labels to both partitions, but the partition with VFAT is shown as "vda2", the partition with btrfs is shown with the label.
9. Earlier on, it said that "/" and "biosboot" are required". Now it says "/" is required and "/boot" is Recommended.
10. The localization is partial. E.g. I see "Wymagany" and "Recommended" and various other parts are not translated, but that's OK at this point.
11. OK, now I click "Zakończ" (i.e. Finish) and then I get a dialogue which says something like "progress will not be saved. Finish or Continue installation". Maybe "Przerwij/Abort" was intended? It is very confusing as is. There's also a "Dalej/Continue" button, but it's greyed out.
12. Ah, OK, it didn't allow me to continue because there was an empty row with "Dodaj punkt montowania/Add a mountpoint" that I didn't fill in. I didn't expect this to matter. When I remove that, I can "Continue".
13. And now it says "System plików /boot nie może być typu vfat. Komputer oparty na BIOS-ie wymaga specjalnej partycji do uruchamiania za pomocą formatu etykiet dysków GPT. Aby kontynuować, proszę utworzyć partycję typu „Partycja startowa BIOS” („biosboot”) o rozmiarze 1 MiB na dysku vda.", i.e. that I screwed up.
14. Note that it tells me to create a 1 MiB partition, which I don't think is the right size.
14. Now I go back to the "Configure storage" dialogue, and I figure out that I need to "Set partition of "/dev/vda" to "BIOS boot partition". I wasn't asked this before and it's entirely nonobvious that you need to do that when initially adding partitions.
15. Also, "biosboot" vs. "BIOS boot partition".
16. OK, I return to "Mount point assignment" and all my assignment have been lost. I only have two, but if I had a system with a few partitions, I would be getting angry.
17. Oh, and now it doesn't allow me to select /dev/vda2 for /boot anymore. It just stopped showing in the device dropdown.
18. OK, so I go back to "Configure storage" and I find out that instead of VFAT I can format the partition as "Bios boot partition". I do that, and the partition shows up as "Unformatted data".
19. But hope springs eternal, so I click "Return to installation" and "Mount point assignment", and, surprise, /dev/vda2 still doesn't show up as an option in the device drop down.
20. Oh, but maybe the "biosboot" unformatted data magic thingy is just there in the background and doesn't need a mount point to be assigned? So I set "/" to /dev/vda1 again, but leave /boot unassigned. But no, "Dalej/Continue" is still greyed out with the message that "/boot cannot by type vfat". Oh, I need to actually click the Trashcan icon to remove the line.
21. After clicking Continue, it says it'll format "DISK1", i.e. the label I gave to the file system, again. And indeed, the UUID changed. There was no option to skip the formatting. Actually, on the previous page there was a checkbox whether to Reformat, but it's seems to be have been ignored.
22. I get to "Installing". It's visually slick, but doesn't previde any feedback what it's actually installing. I guess that OK, but I have to say that I do miss at least the list of rpms being installed.
23. There is a "Send us feedback on your installation" link. When I click that, it opens a new tab (I guess), but the tab interface is disabled. Fortunately I know that ^w can be used to close the tab, but if I didn't I would be stuck.
24. I reboot and the system boots up correctly, yay!
25. There's a Time Zone configuration box, and in the search box it proposes cities and airports and weather stations? E.g. I search for "War[saw]" and I get "Warszawa-Okęcie, Poland" and "Warsaw Munical Airport, Indiana, United States" and "Jasper Warden Automated Reporting Station, Alberta, Canada" and "Warsaw, Indiana, United States". The other Warsaw city I understand, but the "Bowling Green-Warren County Regional Airport, Kentucky" is a bit much.
26. Everything goes smoothly afterwards. It doesn't even make fuss about "test" as password.

So overall, I think this works, but feels like early-alpha software. There is very much a feeling of "flying blind", i.e. you click some buttons, and they do what they say, or sometimes they do nothing, or some other thing, and it's very hard to understand what is happening and what is supposed to happen. I would be too scared to try this on a system with actual user data.

If I stick to the guided partitioning workflow, the whole thing is pretty smooth.
1. The language selection box is a bit surprising because I search for "pol" and it doesn't offer any choices. I took me a second to figure out that I need to click the hambuger below to expand the one completion. I think it'd be nicer if this happened automatically.
2. The problem with tabless-tabbed interface: for example, there's an "About" dialogue which has a link to github Anaconda repo, and without ^W there's no way to go back to the installation.

I'm trying to avoid doing really deep dives for now because the behaviour is going to change significantly when https://github.com/rhinstaller/anaconda-webui/pull/72 lands. that will change a lot of the paths you found confusing, e.g. 6. and 7. But who knows what other issues will lurk.

biosboot and /boot are two different partitions. biosboot is indeed an "unformatted data magic thingy [that] is just there in the background and doesn't need a mount point to be assigned" (it provides space for bits of grub that, before GPT disk labels, were stuffed into a bit of unassigned space on the disk between the partition table and the first partition). 1 MB is the right size for it, but note that current webui/cockpit has a known bug that if you set the size to 1 MB it will actually create the smallest partition it is possible to create, and this won't work; you have to set it to 2 MB (or larger, but it doesn't need to be any larger). It is required on BIOS-native installs to GPT-labelled disks. This isn't new, it is also the case in the current installer, but the current installer does have the "create partitions for you" button which handles creating it.

/boot is...your /boot partition. It is considered 'optional but recommended', which there's currently some debates among the devs about how exactly to best represent. In gtkui, IIRC, it more or less meant "if you use the 'create partitions for me' button it'll be created, but if you do it all yourself and don't create one, nothing complains".

  1. I get to "Installing". It's visually slick, but doesn't previde any feedback what it's actually installing. I guess that OK, but I have to say that I do miss at least the list of rpms being installed.

this is probably just because it's a live image. you don't get a list of RPMs when installing a live image with gtkui either, because there aren't any RPMs being installed, it's just dumping a disk image.

I'm trying to avoid doing really deep dives for now because the behaviour is going to change significantly when https://github.com/rhinstaller/anaconda-webui/pull/72 lands

Yeah. This is also not the best place for such discussions.

Returning to the original question and https://bugzilla.redhat.com/show_bug.cgi?id=2263964, that bug was closed and reopened, and there's clearly no chance of cockpit growing a different workflow in the near future. So now the question becomes how to handle this for F40.

@jkonecny We have 11 days until the Beta freeze. Can you summarize what our options here are from your point of view? In particular, what is the important pull requests we're waiting for?

I would propose the following general workflow:

  • https://github.com/rhinstaller/anaconda-webui/pull/72 was just merged, let's see how things look with that.
  • the QA people should decide whether the state of the changes to the "Storage configuration" part are good enough for a beta release. I think FESCo members (and other interested people) can be involved in the decision, but it should follow the normal blocker bug procedure with participants who understand this area and the expectations much better than FESCo.
  • if the decision is negative, we need to invoke the contingency mechanism. If we do that, we should do it about week from now. Time is tight.

I think for those purposes, we can keep https://bugzilla.redhat.com/show_bug.cgi?id=2263964 for the discussion around the 'plan then apply' criterion, and aside from that, we can evaluate the more functional aspects of the installer against the current criteria (with my hasty revisions for webui) as best we can, once we have a compose with https://github.com/rhinstaller/anaconda-webui/pull/72 in it. I will try and get the team to do as much testing as possible once I've confirmed we have a viable image to test.

Will this be discussed at the next FESCo meeting? It would be good to have clarity on what we should be doing here, we're kinda waiting https://bugzilla.redhat.com/show_bug.cgi?id=2263964 on FESCo's input. For now we are working on the assumption that the criteria will be changed and we will attempt to ship Beta with webui/cockpit-storage , as that is the "most work required" path.

I'm honestly a little shocked that this isn't a thing. Being able to tally up actions and only submit it to udisks when we're ready to apply seems like a logical thing to do given all the other partitioning tools do similar things.

From my perspective, I'd be worried if we actually released with this and got roasted for creating scenarios where users inadvertently blew away their systems and left them in broken states.

It reminds me too much of the days of Disk Druid in old Red Hat Linux days... :frowning:

Metadata Update from @ngompa:
- Issue tagged with: meeting

11 months ago

Throwing my 2 cents in here as a plain old regular user... I'm on team "plan-then-apply". Partitioning is already scary enough for new users... it should not be easy to accidentally delete your whole Windows partition and everything on it just because of a mis-click. When that happens to new users, they will immediately have a bad opinion of Fedora - and probably Linux as a whole - forever.

all the other partitioning tools do similar things.

Except GNOME Disks, of course... :unamused:

My opinion, this shouldn't be taken as workflow of installation. We still follow the original Change which was that we don't want to include custom partitioning into Anaconda. Our focus should be on Guided partitioning where we want to drag the improvements. We are trying to make it clear that this is an external tool which we are able to provide for users and make it easier for use, however, we definitely want to express that any tool could be used. We don't force user to use this, especially on Live media they can use anything they prefer and then follow the Mount point assignment workflow. And yes, you have to know what you are doing and it should be documented.

However, in situations where you don't have environment to do this, we would like to provide something for user. These are mainly Gnome Initial Setup workflow and remote installations. For these cases mainly we added a tool which could be easily integrated and is maintained and used by other project, so we don't have to focus on it later and can more work on the guided solution.

In short, Cockpit storage in Anaconda web UI is just a possibility and it is an external tool. We have to make sure that users understand that we don't forcing them to use this tool, which was the situation with the GTK UI.

To summarize the options here:

  • Keep the current solution
    Obviously.

  • Keep the GTK UI and postpone web UI once again
    That would be unfortunate based my opinion we already slipped it once and postponing it again wouldn't make situation better. If there are challenges because it came late, feel free to share these and we can try to help. I'm also not sure what would be the benefit. We are fixing the bugs currently proposed pretty well and so far the progress is good.

  • Remove Cockpit storage from web UI
    Not sure what would be the benefit. We can do that but the result would be that Gnome Initial Setup workflow needs to be dropped most probably. (IIRC blivet-gui has blocker which would be hard to resolve and there are no developers to work on that).
    I want to note that I don't think Installer team will come with a great solution which would allow us to switch from Cockpit storage to something else or to change Cockpit storage. This situation would be just postponed most probably.

I would also like to point out that Desktop SIG is not included here and it's their product we are talking about. It would be great to have their opinion on this first.

Can I still switch to a virtual console and use a command-line partitioning tool? (It's been a while since I did a fresh installation, but I think that is Ctrl-Alt-F2?) That's probably what I'd try as an advanced user if I found the GUI inadequate or too confusing (and I have done it that way in the distant past). If that is to be considered a supported method for advanced partitioning, I'd request that gdisk be available. I prefer gdisk to parted, but that is just a preference.

Yes, that is supported and reasonable request.

This was discussed during today's FESCo meeting:
* AGREED: Changes/AnacondaWebUIforFedoraWorkstation is postponed to F41.
FESCo and QA will help arrange mass-testing early in the F41 cycle, shortly after F40 GA. (+5, 0, 0)

Metadata Update from @zbyszek:
- Issue untagged with: meeting
- Issue close_status updated to: Accepted
- Issue status updated to: Closed (was: Open)

11 months ago

@amoloney Can you reassign the Change to F41 please?

aoife is on vacation, I'm going to do a round of Change updates today (there are quite a few to handle after the testable deadline ping).

Somebody also needs to own actually implementing the reversion, BTW. We did the same thing in the F40 cycle so we just need to remember what was involved and do it again. I believe it involved updating a few GNOME packages to drop the downstream patches that service this installer flow, then dropping anaconda-webui from the Workstation live kickstart.

Hi everyone, please join the discussion to resolve web UI partitioning requirements

https://discussion.fedoraproject.org/t/feedback-anaconda-web-ui-partitioning/108995

Log in to comment on this ticket.

Metadata