#153 btrfs by default
Closed: Fixed 3 years ago by catanzaro. Opened 3 years ago by chrismurphy.

Changes/BtrfsByDefault

  • Solves #152
  • Contributes to the existing effort to solve #154 (formerly #98) by bringing IO isolation
  • Regular CLI/GUI tools sanely report free/used space as most users expect
  • Desktop integration would be nice, but isn't immediately required.

Brings other benefits to the table:

  • Compression
  • Reflinks
  • Online shrink
  • Integrity checking
  • Snapshots
  • Simple and comprehensive CLI for all features, one master
    ...

This proposal solves the most problems with the least amount of change needed, and no negative knock on effects.


Metadata Update from @chrismurphy:
- Issue tagged with: installation, meeting

3 years ago

Wiki is unavailable for some folks, so I've put it up here.

I just did a quick check of what btfrs looks like from a UI integration point of view (tested using VMs):

  • Files: everything looks like normal with btrfs
  • Disks: disk partitioning shows fine with btrfs, but subvolumes aren't listed. In comparison, they are shown for LVM. Plain partitions work fine too.
  • Disk Usage Analyzer:
    • With the current default (LVM & volumes for home and root), it's not possible to see how much disk space is being used. It doesn't show an overview of the disk usage as a whole and, for some reason, it doesn't show the available space in home either (see #101)
    • The situation with btfrs appears to be identical to the default - you see home and root listed, but you can't see % of space used overall.
    • Plain partitions and a single ext4 partition work better, since you can scan the root partition and get an overview of disk space usage. It's not intuitive, since you have to scan the localhost.localdomain location, and it's not really clear what that is, but it can be done.
    • In general, the initial screen of the app doesn't perform well in terms of how it presents the disk and the different locations to the user. It could do with work.

Disks: disk partitioning shows fine with btrfs, but subvolumes aren't listed. In comparison, they are shown for LVM. Plain partitions work fine too.

I assume that GNOME Disks still uses udisks2? UDisks2 supports managing Btrfs and subvolumes, last I checked. Perhaps this is just something where we're not using an API correctly?

Disk Usage Analyzer: [...] The app doesn't perform well however the disks are partitioned, and generally needs work

Unfortunately I agree with that assessment. :disappointed:

Yes, Disks uses udisks2.

The app doesn't perform well however the disks are partitioned, and generally needs work

I wouldn't spend any time trying to improve Disk Usage Analyzer. My plan is to remove it as soon as Usage is ready. (Of course, Usage has issues with btrfs as well.)

I suspect none of these will be fixed until after we change our default, but we should prioritize Disks and Usage.

Ah, so we're going to say goodbye to baobab (the package name I could almost never remember and for the life of me couldn't figure out how you get "Disk Usage" from "baobab")?

Ah, so we're going to say goodbye to baobab (the package name I could almost never remember and for the life of me couldn't figure out how you get "Disk Usage" from "baobab")?

That's unfortunate. It's a great tool for finding what is consuming your disk.

It's redundant with Usage. We don't need two tools that do the same thing. :)

As a non-filesystem guy, I have wanted something like btrfs/zfs, particularly for VM disks, for a long time. However, I have always been cautious because I had heard "one of the ways it fails is it eats your data." For me, and I suspect many others, this is a non-starter "bug" in a filesystem. However, I have been hearing very positive experiences from users of btrfs and am inclined to trust the experts.

As a result, my concern turns to the UX of a change. I and, I suspect, many fedora users do know how to use lvm even if in a rudimentary way through tools like system-storage-manager. I think we are glossing over the UX change for the many technical (but maybe not with filesystems) users we have.

So, for me, I would want to see a short glossary and a mapping of operations, similar to the yum->dnf or dnf->rpm-ostree ones, that will quickly answer "how do i do x?" I am willing to help with the "questions" side of the "faq" if needed.

I and, I suspect, many fedora users do know how to use lvm even if in a rudimentary way through tools like system-storage-manager.

I've never heard of system-storage-manager before now.

So, for me, I would want to see a short glossary and a mapping of operations, similar to the yum->dnf or dnf->rpm-ostree ones, that will quickly answer "how do i do x?" I am willing to help with the "questions" side of the "faq" if needed.

That seems useful, of course.

Rough draft for a LVM->Btrfs non-secret decoder ring for Langdon. It's missing the use case/how do I? portion.

@aday

Disks: disk partitioning shows fine with btrfs, but subvolumes aren't listed. In comparison, they are shown for LVM. Plain partitions work fine too.

Since subvolumes don't have size, and can't be resized, it might be best to leave them out of this particular app, i.e. treat them as directories.

@catanzaro

(Of course, Usage has issues with btrfs as well.)

Whereas in this case, it might be necessary to account for "shared extents" to avoid counting them more than once. (Reflinks, snapshots, dedup, create shared extents.) Maybe there's a better place to track/organize these kind of questions?

@aday

Disks: disk partitioning shows fine with btrfs, but subvolumes aren't listed. In comparison, they are shown for LVM. Plain partitions work fine too.

Since subvolumes don't have size, and can't be resized, it might be best to leave them out of this particular app, i.e. treat them as directories.

I think that it's important for some place in the desktop to display that the subvolumes exist. Two reasons for this:

  1. General transparency and intelligibility - so people can learn how the system has been put together, should they wish to
  2. Awareness that home can be reused when reinstalling

Maybe there's a better place to track/organize these kind of questions?

These issues do seem to become unweildy very quickly. At the same time, this discussion does seem relevant to the topic at hand.

I'd personally say that the Disks issue should be a requirement that's part of the change proposal.

Under normal circumstances, I'd also argue that Disk Usage Analyzer ought to work effectively with whatever changes we make to the filesystem, and should therefore also be part of the change that's being proposed.

That requirement is slightly complicated by the fact that the current setup is already broken with the Disk Usage Analyzer. In that light, it might be enough to have some evidence to the effect that btrfs doesn't make it impractical to improve the state of that application (or its potential replacements).

I think that it's important for some place in the desktop to display that the subvolumes exist.
1. General transparency and intelligibility - so people can learn how the system has been put together, should they wish to
2. Awareness that home can be reused when reinstalling

Difficulties doing this in Disks
- Seems outside Disks' own scope description: configuration of disks and media.
- Snapshots are subvolumes. Will the UI show all subvolumes? Or a subset? What filtering mechanism?
- SUSE's default installation uses ~10 subvolumes, quickly and automatically producing a dozens more in the first week of use. There's no practical limit on the number of subvolumes, it's something like 2^64.
- Container apps (e.g. podman) can also make prolific use of snapshots. Show them all in Disks? Why?
- Subvolume mounted with -o subvol are just bind mounts, so is the proposal to show all bind mounts? rpm-ostree/Silverblue make prolific use of bind mounts, and exposing how such systems are put together is pretty confusing. And subvolumes aren't always bind mounted, they could just be nested like a directory.
- The installer has to contend with these issues too. I think it promotes the Fedora home subvolume into a Fedora specific listing by parsing /etc/fedora-release and /etc/fstab and everything else is in an Unknown listing.

I'd personally say that the Disks issue should be a requirement that's part of the change proposal.

I get that subvolumes are slightly maddening and could be anything or anywhere. But I also don't think it's that urgent to solve. SUSE has shipped it in this state for 6 years. So I'm not convinced they should be shown here, certainly not all of them.

It is possible to set an XATTR on a subvolume. There have been discussions about having a subvolume type GUID, similar to the concept of GPT partition type GUIDs to indicate purpose/owner/domain, etc. That might be better done in a standard/defined XATTR that everyone agrees on? (freedesktop.org spec?)

I think that it's important for some place in the desktop to display that the subvolumes exist.
1. General transparency and intelligibility - so people can learn how the system has been put together, should they wish to
2. Awareness that home can be reused when reinstalling

Difficulties doing this in Disks
- Seems outside Disks' own scope description: configuration of disks and media.

I don't know where you've got that description from, but it seems pretty obvious that Disks ought to show how device storage is setup. Where else would a user go to see that?

  • Snapshots are subvolumes. Will the UI show all subvolumes? Or a subset? What filtering mechanism?
  • SUSE's default installation uses ~10 subvolumes, quickly and automatically producing a dozens more in the first week of use. There's no practical limit on the number of subvolumes, it's something like 2^64.
  • Container apps (e.g. podman) can also make prolific use of snapshots. Show them all in Disks? Why?
  • Subvolume mounted with -o subvol are just bind mounts, so is the proposal to show all bind mounts? rpm-ostree/Silverblue make prolific use of bind mounts, and exposing how such systems are put together is pretty confusing. And subvolumes aren't always bind mounted, they could just be nested like a directory.
  • The installer has to contend with these issues too. I think it promotes the Fedora home subvolume into a Fedora specific listing by parsing /etc/fedora-release and /etc/fstab and everything else is in an Unknown listing.

These are interesting questions. My immediate answer is that Disks ought to expose the information and features that users need for basic to intermediate (as opposed to advanced) storage tasks.

What that means in the context of btrfs is probably a conversation to have elsewhere. :smile:

I'd personally say that the Disks issue should be a requirement that's part of the change proposal.

I get that subvolumes are slightly maddening and could be anything or anywhere. But I also don't think it's that urgent to solve.

It's a question of which features we think should be available to users. One of the main arguments for btrfs has been to allow home reuse without hitting space limits. That implies that we want home reuse to be easy to do with btrfs and, I'd argue, that implies that users ought to be able to see when a subvolume for home is present. Otherwise, how will users know that home reuse is possible?

Logically, I don't think that it's correct to argue that btrfs is the best filesestym option because it allows home reuse, but then not do the work to expose that feature to users. Either we solve the problem for our users, or we don't.

I also agree with @aday that we should make GNOME Disks expose Btrfs features better (or at all, really).

openSUSE's case is... interesting. They don't rely very much on KDE Partition Manager or GNOME Disks much, as their YaST Partitioner tool far outstrips both and is shipped by default on openSUSE already.

For the Workstation case, I think we'd want GNOME Disks to be better here. I'm also having conversations on the KDE side about improving things over time there too.

Upstream issues for Disks btrfs support:

I don't know where you've got that description from, but it seems pretty obvious that Disks ought to show how device storage is setup. Where else would a user go to see that?

I got it from Disks>About (opensuse tumbleweed). On Fedora there is no About, but the description in Software: Disks provides an easy way to inspect, format, partition, and configure disks and block devices.

A subvolume, on-disk, is a dedicated b-tree. They're more sophisticated than a directory, less sophisticated than a file system, and are not a block device.

Where I'd go look for them depends on context I suppose. Maybe Disks only shows the subvolumes listed in fstab or systemd native mounts? And maybe Files continues to show those same subvolumes as directories; but does somehow graphically represent subvolumes and their snapshots? There is quite a lot of metadata associated with subvolumes and snapshots that indicates their relationship to each other (if any), but offhand the API may be limited.

More in the domain of Disks as it relates to Btrfs? Multiple device support.

It's a question of which features we think should be available to users. One of the main arguments for btrfs has been to allow home reuse without hitting space limits.

  • It could be specific to Anaconda, not sure.
  • If this idea were implemented to reuse a /home directory, would you say it should be visible in Disks as a distinct thing?

Otherwise, how will users know that home reuse is possible?

Numerous examples in Windows and macOS world of /Users (functional equivalent of /home) being reused without graphical representation - they just use text "preserve user files" and offer no other indication that /Users can be reused. I'm not saying it's a bad idea to figure out a way to represent it, I just don't agree it's so significant that it should be attached as a feature requirement.

Logically, I don't think that it's correct to argue that btrfs is the best filesestym option because it allows home reuse, but then not do the work to expose that feature to users. Either we solve the problem for our users, or we don't.

It is exposed in Anaconda. It seems like a leap to me that a thing exposed in Disks is a thing that a completely different piece of software will necessarily protect and allow to be reused. And while Anaconda does do a good job of protecting user data by default without explicitly saying so, it doesn't say so - and my guess is that on both Windows and macOS they decided to be explicit and verbose about it rather than risk ambiguity.

https://gitlab.gnome.org/GNOME/gnome-disk-utility/-/issues/10

This is just bad advice without context or constraints on what is shown. At one time I had some grandiose ideas about btrfs integration, but since Apple recently shipped functional equivalents to subvolumes and they have almost no UI for them, shrug I've reduced my expectations. And also, this RFE has no responses, no thumbs up, and it just doesn't seem compelling after six years - let alone one that would later translate into a must have feature before btrfs could be the default file system.

At one time I had some grandiose ideas about btrfs integration, but since Apple recently shipped functional equivalents to subvolumes and they have almost no UI for them, shrug I've reduced my expectations

To be fair here, Apple also doesn't generally want you to use those features. That's not a generally equivalent case. The features are mostly for Apple to implement its new OS lockdown features, not for the user.

What about performance and benchmarks? How much differs these days compared to 2019? It was noticeably slower, sometimes x5 times, for example application start-up time. Why instead of making our desktops faster we want make them slower?

Also what is really changed since last discussion from authoritative and competent members in this area? It is a time bomb and always will be. In openSUSE BTRFS used only for / and XFS for /home, but Fedora want to use BTRFS for /home partition as well, not only for / (see #152). This mean when disaster happened it will also happened for user data. Just imagine how this could affect Fedora reputation in general.

... it seems pretty obvious that Disks ought to show how device storage is setup. Where else would a user go to see that?
...

Where I'd go look for them depends on context I suppose. Maybe Disks only shows the subvolumes listed in fstab or systemd native mounts? And maybe Files continues to show those same subvolumes as directories; but does somehow graphically represent subvolumes and their snapshots?

It's a question of which features we think should be available to users. One of the main arguments for btrfs has been to allow home reuse without hitting space limits.

It could be specific to Anaconda, not sure.
...

Otherwise, how will users know that home reuse is possible?

Numerous examples in Windows and macOS world of /Users (functional equivalent of /home) being reused without graphical representation - they just use text "preserve user files" and offer no other indication that /Users can be reused. I'm not saying it's a bad idea to figure out a way to represent it, I just don't agree it's so significant that it should be attached as a feature requirement.

Logically, I don't think that it's correct to argue that btrfs is the best filesestym option because it allows home reuse, but then not do the work to expose that feature to users. Either we solve the problem for our users, or we don't.

It is exposed in Anaconda. It seems like a leap to me that a thing exposed in Disks is a thing that a completely different piece of software will necessarily protect and allow to be reused.

From a UX perspective, I think that there are two aspects to this question.

Exposing the datapreservation feature

The primary requirement here is that someone can verify that their current configuration will support preserving home on reinstall.

This could be via a very simple, fairly abstracted experience where we have a label somewhere in Disks that says "Data can be preserved on reinstall" or similar, and a corresponding "Preserve user data" check box in Anaconda.

If Anaconda doesn't have that check box, and we have to with more of a DIY approach, we might need to expose more of the technical details, so users know that the home subvolume is present.

The reason Windows and Mac likely don't do this is that users can be generally confident that data can be preserved, due to the defaults being more established and less likely to be deviated from.

Storage transparency

Historically we have made the current storage configuration visible to the user, assuming it's relatively simple.

Now it could be that we have already deviated from this transparency expectation with EFI and boot partitions and so on - certainly disk partitioning is more complicated now than in the past. However, from a UX perspective I do think that it's worth considering to what extent we expect Fedora to provide a transparent view of the storage setup. given that:

  • it could be an established user expectation
  • we're targeting relatively technical users
  • part of our mission (I think) is to enable people, particularly newcomers to Linux, to learn the underlying technologies

@atim

What about performance and benchmarks? How much differs these days compared to 2019? It was noticeably slower, sometimes x5 times, for example application start-up time. Why instead of making our desktops faster we want make them slower?

We've asked @josef about this (as it certainly has come up). Btrfs performance (throughput and latency) is on par with our standard filesystem configuration at the levels we operate at. We'd have to push to some very extreme levels to make Btrfs latency become an issue. Moreover, with stuff like transparent compression, we can actually make performance higher than it is with even XFS in most circumstances, since effective throughput would be higher by virtue of being able to read more data in less physical space.

Also what is really changed since last discussion from authoritative and competent members in this area? It is a time bomb and always will be. In openSUSE BTRFS used only for / and XFS for /home, but Fedora want to use BTRFS for /home partition as well, not only for / (see #152). This mean when disaster happened it will also happened for user data. Just imagine how this could affect Fedora reputation in general.

openSUSE Tumblweed switched to Btrfs for /home over a year ago. My most recent setups with openSUSE Tumbleweed has YaST always proposing everything on Btrfs. openSUSE Leap, by virtue of being based on SUSE Linux Enterprise 15, inherits the older configuration from when SUSE Linux Enterprise branched from openSUSE Tumbleweed about 2~3 years ago.

In fact, the major difference between openSUSE's setup and Fedora's is that openSUSE's has more subvolumes because they want to add more controls for automatic snapshotting. We're not enabling automatic snapshotting by default right now, and I am not convinced all those extra subvolumes are needed in that scenario.

What about performance and benchmarks? How much differs these days compared to 2019? It was noticeably slower, sometimes x5 times, for example application start-up time. Why instead of making our desktops faster we want make them slower?

I asked Chris about these Phoronix benchmarks couple months ago, and he posted a long rebuttal about how the results are "self-evidently wrong." Chris, is it OK for me to paste that here, or shall I let you field this one...?

Also what is really changed since last discussion from authoritative and competent members in this area? It is a time bomb and always will be. In openSUSE BTRFS used only for / and XFS for /home, but Fedora want to use BTRFS for /home partition as well, not only for / (see #152). This mean when disaster happened it will also happened for user data. Just imagine how this could affect Fedora reputation in general.

So far, the only concrete "issue" with btrfs that we've identified is that it detects failing hard drives more reliably than other filesystems, and will only mount them read-only, which could prevent you from booting your computer when you would be able to boot fine with ext4. That's a design decision to prevent further data corruption. If you're aware of evidence that btrfs is a "time bomb," please provide it.

So far, the only concrete "issue" with btrfs that we've identified is that it detects failing hard drives more reliably than other filesystems, and will only mount them read-only, which could prevent you from booting your computer when you would be able to boot fine with ext4. That's a design decision to prevent further data corruption.

We could maybe add that systemd people are slowly working on allowing to boot with this read-only case. So it shouldn't be a problem in the future.

If you're aware of evidence that btrfs is a "time bomb," please provide it.

TBH i haven't any. This is based mostly on talks with our people, their feedback and even who was positive about it after some time said that they lost data and got broken file system.

Anyway i am trying to be more optimistic about it now and want to experiment with it as well. :) Also after some talks and some investigation i understand now that it could be even faster that current FS's as @ngompa said. Worth to try.

I think filesystem is something where we should be very very cautious with how we proceed. It's very important that we don't lose users data and switching to less tested file system can easily lead to disasters. Right now, ext4 and xfs are the ones that are used by most distros and most of linux user base, so they should be the ones that are getting most testing.

I'm not convinced that it's Fedora that has to do the btrfs spearheading here. Maybe it's OK to let other distros switch to it first. People who know me know that I'm usually very pro improvements and getting latest cutting edge stuff into Fedora, but file system is one area where I like to be conservative.

At the same time, I recognize that there's a really nice feature in btrfs that would neatly solve the / vs /home partitioning issue -- subvolumes. None of the other file systems have that.

I think I'd personally lean towards moving from the current partitioning scheme to a single big ext4 and not switch to btrfs yet (at least not for Workstation -- perhaps it would be nice to experiment in some Fedora spin and see how it goes with btrfs).

But if we do switch to btrfs then let's make use of subvolumes. Just a single big btrfs partition offers no user visible improvements to a single big ext4 that I am aware of.

I'm going to be gone for 3 weeks so consider me -1 to btrfs and +1 to a single big ext4 partition if it comes down to a vote.

@kalev

I think filesystem is something where we should be very very cautious with how we proceed. It's very important that we don't lose users data and switching to less tested file system can easily lead to disasters. Right now, ext4 and xfs are the ones that are used by most distros and most of linux user base, so they should be the ones that are getting most testing.
I'm not convinced that it's Fedora that has to do the btrfs spearheading here. Maybe it's OK to let other distros switch to it first. People who know me know that I'm usually very pro improvements and getting latest cutting edge stuff into Fedora, but file system is one area where I like to be conservative.

Does openSUSE and SUSE Linux Enterprise both doing this by default since 2014, and openSUSE using a similar configuration to ours since 2018 mean nothing? I think at this point we are far from being first, and I've personally carefully watched what they've done and applied lessons learned with their efforts. I've done a fair bit of work in Fedora to pull in what they've done for opening integration possibilities lower in the stack, as well.

I'm very comfortable with how well Btrfs is supported upstream, and I'm confident that we can support it quite well in Fedora with folks like @josef helping, and leveraging the experience/expertise in openSUSE to help us.

At the same time, I recognize that there's a really nice feature in btrfs that would neatly solve the / vs /home partitioning issue -- subvolumes. None of the other file systems have that.
I think I'd personally lean towards moving from the current partitioning scheme to a single big ext4 and not switch to btrfs yet (at least not for Workstation -- perhaps it would be nice to experiment in some Fedora spin and see how it goes with btrfs).
But if we do switch to btrfs then let's make use of subvolumes. Just a single big btrfs partition offers no user visible improvements to a single big ext4 that I am aware of.
I'm going to be gone for 3 weeks so consider me -1 to btrfs and +1 to a single big ext4 partition if it comes down to a vote.

We're definitely doing subvolumes for our Btrfs layout. We're using the default configuration Anaconda does now, which does / and /home as subvolumes.

We're definitely doing subvolumes for our Btrfs layout. We're using the default configuration Anaconda does now, which does / and /home as subvolumes.

I am confused why you are using wording like "we are definitely doing" -- this is, after all, a ticket to discuss how to do things. If you are talking about your own personal preference, please use "I" and "would like to" instead of "definitely doing".

Otherwise, thanks for the reply -- sounds like things are in a good shape :)

I think filesystem is something where we should be very very cautious with how we proceed. It's very important that we don't lose users data and switching to less tested file system can easily lead to disasters. Right now, ext4 and xfs are the ones that are used by most distros and most of linux user base, so they should be the ones that are getting most testing.

I take issue with the "less tested" statement here. We have 207 btrfs specific tests in xfstests, in addition to the 600 that are run on all file systems. Facebook has run btrfs in production since 2015, to the point that almost all of our root drives are btrfs in the fleet, which numbers in the millions of machines. On top of that previous iterations of our container deployments utilized btrfs loop back devices, so each machine had generally 4-5 btrfs file systems mounted at the same time, putting the number of btrfs file systems we use in the tens of millions.

I'm for being cautious, I'm a fs developer and so am more cautious than most by nature. But this is not some experimental file system. This is a battle tested fs that's a core part of the Facebook infrastructure. There's plenty of work to be done, but lets not continue propagating this idea that it's somehow this fragile dangerous thing.

We're definitely doing subvolumes for our Btrfs layout. We're using the default configuration Anaconda does now, which does / and /home as subvolumes.

I am confused why you are using wording like "we are definitely doing" -- this is, after all, a ticket to discuss how to do things. If you are talking about your own personal preference, please use "I" and "would like to" instead of "definitely doing".

I used that wording because Anaconda does this for Btrfs installs by default already. That's irrespective of the switch to Btrfs by default.

Otherwise, thanks for the reply -- sounds like things are in a good shape :)

Indeed. I wouldn't have pushed for this if I didn't think it was ready. :wink:

@aday

From a UX perspective, I think that there are two aspects to this question.
Exposing the datapreservation feature
The primary requirement here is that someone can verify that their current configuration will support preserving home on reinstall.

This requirement isn't met now in Disks with LVM (or plain partitions). It's left up to the user to connect the dots: what is a block device and knowing the old convention of /home being reusable (which lately isn't as common on many linux distros, having switched to one big ext4).

Improving the UI/UX, even raising the bar for all storage types, is great. Disks needs a variety of improvements. But I'm not clear whether 'requirement' for this particular conveyance of subvolume reuse suggests it should be a blocking enhancement. Do you want to create a RHBZ for this? Setting the blocks field to 1851166 will track it as part of the change proposal.

This could be via a very simple, fairly abstracted experience where we have a label somewhere in Disks that says "Data can be preserved on reinstall" or similar, and a corresponding "Preserve user data" check box in Anaconda.

Default partitioning UI doesn't permit any reuse for any storage type. Custom partitioning UI preserves user data by default but does not assign it to a mount point (i.e. /home), the user needs to explicitly do that. I think the difficulty is unambiguously identifying the 'home' device to reuse, it's not tagged with unambiguous metadata for this purpose.

If there were such a checkbox, I don't know what unchecked would translate into, in terms of installer behavior because it doesn't unambiguously know what user data is.

If Anaconda doesn't have that check box, and we have to with more of a DIY approach, we might need to expose more of the technical details, so users know that the home subvolume is present.

The user sees existing subvolumes as a device, just the same as any partition or LVM LV. Including home (or whatever name/label they gave it when it was created).

I'm gonna take a pass on Phoronix related things in this issue. If it comes up on devel@ though...

@chrismurphy

This could be via a very simple, fairly abstracted experience where we have a label somewhere in Disks that says "Data can be preserved on reinstall" or similar, and a corresponding "Preserve user data" check box in Anaconda.

Default partitioning UI doesn't permit any reuse for any storage type. Custom partitioning UI preserves user data by default but does not assign it to a mount point (i.e. /home), the user needs to explicitly do that. I think the difficulty is unambiguously identifying the 'home' device to reuse, it's not tagged with unambiguous metadata for this purpose.

If there were such a checkbox, I don't know what unchecked would translate into, in terms of installer behavior because it doesn't unambiguously know what user data is.

YaST Installation allows to move over the user data (basically /home contents, username and password) for every user in the existing installation you have on the system. I don't know how it does it, but considering my experience with the codebase, I wouldn't want to know either. It isn't exactly a checkbox though, it's a modal with a list of users, in which you have to check the checkboxes to restore them.

Shared extents, also called reflinks, or reflink copies, are a bit of a mind trip. This might be an integration opportunity. For example in Usage. Possibly even Files. You might use them as a file level snapshot. And you'd probably want some way of knowing that if you make 100 copies (for whatever silly reason, because I can), it only costs 1 copy and a little extra metadata.

Shared extents are created by cp --reflink on both xfs and btrfs. And by snapshots and dedup on btrfs. Extents are just a range of blocks, so shared blocks is not incorrect. A less common term is 'efficient copies'. Also, cp is about to start doing cp --reflink=auto by default in the next version of coreutils.

df will report the truth that these copies don't take up more space
du sorta tells the truth in that each file really is the size that it is, but also doesn't tell the whole truth in that they are pointing to the same data. So du will over report.

Possibly Properties and Usage could show something similar to btrfs filesystem du where it shows total, exclusive, and set shared (for a file, directory or subvolume/snapshot). It's not just one value to indicate the amount of things, because of sharing. if a file is partly change, that's 'exclusive' data to that file. There are ioctls for figuring this out but I'm pretty sure it requires privilege to work.

Tricky.

Metadata Update from @chrismurphy:
- Issue untagged with: meeting

3 years ago

FESCo has unanimously approved this change. The WG can either make its own decision, or just decline to do anything and allow the change to be implemented.

The anaconda PR is https://github.com/rhinstaller/anaconda/pull/2688/files

The WG can either make its own decision, or just decline to do anything and allow the change to be implemented.

It would be wrong for this change to happen without the WG making a decision one way or another. It's our product and we're responsible for it.

During yesterday's working group meeting, we discussed whether it would be better to introduce btrfs for F33 or F34. In response to that discussion, I made the argument that the important factor is ensuring that the change is sufficiently tested, not which release we target.

It's therefore worth discussing what we think constitutes sufficient testing for a change like this. My current view is that, given the sensitive and significant nature of the change, we ought to have a rigorous round of testing prior to rolling the change out.

To me this would mean:

  • having a reasonably high number of Fedora testers using btrfs as their daily driver, for a period of time
  • having a reliable way of ascertaining whether our testers had any issues
  • identifying any particular use cases that we have concerns about, and specifically testing those

Personally speaking, this is an important condition for implementing the btrfs change. I'd be happy to help develop and run a test plan with these goals in mind.

I think it's somewhat unlikely that we'll decide that Workstation should go it's own way here - especially since we let this go to FESCO without expressing an opinion. But that being said, I have concerns about how we go live with this in 2-3 months. I'd like to hear more.

Particular concerns:

  • We know that an unmountable root partition as the result of hardware errors is more common with btrfs than with ext4. But we have no idea how common. The Fedora experience when you have an unmountable root partition is about as bad as you could imagine. Do we need to put effort into improving this prior to F33 beta?

  • The number of people testing Fedora with btrfs on their main machines is currently extremely small. We think we'll have more people testing this over the next few months with the decision to switch to btrfs. Is this going to just be a some FESCO and WG members, or will we succeed in getting hundreds of people testing this? How will we know?

  • We are told that some workloads involving VMs and databases result in poor performance and/or badly fragmented filesystems unless you set the +C (nodatacow) option on the containing directory before you create the files in it. What are we doing about this concretely before F33?

  • We have no performance numbers for desktop/developer workloads. We have anecdotal evidence that it's perceptually identical, but certainly nothing to point to if people say that that F33 is slow/sluggish and blame it on btrfs

A quote from: https://arstechnica.com/gadgets/2020/05/linux-distro-review-fedora-workstation-32/

I initiated a reboot on each VM first—and despite clicking Fedora's reboot button before Ubuntu's, Ubuntu got to the desktop noticeably faster. A similar test with a first launch of Firefox after the reboot also came in quicker on Ubuntu. This confirmed my "seat-of-the-pants" impression on the laptop itself—Fedora can be a bit slower than Ubuntu with some tasks.

We should at the bare minimum be doing this sort of testing ourselves to show that Fedora on btrfs is up to Fedora on ext4, adding in a kernel compile maybe.

I'm also thinking that we should have some guide about BTRFS if we approve this.
Explaining for common users what is BTRFS what are the benefits and why Fedora is taking these steps. It's a huge step for Fedora so we have to communicate as best as we can.

@aday

Personally speaking, this is an important condition for implementing the btrfs change. I'd be happy to help develop and run a test plan with these goals in mind.

There are many test cases and release criteria that should hold Btrfs to the same standard as LVM+ext4. But, I expect it will be the edge cases for which there aren't test cases or criteria that will be the source of questions. Also, QA meetings are Mondays @ 1500 UTC in #fedora-meeting.

Yesterday's test day results

@otaylor

We know that an unmountable root partition as the result of hardware errors is more common with btrfs than with ext4. But we have no idea how common.

That is hypothetically true because it has more critical metadata than ext4. As it turns out these patterns are sufficiently rare that Btrfs, ext4, and xfs falling over in practice, are in the same ballpark. We do know this from Josef's reporting. The actual difference of concern is with ext4 the first and only attempt to fix this problem is fsck. Whereas with Btrfs, it's attempt a recovery of important data before attempting to repair. I think it's possible to have a udev rule automatically attempt to mount with rescue read-only mount, and with the future features of read-only boot support in systemd and the desktop, the desktop can notify the user of the problem, that they are in a limited read-only rescue mode, and should refresh backups before attempting to repair. This is generally better advice that what's seen on any platform - it increases the chance that important data is not lost even in the face of a disaster.

Is this going to just be a some FESCO and WG members, or will we succeed in getting hundreds of people testing this?

This is in a sense a recruitment drive, appealing on Fedora pride. No one wants a wobbly release.

set the +C (nodatacow) option on the containing directory before you create the files in it. What are we doing about this concretely before F33?

RFE bug
RFC email
The ability to detect Btrfs and just set +C when raw or qcow2 are created is straightforward enough as a fallback, but I think libvirtd folks might have a better idea. This question also applies to databases - where some databases don't benefit from nodatacow but none are hurt by it. Still, nodatacow takes away compression and checksumming so it's a question who "owns" this domain.

nothing to point to if people say that that F33 is slow/sluggish and blame it on btrfs

Josef and I have discussed this and it's a fairly straightforward A vs B setup bug report with reproduce steps. As long as the user has some reasonable set of steps to reproduce, he's got plenty of test hardware to setup those conditions and compare - so I expect we won't even have to ask the user do a bunch of regression testing unless things are truly exotic.

Metadata Update from @chrismurphy:
- Issue tagged with: meeting

3 years ago

We agreed in the meeting today that we're following through on this change as FESCo has approved it, and we've broken out the work items for this change into separate tickets labeled with the btrfs tag.

These can be found here: https://pagure.io/fedora-workstation/issues?tags=btrfs

Metadata Update from @ngompa:
- Issue untagged with: meeting

3 years ago

Metadata Update from @ngompa:
- Issue set to the milestone: Fedora 33

3 years ago

Btrfs by default tracking bug ID 1851166

RFE (request for enhancement) and bug reports should have this bug ID added to the Blocks field.

Metadata Update from @chrismurphy:
- Issue tagged with: btrfs

3 years ago

I don't think we need to keep this open anymore. The future is now!

Metadata Update from @catanzaro:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

3 years ago

Login to comment on this ticket.

Metadata
Boards 2
Installing Status: Done
Btrfs Status: Done