#10513 Disk images have significantly increased in size with Fedora 35
Closed: Fixed 2 years ago by humaton. Opened 2 years ago by pwhalen.

  • Describe the issue

Precanned qcow2/raw disk images generated in Fedora 35 are significantly larger than they were in Fedora 34. Affects all arches.

X86_64:
Fedora-Cloud-Base-34-1.2.x86_64.raw.xz - 176M
Fedora-Cloud-Base-35-1.2.x86_64.raw.xz - 287M

Fedora-Cloud-Base-34-1.2.aarch64.qcow2 - 252M
Fedora-Cloud-Base-35-1.2.aarch64.qcow2 - 360M

AArch64:
Fedora-Cloud-Base-34-1.2.aarch64.raw.xz - 176M
Fedora-Cloud-Base-35-1.2.aarch64.raw.xz - 282M

The most significant increase is in the ARM/AArch64 images:

Fedora-Minimal-34-1.2.aarch64.raw.xz 542M
Fedora-Minimal-35-1.2.aarch64.raw.xz 1.1G

  • When do you need this? (YYYY/MM/DD)

This would be great to have fixed in F36.

  • If we cannot complete your request, what is the impact?

Wasted bandwidth and disk space.


So comparing 33,34,35:

$ du -sch /srv/web/pub/fedora/linux/releases/3{3,4,5}/Cloud/x86_64/images/Fedora-*raw.xz 
195M    /srv/web/pub/fedora/linux/releases/33/Cloud/x86_64/images/Fedora-Cloud-Base-33-1.2.x86_64.raw.xz
177M    /srv/web/pub/fedora/linux/releases/34/Cloud/x86_64/images/Fedora-Cloud-Base-34-1.2.x86_64.raw.xz
288M    /srv/web/pub/fedora/linux/releases/35/Cloud/x86_64/images/Fedora-Cloud-Base-35-1.2.x86_64.raw.xz
$ du -sch /srv/web/pub/fedora/linux/releases/3{3,4,5}/Cloud/aarch64/images/Fedora-*raw.xz 
202M    /srv/web/pub/fedora/linux/releases/33/Cloud/aarch64/images/Fedora-Cloud-Base-33-1.2.aarch64.raw.xz
177M    /srv/web/pub/fedora/linux/releases/34/Cloud/aarch64/images/Fedora-Cloud-Base-34-1.2.aarch64.raw.xz
283M    /srv/web/pub/fedora/linux/releases/35/Cloud/aarch64/images/Fedora-Cloud-Base-35-1.2.aarch64.raw.xz

The size of the Minimal spin is similar:

$ du -sch fedora-secondary/releases/3*/Spins/aarch64/images/Fedora-Minimal*xz
737M    fedora-secondary/releases/32/Spins/aarch64/images/Fedora-Minimal-32-1.6.aarch64.raw.xz
876M    fedora-secondary/releases/33/Spins/aarch64/images/Fedora-Minimal-33-1.2.aarch64.raw.xz
881M    fedora-secondary/releases/33/Spins/aarch64/images/Fedora-Minimal-33-1.3.aarch64.raw.xz
544M    fedora-secondary/releases/34/Spins/aarch64/images/Fedora-Minimal-34-1.2.aarch64.raw.xz
1.1G    fedora-secondary/releases/35/Spins/aarch64/images/Fedora-Minimal-35-1.2.aarch64.raw.xz

34 seems smaller for some reason to earlier and later releases.

OK uncompressing the files shows that the raw files are the same size

-rw-rw-r--. 1 smooge smooge 5368709120 2022-01-05 19:34 /tmp/Fedora-Minimal-33-1.3.aarch64.raw
-rw-rw-r--. 1 smooge smooge 6442450944 2022-01-05 19:33 /tmp/Fedora-Minimal-34-1.2.aarch64.raw
-rw-rw-r--. 1 smooge smooge 6442450944 2022-01-05 19:33 /tmp/Fedora-Minimal-35-1.2.aarch64.raw

du shows that accounting for empty data in the raw, the files are sized as:

3.6G    /tmp/Fedora-Minimal-33-1.3.aarch64.raw
2.2G    /tmp/Fedora-Minimal-34-1.2.aarch64.raw
2.5G    /tmp/Fedora-Minimal-35-1.2.aarch64.raw

Doing an xz -9 of the Fedora-Minimal-35-1.2.aarch64.raw did not result in any savings. I think that the size growth must be in the filestructure which is making compression hard/impossible.

So the ondisk for the rootfs isn't significantly different. The 100Mb or so is also likely because I updated something for other testing on 35/36 before remembering to df:
F-34: 1.4G
F-35: 1.5G
F-36: 1.5G

In terms of file changes there's a little bit but nothing of real note:

F-35 -> F-36:


+authselect
+authselect-libs
+cxl-libs
+iniparser
+libbpf
+libusb1
+malcontent-libs
+mozjs78
+nano
+openssl1.1
+pcsc-lite
+pcsc-lite-ccid
+pcsc-lite-libs
+polkit
+polkit-pkla-compat
+python3-packaging
+python3-pyparsing
+vim-data

F34 -> F35:


+chkconfig
-compat-readline5
-glibc-doc
+glibc-gconv-extra
+hunspell
+hunspell-en
+hunspell-en-GB
+hunspell-en-US
+hunspell-filesystem
+initscripts-service
+iptables-legacy-libs
+libcap-ng-python3
+libfsverity
-libmetalink
-libtar
-libtextstyle
-libusbx
+mpdecimal
-python3-chardet
+python3-charset-normalizer
-python3-decorator
-python3-pip
-python3-pytz
-python3-setuptools
-python3-slip
-python3-slip-dbus
-shared-mime-info
-systemd-rpm-macros
+systemd-resolved
+util-linux-core

The "biggest" one is hunspeell, and I'm not sure why that's pulled in, I suspect an unnecessary comps addition, but overall it's a few mb. The openssl 1.1 is due to FTB of libarchive but that's another problem.

I wonder if the "free" space isn't being zeroed out in the images.

In fact in F-36 the addition of authselect, which I strongly suspect is completely irrelevant for Minimal/Cloud/Container pulls in polkit/mozjs and friends.

Did the filesystem for these change to a different format in F35? [AKA is it xfs, ext4, btrfs ?] To me it seems like the data was compressible in F33/F34 and then went to not easily compressed in F35.

Only affects the ext4 and xfs disk images:

Btrfs
Fedora-Workstation-34-1.2.aarch64.raw.xz - 3.6G
Fedora-Workstation-35-1.2.aarch64.raw.xz - 3.5G

Fedora-Xfce-34-1.2.aarch64.raw.xz - 3.0G
Fedora-Xfce-35-1.2.aarch64.raw.xz - 3.0G

xfs
Fedora-Server-34-1.2.aarch64.raw.xz - 788M
Fedora-Server-35-1.2.aarch64.raw.xz - 1.3G

ext4
Fedora-Minimal-34-1.2.aarch64.raw.xz 542M
Fedora-Minimal-35-1.2.aarch64.raw.xz 1.1G

@adamwill has been looking into this too in https://bugzilla.redhat.com/show_bug.cgi?id=2031214 and related.

Might be firmware related.

Well, not quite, I was looking into F36 netinst images going over-size. The thing that kicked them over the 700M limit turned out to be due to a mistake when we tried to split some firmware into a subpackage - we created the subpackage but didn't leave the files out of the main package. I fixed that and the F36 images are back down under the 700M limit, but they're still ~40M bigger than the F35 images:

[adamw@xps13k authselect (main %)]$ curl -I https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20220112.n.0/compose/Everything/x86_64/iso/Fedora-Everything-netinst-x86_64-Rawhide-20220112.n.0.iso | grep -i length
content-length: 719323136
[adamw@xps13k authselect (main %)]$ curl -I https://dl.fedoraproject.org/pub/fedora/linux/releases/35/Everything/x86_64/iso/Fedora-Everything-netinst-x86_64-35-1.2.iso | grep -i length
Content-Length: 677380096

I'll see if I can trim some more, but anyway, yeah, this was about 35->36 changes, not 34->35.

I didn't spend too much time on it last week, but as far as I could tell something in the filesystem layout for the xfs/ext4 is not as 'compressible' as it was previously. Once you uncompress and look inside of the image, I didn't find any large differences in package usage of space.. the only thing left seemed to be what Peter said earlier, the 'empty' space in the image.

Fedora-Workstation-Live-aarch64-Rawhide-20220118.n.0.iso       1.8G
Fedora-Workstation-Rawhide-20220118.n.0.aarch64.raw.xz         3.7G

After decompressing the raw.xz

5.6G -rw-r--r--. 1 chris chris  13G Jan 18 08:38 Fedora-Workstation-Rawhide-20220118.n.0.aarch64.raw

losetup->kpartx->blkid

/dev/mapper/loop0p1: UUID="74B9-54F9" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="6f79a5f3-01"
/dev/mapper/loop0p2: UUID="d04179cf-b930-4b86-bf62-8209f9efaa02" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="6f79a5f3-02"
/dev/mapper/loop0p3: LABEL="fedora_fedora" UUID="244c82c0-9f62-47b7-8cb2-affd742e227b" UUID_SUB="948e7836-52c2-42b4-888f-7c12d6ec10dd" BLOCK_SIZE="4096" TYPE="btrfs" PARTUUID="6f79a5f3-03"

fstrim the partitions in order

5.6G -rw-r--r--. 1 chris chris  13G Jan 18 08:38 Fedora-Workstation-Rawhide-20220118.n.0.aarch64.raw
5.5G -rw-r--r--. 1 chris chris  13G Jan 18 08:39 Fedora-Workstation-Rawhide-20220118.n.0.aarch64.raw
3.1G -rw-r--r--. 1 chris chris  13G Jan 18 08:40 Fedora-Workstation-Rawhide-20220118.n.0.aarch64.raw

Anaconda uses either /var/tmp or /home on the newly created root file system for staging the RPMs it downloads. Then performs the installation, deleting the staging RPMs at the end. While the RPMs are deleted from the file system perspective, the backing file still contains them until fstrim punches holes to make the backing file sparse where those deleted RPMs once were. I suspect the image build qemu VM doesn't have discard="unmap" set. Similar to this feature request I filed for CoreOS the other day.
https://github.com/coreos/fedora-coreos-tracker/issues/1069

I think anaconda is doing the right thing here, but as noted in the last comment, if some other image builder is being used, it may not do an fstrim before the final umount of the file system.
https://bugzilla.redhat.com/show_bug.cgi?id=1971186

Note the qemu default for block devices is to omit discard thus it's effectively set to "ignore".

Fedora-Workstation-Rawhide-20220202.n.1.aarch64.raw is currently busting its 4G max size due in part to this issue.

Before trim (uncompressed)

5625396 -rw-r--r--. 1 chris chris 13958643712 Feb  3 14:49 Fedora-Workstation-Rawhide-20220202.n.1.aarch64.raw

After trim (uncompressed)

3219872 -rw-r--r--. 1 chris chris 13958643712 Feb  3 15:37 Fedora-Workstation-Rawhide-20220202.n.1.aarch64.raw

That's 2.3G of garbage being compressed by Fedora infra every time the image is created, and a fraction (the compressed amount) of that is downloaded by everyone.

There is a side-effect of Btrfs transparent compression used in the raw files making xz much less effective. The Btrfs zstd:1 compression set by Anaconda is not as aggressive as xz -9, but xz also can't do any additional compression of zstd:1 blocks.

But a central problem for all live images is the RPMs are staged on the newly created file system, then installed to create the live image, then the RPMs deleted. Those deleted RPMs still take up space in the raw file, unless fstrim'd away.

This issue seems to be resolved by a patch in lorax. If the issue persists or you have any questions please reopen.

Metadata Update from @humaton:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

2 years ago

Login to comment on this ticket.

Metadata