#61 RFE: Add support for writing already compressed data directly to a file
Opened 10 months ago by siosm. Modified 8 months ago

This is a request for enhancement for the brtfs filesystem (kernel side).

Ostree is like git for binary files and is used by rpm-ostree to manage systems (Silverblue, Kinoite, IoT, CoreOS, etc.) and by Flatpak to manage application files.

Objects (files) in an ostree repo are optionally compressed, currently only using gzip, but zstd support could be added. We'll assume that objects can be zstd compressed for the rest of this discussion.

When updating a Flatpak (or your system on the Fedora variants mentioned above), ostree pulls objects (files) from a remote repository over the network and writes them to the local repository. The content fetched from the remote repository is usually compressed, to reduce network usage, and is then decompressed before being written to files on the local filesystem.

If btrfs compression is enabled on the filesystem, then the file content will be compressed again by the kernel before being written to the disk.

The goal of this new feature would be to enable applications (here ostree) to tell the kernel, when writing a new file, that the content is already compressed. Ostree could then directly write the compressed files on the filesystem, skipping the decompression step and the kernel could directly write the content to the disk, skipping the re-compression step.

Once the file is finished being written, it would behave like a normal compressed file and would be decompressed on demand again.

Overhaul, this would save time and CPU usage.


If ostree decides to retain compression on writes, Btrfs has a cheap estimator that detects there's no advantage of even attempting to compress such files. In effect it's a noop.

You could also set a compression=none property on a parent directory at the time its created, any additional files and directories created will inherit the property. The kernel honors this.

You'd need to benchmark the two methods to see if one has an advantage. And also benchmark whether ostree compressed files have an advantage over btrfs compressed files.

Metadata Update from @ngompa:
- Issue tagged with: Cloud, Desktop, Kernel

9 months ago

Metadata Update from @ngompa:
- Issue tagged with: Dev

9 months ago

Btrfs supports writing pre-compressed data with the BTRFS_IOC_ENCODED_WRITE ioctl since Linux v5.18. It's documented in linux/btrfs.h. Note that there are a handful of restrictions on sizes, offsets, etc. imposed by the Btrfs on-disk format. This also bypasses the page cache, which may be better or worse depending on your workload.

I have some example programs in an xfstests branch: btrfs_compress_extent compresses data in a Btrfs-friendly format, and btrfs_encoded_io does the actual ioctl. (Those links will go away once I get that branch merged, so check https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/tree/src/btrfs_compress_extent.c and https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/tree/src/btrfs_encoded_write.c if the earlier links don't work.)

Nice! This looks exactly like what I need. I'll have to give this a try.

Login to comment on this ticket.

Metadata
Boards 1
Development Status: Backlog