#12 Integrated backup and restore
Opened 5 years ago by chrismurphy. Modified a year ago

The proposal is to use btrfs snapshots and send receive for this functionality.

  • btrfs is the default file system on Fedora desktop editions and spins
  • btrfs snapshot creation is cheap; and incremental replication of data is very cheap
    • significantly, most of the work tracking file change is done by nature of btrfs normal function;
    • no deep traversal on either the source or destination is necessary to determine what files have changed
      • btrfs file b-tree contains per file generation value making it very cheap for btrfs send to find the files that have changed since the last backup, unchanged files don't even need their metadata read
    • shines when there are many user files, with few changes - the total time for a backup is proportional to the difference (changes) payload
  • btrfs snapshots behave like directories, so even if the configuration is lost, the software is uninstalled, becomes stale, the distribution changes - the user's files are still intact, locatable, retrievable by any linux kernel with btrfs support
  • when btrfs receives an incremental backup, it's merged with a previous snapshot, therefore the latest backup manifests as a directory that contains the complete full current backup (not just the changed files)
  • the btrfs snapshots are read-only, therefore their data and metadata is mostly immutable even by root - so inadvertent relabeling or recursive chown or rm won't affect them; it is possible for the root user to delete the snapshots of course

There are two modes envisioned: local and remote

  • local means plugging in a USB stick or drive, and using it as the destination receiving the backup.
  • remote means using ssh to securely transmit the incremental backup stream to another system, and this proposal envisions Fedora Server is that system;
    • Fedora Server does not need to adopt btrfs by default; nor does the user need to choose it at installation time

Restore scenarios

  • User can use any of: sshfs, NFS, SMB, rsync, whether single file restore or everything or anything in between. This is a manual approach. The user should have confidence they can always restore their data. This is often a weak point of backups.
  • Monolithic (full) restore using btrfs send/receive (without -p) back to the workstation. This is an automatic/integrated approach to restore. It might be a 2.0 feature, it may not be strictly necessary.

Putting it together

  • Btrfs in the kernel does a lot of the heavy lifting code wise
  • Btrfs-progs user space tools provide the commands
  • Btrbk, is a utility that leverages btrfs commands to perform snapshots and send/receive replication (locally or remote) on a schedule
  • The proposal means discussion, design, and implementation for
    • Cockpit UI to help the user create a suitably sized btrfs volume to receive the backups
    • Cockpit UI to broadcast the server as providing backing receiving service (e.g. via ssh)
    • Policy for desktop and server snapshot frequency, retention and deletion

Project references:

man btrfs subvolume
man btrfs send
man btrfs receive

btrbk - backup tool for btrfs subvolumes

bees - deduplication agent


Metadata Update from @chrismurphy:
- Issue set to the milestone: Future Release
- Issue tagged with: Desktop, Server, Utils

5 years ago

A liability/limitation for this idea is SELinux will prevent btrfs receive from restoring security labels it doesn't recognize.

Example: Fedora 36 has a new security label system_u:object_r:NetworkManager_dispatcher_script_t:s0 which Fedora 35 is not aware of. If a Fedora 35 Server is the destination for backups from a Fedora 36 desktop, it fails by default. While we have a way to suppress the error so receive succeeds, the received subvolume snapshot does not have the "Received UUID" set, which is a piece of metadata needed for incremental send/receive. Thus, while we have the data received, it cannot then be used as a source(or parent) in a subsequent incremental receive, thus breaking incremental receive - which is the feature of send/receive. The incremental computation on btrfs is cheap, no deep traversal is needed on either source or destination.

Relaxing the rule requiring full replication of all data and metadata before Received UUID can be set is a slippery slope. It'd take some evaluation to make sure we don't cause other problem, by allowing this metadata to be dropped while still setting the Received UUID.

Perhaps it's possible to backport these labels to earlier versions of Fedora?

Still another thought is these new security labels tend to only crop up for /usr /etc/ /var not /home. So if the feature were constrained to backing up just user /home, then that might be a suitable work around? Is there a use case for users creating and setting arbitrary security labels?

See also: send|receive ERROR: lsetxattr failed, SELINUX_ERR op=setxattr invalid_context

Interesting. I've been using zfs send/receive to backup my Fedora servers for many years and I've never hit this issue. I guess because I have zfs configured not to mount the backups on the receiving end (the receiving stream inherits mountpoint=none).

Log in to comment on this ticket.

Metadata