#1589 Migrate main backup storage pool to different hardware
Opened a month ago by arrfab. Modified a month ago

As we'll have to shuffle some services and hardware, we should already migrate our main backup machine to a different one (still under warranty and that will be part of the DC move).
Doing that in advance would ensure a smooth migration during DC move itself.
As we have also multiple artifacts which share some block (at the block device level that is), the new backup solution will be configured directly with VDO (Virtual Data Optimizer)

Criteria :

  • New machine is deployed and configured with VDO (new ansible role needed)
  • Monitoring in place to cover VDO usage and stats (Zabbix template)
  • Backup pool configured and operational on new machine
  • restic role applied to have offsite encrypted backups (like on existing machine)
  • ability to decommission old machine (EOL)

Metadata Update from @arrfab:
- Issue marked as blocking: #1579
- Issue tagged with: centos-common-infra, dc-move, high-gain, high-trouble

a month ago

Writing and testing some of this seems like an excellent self-contained thing for me to look at next week, if no one gets there first? Would you agree?

Edit: ugh, phone blipped in a tunnel, sorry for the duplicate

Metadata Update from @gwmngilfen:
- Issue assigned to gwmngilfen

a month ago

I did some research into this, and it seems a good fit. The latest versions of VDO are now part of LVM/device-mapper, so it's as simple as:

lvcreate --type vdo \
  --name vdo-lv1 \
  --extents "100%FREE" \
  --virtualsize 200G \
  --vdosettings 'vdo_slab_size_mb=1024' \
  vdo-vg

To create one. virtualsize is arbitrary, as it's a thin-provision on top of the storage device (in this case 20G) and you have to state your expected compression ahead of time (implicitly, as we have here a 20G disk and a 200G LV, implying a 10:1 compression). In reality this doesn't matter much as you can easily extend both the LV and the underlying devices if needed. You do need something like mkfs.xfs -K to avoid discarding blocks on the LV though.

There is already a RHEL system-role for VDO, so we should be able to plagiarise re-use that upstream content in our Ansible infra. See https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/9/html/deduplicating_and_compressing_logical_volumes_on_rhel/creating-a-deduplicated-and-compressed-logical-volume_deduplicating-and-compressing-logical-volumes-on-rhel#configuring-an-lvm-vdo-volume-using-the-storage-rhel-system-role_creating-a-deduplicated-and-compressed-logical-volume

Monitoring is quite doable, we can have Zabbix monitor the LV usage as normal, along with vdostats for the physical device:

# vdostats  --human-readable vdo--vg-vpool0-vpool
Device                    Size      Used Available Use% Space saving%
vdo--vg-vpool0-vpool     20.0G      6.6G     13.4G  33%           67%

We can parse for Use% and alert over, say, 90%.

The one thing we should probably do is set up a test with some real (staging) data so we can guage what sort of compression level we'll get.

Log in to comment on this ticket.