Secure External Backup with ZFS Native Encryption

August 3rd, 2021 by Philip Iezzi 28 min read
cover image

Let's improve our Simple and Secure External Backup solution I have published back in 2018. Back then, I was using rsync over SSH to pull backup data, and LUKS encryption as full disk encryption for the external drives. As we all know, transferring data with rsync can get horribly slow and blow up your I/O if you're transferring millions of small files. Also, LUKS encryption may be a bit low level and inflexible. What we want to accomplish: A performant and secure backup solution based on ZFS, using zfs send|recv for efficient data transfer, and ZFS native encryption to secure our external drives. So let's go ahead and built that thing from scratch on a fresh 2021 stack!

Let's assume you already have an existing backup server that is connected to the internet 24/7 and runs daily/weekly/monthly backup jobs. Backup data is stored in ZFS datasets, ideally as individual datasets per full-system backup for each host. We would now like to set up a second offsite backup server that just cares about storing data to an encrypted external drive and after each backup run, you are going to physically detach that drive.

So, we are talking about offline backups in addition to the fact having this server offsite - at a different location than your main backup server.

Preferably, your main backup server would also be offsite. But as it needs to pull data frequently, its storage is always available and not getting detached.

Let's call your main backup server backup and the one we are going to set up here extbackup.

Requirements

We have the following – from a security point of view – rather strict requirements:

  • Hardware: Payable but reliable fanless system that could be mounted to any office desk.
  • System: Proxmox VE for well-maintained OpenZFS and PVE-zsync support
  • No backup data should ever be stored (not even temporarily!) on rpool (system partition), only on encrypted external drives.
  • System is installed on internal SSD, external drives attached over USB 3 or eSATA
  • External drives fully encrypted using ZFS native encryption (preferred over LUKS for better handling and easier setup)
  • ZFS encryption key not stored anywhere on disk, only generated at runtime.
  • Snapshot-based backup for optimal performance: Backup data is pulled from backup over SSH, using zfs send|recv. We choose PVE-zsync as simple backup tool / replication manager
  • SSH private key to connect to backup is not stored anywhere on extbackup in plaintext, only getting generated at runtime.
  • SSH access on remote backup server is strictly limited to pve-zsync operations, without providing a full shell.
  • Both ZFS encryption key and SSH private key should not be required to be stored in any password manager or anywhere else.
  • Unlock password for those keys should only be used as a 2nd factor / seed, so encryption security does not depend on its length.
  • External disks should only get decrypted during a backup run and getting encrypted/unmounted right afterwards.
  • Backup data is directly streamed to external disks without temporary storage on extbackup server.
  • Ideally, the external drive should be a plain SATA drive without external casing, which could be plugged directly into a server via onboard SATA or eSATA (in case of an emergency when we lack time to transfer the data for recovery). Performance should also be decent in case we ever run from such a drive in production, so SSD is preferred!

Phew! This sounds like a lot of tight requirements! But it is all doable in a very simple slick and sexy setup. Keep on reading!

Hardware Setup

I do recommend the following hardware for a small robust fanless system that is built for 24/7 operation:

componentspec
SystemShuttle Barebone XPC slim DS10U Series (DS10U5 / Intel Core i5-8265U)
RAM2x SO-DDR4-RAM Vengeance 2400MHz 16GB (CMSX16GX4M1A2400C16)
SSDSamsung SSD 860 EVO AHCI (SATA) M.2 2280 250GB (MZ-N6E250BW)

This hardware currently (Aug 2021) is available for ~ $900 in total. You can go much cheaper with some Intel NUC system that comes delivered with a basic RAM / SSD setup. But we do recommend that Shuttle XPC line as it is an industry-grade platform and we never had any issues with its predecessors DS57U / DS77U. It's also 100% quiet, producing no noise at all.

And maybe you want to save some bucks and don't get the powerful DS10U5 with an i5, but rather a DS10U3 with an i3 or even the base model with a Celeron 4205U.

As external 8TB SSD (we're up for speed!), I do recommend the following:

  • Samsung 870 QVO 8TB (model MZ-77Q8T0BW), or go for 4/2/1 TB variant, if that is enough for you
  • StarTech USB 3.1 to SATA adapter (model USB312SAT3CB)

Do NOT use any USB3-to-SATA adapter that emulates its own device, e.g. this shitty piece of crap: Manhattan USB 3.0 to SATA (model 130424). The adapter should support direct access to your drive. The StarTech adapter supports UASP (USB Attached SCSI) which is what we want!

This whole hardware setup might sound luxury in your eyes. It all depends of how important the (customer) data is and how far you want to optimize disaster recovery time. You can go much cheaper if that all doesn't matter to you.

These are my recommendations and they have been very well tested at Onlime GmbH webhosting.

System Setup

Install Proxmox VE 7 (Debian Bullseye based) from an USB stick which you create as follows:

$ wget https://www.proxmox.com/images/download/pve/iso/proxmox-ve_7.0-1.iso
$ cat proxmox-ve_7.0-1.iso > /dev/sdx

On macOS, use diskutil list to identify the correct device name of your USB stick and run diskutil unmountDisk /dev/disk2 prior to writing to it. Something like:

$ diskutil list
$ sudo su -
$ diskutil unmountDisk /dev/disk2
$ cat proxmox-ve_7.0-1.iso > /dev/disk2

I am not going to explain how to set up Proxmox VE here as its installer is super slick and I am sure you are going to manage that by yourself.

Build encryption key

As promised, the encryption key for ZFS native encryption (and a second LUKS encryption key, if we're ever going to use that again) is only going to be stored at runtime in ramfs (volatile memory aka. RAM). It is built upon first login as root. So let's hook below script into /root/.profile:

echo "/usr/local/sbin/build-encryption-key.sh" >> /root/.profile

Deploy the following script to /usr/local/sbin/build-encryption-key.sh:

build-encryption-key.sh
#!/bin/bash
#
# This script usually is called on the first login and asks for a password
# to build the LUKS + ZFS encryption keys which then are stored only in volatile
# memory /mnt/ramfs (ramfs).
# This script is added to your /root/.profile in order you won't forget to
# build the encryption key each time you reboot the server.
#
# We are using ramfs instead of tmpfs as there is no swapping support in 
# ramfs which is good in a security perspective.
# see: http://www.thegeekstuff.com/2008/11/overview-of-ramfs-and-tmpfs-on-linux
#

################### CONFIGURATION #######################
RAMFS_PATH='/mnt/ramfs'
RAMFS_SIZE='20M'
LUKS_KEYFILE=$RAMFS_PATH/luks_pw
ZFS_KEYFILE=$RAMFS_PATH/zfs_enc_key
SALT='**************************************************'
SHA1_CHECK='fb9e740efe20f541349d37eff7aa34efd4ac823d'
#########################################################

printinfo() {
    echo "[INFO] $1"
}

printwarn() {
    echo "[WARNING] $1" | grep --color "WARNING"
}

if [[ -f "$LUKS_KEYFILE" && -f "$ZFS_KEYFILE" ]]; then
    # exit silently, as the key file already exists
    exit
else
    printwarn "The LUKS ($LUKS_KEYFILE) and ZFS ($ZFS_KEYFILE) key files don't yet exist!"
fi

# set up RAM disk if it is not yet mounted
if ! mountpoint -q "$RAMFS_PATH"; then
    echo "Setting up ramfs on $RAMFS_PATH (size=$RAMFS_SIZE) ..."
    mkdir -p $RAMFS_PATH
    mount -t ramfs -o size=$RAMFS_SIZE ramfs $RAMFS_PATH
    if ! grep -q "$RAMFS_PATH" /etc/fstab; then
        echo "Adding line to /etc/fstab to persist mounting of $RAMFS_PATH ..."
        echo "ramfs   $RAMFS_PATH              ramfs   defaults,size=$RAMFS_SIZE        0 0" >> /etc/fstab
    fi
fi

# make this script start on each login
SCRIPT=$(readlink -f $0)
if ! grep -q "$SCRIPT" /root/.profile; then
    echo "$SCRIPT" >> /root/.profile
fi

# get password from interactive user input
while read -s -p 'Unlock encryption keys: ' PASS && [[ $(echo -n "$PASS" | wc --chars) -lt 8 ]] ; do
    echo
    echo "Your password must be at least 8 characters long!"
done
echo

# calculate encryption key (SHA-512 hash of salt.password concatenation)
KEY=$(echo -n "$SALT.$PASS" | sha512sum | cut -d' ' -f1)

# store LUKS key file to ramfs
touch $LUKS_KEYFILE && chmod 600 $LUKS_KEYFILE
echo -n "$KEY" > $LUKS_KEYFILE

# SHA-1 check of the key - assure you have correctly built it by entering the correct password
KEY_SHA1=`cat $LUKS_KEYFILE | sha1sum | cut -d' ' -f1`
if [ "$KEY_SHA1" != "$SHA1_CHECK" ]; then
    printwarn "Your key does not seem to be correct. You might have entered the wrong password. Please run `basename $SCRIPT` again!"
    printinfo "If you are sure you have entered the right password, try to set SHA1_CHECK='$KEY_SHA1' in `basename $SCRIPT`."
    rm -f $LUKS_KEYFILE
    exit 1
fi

printinfo "The LUKS key file was successfully stored in $LUKS_KEYFILE."

# ZFS raw key needs to be 32 characters long
head -c 32 $LUKS_KEYFILE > $ZFS_KEYFILE
printinfo "The ZFS key file was successfully stored in $ZFS_KEYFILE."

############################
# SSH private key encryption
############################
#
# INITIAL SETUP:
#
# Initially, we need to create an encrypted version of id_rsa and destroy the plaintext private key:
#   $ cat ~/.ssh/id_rsa | openssl enc -e -aes-256-cbc -pbkdf2 -a -pass file:/mnt/ramfs/luks_pw > ~/.ssh/id_rsa.encrypted
#
# or you could also copy the password from /mnt/ramfs/luks_pw and enter it interactively:
#   $ cat ~/.ssh/id_rsa | openssl enc -e -aes-256-cbc -pbkdf2 -a > ~/.ssh/id_rsa.encrypted
#   $ enter aes-256-cbc encryption password: (...)
#   $ chmod 600 ~/.ssh/id_rsa.encrypted
#   $ shred -u ~/.ssh/id_rsa
#
# Updating the encrypted version of id_rsa:
#   $ cat /mnt/ramfs/id_rsa.decrypted | openssl enc -e -aes-256-cbc -pbkdf2 -a -pass file:/mnt/ramfs/luks_pw > ~/.ssh/id_rsa.encrypted
# Testing encryption and decryption:
#   $ cat /mnt/ramfs/id_rsa.decrypted | openssl enc -e -aes-256-cbc -pbkdf2 -a -pass file:/mnt/ramfs/luks_pw > ~/.ssh/id_rsa.encrypted-testing
#   $ cat ~/.ssh/id_rsa.encrypted-testing | openssl base64 -d | openssl enc -d -aes-256-cbc -pbkdf2 -pass file:/mnt/ramfs/luks_pw > /mnt/ramfs/id_rsa.testing
#   $ diff /mnt/ramfs/id_rsa.decrypted /mnt/ramfs/id_rsa.testing
#   $ shred -u /mnt/ramfs/id_rsa.testing
#
if [ -f ~/.ssh/id_rsa.encrypted ]; then
    touch $RAMFS_PATH/id_rsa.decrypted && chmod 600 $RAMFS_PATH/id_rsa.decrypted
    cat ~/.ssh/id_rsa.encrypted | openssl base64 -d | openssl enc -d -aes-256-cbc -pbkdf2 -pass file:$LUKS_KEYFILE > $RAMFS_PATH/id_rsa.decrypted
    ln -sf $RAMFS_PATH/id_rsa.decrypted ~/.ssh/id_rsa
    printinfo "SSH private key was successfully decrypted to $RAMFS_PATH/id_rsa.decrypted."
else
    printwarn "Please encrypt your SSH private key to ~/.ssh/id_rsa.encrypted so I can decrypt it to ramfs."
fi

But before using this script in production, ensure you have set your own SALT in the CONFIGURATION section. Don't yet care about SHA1_CHECK - this is only used to verify that the key got correctly built. Upon first run, the script will provide you with the right information. The unlock password can be freely chosen, just please remember it!:

$ build-encryption-key.sh
Unlock encryption keys: (choose a new unlock password)
[WARNING] Your key does not seem to be correct. You might have entered the wrong password. Please run build-encryption-key.sh again!
[INFO] If you are sure you have entered the right password, try to set SHA1_CHECK='fb9e740efe20f541349d37eff7aa34efd4ac823d' in build-encryption-key.sh.

The unlock password is only needed to unlock/generate your LUKS + ZFS encryption keys and is only used as a second factor. For ZFS native encryption, we are then only going to use the generated /mnt/ramfs/zfs_enc_key which you should not store anywhere else!!!

Set correct SHA1_CHECK (from above output) in /usr/local/sbin/build-encryption-key.sh and test if the script gets correctly invoked upon first login as root:

extbackup$ sudo su -
[WARNING] The LUKS (/mnt/ramfs/luks_pw) and ZFS (/mnt/ramfs/zfs_enc_key) key files don't yet exist!
Unlock encryption keys: 
[INFO] The LUKS key file was successfully stored in /mnt/ramfs/luks_pw.
[INFO] The ZFS key file was successfully stored in /mnt/ramfs/zfs_enc_key.

Set up SSH keypair with encrypted private key

You may have noticed that our build-encryption-key.sh script also tries to decrypt your SSH private key to ramfs upon first login as root. We are going to use the default identity file ~/.ssh/id_rsa to access remote backup server over SSH (using pve-zsync to pull data). But we don't want the private key to get stored plaintext anywhere on persistent storage.

build-encryption-key.sh cares about the following:

  • Asks for encryption key password
  • Builds LUKS encryption key /mnt/ramfs/luks_pw using a seed and the provided password
  • Derives ZFS encryption key /mnt/ramfs/zfs_enc_key from LUKS encryption key (first 32 chars)
  • Decrypts the encrypted SSH private key of user root ~/.ssh/id_rsa.encrypted to /mnt/ramfs/id_rsa.decrypted

Initially, you need to do this in two steps, as we did not yet encrypt id_rsa private key yet.

Also see setup instructions in comments inside build-encryption-key.sh!

$ ssh extbackup
$ sudo su -
[WARNING] The LUKS (/mnt/ramfs/luks_pw) and ZFS (/mnt/ramfs/zfs_enc_key) key files don't yet exist!
Unlock encryption keys: 
[INFO] The LUKS key file was successfully stored in /mnt/ramfs/luks_pw.
[INFO] The ZFS key file was successfully stored in /mnt/ramfs/zfs_enc_key.
[WARNING] Please encrypt your SSH private key to ~/.ssh/id_rsa.encrypted so I can decrypt it to ramfs.

# regenerate RSA/4096 keypair, if not already done
$ ssh-keygen -t rsa -b 4096 -P ''

# encrypt SSH private key
$ cd /root/.ssh/
$ cat id_rsa | openssl enc -e -aes-256-cbc -pbkdf2 -a -pass file:/mnt/ramfs/luks_pw > id_rsa.encrypted
$ chmod 600 id_rsa.encrypted

Let's now secure erase our plaintext private key:

$ shred -u /root/.ssh/id_rsa

Testing, logout and login again:

$ rm -f /mnt/ramfs/*
$ logout
$ sudo su -
[WARNING] The LUKS key file (/mnt/ramfs/luks_pw) does not yet exist!
Unlock LUKS encryption key: **********
[INFO] The LUKS key file was successfully stored in /mnt/ramfs/luks_pw.
[INFO] The ZFS key file was successfully stored in /mnt/ramfs/zfs_enc_key.
[INFO] SSH private key was successfully decrypted to /mnt/ramfs/id_rsa.decrypted.

Your keypair is now ready. You can now copy-paste your public key to remote backup server:

$ cat /root/.ssh/id_rsa.pub

Append it to /root/.ssh/authorized_keys on backup server.

Encrypt external drive: Setup dpool

NOTE: We only need to set up the first external drive. To set up further external drives, use mirroring as described in the final chapter below.

QUICK HOWTO: Initialize a new external drive /dev/sdb

# encrypt disk / set up dpool (use -f to force, if dpool already exists)
$ zpool create -O acltype=posixacl -O encryption=on -O keylocation=file:///mnt/ramfs/zfs_enc_key -O keyformat=raw dpool sdb
# check status
$ zfs get encryption,keylocation,keystatus,keyformat dpool

# set up LABEL and dataset(s)
$ echo -n "1a" > /dpool/LABEL
$ zfs create dpool/zfsdisks

# unmount and export (only disconnect disk, after exporting dpool!)
$ zfs unmount dpool && zfs unload-key dpool
$ zpool export dpool

In more detail...

Initialize a new external drive with ZFS native encryption, putting everything under encrypted pool dpool (name stands for "data-pool", while rpool is the default system "root-pool" in Proxmox VE):

The aes-256-ccm algorithm is used by default (don't be confused about misleading ZFS documentation which says aes-128-ccm)

# use lsblk to check device names
$ lsblk

# create dpool on /dev/sdb
# NOTE: key must be exactly 32 characters long, so use /mnt/ramfs/zfs_enc_key (first 32 chars of luks_pw)
$ zpool create -O acltype=posixacl -O encryption=on -O keylocation=file:///mnt/ramfs/zfs_enc_key -O keyformat=raw dpool sdb

# check pool and encrytion status
$ zpool status dpool
  pool: dpool
 state: ONLINE
config:
    NAME        STATE     READ WRITE CKSUM
    dpool       ONLINE       0     0     0
      sdb       ONLINE       0     0     0

$ zfs get encryption,keylocation,keystatus,keyformat dpool
NAME   PROPERTY     VALUE                          SOURCE
dpool  encryption   aes-256-gcm                    -
dpool  keylocation  file:///mnt/ramfs/zfs_enc_key  local
dpool  keystatus    available                      -
dpool  keyformat    raw                            -

Unmount/encrypt dpool like this:

$ zfs unmount dpool
$ zfs unload-key dpool
$ zpool export dpool

WARNING: Never forget to export dpool before you detach the external drive or before you reboot!

Mount/decrypt dpool like this:

$ zpool import dpool
$ zfs load-key dpool
$ zfs mount -l -a

man zfs:

zfs mount [-Olv] [-o options] -a | filesystem
 -a  Mount all available ZFS file systems.  Invoked automatically as
     part of the boot process if configured.
 -l  Load keys for encrypted filesystems as they are being mounted. This
     is equivalent to executing zfs load-key on each encryption root be‐
     fore mounting it. Note that if a filesystem has a keylocation of
     prompt this will cause the terminal to interactively block after
     asking for the key.

Finally, create a dataset on dpool - it will also be encrypted:

$ zfs create dpool/zfsdisks

$ zfs get encryption,keystatus dpool/zfsdisks
NAME            PROPERTY    VALUE        SOURCE
dpool/zfsdisks  encryption  aes-256-gcm  -
dpool/zfsdisks  keystatus   available    -

Set up mounting aliases

Let's set up aliases for the common used mounting commands in /etc/profile.d/zfs-encryption.sh:

alias mount-extbackup='zpool import dpool && zfs load-key dpool && zfs mount -l -a'
alias umount-extbackup='zfs unmount dpool && zfs unload-key dpool && zpool export dpool'

We can now use those aliases to mount/unmount our external drive to /dpool:

$ mount-extbackup
$ cat /dpool/LABEL
$ umount-extbackup

Limit SSH access to pve-zsync operations

For security reasons, we want to limit SSH access on remote backup server only to the very specific commands which pve-zsync fires on remote side.

In below extbackup-zfs.sh backup script, pve-zsync is executed as follows, e.g. for VMID 181:

$ pve-zsync sync --source fatboy:dpool/zfsdisks/subvol-181-disk-1 --dest rpool/zfsdisks/subvol-181-disk-1 --maxsnap 12 --name extbackup

This produces the following remote commands...

Sample pve-zsync remote commands on first run:

$ zfs list -r -t snapshot -Ho name -S creation dpool/zfsdisks/subvol-181-disk-1
$ zfs snapshot dpool/zfsdisks/subvol-181-disk-1@rep_extbackup_2021-07-12_23:15:53
$ zfs send -- dpool/zfsdisks/subvol-181-disk-1@rep_extbackup_2021-07-12_23:15:53

Sample pve-zsnyc remote commands on the following runs:

$ zfs list -rt snapshot -Ho name dpool/zfsdisks/subvol-181-disk-1@rep_extbackup_2021-07-12_23:15:53
$ zfs send -i dpool/zfsdisks/subvol-181-disk-1@rep_extbackup_2021-07-12_23:15:53 -- dpool/zfsdisks/subvol-181-disk-1@rep_extbackup_2021-07-12_23:48:52
$ zfs destroy dpool/zfsdisks/subvol-181-disk-1@rep_extbackup_2021-07-12_23:13:13

To accomplish this specific restrictions, we need to use a script in command section before the extbackup's pubkey in .ssh/authorized_keys:

command="/root/.ssh/allowed-commands-onlime.sh",no-agent-forwarding,no-port-forwarding,no-pty,no-user-rc,no-X11-forwarding ssh-rsa AAAAB3N... root@extbackup

allowed-commands-onlime.sh then does pattern matching to verify the commands - use $SSH_ORIGINAL_COMMAND to get the original command:

allowed-commands-onlime.sh
#!/bin/bash
#
# https://serverfault.com/a/803873/299863
# You can have only one forced command in ~/.ssh/authorized_keys. Use this wrapper to allow several commands.
# And use https://3widgets.com/ - Regex Numeric Range Generator
#

## CONFIGURATION ##############################################
# host range 194-254
veid_pattern='(19[4-9]|2[0-4][0-9]|25[0-4])'
backup_srcdir='/backup'
pool_name=dpool
###############################################################

# Regex patterns
ds_pattern="${pool_name}/zfsdisks/subvol-${veid_pattern}-disk-1"
ts_pattern='20[2-9][0-9]-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])_(0[0-9]|1[0-9]|2[0-3])(:(0[0-9]|[1-5][0-9])){2}'
snap_prefix_pattern='rep_extbackup_([1-9]a_)?'
snap_pattern="@${snap_prefix_pattern}${ts_pattern}"
ds_snap_pattern="${ds_pattern}${snap_pattern}"

patt1='zfs list -r( -)?t snapshot -Ho name( -S creation)? '"$ds_pattern($snap_pattern)?"
patt2='zfs (snapshot|destroy) '"$ds_snap_pattern"
patt3='zfs send '"(-i $ds_snap_pattern )?-- $ds_snap_pattern"
patt4="zfs rename $ds_snap_pattern $ds_snap_pattern" # only used in extbackup-migrate-zfs-snaps.sh

cmd="$SSH_ORIGINAL_COMMAND"

if [[ "$cmd" == 'list-datasets' ]]; then
    # special command to get a server list of all datasets and mountpoints
    zfs list -H -o name,mountpoint,usedds | grep $backup_srcdir
    exit
elif [[ $cmd =~ ^$patt1$ || $cmd =~ ^$patt2$ || $cmd =~ ^$patt3$ || $cmd =~ ^$patt4$ ]]; then
    $SSH_ORIGINAL_COMMAND
    exit
else
    logger "$(basename $0) violation: $cmd"
    echo "Access denied"
    exit 1
fi

We can then test the magic list-datasets command to get a server list of all datasets and mountpoints:

extbackup$ ssh backup list-datasets

This magic command is going to be used by our extbackup script to loop over ZFS datasets that need to be backupped.

Final extbackup script

The following script basically decrypts and mounts the external USB device, then runs pve-zsync to pull the whole data from backup for each dataset, and unmounts / closes the encrypted device right afterwards.

extbackup-zfs.sh:

extbackup-zfs.sh
#!/bin/bash

######### CONFIGURATION ############
USB_DEVICE=/dev/sdb
DISKS=3
DATAPOOL=dpool
BKUP_SERVER="backup"
BKUP_SRCDIR="/backup"
MAXSNAP=1
KEYFILE="/mnt/ramfs/zfs_enc_key"
LOGFILE="/var/log/extbackup.log"
EXCLUDES=() # datasets that should be excluded from backup
SCRIPTNAME=$(basename $0)
CREATE_RERUN=1 # default to creating rerun on failure
RERUN_FILE="/tmp/$SCRIPTNAME.rerun"
####################################

usage() {
    echo "Usage:"
    echo "$(basename $0) [-m|--maxsnap N] [-r|--rerun]"
    echo
    echo "Where:"
    echo "  -m|--maxsnap N  The number of snapshots to keep until older ones are erased (default: 1)"
    echo "  -r|--rerun      Rerun mode: rerun only if previous run has failed"
    exit 1
}

# Get options
while :
do
    case $1 in
        -m | --maxsnap)
            MAXSNAP="${2:-$MAXSNAP}"
            shift
            shift
            ;;
        -r | --rerun)
            RERUN="true"
            shift
            ;;
        -h | --help)
            usage
            ;;
        *)  # no more options. Stop while loop
            break
            ;;
    esac
done

# include common functions
source /usr/local/sbin/common-functions.sh
# ensure this is the only running instance
assert_single_instance
# append STDOUT and STDERR to logfile and output both at the same time
logall_output_stderr_stdout $LOGFILE

# zfs command convienence aliases
zfsgetval='zfs get -H -o value'

# Rerun if `--rerun` was provided and previous run has failed
if [ -n "$RERUN" ]; then
    if [[ ! -f $RERUN_FILE ]]; then
        # printinfo "did not find rerun file $RERUN_FILE; exiting OK"
        exit
    fi
elif [[ $CREATE_RERUN -eq 1 ]]; then
    touch $RERUN_FILE
fi

# check if key file exists
if [ ! -f $KEYFILE ]; then
    errquit "The key file $KEYFILE does not exist yet. Please run build-encryption-key.sh first!"
fi

# First, make sure /dev/sdb will wake up and actually exists (using parted as a simple workaround)
parted -s $USB_DEVICE print > /dev/null
if [ $? -ne 0 ]; then
    errquit "External USB disk does not seem to show up as $USB_DEVICE. Backup script aborted!"
fi

# Import dpool, load encrytion key and mount all pools
printinfo "Importing ZFS pool $DATAPOOL ..."
zpool import $DATAPOOL
if [ $? -ne 0 ]; then
    errquit "Could not import $DATAPOOL. Backup script aborted!"
fi
printinfo "Loading key for ZFS pool $DATAPOOL ..."
zfs load-key $DATAPOOL
printinfo "Mouting/decrypting all ZFS datasets ..."
zfs mount -l -a

# Check if decryption/mounting of dpool was done
if [ $? -ne 0 ]; then
    errquit "Could not mount/decrypt ZFS filesystems. Backup script aborted!"
fi
if [ "$($zfsgetval keystatus $DATAPOOL)" != "available" ]; then
    errquit "ZFS dataset $DATAPOOL does not seem to be correctly decrypted. Backup script aborted!"
fi

# abort on missing /dpool/LABEL or wrongly formatted label
if [ ! -f /$DATAPOOL/LABEL ]; then
    errquit "Disk label (/$DATAPOOL/LABEL) does not exist. Backup script aborted!!!"
fi
label=$(cat /$DATAPOOL/LABEL)
if [[ ! $label =~ ^[1-$DISKS]a$ ]]; then
    errquit "Invalid label in /$DATAPOOL/LABEL: $label"
fi

printinfo "=== STARTING EXTBACKUP TO DISK WITH LABEL $label ==="

# loop over all datasets and run pve-zsync
# `list-datasets` is a magic command that is resolved to `zfs list -H -o name,mountpoint,usedds | grep /backup` on remote
ssh $BKUP_SERVER list-datasets | while read dataset mountpoint usedds; do
    zsync="pve-zsync sync --source $BKUP_SERVER:$dataset --dest $(dirname $dataset) --maxsnap $MAXSNAP --name extbackup_$label"
    # check if the dataset should be excluded from extbackup
    if [[ ${EXCLUDES[*]} =~ $dataset ]]; then
        printwarn "(SKIPPED) $zsync"
        continue
    fi

    # check if dataset exists
    if zfs list $dataset &>/dev/null; then
        # unmount dataset prior to pve-zsync run to avoid "dataset is busy"
        zfs unmount $dataset
    fi
    printinfo "$zsync # $usedds"
    # use `< /dev/null` as workaround to avoid pve-zsync breaking out of while loop (as it somehow expects input)
    $zsync < /dev/null

    # set mountpoint to the same as on remote backup server
    if [ "$($zfsgetval mountpoint $dataset)" != "$mountpoint" ]; then
        cmd="zfs set mountpoint=$mountpoint $dataset"
        printinfo "$cmd"
        $cmd
        sleep 1
    else
        zfs mount $dataset
    fi
done

printinfo "Unmounting/encrypting ZFS pool $DATAPOOL ..."
zfs unmount $DATAPOOL
printinfo "Unloading key for ZFS pool $DATAPOOL ..."
zfs unload-key $DATAPOOL
printinfo "Exporting ZFS pool $DATAPOOL ..."
zpool export $DATAPOOL
if [ $? -ne 0 ]; then
    errquit "Could not export $DATAPOOL. Please fix!"
fi

# Remove rerun file on successful run
rm -f $RERUN_FILE

sync
printinfo "DONE."

common-functions.sh:

common-functions.sh
#!/bin/bash

####### CONFIGURATION #################
TSFORMAT="%Y-%m-%d %H:%M:%S"
EXECUTING_SCRIPT=$(basename $0)
#######################################

printinfo() {
    echo "[`date +"$TSFORMAT"`] $1"
}

printwarn() {
    echo "[`date +"$TSFORMAT"`] WARNING: $1" | grep --color "WARNING"
}

printerr() {
    >&2 echo "[`date +"$TSFORMAT"`] ERROR: $1"
}

errquit() {
    if [ -n "$1" ]; then
        printerr "$1"
    fi
    if [ -n "$LOCKFILE" ]; then
        rm -f $LOCKFILE
    fi
    exit 1
}

assert_single_instance() {
    # Check if another instance of this script is already running
    # https://stackoverflow.com/a/16807995/5982842
    for pid in $(pidof -x $EXECUTING_SCRIPT); do
        if [ $pid != $$ ]; then
            errquit "$EXECUTING_SCRIPT process is already running with PID $pid"
        fi
    done
}

logall_output_stderr_stdout() {
    logfile=${1:-${EXECUTING_SCRIPT}.log}
    # Append STDOUT and STDERR to logfile and output both at the same time
    exec >  >(tee -ia $logfile)
    exec 2> >(tee -ia $logfile >&2)
}

Run this as a weekly cronjob, e.g. via /etc/cron.d/extbackup, starting Sat morning:

# weekly extbackup on encrypted USB drive (Sat 07:30AM), rerun on failure the next 3 days
30 07   * * 6   root    extbackup-zfs.sh
30 07   * * 0-2 root    extbackup-zfs.sh --rerun    

On first run (e.g. in a screen session instead of via cronjob), make sure you follow the log:

$ tail -f /var/log/extbackup.log

Set up another encrypted external drive

We want to rotate several external drives, storing them in different physical locations. Setting up an additional drive should not require us to transfer all data again from backup server (which in our case was > 4TB of data that needed to get transferred between two datacenters). We want to start from the latest external drive and replicate its data instead and then only continue with incremental backups.

Idea

Problem: ZFS does not support duplicating/copying of existing snapshots on the same dataset (with a new snapshot name, obviously). So, the initial idea was to set up encryption on new disk first, then connect first disk and import secondary as dpool2, then copying all datasets over using zfs send|receive. That would force us to write another complex script. Way to complicated! Let's use mirroring to replicate the full disk instead!

Solution:

  • Part 1) Replicate Disk
    1. Connect existing disk with LABEL 1a and import dpool: zpool import dpool
    2. Connect new (empty) disk and attach it to dpool, creating a mirror: zpool attach dpool sdb sdd
    3. Wait until resilvering has completed
    4. Split mirror again to have new disk in a separate dpool2: zpool split dpool dpool2
    5. Export first dpool, disconnect first disk, and import dpool2 by renaming it to dpool: zpool import dpool2 dpool
    6. Export dpool (now the new final replicated disk), never forget exporting before detaching a disk!
  • Part 2) Renaming LABEL / snapshots
    1. Rember, we have a full copy of initial disk with LABEL 1a on new disk which should be renamed to 2a
    2. mount/decrypt new disk and check label (still must be the same as the previous disk!), unmount/encrypt
    3. run extbackup-zfs.sh --maxsnap 2 with old LABEL, to create new snapshots that can be renamed (and keeping first ones alive on backup for first disk)
    4. change LABEL
    5. run extbackup-migrate-zfs-snaps.sh to rename all local and remote (backup) snapshots to new LABEL
    6. purge local snapshots with old label by running extbackup-migrate-zfs-snaps.sh --purge
    7. unmount/encrypt disk

Part 1) Replicate a disk

Replicate a full disk by mirorring dpool :

# On extbackup with 1st disk (LABEL 1a) connected
$ zpool import dpool
$ zpool status dpool
  pool: dpool
 state: ONLINE
config:
    NAME        STATE     READ WRITE CKSUM
    dpool       ONLINE       0     0     0
      sdb       ONLINE       0     0     0

# Connect 2nd disk (already partitioned or not should not matter)
$ lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sdb      8:16   0   7.3T  0 disk 
├─sdb1   8:17   0   7.3T  0 part 
└─sdb9   8:25   0    64M  0 part 
sdd      8:48   0   7.3T  0 disk 

# Attach new device to existing zpool, creating a mirror
$ zpool attach dpool sdb sdd
# (maybe) need to use --force if the disk was already initialized before
$ zpool attach dpool sdb sdd -f
# Resilvering starts immediately.

# Wait until fully resilvered ...
$ zpool status dpool
  pool: dpool
 state: ONLINE
  scan: resilvered 4.73T in 09:15:53 with 0 errors on ...
config:
    NAME        STATE     READ WRITE CKSUM
    dpool       ONLINE       0     0     0
      mirror-0  ONLINE       0     0     0
        sdb     ONLINE       0     0     0
        sdd     ONLINE       0     0     0

You can use the zpool split command to detach disks from a mirrored ZFS storage pool to create a new pool with one of the detached disks. The new pool will have identical contents to the original mirrored ZFS storage pool:

$ zpool split dpool dpool2
# dpool2 is not imported automatically, which is what we want!

# now, export dpool to free this pools name
$ zpool export dpool

# rename new dpool2 to dpool and export it again
$ zpool import dpool2 dpool
$ zpool export dpool

# disconnect both disks and list importable pools after reconnecting the second disk (should then show up as /dev/sdb)
$ zpool import
   pool: dpool
     id: 9764679279839179262
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:
    dpool                                          ONLINE
      ata-Samsung_SSD_870_QVO_8TB_S5SSNF0R201266B  ONLINE

# import it to check status
$ zpool import dpool
$ zpool status dpool
  pool: dpool
 state: ONLINE
  scan: resilvered 4.61T in 08:45:42 with 0 errors on ...
config:
    NAME                                           STATE     READ WRITE CKSUM
    dpool                                          ONLINE       0     0     0
      ata-Samsung_SSD_870_QVO_8TB_S5SSNF0R201266B  ONLINE       0     0     0

$ zpool export dpool

All cool! We now have two identical disks, both still having label set to 1a in /dpool/LABEL.

Part 2) Renaming LABEL / snapshots

Now, set up new replicated disk:

  1. mount/decrypt new disk and check label (still must be the same as the previous disk!), unmount/encrypt
  2. run extbackup-zfs.sh --maxsnap 2 with old LABEL, to create new snapshots that can be renamed (and keeping first ones alive on backup for first disk).
  3. change LABEL to 2a, 3a,...
  4. run extbackup-migrate-zfs-snaps.sh to rename all local and remote (backup) snapshots to new LABEL
  5. purge local snapshots with old label by running extbackup-migrate-zfs-snaps.sh --purge
  6. unmount/encrypt disk

For mounting/dencrytion and encryption/unmounting, we're using the following aliases in below commands:

alias mount-extbackup='zpool import dpool && zfs load-key dpool && zfs mount -l -a'
alias umount-extbackup='zfs unmount dpool && zfs unload-key dpool && zpool export dpool'
$ mount-extbackup
# check LABEL (must match previous disk, DON'T CHANGE IT YET!) and review snapshots
$ cat /dpool/LABEL
1a
$ zfs list -H -t snapshot -o name

# ... and unmount it again (to be ready for extbackup-zfs.sh)
$ umount-extbackup

# run extbackup by keeping 2 snapshots (and still pretending we're 1a, keeping old LABEL !!)
$ screen
$ extbackup-zfs.sh --maxsnap 2

# after extbackup completion change LABEL
$ mount-extbackup
$ echo -n "2a" > /dpool/LABEL

# Rename latest @rep_extbackup_1a_* snapshots to @rep_extbackup_2a_* (new LABEL)
$ extbackup-migrate-zfs-snaps.sh --dryrun
# This will rename them both locally and on remote backup server
$ extbackup-migrate-zfs-snaps.sh

# destroy old snapshots (only on extbackup, KEEP THEM ON backup !!!)
$ extbackup-migrate-zfs-snaps.sh --purge --dryrun
$ extbackup-migrate-zfs-snaps.sh --purge
# review (there should now only be @rep_extbackup_2a_* snapshots)
$ zfs list -H -t snapshot -o name | grep -v @rep_extbackup_2a

# unmount/encrypt disk
$ umount-extbackup

The disk can now be unplugged and is ready for the next backup run.

Helper script extbackup-migrate-zfs-snaps.sh:

extbackup-migrate-zfs-snaps.sh
#!/bin/bash
######### CONFIGURATION ############
DISKS=3
DATAPOOL=dpool
OLDLABEL=1a
BKUP_SERVER="backup"
####################################

usage() {
    echo "Usage:"
    echo "$(basename $0) [-l|--label OLDLABEL]"
    echo
    echo "Where:"
    echo "  -l|--label OLDLABEL  The label of the disks where the previous snapshots have been performed (default: 1a)"
    echo "  -s|--dryrun          Dry-run / simulation mode: Don't rename any snapshots"
    echo "  -p|-purge            Purge all snapshots that don't match current disk label (only run this as a final migration!)"
    exit 1
}

# Get options
while :
do
    case $1 in
        -l | --label)
            OLDLABEL="${2:-$OLDLABEL}"
            shift
            shift
            ;;
        -s | --dryrun)
            DRYRUN="true"
            shift
            ;;
        -p | --purge)
            PURGE="true"
            shift
            ;;
        -h | --help)
            usage
            ;;
        *)  # no more options. Stop while loop
            break
            ;;
    esac
done

# include common functions
source /usr/local/sbin/common-functions.sh

# zfs command convienence aliases
zfslistname='zfs list -H -o name'

# abort on missing label
if [ ! -f /$DATAPOOL/LABEL ]; then
    errquit "Disk label (/$DATAPOOL/LABEL) does not exist. Backup script aborted!!!"
fi
label=$(cat /$DATAPOOL/LABEL)
if [[ ! $label =~ ^[1-$DISKS]a$ ]]; then
    errquit "Invalid label in /$DATAPOOL/LABEL: $label"
fi

dspattern="$DATAPOOL/zfsdisks/[[:alnum:]-]+"
for dataset in $($zfslistname -t filesystem -d 2 $DATAPOOL); do
    if [[ $dataset =~ ^$dspattern$ ]]; then
        # local purge run (as a final migration)
        if [ -n "$PURGE" ]; then
            for snapshot in $($zfslistname -t snapshot -s creation $dataset | grep @rep_extbackup_${OLDLABEL}_); do
                cmd="zfs destroy $snapshot"
                if [ -z "$DRYRUN" ]; then
                    printinfo "$cmd"
                    $cmd
                else
                    printinfo "(DRYRUN) $cmd"
                fi
            done
            continue # don't proceed with snapshot renaming below, as that is part of first migration
        fi

        # snapshot renaming (first migration)
        snapold=$($zfslistname -t snapshot -s creation $dataset | grep @rep_extbackup | tail -n1)
        if [[ -z "$snapold" ]]; then
            printwarn "Skipping $dataset: no extbackup snapshot found."
            continue
        fi
        snapnew="${snapold/@rep_extbackup_$OLDLABEL/@rep_extbackup_$label}"
        cmd="zfs rename $snapold $snapnew"
        if [ -z "$DRYRUN" ]; then
            printinfo "$cmd"
            $cmd
            if [ $? -ne 0 ]; then
                errquit "Failed to locally rename snapshot. Migration script aborted!"
            fi
            ssh $BKUP_SERVER $cmd
            if [ $? -ne 0 ]; then
                errquit "Failed to remotely rename snapshot on $BKUP_SERVER. Migration script aborted!"
            fi
        else
            printinfo "(DRYRUN) $cmd"
        fi
    fi
done