My Backup Solution Leveraging OpenZFS, rsync, WOL and crontab

The other day I managed to destroy my hard drive’s partition table as I attempted to fix a grub(2) issue. To make matters worse, my backup of important files was old, and while attempting to make a boot-disk to re-install Linux, I selected the wrong disk and over-wrote the backup! Clearly I needed a more robust backup solution.

My requirements for the solution were as follows:

  • Runs during idle time, in my case 5:00 am and 5:00 pm.
  • Does not require me to leave my computer on 24/7. I try to reduce my power consumption.
  • Does not require non-standard (read: non-open source) software.

My setup is as follows:

  • Linux Mint desktop. Hate on me if you want, I love the Cinnamon desktop.
  • FreeBSD server, which serves as the ZFS backup sink (and a lot more).
  • A small Linux (soon-to-be NetBSD) Raspberry Pi 1 that functions as my IPv6 router, VPN endpoint and jump host when I need to tunnel into my home network from the outside.

Lets begin!

At 4:58 AM, the Raspberry Pi’s crontab(1) runs a script to ping my desktop. If the ping is successful, it notes that the computer is online by writing 0 to /var/log/pc-status. Otherwise, it notes that the computer is off-line, writes 1 to /var/log/pc-status and sends a Wake-On-LAN (WOL) frame to my computer to power on the machine. The script is as follows:

#!/bin/sh
ping6 -w 0.5 -c 1 -q PC_IP6_ADDRESS 2> /dev/null > /dev/null
if [ $? -ne 0 ]; then
     # It was off, so write 1
     sudo wakeonlan PC_MAC_ADDRESS 2> /dev/null > /dev/null
     sudo sh -c 'echo 1 > /var/log/pc-status'
else
     # Its already on, so write 0
     sudo sh -c 'echo 0 > /var/log/pc-status'
fi

I put this in the crontab(1) as follows: 58 4 * * * /home/farhan/bin/wake-script.sh.

Next, at 5:00 AM, the Linux desktop performs an rsync(1) to the FreeBSD backup machine to synchronize all files. Since this is rsync(1) and not scp, it does not waste time on files that are already up to date. Upon completion, the Linux desktop queries the Raspberry Pi’s power-status record to determine if it was just on or off. If the value is on, Linux will do nothing. If the value is off, it will suspend the machine. This is done as follows in another script:

rsync -6 -q -a --exclude "VirtualBox VMs" --include ".*" /home/farhan/* farhan@FREEBSD_SERVER:/usr/local/home/farhan/pc_home/
ssh farhan@RASPBERRY_PI cat /var/log/pc-status | grep 1 -q
if [ $? -ne 0 ]; then
     sudo pm-suspend 2> /dev/null > /dev/null # The redirects are unnecessary.
fi

My storage device is 4 TB and mostly unused, so I do not even bother with the --delete option. Yes, my ~/Downloads directory will likely grow quite large in the next few weeks, but that is not a problem. However, I excluded the Virtual Machines’ directory because even just powering on a VM results in changing the VDI disk image. And similarly to the wake-script, this is in crontab(1) as follows: 0 5 * * * /home/farhan/bin/sleep-script.sh.

Finally, on the FreeBSD side I run daily OpenZFS snapshots and daily prunes of snapshots that are older than 14 days. My crontab(1) is as follows:

@daily /sbin/zfs snapshot -r Data@daily-`date "+\%Y-\%m-\%d"`
@daily /sbin/zfs list -t snapshot -o name | /usr/bin/grep tank/home/farhan/pcbackup@daily- | /usr/bin/sort -r | /usr/bin/tail -n +14 | /usr/bin/xargs -n 1 /sbin/zfs destroy -r

And that’s it! A low-powered, open-source backup solution that relies 100% on Unix tools.

Notes: The only way to make this solution more elegant would be if the Linux desktop ran OpenZFS and used zfs(8) send command. But given that Linux Mint does not support ZFS out of the box, I am concerned what might happen if the OpenZFS module fails to load and I am stuck with a non-functional machine. Also, notice the explicit use of IPv6, not legacy IP.

Leave a Reply