ZYNK: The Zuper Zimple ZFS Sync (Replication) Tool
Posted on November 17, 2008
I’ve been working on building better and better ZFS replication tools for use at Joyent, but it often gets complex and frustrating because, although replication in ZFS is very simplistic, managing all the snapshots, retentions, and mountains of error checking and handling, on top of reporting and stats collection, is a nightmare.
So, just to relax I wrote a fun simple replication tool I call “Zynk”. It’s pathetically simple (read: elegant) and fun. As the comment says, if something breaks, its a pita to clean up, but otherwise should work well when set in motion. The intention is to run from cron every 30-600 seconds or so, but be aware that you should do the first run manually, because thats gonna take some time… the incrementals afterwards should be able to run in less than whatever frequency via cron you set.
#!/bin/bash ## ZYNK: The Zuper Zimple ZFS Sync (Replication) Tool ## Form: zynk local/dataset root@remote.host destination/dataset # Please note: The reason this is so simple is because there is no error checking, reporting, or cleanup. # In the event that something goes wonkey, you'll manually need to fix the snapshots and # modify or remote the /var/run/zynk datafile which contains the most recent snapshot name. # Furthermore, this absolutely relies on the GNU version of 'date' in order to get epoch time # Before using, make sure you've distributed your SSH key to the remote host and can ssh without password. if [ ! $3 ] then echo "Usage: zynk local/dataset root@remote.host destination/dataset" echo "WARNING: The destination is the full path for the remote dataset, not the prefix dataset stub." exit fi DATE=`date +%s` if [ $DATE == "%s" ] then echo "Must use GNU Date, please install and modify script." exit fi if [ -e /var/run/zynk ] then # Datafile is found, creating incr. echo "Incremental started at `date`" zfs snapshot ${1}@${DATE} zfs send -i ${1}@`cat /var/run/zynk` ${1}@${DATE} | ssh ${2} zfs recv -F ${3} zfs destroy ${1}@`cat /var/run/zynk` ssh ${2} zfs destroy ${3}@`cat /var/run/zynk` echo ${DATE} > /var/run/zynk echo "Incremental complete at `date`" else # Datafile not found, creating full. echo "Full started at `date`" zfs snapshot ${1}@${DATE} zfs send ${1}@${DATE} | ssh ${2} zfs recv ${3} echo ${DATE} > /var/run/zynk echo "Full completed at `date`" fi
Here it is in action:
root@quadra ~$ rm /var/run/zynk root@quadra ~$ ./zynk data/home/tamr root@localhost backup/zynk/tamr Full started at Mon Nov 17 13:44:28 PST 2008 Full completed at Mon Nov 17 13:44:28 PST 2008 root@quadra ~$ ./zynk data/home/tamr root@localhost backup/zynk/tamr Incremental started at Mon Nov 17 13:44:58 PST 2008 Incremental complete at Mon Nov 17 13:44:58 PST 2008 root@quadra ~$ ./zynk data/home/tamr root@localhost backup/zynk/tamr Incremental started at Mon Nov 17 13:45:01 PST 2008 Incremental complete at Mon Nov 17 13:45:02 PST 2008 root@quadra ~$ ./zynk data/home/tamr root@localhost backup/zynk/tamr Incremental started at Mon Nov 17 13:45:19 PST 2008 Incremental complete at Mon Nov 17 13:45:20 PST 2008 root@quadra ~$ zfs list -r data/home/tamr backup/zynk/tamr NAME USED AVAIL REFER MOUNTPOINT backup/zynk/tamr 2.45M 296G 2.45M /backup/zynk/tamr backup/zynk/tamr@1226958319 0 - 2.45M - data/home/tamr 2.47M 196M 2.45M /data/home/tamr data/home/tamr@1226958319 0 - 2.45M -
Whats important to note is that it only maintains a single snapshot on either source or destination, so you don’t consume a bunch of additional space or have to worry about screwing up quotas.
This isn’t intended so much as a “real” tool, but something you can play around with and hopefully excite the mind about some new fun applications. Add error checking, add retention, add reporting, re-implement in a new language. Have fun…. drink Zima. 🙂
For a thorough discussion of ZFS Replication, see my post from a couple weeks ago: Understanding ZFS: Replication, Archive and Backup.