NAME

zrepl - zrepl Documentation GitHub license Language: Go Twitter Donate via Patreon Donate via GitHub Sponsors Donate via Liberapay Donate via PayPal Matrix

zrepl is a one-stop, integrated solution for ZFS replication.

GETTING STARTED

The 10 minute quick-start guides give you a first impression.

MAIN FEATURES

•: Filesystem replication

[x] Pull & Push mode
[x] Multiple transport modes: TCP, TCP + TLS client auth, SSH
Advanced replication features

[x] Automatic retries for temporary network errors
[x] Automatic resumable send & receive
[x] Automatic ZFS holds during send & receive
[x] Automatic bookmark & hold management for guaranteed incremental send & recv
[x] Encrypted raw send & receive to untrusted receivers (OpenZFS native encryption)
[x] Properties send & receive
[x] Compressed send & receive
[x] Large blocks send & receive
[x] Embedded data send & receive
[x] Resume state send & receive
[x] Bandwidth limiting

•: Automatic snapshot management

[x] Periodic filesystem snapshots
[x] Support for pre- and post-snapshot hooks with builtins for MySQL & Postgres
[x] Flexible pruning rule system

[x] Age-based fading (grandfathering scheme)
[x] Bookmarks to avoid divergence between sender and receiver

•: Sophisticated Monitoring & Logging

[x] Live progress reporting via zrepl status subcommand
[x] Comprehensive, structured logging

human, logfmt and json formatting
stdout, syslog and TCP (+TLS client auth) outlets

•: [x] Prometheus monitoring endpoint

•: Maintainable implementation in Go

[x] Cross platform
[x] Dynamic feature checking
[x] Type safe & testable code

ATTENTION:

zrepl as well as this documentation is still under active development. There is no stability guarantee on the RPC protocol or configuration format, but we do our best to document breaking changes in the changelog.

CONTRIBUTING

We are happy about any help we can get!

Financial Support
Explore the codebase

•: These docs live in the docs/ subdirectory

Document any non-obvious / confusing / plain broken behavior you encounter when setting up zrepl for the first time
Check the Issues and Projects sections for things to do. The good first issues and docs are suitable starting points.

Development Workflow: The GitHub repository is where all development happens. Make sure to read the Developer Documentation section and open new issues or pull requests there.

Quick Start by Use Case

The goal of this quick-start guide is to give you an impression of how zrepl can accomodate your use case.

Install zrepl

Follow the OS-specific installation instructions and come back here.

Overview Of How zrepl Works

Check out the overview section to get a rough idea of what you are going to configure in the next step, then come back here.

Configuration Examples

zrepl is configured through a YAML configuration file in /etc/zrepl/zrepl.yml. We have prepared example use cases that show-case typical deployments and different functionality of zrepl. We encourage you to read through all of the examples to get an idea of what zrepl has to offer, and how you can mix-and-match configurations for your use case. Keep the full config documentation handy if a config snippet is unclear.

Example Use Cases

Continuous Backup of a Server

This config example shows how we can backup our ZFS-based server to another machine using a zrepl push job.

•: Production server prod with filesystems to back up:

The entire pool zroot
except zroot/var/tmp and all child datasets of it
and except zroot/usr/home/paranoid which belongs to a user doing backups themselves.

•: Backup server backups with a dataset sub-tree for use by zrepl:

•: In our example, that will be storage/zrepl/sink/prod.

Our backup solution should fulfill the following requirements:

Periodically snapshot the filesystems on prod every 10 minutes
Incrementally replicate these snapshots to storage/zrepl/sink/prod/* on backups
Keep only very few snapshots on prod to save disk space
Keep a fading history (24 hourly, 30 daily, 6 monthly) of snapshots on backups
The network is untrusted - zrepl should use TLS to protect its communication and our data.

Analysis

We can model this situation as two jobs:

•: A push job on prod

Creates the snapshots
Keeps a short history of local snapshots to enable incremental replication to backups
Connects to the zrepl daemon process on backups
Pushes snapshots backups
Prunes snapshots on backups after replication is complete

•: A sink job on backups

Accepts connections & responds to requests from prod
Limits client prod access to filesystem sub-tree storage/zrepl/sink/prod

Generate TLS Certificates

We use the TLS client authentication transport to protect our data on the wire. To get things going quickly, we skip setting up a CA and generate two self-signed certificates as described here. For convenience, we generate the key pairs on our local machine and distribute them using ssh:

(name=backups; openssl req -x509 -sha256 -nodes \
 -newkey rsa:4096 \
 -days 365 \
 -keyout $name.key \
 -out $name.crt -addext "subjectAltName = DNS:$name" -subj "/CN=$name")
(name=prod; openssl req -x509 -sha256 -nodes \
 -newkey rsa:4096 \
 -days 365 \
 -keyout $name.key \
 -out $name.crt -addext "subjectAltName = DNS:$name" -subj "/CN=$name")
ssh root@backups "mkdir /etc/zrepl"
scp  backups.key backups.crt prod.crt root@backups:/etc/zrepl
ssh root@prod "mkdir /etc/zrepl"
scp  prod.key prod.crt backups.crt root@prod:/etc/zrepl

Note that alternative transports exist, e.g. via TCP without TLS or ssh.

Configure server prod

We define a push job named prod_to_backups in /etc/zrepl/zrepl.yml on host prod :

jobs:
- name: prod_to_backups
  type: push
  connect:
    type: tls
    address: "backups.example.com:8888"
    ca: /etc/zrepl/backups.crt
    cert: /etc/zrepl/prod.crt
    key:  /etc/zrepl/prod.key
    server_cn: "backups"
  filesystems: {
    "zroot<": true,
    "zroot/var/tmp<": false,
    "zroot/usr/home/paranoid": false
  }
  snapshotting:
    type: periodic
    prefix: zrepl_
    interval: 10m
  pruning:
    keep_sender:
    - type: not_replicated
    - type: last_n
      count: 10
    keep_receiver:
    - type: grid
      grid: 1x1h(keep=all) | 24x1h | 30x1d | 6x30d
      regex: "^zrepl_"

Configure server backups

We define a corresponding sink job named sink in /etc/zrepl/zrepl.yml on host backups :

jobs:
- name: sink
  type: sink
  serve:
      type: tls
      listen: ":8888"
      ca: "/etc/zrepl/prod.crt"
      cert: "/etc/zrepl/backups.crt"
      key: "/etc/zrepl/backups.key"
      client_cns:
        - "prod"
  root_fs: "storage/zrepl/sink"

Go Back To Quickstart Guide

Click here to go back to the quickstart guide.

Local Snapshots + Offline Backup to an External Disk

This config example shows how we can use zrepl to make periodic snapshots of our local workstation and back it up to a zpool on an external disk which we occassionally connect.

The local snapshots should be taken every 15 minutes for pain-free recovery from CLI disasters (rm -rf / and the like). However, we do not want to keep the snapshots around for very long because our workstation is a little tight on disk space. Thus, we only keep one hour worth of high-resolution snapshots, then fade them out to one per hour for a day (24 hours), then one per day for 14 days.

At the end of each work day, we connect our external disk that serves as our workstation's local offline backup. We want zrepl to inspect the filesystems and snapshots on the external pool, figure out which snapshots were created since the last time we connected the external disk, and use incremental replication to efficiently mirror our workstation to our backup disk. Afterwards, we want to clean up old snapshots on the backup pool: we want to keep all snapshots younger than one hour, 24 for each hour of the first day, then 360 daily backups.

A few additional requirements:

Snapshot creation and pruning on our workstation should happen in the background, without interaction from our side.
However, we want to explicitly trigger replication via the command line.
We want to use OpenZFS native encryption to protect our data on the external disk. It is absolutely critical that only encrypted data leaves our workstation. zrepl should provide an easy config knob for this and prevent replication of unencrypted datasets to the external disk.
We want to be able to put off the backups for more than three weeks, i.e., longer than the lifetime of the automatically created snapshots on our workstation. zrepl should use bookmarks and holds to achieve this goal.
When we yank out the drive during replication and go on a long vacation, we do not want the partially replicated snapshot to stick around as it would hold on to too much disk space over time. Therefore, we want zrepl to deviate from its default behavior and sacrifice resumability, but nonetheless retain the ability to do incremental replication once we return from our vacation. zrepl should provide an easy config knob to disable step holds for incremental replication.

The following config snippet implements the setup described above. You will likely want to customize some aspects mentioned in the top comment in the file.

# This config serves as an example for a local zrepl installation that
# backups the entire zpool `system` to `backuppool/zrepl/sink`
#
# The requirements covered by this setup are described in the zrepl documentation's
# quick start section which inlines this example.
#
# CUSTOMIZATIONS YOU WILL LIKELY WANT TO APPLY:
# - adjust the name of the production pool `system` in the `filesystems` filter of jobs `snapjob` and `push_to_drive`
# - adjust the name of the backup pool `backuppool` in the `backuppool_sink` job
# - adjust the occurences of `myhostname` to the name of the system you are backing up (cannot be easily changed once you start replicating)
# - make sure the `zrepl_` prefix is not being used by any other zfs tools you might have installed (it likely isn't)
jobs:
# this job takes care of snapshot creation + pruning
- name: snapjob
  type: snap
  filesystems: {
      "system<": true,
  }
  # create snapshots with prefix `zrepl_` every 15 minutes
  snapshotting:
    type: periodic
    interval: 15m
    prefix: zrepl_
  pruning:
    keep:
    # fade-out scheme for snapshots starting with `zrepl_`
    # - keep all created in the last hour
    # - then destroy snapshots such that we keep 24 each 1 hour apart
    # - then destroy snapshots such that we keep 14 each 1 day apart
    # - then destroy all older snapshots
    - type: grid
      grid: 1x1h(keep=all) | 24x1h | 14x1d
      regex: "^zrepl_.*"
    # keep all snapshots that don't have the `zrepl_` prefix
    - type: regex
      negate: true
      regex: "^zrepl_.*"
# This job pushes to the local sink defined in job `backuppool_sink`.
# We trigger replication manually from the command line / udev rules using
#  `zrepl signal wakeup push_to_drive`
- type: push
  name: push_to_drive
  connect:
    type: local
    listener_name: backuppool_sink
    client_identity: myhostname
  filesystems: {
      "system<": true
  }
  send:
    encrypted: true
  replication:
    protection:
      initial: guarantee_resumability
      # Downgrade protection to guarantee_incremental which uses zfs bookmarks instead of zfs holds.
      # Thus, when we yank out the backup drive during replication
      # - we might not be able to resume the interrupted replication step because the partially received `to` snapshot of a `from`->`to` step may be pruned any time
      # - but in exchange we get back the disk space allocated by `to` when we prune it
      # - and because we still have the bookmarks created by `guarantee_incremental`, we can still do incremental replication of `from`->`to2` in the future
      incremental: guarantee_incremental
  snapshotting:
    type: manual
  pruning:
    # no-op prune rule on sender (keep all snapshots), job `snapshot` takes care of this
    keep_sender:
    - type: regex
      regex: ".*"
    # retain 
    keep_receiver:
    # longer retention on the backup drive, we have more space there
    - type: grid
      grid: 1x1h(keep=all) | 24x1h | 360x1d
      regex: "^zrepl_.*"
    # retain all non-zrepl snapshots on the backup drive
    - type: regex
      negate: true
      regex: "^zrepl_.*"
# This job receives from job `push_to_drive` into `backuppool/zrepl/sink/myhostname`
- type: sink
  name: backuppool_sink
  root_fs: "backuppool/zrepl/sink"
  serve:
    type: local
    listener_name: backuppool_sink

Click here to go back to the quickstart guide.

Fan-out replication

This quick-start example demonstrates how to implement a fan-out replication setup where datasets on a server (A) are replicated to multiple targets (B, C, etc.).

This example uses multiple source jobs on server A and pull jobs on the target servers.

WARNING:

Before implementing this setup, please see the caveats listed in the fan-out replication configuration overview.

Overview

On the source server (A), there should be:

•: A snap job

Creates the snapshots
Handles the pruning of snapshots

•: A source job for target B

•: Accepts connections from server B and B only

•: Further source jobs for each additional target (C, D, etc.)

Listens on a unique port
Only accepts connections from the specific target

On each target server, there should be:

•: A pull job that connects to the corresponding source job on A

prune_sender should keep all snapshots since A's snap job handles the pruning
prune_receiver can be configured as appropriate on each target server

Generate TLS Certificates

Mutual TLS via the TLS client authentication transport can be used to secure the connections between the servers. In this example, a self-signed certificate is created for each server without setting up a CA.

source=a.example.com
targets=(
    b.example.com
    c.example.com
    # ...
)
for server in "${source}" "${targets[@]}"; do
    openssl req -x509 -sha256 -nodes \
        -newkey rsa:4096 \
        -days 365 \
        -keyout "${server}.key" \
        -out "${server}.crt" \
        -addext "subjectAltName = DNS:${server}" \
        -subj "/CN=${server}"
done
# Distribute each host's keypair
for server in "${source}" "${targets[@]}"; do
    ssh root@"${server}" mkdir /etc/zrepl
    scp "${server}".{crt,key} root@"${server}":/etc/zrepl/
done
# Distribute target certificates to the source
scp "${targets[@]/%/.crt}" root@"${source}":/etc/zrepl/
# Distribute source certificate to the targets
for server in "${targets[@]}"; do
    scp "${source}.crt" root@"${server}":/etc/zrepl/
done

Configure source server A

jobs:
# Separate job for snapshots and pruning
- name: snapshots
  type: snap
  filesystems:
    'tank<': true # all filesystems
  snapshotting:
    type: periodic
    prefix: zrepl_
    interval: 10m
  pruning:
    keep:
      # Keep non-zrepl snapshots
      - type: regex
        negate: true
        regex: '^zrepl_'
      # Time-based snapshot retention
      - type: grid
        grid: 1x1h(keep=all) | 24x1h | 30x1d | 12x30d
        regex: '^zrepl_'
# Source job for target B
- name: target_b
  type: source
  serve:
    type: tls
    listen: :8888
    ca: /etc/zrepl/b.example.com.crt
    cert: /etc/zrepl/a.example.com.crt
    key: /etc/zrepl/a.example.com.key
    client_cns:
      - b.example.com
  filesystems:
    'tank<': true # all filesystems
  # Snapshots are handled by the separate snap job
  snapshotting:
    type: manual
# Source job for target C
- name: target_c
  type: source
  serve:
    type: tls
    listen: :8889
    ca: /etc/zrepl/c.example.com.crt
    cert: /etc/zrepl/a.example.com.crt
    key: /etc/zrepl/a.example.com.key
    client_cns:
      - c.example.com
  filesystems:
    'tank<': true # all filesystems
  # Snapshots are handled by the separate snap job
  snapshotting:
    type: manual
# Source jobs for remaining targets. Each one should listen on a different port
# and reference the correct certificate and client CN.
# - name: target_c
#   ...

Configure each target server

jobs:
# Pull from source server A
- name: source_a
  type: pull
  connect:
    type: tls
    # Use the correct port for this specific client (eg. B is 8888, C is 8889, etc.)
    address: a.example.com:8888
    ca: /etc/zrepl/a.example.com.crt
    # Use the correct key pair for this specific client
    cert: /etc/zrepl/b.example.com.crt
    key: /etc/zrepl/b.example.com.key
    server_cn: a.example.com
  root_fs: pool0/backup
  interval: 10m
  pruning:
    keep_sender:
      # Source does the pruning in its snap job
      - type: regex
        regex: '.*'
    # Receiver-side pruning can be configured as desired on each target server
    keep_receiver:
      # Keep non-zrepl snapshots
      - type: regex
        negate: true
        regex: '^zrepl_'
      # Time-based snapshot retention
      - type: grid
        grid: 1x1h(keep=all) | 24x1h | 30x1d | 12x30d
        regex: '^zrepl_'

Go Back To Quickstart Guide

Click here to go back to the quickstart guide.

Use zrepl configcheck to validate your configuration. No output indicates that everything is fine.

NOTE:

Please open an issue on GitHub if your use case for zrepl is significantly different from those listed above. Or even better, write it up in the same style as above and open a PR!

Apply Configuration Changes

We hope that you have found a configuration that fits your use case. Use zrepl configcheck once again to make sure the config is correct (output indicates that everything is fine). Then restart the zrepl daemon on all systems involved in the replication, likely using service zrepl restart or systemctl restart zrepl.

WARNING:

Please read up carefully on the pruning rules before applying the config. In particular, note that most example configs apply to all snapshots, not just zrepl-created snapshots. Use the following keep rule on sender and receiver to prevent this:

- type: regex
  negate: true
  regex: "^zrepl_.*" # <- the 'prefix' specified in snapshotting.prefix

Watch it Work

Run zrepl status on the active side of the replication setup to monitor snaphotting, replication and pruning activity. To re-trigger replication (snapshots are separate!), use zrepl signal wakeup JOBNAME. (refer to the example use case document if you are uncertain which job you want to wake up).

You can also use basic UNIX tools to inspect see what's going on. If you like tmux, here is a handy script that works on FreeBSD:

pkg install gnu-watch tmux
tmux new -s zrepl -d
tmux split-window -t zrepl "tail -f /var/log/messages"
tmux split-window -t zrepl "gnu-watch 'zfs list -t snapshot -o name,creation -s creation'"
tmux split-window -t zrepl "zrepl status"
tmux select-layout -t zrepl tiled
tmux attach -t zrepl

The Linux equivalent might look like this:

# make sure tmux is installed & let's assume you use systemd + journald
tmux new -s zrepl -d
tmux split-window -t zrepl  "journalctl -f -u zrepl.service"
tmux split-window -t zrepl "watch 'zfs list -t snapshot -o name,creation -s creation'"
tmux split-window -t zrepl "zrepl status"
tmux select-layout -t zrepl tiled
tmux attach -t zrepl

What Next?

Read more about configuration format, options & job types
Configure logging & monitoring.

Installation

TIP:

Note: check out the quick-start guides if you want a first impression of zrepl.

User Privileges

It is possible to run zrepl as an unprivileged user in combination with ZFS delegation. Also, there is the possibility to run it in a jail on FreeBSD by delegating a dataset to the jail.

TIP:

Note: check out the installation-freebsd-jail-with-iocage for FreeBSD jail setup instructions.

Packages

zrepl source releases are signed & tagged by the author in the git repository. Your OS vendor may provide binary packages of zrepl through the package manager. Additionally, binary releases are provided on GitHub. The following list may be incomplete, feel free to submit a PR with an update:

OS / Distro	Install Command	Link
FreeBSD	pkg install zrepl	https://www.freshports.org/sysutils/zrepl/ installation-freebsd-jail-with-iocage
FreeNAS		installation-freebsd-jail-with-iocage
MacOS	brew install zrepl	Available on homebrew
Arch Linux	yay install zrepl	Available on AUR
Fedora, CentOS, RHEL, OpenSUSE	dnf install zrepl	RPM repository config
Debian + Ubuntu	apt install zrepl	APT repository config
OmniOS	pkg install zrepl	Available since r151030
Void Linux	xbps-install zrepl	Available since a88a2a4
Others		Use binary releases or build from source.

Debian / Ubuntu APT repositories

We maintain APT repositories for Debian, Ubuntu and derivatives. The fingerprint of the signing key is E101 418F D3D6 FBCB 9D65 A62D 7086 99FC 5F2E BF16. It is available at https://zrepl.cschwarz.com/apt/apt-key.asc . Please open an issue in on GitHub if you encounter any issues with the repository.

(
set -ex
zrepl_apt_key_url=https://zrepl.cschwarz.com/apt/apt-key.asc
zrepl_apt_key_dst=/usr/share/keyrings/zrepl.gpg
zrepl_apt_repo_file=/etc/apt/sources.list.d/zrepl.list
# Install dependencies for subsequent commands
sudo apt update && sudo apt install curl gnupg lsb-release
# Deploy the zrepl apt key.
curl -fsSL "$zrepl_apt_key_url" | tee | gpg --dearmor | sudo tee "$zrepl_apt_key_dst" > /dev/null
# Add the zrepl apt repo.
ARCH="$(dpkg --print-architecture)"
CODENAME="$(lsb_release -i -s | tr '[:upper:]' '[:lower:]') $(lsb_release -c -s | tr '[:upper:]' '[:lower:]')"
echo "Using Distro and Codename: $CODENAME"
echo "deb [arch=$ARCH signed-by=$zrepl_apt_key_dst] https://zrepl.cschwarz.com/apt/$CODENAME main" | sudo tee /etc/apt/sources.list.d/zrepl.list
# Update apt repos.
sudo apt update
)

NOTE:

Until zrepl reaches 1.0, the repositories will be updated to the latest zrepl release immediately. This includes breaking changes between zrepl versions. Use apt-mark hold zrepl to prevent upgrades of zrepl.

RPM repositories

We provide a single RPM repository for all RPM-based Linux distros. The zrepl binary in the repo is the same as the one published to GitHub. Since Go binaries are statically linked, the RPM should work about everywhere.

The fingerprint of the signing key is F6F6 E8EA 6F2F 1462 2878 B5DE 50E3 4417 826E 2CE6. It is available at https://zrepl.cschwarz.com/rpm/rpm-key.asc . Please open an issue on GitHub if you encounter any issues with the repository.

Copy-paste the following snippet into your shell to set up the zrepl repository. Then dnf install zrepl and make sure to confirm that the signing key matches the one shown above.

cat > /etc/yum.repos.d/zrepl.repo <<EOF
[zrepl]
name = zrepl
baseurl = https://zrepl.cschwarz.com/rpm/repo
gpgkey = https://zrepl.cschwarz.com/rpm/rpm-key.asc
EOF

NOTE:

Until zrepl reaches 1.0, the repository will be updated to the latest zrepl release immediately. This includes breaking changes between zrepl versions. If that bothers you, use the dnf versionlock plugin to pin the version of zrepl on your system.

Compile From Source

Producing a release requires Go 1.11 or newer and Python 3 + pip3 + docs/requirements.txt for the Sphinx documentation. A tutorial to install Go is available over at golang.org. Python and pip3 should probably be installed via your distro's package manager.

::: cd to/your/zrepl/checkout python3 -m venv3 source venv3/bin/activate ./lazy.sh devsetup make release # build artifacts are available in ./artifacts/release

The Python venv is used for the documentation build dependencies. If you just want to build the zrepl binary, leave it out and use ./lazy.sh godep instead.

Alternatively, you can use the Docker build process: it is used to produce the official zrepl binary releases and serves as a reference for build dependencies and procedure:

cd to/your/zrepl/checkout
# make sure your user has access to the docker socket
make release-docker
# if you want .deb or .rpm packages, invoke the follwoing
# targets _after_ you invoked release-docker
make deb-docker
make rpm-docker
# build artifacts are available in ./artifacts/release
# packages are available in ./artifacts

NOTE:

It is your job to install the built binary in the zrepl users's $PATH, e.g. /usr/local/bin/zrepl. Otherwise, the examples in the quick-start guides may need to be adjusted.

FreeBSD Jail With iocage

This tutorial shows how zrepl can be installed on FreeBSD, or FreeNAS in a jail using iocage. While this tutorial focuses on using iocage, much of the setup would be similar using a different jail manager.

NOTE:

From a security perspective, just keep in mind that zfs send/recv was never designed with jails in mind, an attacker could probably crash the receive-side kernel or worse induce stateful damage to the receive-side pool if they were able to get access to the jail.

The jail doesn't provide security benefits, but only management ones.

Requirements

A dataset that will be delegated to the jail needs to be created if one does not already exist. For the tutorial tank/zrepl will be used.

zfs create -o mountpoint=none tank/zrepl

The only software requirements on the host system are iocage, which can be installed from ports or packages.

pkg install py37-iocage

NOTE:

By default iocage will "activate" on first use which will set up some defaults such as which pool will be used. To activate iocage manually the iocage activate command can be used.

Jail Creation

There are two options for jail creation using FreeBSD.

1.: Manually set up the jail from scratch
2.: Create the jail using the zrepl plugin. On FreeNAS this is possible from the user interface using the community index.

Manual Jail

Create a jail, using the same release as the host, called zrepl that will be automatically started at boot. The jail will have tank/zrepl delegated into it.

iocage create --release "$(freebsd-version -k | cut -d '-' -f '1,2')" --name zrepl \
       boot=on nat=1 \
       jail_zfs=on \
       jail_zfs_dataset=zrepl \
       jail_zfs_mountpoint='none'

Enter the jail:

iocage console zrepl

Install zrepl

pkg update && pkg upgrade
pkg install zrepl

Create the log file /var/log/zrepl.log

touch /var/log/zrepl.log && service newsyslog restart

Tell syslogd to redirect facility local0 to the zrepl.log file:

service syslogd reload

Enable the zrepl daemon to start automatically at boot:

sysrc zrepl_enable="YES"

Now jump to the summary below.

Plugin

When using the plugin, zrepl will be installed for you in a jail using the following iocage properties.

nat=1
jail_zfs=on
jail_zfs_mountpoint=none

Additionally the delegated dataset should be specified upon creation, and optionally start on boot can be set. This can also be done from the FreeNAS webui.

fetch https://raw.githubusercontent.com/ix-plugin-hub/iocage-plugin-index/master/zrepl.json -o /tmp/zrepl.json
iocage fetch -P /tmp/zrepl.json --name zrepl jail_zfs_dataset=zrepl boot=on

Configuration

Now zrepl can be configured.

Enter the jail.

iocage console zrepl

Modify the /usr/local/etc/zrepl/zrepl.yml configuration file.

TIP:

Note: check out the quick-start guides for examples of a sink job.

Now zrepl can be started.

service zrepl start

Now jump to the summary below.

Summary

Congratulations, you have a working jail!

NOTE:

With FreeBSD 13's transition to OpenZFS 2.0, please ensure that your jail's FreeBSD version matches the one in the kernel module. If you are getting cryptic errors such as cannot receive new filesystem stream: invalid backup stream the instructions posted here might help.

What next?

Read the configuration chapter and then continue with the usage chapter.

Reminder: If you want a quick introduction, please read the quick-start guides.

Configuration

Overview & Terminology

All work zrepl does is performed by the zrepl daemon which is configured in a single YAML configuration file loaded on startup. The following paths are considered:

If set, the location specified via the global --config flag
/etc/zrepl/zrepl.yml
/usr/local/etc/zrepl/zrepl.yml

The zrepl configcheck subcommand can be used to validate the configuration. The command will output nothing and exit with zero status code if the configuration is valid. The error messages vary in quality and usefulness: please report confusing config errors to the tracking issue #155. Full example configs such as in the quick-start guides or the config/samples/ directory might also be helpful. However, copy-pasting examples is no substitute for reading documentation!

Config File Structure

global: ...
jobs:
- name: backup
  type: push
- ...

zrepl is configured using a single YAML configuration file with two main sections: global and jobs. The global section is filled with sensible defaults and is covered later in this chapter. The jobs section is a list of jobs which we are going to explain now.

Jobs & How They Work Together

A job is the unit of activity tracked by the zrepl daemon. The type of a job determines its role in a replication setup and in snapshot management. Jobs are identified by their name, both in log files and the zrepl status command.

NOTE:

The job name is persisted in several places on disk and thus cannot be changed easily.

Replication always happens between a pair of jobs: one is the active side, and one the passive side. The active side connects to the passive side using a transport and starts executing the replication logic. The passive side responds to requests from the active side after checking its permissions.

The following table shows how different job types can be combined to achieve both push and pull mode setups. Note that snapshot-creation denoted by "(snap)" is orthogonal to whether a job is active or passive.

Setup name	active side	passive side	use case
Push mode	push (snap)	sink	0.0 • 2 Laptop backup • 2 NAS behind NAT to offsite 168u
Pull mode	pull	source (snap)	0.0 • 2 Central backup-server for many nodes • 2 Remote server to NAS behind NAT 168u
Local replication	push + sink in one config with local transport	0.0 • 2 Backup to locally attached disk • 2 Backup FreeBSD boot pool 168u

Snap & prune-only

snap (snap)

N/A

0.0 • 2 Snapshots & pruning but no replication required • 2 Workaround for source-side pruning 168u

How the Active Side Works

The active side (push and pull job) executes the replication and pruning logic:

Wakeup because of finished snapshotting (push job) or pull interval ticker (pull job).
Connect to the corresponding passive side using a transport and instantiate an RPC client.
Replicate data from the sending to the receiving side (see below).
Prune on sender & receiver.

TIP:

The progress of the active side can be watched live using the zrepl status subcommand.

How the Passive Side Works

The passive side (sink and source) waits for connections from the corresponding active side, using the transport listener type specified in the serve field of the job configuration. When a client connects, the transport listener performS listener-specific access control (cert validation, IP ACLs, etc) and determines the client identity. The passive side job then uses this client identity as follows:

The sink job maps requests from different client identities to their respective sub-filesystem tree root_fs/${client_identity}.
The source might, in the future, embed the client identity in zrepl's ZFS abstraction names in order to support multi-host replication.

TIP:

The implementation of the sink job requires that the connecting client identities be a valid ZFS filesystem name components.

How Replication Works

One of the major design goals of the replication module is to avoid any duplication of the nontrivial logic. As such, the code works on abstract senders and receiver endpoints, where typically one will be implemented by a local program object and the other is an RPC client instance. Regardless of push- or pull-style setup, the logic executes on the active side, i.e. in the push or pull job.

The following high-level steps take place during replication and can be monitored using the zrepl status subcommand:

•: Plan the replication:

Compare sender and receiver filesystem snapshots
Build the replication plan

Per filesystem, compute a diff between sender and receiver snapshots
Build a list of replication steps

If possible, use incremental and resumable sends
Otherwise, use full send of most recent snapshot on sender

Retry on errors that are likely temporary (i.e. network failures).
Give up on filesystems where a permanent error was received over RPC.

•: Execute the plan

Perform replication steps in the following order: Among all filesystems with pending replication steps, pick the filesystem whose next replication step's snapshot is the oldest.
Create placeholder filesystems on the receiving side to mirror the dataset paths on the sender to root_fs/${client_identity}.
Acquire send-side step-holds on the step's from and to snapshots.
Perform the replication step.
Move the replication cursor bookmark on the sending side (see below).
Move the last-received-hold on the receiving side (see below).
Release the send-side step-holds.

The idea behind the execution order of replication steps is that if the sender snapshots all filesystems simultaneously at fixed intervals, the receiver will have all filesystems snapshotted at time T1 before the first snapshot at T2 = T1 + $interval is replicated.

ZFS Background Knowledge

This section gives some background knowledge about ZFS features that zrepl uses to provide guarantees for a replication filesystem. Specifically, zrepl guarantees by default that incremental replication is always possible and that started replication steps can always be resumed if they are interrupted.

ZFS Send Modes & Bookmarks ZFS supports full sends (zfs send fs@to) and incremental sends (zfs send -i @from fs@to). Full sends are used to create a new filesystem on the receiver with the send-side state of fs@to. Incremental sends only transfer the delta between @from and @to. Incremental sends require that @from be present on the receiving side when receiving the incremental stream. Incremental sends can also use a ZFS bookmark as from on the sending side (zfs send -i #bm_from fs@to), where #bm_from was created using zfs bookmark fs@from fs#bm_from. The receiving side must always have the actual snapshot @from, regardless of whether the sending side uses @from or a bookmark of it.

Plain and raw sends By default, zfs send sends the most generic, backwards-compatible data stream format (so-called 'plain send'). If the sent uses newer features, e.g. compression or encryption, zfs send has to un-do these operations on the fly to produce the plain send stream. If the receiver uses newer features (e.g. compression or encryption inherited from the parent FS), it applies the necessary transformations again on the fly during zfs recv.

Flags such as -e, -c and -L tell ZFS to produce a send stream that is closer to how the data is stored on disk. Sending with those flags removes computational overhead from sender and receiver. However, the receiver will not apply certain transformations, e.g., it will not compress with the receive-side compression algorithm.

The -w (--raw) flag produces a send stream that is as raw as possible. For unencrypted datasets, its current effect is the same as -Lce.

Encrypted datasets can only be sent plain (unencrypted) or raw (encrypted) using the -w flag.

Resumable Send & Recv The -s flag for zfs recv tells zfs to save the partially received send stream in case it is interrupted. To resume the replication, the receiving side filesystem's receive_resume_token must be passed to a new zfs send -t <value> | zfs recv command. A full send can only be resumed if @to still exists. An incremental send can only be resumed if @to still exists and either @from still exists or a bookmark #fbm of @from still exists.

ZFS Holds ZFS holds prevent a snapshot from being deleted through zfs destroy, letting the destroy fail with a datset is busy error. Holds are created and referred to by a tag. They can be thought of as a named, persistent lock on the snapshot.

ZFS Abstractions Managed By zrepl

With the background knowledge from the previous paragraph, we now summarize the different on-disk ZFS objects that zrepl manages to provide its functionality.

Placeholder filesystems on the receiving side are regular ZFS filesystems with the ZFS property zrepl:placeholder=on. Placeholders allow the receiving side to mirror the sender's ZFS dataset hierarchy without replicating every filesystem at every intermediary dataset path component. Consider the following example: S/H/J shall be replicated to R/sink/job/S/H/J, but neither S/H nor S shall be replicated. ZFS requires the existence of R/sink/job/S and R/sink/job/S/H in order to receive into R/sink/job/S/H/J. Thus, zrepl creates the parent filesystems as placeholders on the receiving side. If at some point S/H and S shall be replicated, the receiving side invalidates the placeholder flag automatically. The zrepl test placeholder command can be used to check whether a filesystem is a placeholder.

The replication cursor bookmark and last-received-hold are managed by zrepl to ensure that future replications can always be done incrementally. The replication cursor is a send-side bookmark of the most recent successfully replicated snapshot, and the last-received-hold is a hold of that snapshot on the receiving side. Both are moved atomically after the receiving side has confirmed that a replication step is complete.

The replication cursor has the format #zrepl_CUSOR_G_<GUID>_J_<JOBNAME>. The last-received-hold tag has the format zrepl_last_received_J_<JOBNAME>. Encoding the job name in the names ensures that multiple sending jobs can replicate the same filesystem to different receivers without interference.

Tentative replication cursor bookmarks are short-lived bookmarks that protect the atomic moving-forward of the replication cursor and last-received-hold (see this issue). They are only necessary if step holds are not used as per the replication.protection setting. The tentative replication cursor has the format #zrepl_CUSORTENTATIVE_G_<GUID>_J_<JOBNAME>. The zrepl zfs-abstraction list command provides a listing of all bookmarks and holds managed by zrepl.

Step holds are zfs holds managed by zrepl to ensure that a replication step can always be resumed if it is interrupted, e.g., due to network outage. zrepl creates step holds before it attempts a replication step and releases them after the receiver confirms that the replication step is complete. For an initial replication full @initial_snap, zrepl puts a zfs hold on @initial_snap. For an incremental send @from -> @to, zrepl puts a zfs hold on both @from and @to. Note that @from is not strictly necessary for resumability -- a bookmark on the sending side would be sufficient --, but size-estimation in currently used OpenZFS versions only works if @from is a snapshot. The hold tag has the format zrepl_STEP_J_<JOBNAME>. A job only ever has one active send per filesystem. Thus, there are never more than two step holds for a given pair of (job,filesystem).

Step bookmarks are zrepl's equivalent for holds on bookmarks (ZFS does not support putting holds on bookmarks). They are intended for a situation where a replication step uses a bookmark #bm as incremental from where #bm is not managed by zrepl. To ensure resumability, zrepl copies #bm to step bookmark #zrepl_STEP_G_<GUID>_J_<JOBNAME>. If the replication is interrupted and #bm is deleted by the user, the step bookmark remains as an incremental source for the resumable send. Note that zrepl does not yet support creating step bookmarks because the corresponding ZFS feature for copying bookmarks is not yet widely available . Subscribe to zrepl issue #326 for details.

The zrepl zfs-abstraction list command provides a listing of all bookmarks and holds managed by zrepl.

NOTE:

More details can be found in the design document replication/design.md.

Limitations

ATTENTION:

Currently, zrepl does not replicate filesystem properties. When receiving a filesystem, it is never mounted (-u flag) and mountpoint=none is set. This is temporary and being worked on issue #24.

Multiple Jobs & More than 2 Machines

Most users are served well with a single sender and a single receiver job. This section documents considerations for more complex setups.

ATTENTION:

Before you continue, make sure you have a working understanding of how zrepl works and what zrepl does to ensure that replication between sender and receiver is always possible without conflicts. This will help you understand why certain kinds of multi-machine setups do not (yet) work.

NOTE:

If you can't find your desired configuration, have questions or would like to see improvements to multi-job setups, please open an issue on GitHub.

Multiple Jobs on one Machine

As a general rule, multiple jobs configured on one machine must operate on disjoint sets of filesystems. Otherwise, concurrently running jobs might interfere when operating on the same filesystem.

On your setup, ensure that

all filesystems filter specifications are disjoint
no root_fs is a prefix or equal to another root_fs
no filesystems filter matches any root_fs

Exceptions to the rule:

•: A snap and push job on the same machine can match the same filesystems. To avoid interference, only one of the jobs should be pruning snapshots on the sender, the other one should keep all snapshots. Since the jobs won't coordinate, errors in the log are to be expected, but zrepl's ZFS abstractions ensure that push and sink can always replicate incrementally. This scenario is detailed in one of the quick-start guides.

More Than 2 Machines

This section might be relevant to users who wish to fan-in (N machines replicate to 1) or fan-out (replicate 1 machine to N machines).

Working setups:

•: Fan-in: N servers replicated to one receiver, disjoint dataset trees.

This is the common use case of a centralized backup server.
Implementation:

N push jobs (one per sender server), 1 sink (as long as the different push jobs have a different client identity)
N source jobs (one per sender server), N pull on the receiver server (unique names, disjoing root_fs)

The sink job automatically constrains each client to a disjoint sub-tree of the sink-side dataset hierarchy ${root_fs}/${client_identity}. Therefore, the different clients cannot interfere.
The pull job only pulls from one host, so it's up to the zrepl user to ensure that the different pull jobs don't interfere.

•: Fan-out: 1 server replicated to N receivers

•: Can be implemented either in a pull or push fashion.

pull setup: 1 pull job on each receiver server, each with a corresponding unique source job on the sender server.
push setup: 1 sink job on each receiver server, each with a corresponding unique push job on the sender server.

It is critical that we have one sending-side job (source, push) per receiver. The reason is that zrepl's ZFS abstractions (zrepl zfs-abstraction list) include the name of the source/push job, but not the receive-side job name or client identity (see issue #380). As a counter-example, suppose we used multiple pull jobs with only one source job. All pull jobs would share the same replication cursor bookmark and trip over each other, breaking incremental replication guarantees quickly. The anlogous problem exists for 1 push to N sink jobs.
The filesystems matched by the sending side jobs (source, push) need not necessarily be disjoint. For this to work, we need to avoid interference between snapshotting and pruning of the different sending jobs. The solution is to centralize sender-side snapshot management in a separate snap job. Snapshotting in the source/push job should then be disabled (type: manual). And sender-side pruning (keep_sender) needs to be disabled in the active side (pull / push), since that'll be done by the snap job.
Restore limitations: when restoring from one of the pull targets (e.g., using zfs send -R), the replication cursor bookmarks don't exist on the restored system. This can break incremental replication to all other receive-sides after restore.
See the fan-out replication quick-start guide for an example of this setup.

Setups that do not work:

•: N pull identities, 1 source job. Tracking issue #380.

Job Types in Detail

Job Type push

Parameter	Comment
type	= push
name	unique name of the job (must not change)
connect	connect specification
filesystems	filter specification for filesystems to be snapshotted and pushed to the sink
send	send options, e.g. for encrypted sends
snapshotting	snapshotting specification
pruning	pruning specification

Example config: config/samples/push.yml

Job Type sink

Parameter	Comment
type	= sink
name	unique name of the job (must not change)
serve	serve specification
root_fs	ZFS filesystems are received to $root_fs/$client_identity/$source_path

Example config: config/samples/sink.yml

Job Type pull

Parameter	Comment
type	= pull
name	unique name of the job (must not change)
connect	connect specification
root_fs	ZFS filesystems are received to $root_fs/$source_path
interval	Interval at which to pull from the source job (e.g. 10m). manual disables periodic pulling, replication then only happens on wakeup.

pruning

pruning specification

Example config: config/samples/pull.yml

Job Type source

Parameter	Comment
type	= source
name	unique name of the job (must not change)
serve	serve specification
filesystems	filter specification for filesystems to be snapshotted and exposed to connecting clients
send	send options, e.g. for encrypted sends
snapshotting	snapshotting specification

Example config: config/samples/source.yml

Local replication

If you have the need for local replication (most likely between two local storage pools), you can use the local transport type to connect a local push job to a local sink job.

Example config: config/samples/local.yml.

Job Type snap (snapshot & prune only)

Job type that only takes snapshots and performs pruning on the local machine.

Parameter	Comment
type	= snap
name	unique name of the job (must not change)
filesystems	filter specification for filesystems to be snapshotted
snapshotting	snapshotting specification
pruning	pruning specification

Example config: config/samples/snap.yml

Transports

The zrepl RPC layer uses transports to establish a single, bidirectional data stream between an active and passive job. On the passive (serving) side, the transport also provides the client identity to the upper layers: this string is used for access control and separation of filesystem sub-trees in sink jobs. Transports are specified in the connect or serve section of a job definition.

•: Transports

•: tcp Transport

Serve
Connect

•: tls Transport

Serve
Connect
Mutual-TLS between Two Machines
Certificate Authority using EasyRSA

•: ssh+stdinserver Transport

Serve
Connect

•: local Transport

ATTENTION:

The client identities must be valid ZFS dataset path components because the sink job uses ${root_fs}/${client_identity} to determine the client's subtree.

tcp Transport

The tcp transport uses plain TCP, which means that the data is not encrypted on the wire. Clients are identified by their IPv4 or IPv6 addresses, and the client identity is established through a mapping on the server.

This transport may also be used in conjunction with network-layer encryption and/or VPN tunnels to provide encryption on the wire. To make the IP-based client authentication effective, such solutions should provide authenticated IP addresses. Some options to consider:

WireGuard: Linux-focussed, in-kernel TLS
OpenVPN: Cross-platform VPN, uses tun on *nix
IPSec: Properly standardized, in-kernel network-layer VPN
spiped: think of it as an encrypted pipe between two servers
SSH

sshuttle: VPN-like solution, but using SSH
SSH port forwarding: Systemd user unit & make it start before the zrepl service.

Serve

jobs:
- type: sink
  serve:
    type: tcp
    listen: ":8888"
    listen_freebind: true # optional, default false
    clients: {
      "192.168.122.123" :               "mysql01",
      "192.168.122.42" :                "mx01",
      "2001:0db8:85a3::8a2e:0370:7334": "gateway",
      # CIDR masks require a '*' in the client identity string
      # that is expanded to the client's IP address
      "10.23.42.0/24":       "cluster-*"
      "fde4:8dba:82e1::/64": "san-*"
    }
  ...

listen_freebind controls whether the socket is allowed to bind to non-local or unconfigured IP addresses (Linux IP_FREEBIND , FreeBSD IP_BINDANY). Enable this option if you want to listen on a specific IP address that might not yet be configured when the zrepl daemon starts.

Connect

jobs:
 - type: push
   connect:
     type: tcp
     address: "10.23.42.23:8888"
     dial_timeout: # optional, default 10s
   ...

tls Transport

The tls transport uses TCP + TLS with client authentication using client certificates. The client identity is the common name (CN) presented in the client certificate.

It is recommended to set up a dedicated CA infrastructure for this transport, e.g. using OpenVPN's EasyRSA. For a simple 2-machine setup, mutual TLS might also be sufficient. We provide copy-pastable instructions to generate the certificates below.

The implementation uses Go's TLS library. Since Go binaries are statically linked, you or your distribution need to recompile zrepl when vulnerabilities in that library are disclosed.

All file paths are resolved relative to the zrepl daemon's working directory. Specify absolute paths if you are unsure what directory that is (or find out from your init system).

If intermediate CAs are used, the full chain must be present in either in the ca file or the individual cert files. Regardless, the client's certificate must be first in the cert file, with each following certificate directly certifying the one preceding it (see TLS's specification). This is the common default when using a CA management tool.

NOTE:

As of Go 1.15 (zrepl 0.3.0 and newer), the Go TLS / x509 library requrires Subject Alternative Names be present in certificates. You might need to re-generate your certificates using one of the two alternatives provided below.

Note further that zrepl continues to use the CommonName field to assign client identities. Hence, we recommend to keep the Subject Alternative Name and the CommonName in sync.

Serve

jobs:
  - type: sink
    root_fs: "pool2/backup_laptops"
    serve:
      type: tls
      listen: ":8888"
      listen_freebind: true # optional, default false
      ca:   /etc/zrepl/ca.crt
      cert: /etc/zrepl/prod.fullchain
      key:  /etc/zrepl/prod.key
      client_cns:
        - "laptop1"
        - "homeserver"

The ca field specified the certificate authority used to validate client certificates. The client_cns list specifies a list of accepted client common names (which are also the client identities for this transport). The listen_freebind field is explained here.

Connect

jobs:
- type: pull
  connect:
    type: tls
    address: "server1.foo.bar:8888"
    ca:   /etc/zrepl/ca.crt
    cert: /etc/zrepl/backupserver.fullchain
    key:  /etc/zrepl/backupserver.key
    server_cn: "server1"
    dial_timeout: # optional, default 10s

The ca field specifies the CA which signed the server's certificate (serve.cert). The server_cn specifies the expected common name (CN) of the server's certificate. It overrides the hostname specified in address. The connection fails if either do not match.

Mutual-TLS between Two Machines

However, for a two-machine setup, self-signed certificates distributed using an out-of-band mechanism will also work just fine:

Suppose you have a push-mode setup, with backups.example.com running the sink job, and prod.example.com running the push job. Run the following OpenSSL commands on each host, substituting HOSTNAME in both filenames and the interactive input prompt by OpenSSL:

(name=HOSTNAME; openssl req -x509 -sha256 -nodes \
 -newkey rsa:4096 \
 -days 365 \
 -keyout $name.key \
 -out $name.crt -addext "subjectAltName = DNS:$name" -subj "/CN=$name")

Now copy each machine's HOSTNAME.crt to the other machine's /etc/zrepl/HOSTNAME.crt, for example using scp. The serve & connect configuration will thus look like the following:

# on backups.example.com
- type: sink
  serve:
    type: tls
    listen: ":8888"
    ca: "/etc/zrepl/prod.example.com.crt"
    cert: "/etc/zrepl/backups.example.com.crt"
    key: "/etc/zrepl/backups.example.com.key"
    client_cns:
      - "prod.example.com"
  ...
# on prod.example.com
- type: push
  connect:
    type: tls
    address:"backups.example.com:8888"
    ca: /etc/zrepl/backups.example.com.crt
    cert: /etc/zrepl/prod.example.com.crt
    key:  /etc/zrepl/prod.example.com.key
    server_cn: "backups.example.com"
  ...

Certificate Authority using EasyRSA

For more than two machines, it might make sense to set up a CA infrastructure. Tools like EasyRSA make this very easy:

#!/usr/bin/env bash
set -euo pipefail
HOSTS=(backupserver prod1 prod2 prod3)
curl -L https://github.com/OpenVPN/easy-rsa/releases/download/v3.0.7/EasyRSA-3.0.7.tgz > EasyRSA-3.0.7.tgz
echo "157d2e8c115c3ad070c1b2641a4c9191e06a32a8e50971847a718251eeb510a8  EasyRSA-3.0.7.tgz" | sha256sum -c
rm -rf EasyRSA-3.0.7
tar -xf EasyRSA-3.0.7.tgz
cd EasyRSA-3.0.7
./easyrsa
./easyrsa init-pki
./easyrsa build-ca nopass
for host in "${HOSTS[@]}"; do
    ./easyrsa build-serverClient-full $host nopass
    echo cert for host $host available at pki/issued/$host.crt
    echo key for host $host available at pki/private/$host.key
done
echo ca cert available at pki/ca.crt

ssh+stdinserver Transport

ssh+stdinserver uses the ssh command and some features of the server-side SSH authorized_keys file. It is less efficient than other transports because the data passes through two more pipes. However, it is fairly convenient to set up and allows the zrepl daemon to not be directly exposed to the internet, because all traffic passes through the system's SSH server.

The concept is inspired by git shell and Borg Backup. The implementation is provided by the Go package github.com/problame/go-netssh.

NOTE:

ssh+stdinserver generally provides inferior error detection and handling compared to the tcp and tls transports. When encountering such problems, consider using tcp or tls transports, or help improve package go-netssh.

Serve

jobs:
- type: source
  serve:
    type: stdinserver
    client_identities:
    - "client1"
    - "client2"
  ...

First of all, note that type=stdinserver in this case: Currently, only connect.type=ssh+stdinserver can connect to a serve.type=stdinserver, but we want to keep that option open for future extensions.

The serving job opens a UNIX socket named after client_identity in the runtime directory. In our example above, that is /var/run/zrepl/stdinserver/client1 and /var/run/zrepl/stdinserver/client2.

On the same machine, the zrepl stdinserver $client_identity command connects to /var/run/zrepl/stdinserver/$client_identity. It then passes its stdin and stdout file descriptors to the zrepl daemon via cmsg(3). zrepl daemon in turn combines them into an object implementing net.Conn: a Write() turns into a write to stdout, a Read() turns into a read from stdin.

Interactive use of the stdinserver subcommand does not make much sense. However, we can force its execution when a user with a particular SSH pubkey connects via SSH. This can be achieved with an entry in the authorized_keys file of the serving zrepl daemon.

# for OpenSSH >= 7.2
command="zrepl stdinserver CLIENT_IDENTITY",restrict CLIENT_SSH_KEY
# for older OpenSSH versions
command="zrepl stdinserver CLIENT_IDENTITY",no-port-forwarding,no-X11-forwarding,no-pty,no-agent-forwarding,no-user-rc CLIENT_SSH_KEY

CLIENT_IDENTITY is substituted with an entry from client_identities in our example
CLIENT_SSH_KEY is substituted with the public part of the SSH keypair specified in the connect.identity_file directive on the connecting host.

NOTE:

You may need to adjust the PermitRootLogin option in /etc/ssh/sshd_config to forced-commands-only or higher for this to work. Refer to sshd_config(5) for details.

To recap, this is of how client authentication works with the ssh+stdinserver transport:

Connections to the /var/run/zrepl/stdinserver/${client_identity} UNIX socket are blindly trusted by zrepl daemon. The connection client identity is the name of the socket, i.e. ${client_identity}.
Thus, the runtime directory must be private to the zrepl user (this is checked by zrepl daemon)
The admin of the host with the serving zrepl daemon controls the authorized_keys file.
Thus, the administrator controls the mapping PUBKEY -> CLIENT_IDENTITY.

Connect

jobs:
- type: pull
  connect:
    type: ssh+stdinserver
    host: prod.example.com
    user: root
    port: 22
    identity_file: /etc/zrepl/ssh/identity
    # options: # optional, default [], `-o` arguments passed to ssh
    # - "Compression=yes"
    # dial_timeout: 10s # optional, default 10s, max time.Duration until initial handshake is completed

The connecting zrepl daemon

1.: Creates a pipe
2.: Forks
3.: In the forked process

1.: Replaces forked stdin and stdout with the corresponding pipe ends
2.: Executes the ssh binary found in $PATH.

1.: The identity file (-i) is set to $identity_file.
2.: The remote user, host and port correspond to those configured.
3.: Further options can be specified using the options field, which appends each entry in the list to the command line using -o $entry.

4.: Wraps the pipe ends in a net.Conn and returns it to the RPC layer.

As discussed in the section above, the connecting zrepl daemon expects that zrepl stdinserver $client_identity is executed automatically via an authorized_keys file entry.

The known_hosts file used by the ssh command must contain an entry for connect.host prior to starting zrepl. Thus, run the following on the pulling host's command line (substituting connect.host):

ssh -i /etc/zrepl/ssh/identity root@prod.example.com

NOTE:

The environment variables of the underlying SSH process are cleared. $SSH_AUTH_SOCK will not be available. It is suggested to create a separate, unencrypted SSH key solely for that purpose.

local Transport

The local transport can be used to implement local replication, i.e., push replication between a push and sink job defined in the same configuration file.

The listener_name is analogous to a hostname and must match between serve and connect. The client_identity is used by the sink as documented above.

jobs:
- type: sink
  serve:
    type: local
    listener_name: localsink
  ...
- type: push
  connect:
    type: local
    listener_name: localsink
    client_identity: local_backup
    dial_timeout: 2s # optional, 0 for no timeout
  ...

Filter Syntax

For source, push and snap jobs, a filesystem filter must be defined (field filesystems). A filter takes a filesystem path (in the ZFS filesystem hierarchy) as parameter and returns true (pass) or false (block).

A filter is specified as a YAML dictionary with patterns as keys and booleans as values. The following rules determine which result is chosen for a given filesystem path:

More specific path patterns win over less specific ones
Non-wildcard patterns (full path patterns) win over subtree wildcards (< at end of pattern)
If the path in question does not match any pattern, the result is false.

The subtree wildcard < means "the dataset left of < and all its children".

TIP:

You can try out patterns for a configured job using the zrepl test filesystems subcommand for push and source jobs.

Examples

Full Access

The following configuration will allow access to all filesystems.

jobs:
- type: source
  filesystems: {
    "<": true,
  }
  ...

Fine-grained

The following configuration demonstrates all rules presented above.

jobs:
- type: source
  filesystems: {
    "tank<": true,          # rule 1
    "tank/foo<": false,     # rule 2
    "tank/foo/bar": true,  # rule 3
  }
  ...

Which rule applies to given path, and what is the result?

tank/foo/bar/loo => 2    false
tank/bar         => 1    true
tank/foo/bar     => 3    true
zroot            => NONE false
tank/var/log     => 1    true

Send & Recv Options

Send Options

Source and push jobs have an optional send configuration section.

jobs:
- type: push
  filesystems: ...
  send:
    # flags from the table below go here
  ...

The following table specifies the list of (boolean) options. Flags with an entry in the zfs send column map directly to the zfs send CLI flags. zrepl does not perform feature checks for these flags. If you enable a flag that is not supported by the installed version of ZFS, the zfs error will show up at runtime in the logs and zrepl status. See the upstream man page (man zfs-send) for their semantics.

send.	zfs send	Comment
encrypted		Specific to zrepl, see below.
bandwidth_limit		Specific to zrepl, see below.
raw	-w	Use encrypted to only allow encrypted sends. Mixed sends are not supported.
send_properties	-p	Be careful, read the note on property replication below.
backup_properties	-b	Be careful, read the note on property replication below.
large_blocks	-L	Potential data loss on OpenZFS < 2.0, see warning below.
compressed	-c
embedded_data	-e
saved	-S

encrypted

The encrypted option controls whether the matched filesystems are sent as OpenZFS native encryption raw sends. More specifically, if encrypted=true, zrepl

checks for any of the filesystems matched by filesystems whether the ZFS encryption property indicates that the filesystem is actually encrypted with ZFS native encryption and
invokes the zfs send subcommand with the -w option (raw sends) and
expects the receiving side to support OpenZFS native encryption (recv will fail otherwise)

Filesystems matched by filesystems that are not encrypted are not sent and will cause error log messages.

If encrypted=false, zrepl expects that filesystems matching filesystems are not encrypted or have loaded encryption keys.

NOTE:

Use encrypted instead of raw to make your intent clear that zrepl must only replicate filesystems that are actually encrypted by OpenZFS native encryption. It is meant as a safeguard to prevent unintended sends of unencrypted filesystems in raw mode.

properties

Sends the dataset properties along with snapshots. Please be careful with this option and read the note on property replication below.

backup_properties

When properties are modified on a filesystem that was received from a send stream with send.properties=true, ZFS archives the original received value internally. This also applies to inheriting or overriding properties during zfs receive.

When sending those received filesystems another hop, the backup_properties flag instructs ZFS to send the original property values rather than the current locally set values.

This is useful for replicating properties across multiple levels of backup machines. Example: Suppose we want to flow snapshots from Machine A to B, then from B to C. A will enable the properties send option. B will want to override critical properties such as mountpoint or canmount. But the job that replicates from B to C should be sending the original property values received from A. Thus, B sets the backup_properties option.

Please be careful with this option and read the note on property replication below.

large_blocks

This flag should not be changed after initial replication. Prior to OpenZFS commit 7bcb7f08 it was possible to change this setting which resulted in data loss on the receiver. The commit in question is included in OpenZFS 2.0 and works around the problem by prohibiting receives of incremental streams with a flipped setting.

WARNING:

This bug has not been fixed in the OpenZFS 0.8 releases which means that changing this flag after initial replication might cause data loss on the receiver.

Recv Options

Sink and pull jobs have an optional recv configuration section:

jobs:
- type: pull
  recv:
    properties:
      inherit:
        - "mountpoint"
      override: {
        "org.openzfs.systemd:ignore": "on"
      }
    bandwidth_limit: ...
    placeholder:
      encryption: unspecified | off | inherit
  ...

Jump to properties , bandwidth_limit , and placeholder.

properties

override maps directly to the zfs recv -o flag. Property name-value pairs specified in this map will apply to all received filesystems, regardless of whether the send stream contains properties or not.

inherit maps directly to the zfs recv -x flag. Property names specified in this list will be inherited from the receiving side's parent filesystem (e.g. root_fs).

With both options, the sending side's property value is still stored on the receiver, but the local override or inherit is the one that takes effect. You can send the original properties from the first receiver to another receiver using send.backup_properties.

A Note on Property Replication

If a send stream contains properties, as per send.properties or send.backup_properties, the default ZFS behavior is to use those properties on the receiving side, verbatim.

In many use cases for zrepl, this can have devastating consequences. For example, when backing up a filesystem that has mountpoint=/ to a storage server, that storage server's root filesystem will be shadowed by the received file system on some platforms. Also, many scripts and tools use ZFS user properties for configuration and do not check the property source (local vs. received). If they are installed on the receiving side as well as the sending side, property replication could have unintended effects.

zrepl currently does not provide any automatic safe-guards for property replication:

Make sure to read the entire man page on zfs recv (man zfs recv) before enabling this feature.
Use recv.properties.override whenever possible, e.g. for mountpoint=none or canmount=off.
Use recv.properties.inherit if that makes more sense to you.

Below is an non-exhaustive list of problematic properties. Please open a pull request if you find a property that is missing from this list. (Both with regards to core ZFS tools and other software in the broader ecosystem.)

Mount behaviour

mountpoint
canmount
overlay

Note: inheriting or overriding the mountpoint property on ZVOLs fails in zfs recv. This is an issue in OpenZFS . As a workaround, consider creating separate zrepl jobs for your ZVOL and filesystem datasets. Please comment at zrepl issue #430 if you encounter this issue and/or would like zrepl to automatically work around it.

Systemd

With systemd, you should also consider the properties processed by the zfs-mount-generator .

Most notably:

org.openzfs.systemd:ignore
org.openzfs.systemd:wanted-by
org.openzfs.systemd:required-by

Encryption

If the sender filesystems are encrypted but the sender does plain sends and property replication is enabled, the receiver must inherit the following properties:

keylocation
keyformat
encryption

Placeholders

placeholder:
  encryption: unspecified | off | inherit

During replication, zrepl creates placeholder datasets on the receiving side if the sending side's filesystems filter creates gaps in the dataset hierarchy. This is generally fully transparent to the user. However, with OpenZFS Native Encryption, placeholders require zrepl user attention. Specifically, the problem is that, when zrepl attempts to create the placeholder dataset on the receiver, and that placeholder's parent dataset is encrypted, ZFS wants to inherit encryption to the placeholder. This is relevant to two use cases that zrepl supports:

1.: encrypted-send-to-untrusted-receiver In this use case, the sender sends an encrypted send stream and the receiver doesn't have the key loaded.
2.: send-plain-encrypt-on-receive The receive-side root_fs dataset is encrypted, and the senders are unencrypted. The key of root_fs is loaded, and the goal is that the plain sends (e.g., from production) are encrypted on-the-fly during receive, with root_fs's key.

For encrypted-send-to-untrusted-receiver, the placeholder datasets need to be created with -o encryption=off. Without it, creation would fail with an error, indicating that the placeholder's parent dataset's key needs to be loaded. But we don't trust the receiver, so we can't expect that to ever happen.

However, for send-plain-encrypt-on-receive, we cannot set -o encryption=off. The reason is that if we did, any of the (non-placeholder) child datasets below the placeholder would inherit encryption=off, thereby silently breaking our encrypt-on-receive use case. So, to cover this use case, we need to create placeholders without specifying -o encryption. This will make zfs create inherit the encryption mode from the parent dataset, and thereby transitively from root_fs.

The zrepl config provides the recv.placeholder.encryption knob to control this behavior. In undefined mode (default), placeholder creation bails out and asks the user to configure a behavior. In off mode, the placeholder is created with encryption=off, i.e., encrypted-send-to-untrusted-rceiver use case. In inherit mode, the placeholder is created without specifying -o encryption at all, i.e., the send-plain-encrypt-on-receive use case.

Common Options

Bandwidth Limit (send & recv)

bandwidth_limit:
  max: 23.5 MiB # -1 is the default and disabled rate limiting
  bucket_capacity: # token bucket capacity in bytes; defaults to 128KiB

Both send and recv can be limited to a maximum bandwidth through bandwidth_limit. For most users, it should be sufficient to just set bandwidth_limit.max. The bandwidth_limit.bucket_capacity refers to the token bucket size.

The bandwidth limit only applies to the payload data, i.e., the ZFS send stream. It does not account for transport protocol overheads. The scope is the job level, i.e., all concurrent sends or incoming receives of a job share the bandwidth limit.

Replication Options

jobs:
- type: push
  filesystems: ...
  replication:
    protection:
      initial:     guarantee_resumability # guarantee_{resumability,incremental,nothing}
      incremental: guarantee_resumability # guarantee_{resumability,incremental,nothing}
    concurrency:
      size_estimates: 4
      steps: 1
  ...

protection option

The protection variable controls the degree to which a replicated filesystem is protected from getting out of sync through a zrepl pruner or external tools that destroy snapshots. zrepl can guarantee resumability or just incremental replication.

guarantee_resumability is the default value and guarantees that a replication step is always resumable and that incremental replication will always be possible. The implementation uses replication cursors, last-received-hold and step holds.

guarantee_incremental only guarantees that incremental replication will always be possible. If a step from -> to is interrupted and its to snapshot is destroyed, zrepl will remove the half-received to's resume state and start a new step from -> to2. The implementation uses replication cursors, tentative replication cursors and last-received-hold.

guarantee_nothing does not make any guarantees with regards to keeping sending and receiving side in sync. No bookmarks or holds are created to protect sender and receiver from diverging.

Tradeoffs

Using guarantee_incremental instead of guarantee_resumability obviously removes the resumability guarantee. This means that replication progress is no longer monotonic which might lead to a replication setup that never makes progress if mid-step interruptions are too frequent (e.g. frequent network outages). However, the advantage and reason for existence of the incremental mode is that it allows the pruner to delete snapshots of interrupted replication steps which is useful if replication happens so rarely (or fails so frequently) that the amount of disk space exclusively referenced by the step's snapshots becomes intolerable.

NOTE:

When changing this flag, obsoleted zrepl-managed bookmarks and holds will be destroyed on the next replication step that is attempted for each filesystem.

concurrency option

The concurrency options control the maximum amount of concurrency during replication. The default values allow some concurrency during size estimation but no parallelism for the actual replication.

concurrency.steps (default = 1) controls the maximum number of concurrently executed replication steps. The planning step for each file system is counted as a single step.
concurrency.size_estimates (default = 4) controls the maximum number of concurrent step size estimations done by the job.

Note that initial replication cannot start replicating child filesystems before the parent filesystem's initial replication step has completed.

Some notes on tuning these values:

Disk: Size estimation is less I/O intensive than step execution because it does not need to access the data blocks.
CPU: Size estimation is usually a dense CPU burst whereas step execution CPU utilization is stretched out over time because of disk IO. Faster disks, sending a compressed dataset in plain mode and the zrepl transport mode all contribute to higher CPU requirements.
Network bandwidth: Size estimation does not consume meaningful amounts of bandwidth, step execution does.
zrepl ZFS abstractions: for each replication step zrepl needs to update its ZFS abstractions through the zfs command which often waits multiple seconds for the zpool to sync. Thus, if the actual send & recv time of a step is small compared to the time spent on zrepl ZFS abstractions then increasing step execution concurrency will result in a lower overall turnaround time.

Taking Snaphots

The push, source and snap jobs can automatically take periodic snapshots of the filesystems matched by the filesystems filter field. The snapshot names are composed of a user-defined prefix followed by a UTC date formatted like 20060102_150405_000. We use UTC because it will avoid name conflicts when switching time zones or between summer and winter time.

When a job is started, the snapshotter attempts to get the snapshotting rhythms of the matched filesystems in sync because snapshotting all filesystems at the same time results in a more consistent backup. To find that sync point, the most recent snapshot, made by the snapshotter, in any of the matched filesystems is used. A filesystem that does not have snapshots by the snapshotter has lower priority than filesystem that do, and thus might not be snapshotted (and replicated) until it is snapshotted at the next sync point.

For push jobs, replication is automatically triggered after all filesystems have been snapshotted.

Note that the zrepl signal wakeup JOB subcommand does not trigger snapshotting.

jobs:
- type: push
  filesystems: {
    "<": true,
    "tmp": false
  }
  snapshotting:
    type: periodic
    prefix: zrepl_
    interval: 10m
    hooks: ...
  ...

There is also a manual snapshotting type, which covers the following use cases:

Existing infrastructure for automatic snapshots: you only want to use this zrepl job for replication.
Handling snapshotting through a separate snap job.

Note that you will have to trigger replication manually using the zrepl signal wakeup JOB subcommand in that case.

jobs:
- type: push
  filesystems: {
    "<": true,
    "tmp": false
  }
  snapshotting:
    type: manual
  ...

Pre- and Post-Snapshot Hooks

Jobs with periodic snapshots can run hooks before and/or after taking the snapshot specified in snapshotting.hooks: Hooks are called per filesystem before and after the snapshot is taken (pre- and post-edge). Pre-edge invocations are in configuration order, post-edge invocations in reverse order, i.e. like a stack. If a pre-snapshot invocation fails, err_is_fatal=true cuts off subsequent hooks, does not take a snapshot, and only invokes post-edges corresponding to previous successful pre-edges. err_is_fatal=false logs the failed pre-edge invocation but does not affect subsequent hooks nor snapshotting itself. Post-edges are only invoked for hooks whose pre-edges ran without error. Note that hook failures for one filesystem never affect other filesystems.

The optional timeout parameter specifies a period after which zrepl will kill the hook process and report an error. The default is 30 seconds and may be specified in any units understood by time.ParseDuration.

The optional filesystems filter which limits the filesystems the hook runs for. This uses the same filter specification as jobs.

Most hook types take additional parameters, please refer to the respective subsections below.

Hook type	Details	Description
command	Details	Arbitrary pre- and post snapshot scripts.
postgres-checkpoint	Details	Execute Postgres CHECKPOINT SQL command before snapshot.
mysql-lock-tables	Details	Flush and read-Lock MySQL tables while taking the snapshot.

command Hooks

jobs:
- type: push
  filesystems: {
    "<": true,
    "tmp": false
  }
  snapshotting:
    type: periodic
    prefix: zrepl_
    interval: 10m
    hooks:
    - type: command
      path: /etc/zrepl/hooks/zrepl-notify.sh
      timeout: 30s
      err_is_fatal: false
    - type: command
      path: /etc/zrepl/hooks/special-snapshot.sh
      filesystems: {
        "tank/special": true
      }
  ...

command hooks take a path to an executable script or binary to be executed before and after the snapshot. path must be absolute (e.g. /etc/zrepl/hooks/zrepl-notify.sh). No arguments may be specified; create a wrapper script if zrepl must call an executable that requires arguments. The process standard output is logged at level INFO. Standard error is logged at level WARN. The following environment variables are set:

ZREPL_HOOKTYPE: either "pre_snapshot" or "post_snapshot"
ZREPL_FS: the ZFS filesystem name being snapshotted
ZREPL_SNAPNAME: the zrepl-generated snapshot name (e.g. zrepl_20380119_031407_000)
ZREPL_DRYRUN: set to "true" if a dry run is in progress so scripts can print, but not run, their commands

An empty template hook can be found in config/samples/hooks/template.sh.

postgres-checkpoint Hook

Connects to a Postgres server and executes the CHECKPOINT statement pre-snapshot. Checkpointing applies the WAL contents to all data files and syncs the data files to disk. This is not required for a consistent database backup: it merely forward-pays the "cost" of WAL replay to the time of snapshotting instead of at restore. However, the Postgres manual recommends against checkpointing during normal operation. Further, the operation requires Postgres superuser privileges. zrepl users must decide on their own whether this hook is useful for them (it likely isn't).

ATTENTION:

Note that WALs and Postgres data directory (with all database data files) must be on the same filesystem to guarantee a correct point-in-time backup with the ZFS snapshot.

DSN syntax documented here: https://godoc.org/github.com/lib/pq

CREATE USER zrepl_checkpoint PASSWORD yourpasswordhere;
ALTER ROLE zrepl_checkpoint SUPERUSER;

- type: postgres-checkpoint
  dsn: "host=localhost port=5432 user=postgres password=yourpasswordhere sslmode=disable"
  filesystems: {
      "p1/postgres/data11": true
  }

mysql-lock-tables Hook

Connects to MySQL and executes

pre-snapshot FLUSH TABLES WITH READ LOCK to lock all tables in all databases in the MySQL server we connect to (docs)
post-snapshot UNLOCK TABLES reverse above operation.

Above procedure is documented in the MySQL manual as a means to produce a consistent backup of a MySQL DBMS installation (i.e., all databases).

DSN syntax: [username[:password]@][protocol[(address)]]/dbname[?param1=value1&...&paramN=valueN]

ATTENTION:

All MySQL databases must be on the same ZFS filesystem to guarantee a consistent point-in-time backup with the ZFS snapshot.

CREATE USER zrepl_lock_tables IDENTIFIED BY 'yourpasswordhere';
GRANT RELOAD ON *.* TO zrepl_lock_tables;
FLUSH PRIVILEGES;

- type: mysql-lock-tables
  dsn: "zrepl_lock_tables:yourpasswordhere@tcp(localhost)/"
  filesystems: {
    "tank/mysql": true
  }

Pruning Policies

In zrepl, pruning means destroying snapshots. Pruning must happen on both sides of a replication or the systems would inevitably run out of disk space at some point.

Typically, the requirements to temporal resolution and maximum retention time differ per side. For example, when using zrepl to back up a busy database server, you will want high temporal resolution (snapshots every 10 min) for the last 24h in case of administrative disasters, but cannot afford to store them for much longer because you might have high turnover volume in the database. On the receiving side, you may have more disk space available, or need to comply with other backup retention policies.

zrepl uses a set of keep rules per sending and receiving side to determine which snapshots shall be kept per filesystem. A snapshot that is not kept by any rule is destroyed. The keep rules are evaluated on the active side (push or pull job) of the replication setup, for both active and passive side, after replication completed or was determined to have failed permanently.

Example Configuration:

jobs:
  - type: push
    name: ...
    connect: ...
    filesystems: {
      "<": true,
      "tmp": false
    }
    snapshotting:
      type: periodic
      prefix: zrepl_
      interval: 10m
    pruning:
      keep_sender:
        - type: not_replicated
        # make sure manually created snapshots by the administrator are kept
        - type: regex
          regex: "^manual_.*"
        - type: grid
          grid: 1x1h(keep=all) | 24x1h | 14x1d
          regex: "^zrepl_.*"
      keep_receiver:
        - type: grid
          grid: 1x1h(keep=all) | 24x1h | 35x1d | 6x30d
          regex: "^zrepl_.*"
        # manually created snapshots will be kept forever on receiver
        - type: regex
          regex: "^manual_.*"

DANGER:

You might have existing snapshots of filesystems affected by pruning which you want to keep, i.e. not be destroyed by zrepl. Make sure to actually add the necessary regex keep rules on both sides, like with manual in the example above.

Policy not_replicated

jobs:
- type: push
  pruning:
    keep_sender:
    - type: not_replicated
  ...

not_replicated keeps all snapshots that have not been replicated to the receiving side. It only makes sense to specify this rule for the keep_sender. The reason is that, by definition, all snapshots on the receiver have already been replicated to there from the sender. To determine whether a sender-side snapshot has already been replicated, zrepl uses the replication cursor bookmark which corresponds to the most recent successfully replicated snapshot.

Policy grid

jobs:
- type: pull
  pruning:
    keep_receiver:
    - type: grid
      regex: "^zrepl_.*"
      grid: 1x1h(keep=all) | 24x1h | 35x1d | 6x30d
            │                │               │
            └─ 1 repetition of a one-hour interval with keep=all
                             │               │
                             └─ 24 repetitions of a one-hour interval with keep=1
                                             │
                                             └─ 6 repetitions of a 30-day interval with keep=1
  ...

The retention grid can be thought of as a time-based sieve that thins out snapshots as they get older.

The grid field specifies a list of adjacent time intervals. Each interval is a bucket with a maximum capacity of keep snapshots. The following procedure happens during pruning:

1.: The list of snapshots is filtered by the regular expression in regex. Only snapshots names that match the regex are considered for this rule, all others will be pruned unless another rule keeps them.
2.: The snapshots that match regex are placed onto a time axis according to their creation date. The youngest snapshot is on the left, the oldest on the right.
3.: The first buckets are placed "under" that axis so that the grid spec's first bucket's left edge aligns with youngest snapshot.
4.: All subsequent buckets are placed adjacent to their predecessor bucket.
5.: Now each snapshot on the axis either falls into one bucket or it is older than our rightmost bucket. Buckets are left-inclusive and right-exclusive which means that a snapshot on the edge of bucket will always 'fall into the right one'.
6.: Snapshots older than the rightmost bucket are not kept by the grid specification.
7.: For each bucket, we only keep the keep oldest snapshots.

The syntax to describe the bucket list is as follows:

Repeat x Duration (keep=all)

The duration specifies the length of the interval.
The keep count specifies the number of snapshots that fit into the bucket. It can be either a positive integer or all (all snapshots are kept).
The repeat count repeats the bucket definition for the specified number of times.

Example:

Assume the following grid specification:
   grid: 1x1h(keep=all) | 2x2h | 1x3h
This grid specification produces the following constellation of buckets:
0h        1h        2h        3h        4h        5h        6h        7h        8h        9h
|         |         |         |         |         |         |         |         |         |
|-Bucket1-|-----Bucket2-------|------Bucket3------|-----------Bucket4-----------|
| keep=all|      keep=1       |       keep=1      |            keep=1           |
Now assume that we have a set of snapshots @a, @b, ..., @D.
Snapshot @a is the most recent snapshot.
Snapshot @D is the oldest snapshot, it is almost 9 hours older than snapshot @a.
We place the snapshots on the same timeline as the buckets:
0h        1h        2h        3h        4h        5h        6h        7h        8h        9h
|         |         |         |         |         |         |         |         |         |
|-Bucket1-|-----Bucket2-------|------Bucket3------|-----------Bucket4-----------|
| keep=all|      keep=1       |       keep=1      |            keep=1           |
|         |                   |                   |                             |
| a  b  c | d  e  f  g  h  i  j  k  l  m  n  o  p |q  r  s  t  u  v  w  x  y  z |A  B  C  D
We obtain the following mapping of snapshots to buckets:
Bucket1:   a,b,c
Bucket2:   d,e,f,g,h,i
Bucket3:   j,k,l,m,n,o,p
Bucket4:   q,r,s,t,u,v,w,x,y,z
No bucket: A,B,C,D
For each bucket, we now prune snapshots until it only contains `keep` snapshots.
Newer snapshots are destroyed first.
Snapshots that do not fall into a bucket are always destroyed.
Result after pruning:
0h        1h        2h        3h        4h        5h        6h        7h        8h        9h
|         |         |         |         |         |         |         |         |         |
|-Bucket1-|-----Bucket2-------|------Bucket3------|-----------Bucket4-----------|
|         |                   |                   |                             |
| a  b  c |                i  |                 p |                           z |

Policy last_n

jobs:
  - type: push
    pruning:
      keep_receiver:
      - type: last_n
        count: 10
        regex: ^zrepl_.*$ # optional
  ...

last_n filters the snapshot list by regex, then keeps the last count snapshots in that list (last = youngest = most recent creation date) All snapshots that don't match regex or exceed count in the filtered list are destroyed unless matched by other rules.

Policy regex

jobs:
  - type: push
    pruning:
      keep_receiver:
      # keep all snapshots with prefix zrepl_ or manual_
      - type: regex
        regex: "^(zrepl|manual)_.*"
  - type: push
    snapshotting:
      prefix: zrepl_
    pruning:
      keep_sender:
      # keep all snapshots that were not created by zrepl
      - type: regex
        negate: true
        regex: "^zrepl_.*"

regex keeps all snapshots whose names are matched by the regular expression in regex. Like all other regular expression fields in prune policies, zrepl uses Go's regexp.Regexp Perl-compatible regular expressions (Syntax). The optional negate boolean field inverts the semantics: Use it if you want to keep all snapshots that do not match the given regex.

Source-side snapshot pruning

A source jobs takes snapshots on the system it runs on. The corresponding pull job on the replication target connects to the source job and replicates the snapshots. Afterwards, the pull job coordinates pruning on both sender (the source job side) and receiver (the pull job side).

There is no built-in way to define and execute pruning on the source side independently of the pull side. The source job will continue taking snapshots which will not be pruned until the pull side connects. This means that extended replication downtime will fill up the source's zpool with snapshots.

If the above is a conceivable situation for you, consider using push mode, where pruning happens on the same side where snapshots are taken.

Workaround using snap job

As a workaround (see GitHub issue #102 for development progress), a pruning-only snap job can be defined on the source side: The snap job is in charge of snapshot creation & destruction, whereas the source job's role is reduced to just serving snapshots. However, since, jobs are run independently, it is possible that the snap job will prune snapshots that are queued for replication / destruction by the remote pull job that connects to the source job. Symptoms of such race conditions are spurious replication and destroy errors.

Example configuration:

# source side
jobs:
- type: snap
  snapshotting:
    type: periodic
  pruning:
    keep:
      # source side pruning rules go here
  ...
- type: source
  snapshotting:
    type: manual
  root_fs: ...
# pull side
jobs:
- type: pull
  pruning:
    keep_sender:
      # let the source-side snap job do the pruning
      - type: regex
        regex: ".*"
      ...
    keep_receiver:
      # feel free to prune on the pull side as desired
      ...

Logging

zrepl uses structured logging to provide users with easily processable log messages.

Logging outlets are configured in the global section of the config file.

global:
  logging:
    - type: OUTLET_TYPE
      level: MINIMUM_LEVEL
      format: FORMAT
    - type: OUTLET_TYPE
      level: MINIMUM_LEVEL
      format: FORMAT
    ...
jobs: ...

ATTENTION:

The first outlet is special: if an error writing to any outlet occurs, the first outlet receives the error and can print it. Thus, the first outlet must be the one that always works and does not block, e.g. stdout, which is the default.

Default Configuration

By default, the following logging configuration is used

global:
  logging:
    - type: "stdout"
      level:  "warn"
      format: "human"

Building Blocks

The following sections document the semantics of the different log levels, formats and outlet types.

Levels

Level	SHORT	Description
error	ERRO	immediate action required
warn	WARN	symptoms for misconfiguration, soon expected failure, etc.
info	INFO	explains what happens without too much detail
debug	DEBG	tracing information, state dumps, etc. useful for debugging.

Incorrectly classified messages are considered a bug and should be reported.

Formats

Format	Description
human	prints job and subsystem into brackets before the actual message, followed by remaining fields in logfmt style
logfmt	logfmt output. zrepl uses this Go package.
json	JSON formatted output. Each line is a valid JSON document. Fields are marshaled by encoding/json.Marshal(), which is particularly useful for processing in log aggregation or when processing state dumps.

Outlets

Outlets are the destination for log entries.

stdout Outlet

Parameter	Comment
type	stdout
level	minimum log level
format	output format
time	always include time in output (true or false)
color	colorize output according to log level (true or false)

Writes all log entries with minimum level level formatted by format to stdout. If stdout is a tty, interactive usage is assumed and both time and color are set to true.

Can only be specified once.

syslog Outlet

Parameter	Comment
type	syslog
level	minimum log level
format	output format
facility	Which syslog facility to use (default = local0)
retry_interval	Interval between reconnection attempts to syslog (default = 0)

Writes all log entries formatted by format to syslog. On normal setups, you should not need to change the retry_interval.

Can only be specified once.

tcp Outlet

Parameter	Comment
type	tcp
level	minimum log level
format	output format
net	tcp in most cases
address	remote network, e.g. logs.example.com:10202
retry_interval	Interval between reconnection attempts to address
tls	TLS config (see below)

Establishes a TCP connection to address and sends log messages with minimum level level formatted by format. If tls is not specified, an unencrypted connection is established. If tls is specified, the TCP connection is secured with TLS + Client Authentication. The latter is particularly useful in combination with log aggregation services.

Parameter	Description
ca	PEM-encoded certificate authority that signed the remote server's TLS certificate
cert	PEM-encoded client certificate identifying this zrepl daemon toward the remote server
key	PEM-encoded, unencrypted client private key identifying this zrepl daemon toward the remote server

WARNING:

zrepl drops log messages to the TCP outlet if the underlying connection is not fast enough. Note that TCP buffering in the kernel must first run full before messages are dropped.

Make sure to always configure a stdout outlet as the special error outlet to be informed about problems with the TCP outlet (see above ).

NOTE:

zrepl uses Go's crypto/tls and crypto/x509 packages and leaves all but the required fields in tls.Config at their default values. In case of a security defect in these packages, zrepl has to be rebuilt because Go binaries are statically linked.

Monitoring

Monitoring endpoints are configured in the global.monitoring section of the config file.

Prometheus & Grafana

zrepl can expose Prometheus metrics via HTTP. The listen attribute is a net.Listen string for tcp, e.g. :9811 or 127.0.0.1:9811 (port 9811 was reserved to zrepl on the official list). The listen_freebind attribute is explained here. The Prometheus monitoring job appears in the zrepl control job list and may be specified at most once.

zrepl also ships with an importable Grafana dashboard that consumes the Prometheus metrics: see dist/grafana. The dashboard also contains some advice on which metrics are important to monitor.

NOTE:

At the time of writing, there is no stability guarantee on the exported metrics.

global:
  monitoring:
    - type: prometheus
      listen: ':9811'
      listen_freebind: true # optional, default false

Miscellaneous

Runtime Directories & UNIX Sockets

The zrepl daemon needs to open various UNIX sockets in a runtime directory:

a control socket that the CLI commands use to interact with the daemon
the transport-ssh+stdinserver listener opens one socket per configured client, named after client_identity parameter

There is no authentication on these sockets except the UNIX permissions. The zrepl daemon will refuse to bind any of the above sockets in a directory that is world-accessible.

The following sections of the global config shows the default paths. The shell script below shows how the default runtime directory can be created.

global:
  control:
    sockpath: /var/run/zrepl/control
  serve:
    stdinserver:
      sockdir: /var/run/zrepl/stdinserver

mkdir -p /var/run/zrepl/stdinserver
chmod -R 0700 /var/run/zrepl

Durations & Intervals

Interval & duration fields in job definitions, pruning configurations, etc. must match the following regex:

var durationStringRegex *regexp.Regexp = regexp.MustCompile(`^\s*(\d+)\s*(s|m|h|d|w)\s*$`)
// s = second, m = minute, h = hour, d = day, w = week (7 days)

Super-Verbose Job Debugging

You have probably landed here because you opened an issue on GitHub and some developer told you to do this... So just read the annotated comments ;)

job:
- name: ...
  ...
 # JOB DEBUGGING OPTIONS
  # should be equal for all job types, but each job implements the debugging itself
  debug:
    conn: # debug the io.ReadWriteCloser connection
      read_dump: /tmp/connlog_read   # dump results of Read() invocations to this file
      write_dump: /tmp/connlog_write # dump results of Write() invocations to this file
    rpc: # debug the RPC protocol implementation
      log: true # log output from rpc layer to the job log

ATTENTION:

Connection dumps will almost certainly contain your or other's private data. Do not share it in a bug report.

Usage

CLI Overview

NOTE:

The zrepl binary is self-documenting: run zrepl help for an overview of the available subcommands or zrepl SUBCOMMAND --help for information on available flags, etc.

Subcommand	Description
zrepl help	show subcommand overview
zrepl daemon	run the daemon, required for all zrepl functionality
zrepl status	show job activity, or with --raw for JSON output
zrepl stdinserver	see transport-ssh+stdinserver
zrepl signal wakeup JOB	manually trigger replication + pruning of JOB
zrepl signal reset JOB	manually abort current replication + pruning of JOB
zrepl configcheck	check if config can be parsed without errors
zrepl migrate	perform on-disk state / ZFS property migrations (see changelog for details)

zrepl zfs-abstraction

list and remove zrepl's abstractions on top of ZFS, e.g. holds and step bookmarks (see overview )

zrepl daemon

All actual work zrepl does is performed by a daemon process. The daemon supports structured logging and provides monitoring endpoints.

When installing from a package, the package maintainer should have provided an init script / systemd.service file. You should thus be able to start zrepl daemon using your init system.

Alternatively, or for running zrepl in the foreground, simply execute zrepl daemon. Note that you won't see much output with the default logging configuration:

ATTENTION:

Make sure to actually monitor the error level output of zrepl: some configuration errors will not make the daemon exit.

Example: if the daemon cannot create the transport-ssh+stdinserver sockets in the runtime directory, it will emit an error message but not exit because other tasks such as periodic snapshots & pruning are of equal importance.

Restarting

The daemon handles SIGINT and SIGTERM for graceful shutdown. Graceful shutdown means at worst that a job will not be rescheduled for the next interval. The daemon exits as soon as all jobs have reported shut down.

Systemd Unit File

A systemd service definition template is available in dist/systemd. Note that some of the options only work on recent versions of systemd. Any help & improvements are very welcome, see issue #145.

Ops Runbooks

Migrating Sending Side

Objective: Move sending-side zpool to new hardware. Make the move fully transparent to the sending-side jobs. After the move is done, all sending-side zrepl jobs should continue to work as if the move had not happened. In particular, incremental replication should be able to pick up where it left before the move.

Suppose we want to migrate all data from one zpool oldpool to another zpool newpool. A possible reason might be that we want to change RAID levels, ashift, or just migrate over to next-gen hardware.

If the pool names are different, zrepl's matching between sender and receiver dataset will break becase the receive-side dataset names contain oldpool. To avoid this, we will need the name of the new pool to match that of the old pool. The following steps will accomplish this:

1.: Stop zrepl.
2.: Create the new pool: zpool create newpool ...
3.: Take a snapshot of the old pool so that you have something that you can zfs send. For example, run zfs snapshot -r oldpool@migration_oldpool_newpool.
4.: Send all of the oldpool's datasets to the new pool: zfs send -R oldpool@migration_oldpool_newpool | zfs recv -F newpool
5.: Export the old pool: zpool export oldpool
6.: Export the new pool: zpool export newpool
7.: (Optional) Change the name of the old pool to something that does not conflict with the new pool. We are going to use the name oldoldpool in this example. Use zpool import with no arguments to see the pool id. Then zpool import <id> oldoldpool && zpool export oldoldpool.
8.: Import the new pool, while changing the name to match the old pool: zpool import newpool oldpool
9.: Start zrepl again and wake up the relevant jobs.
10.: Use zrepl status or you monitoring to ensure that replication works. The best test is an end-to-end test where you write some junk data on a sender dataset and wait until a snapshot with that data appears on the receiving side.
11.: Once you are confident that replication is working, you may dispose of the old pool.

Note that, depending on pruning rules, it will not be possible to switch back to the old pool seamlessly, i.e., without a full re-replication.

Platform Tests

Along with the main zrepl binary, we release the platformtest binaries. The zrepl platform tests are an integration test suite that is complementary to the pure Go unit tests. Any test that needs to interact with ZFS is a platform test.

The platform need to run as root. For each test, we create a fresh dummy zpool backed by a file-based vdev. The file path, and a root mountpoint for the dummy zpool, must be specified on the command line:

mkdir -p /tmp/zreplplatformtest
./platformtest \
    -poolname 'zreplplatformtest' \  # <- name must contain zreplplatformtest
    -imagepath /tmp/zreplplatformtest.img \ # <- zrepl will create the file
    -mountpoint /tmp/zreplplatformtest # <- must exist

WARNING:

platformtest will unconditionally overwrite the file at imagepath and unconditionally zpool destroy $poolname. So, don't use a production poolname, and consider running the test in a VM. It'll be a lot faster as well because the underlying operations, zfs list in particular, will be faster.

While the platformtests are running, there will be a log of log output. After all tests have run, it prints a summary with a list of tests, grouped by result type (success, failure, skipped):

PASSING TESTS:
  github.com/zrepl/zrepl/platformtest/tests.BatchDestroy
  github.com/zrepl/zrepl/platformtest/tests.CreateReplicationCursor
  github.com/zrepl/zrepl/platformtest/tests.GetNonexistent
  github.com/zrepl/zrepl/platformtest/tests.HoldsWork
  ...
  github.com/zrepl/zrepl/platformtest/tests.SendStreamNonEOFReadErrorHandling
  github.com/zrepl/zrepl/platformtest/tests.UndestroyableSnapshotParsing
SKIPPED TESTS:
  github.com/zrepl/zrepl/platformtest/tests.SendArgsValidationEncryptedSendOfUnencryptedDatasetForbidden__EncryptionSupported_false
FAILED TESTS: []

If there is a failure, or a skipped test that you believe should be passing, re-run the test suite, capture stderr & stdout to a text file, and create an issue on GitHub.

To run a specific test case, or a subset of tests matched by regex, use the -run REGEX command line flag.

To stop test execution at the first failing test, and prevent cleanup of the dummy zpool, use the -failure.stop-and-keep-pool flag.

To build the platformtests yourself, use make test-platform-bin. There's also the make test-platform target to run the platform tests with a default command line.

Talks & Presentations

Talk at OpenZFS Developer Summit 2018 of pre-release 0.1 ( 25min Recording , Slides , Event )
Talk at EuroBSDCon2017 FreeBSD DevSummit with live demo of zrepl 0.0.3 ( 55min Recording, Slides, Event )

Changelog

The changelog summarizes bugfixes that are deemed relevant for users and package maintainers. Developers should consult the git commit log or GitHub issue tracker.

0.6 (Unreleased)

Feature Wishlist on GitHub
[BREAK] [FEATURE] convert Prometheus metric zrepl_version_daemon to zrepl_start_time metric

•: The metric still reports the zrepl version in a label. But the metric value is now the Unix timestamp at the time the daemon was started. The Grafana dashboard in dist/grafana has been updated.

0.5

[FEATURE] Bandwidth limiting (Thanks, Prominic.NET, Inc.)
[FEATURE] zrepl status: use a * to indicate which filesystem is currently replicating
[FEATURE] include daemon environment variables in zrepl status (currently only in --raw)
[BUG] fix encrypt-on-receive + placeholders use case (issue #504)

Before this fix, plain sends to a receiver with an encrypted root_fs could be received unencrypted if zrepl needed to create placeholders on the receiver.
Existing zrepl users should read the docs and check zfs get -r encryption,zrepl:placeholder PATH_TO_ROOTFS on the receiver.
Thanks to @mologie and @razielgn for reporting and testing!

[BUG] Rename mis-spelled send option embbeded_data to embedded_data.
[BUG] zrepl status: replication step numbers should start at 1
[BUG] incorrect bandwidth averaging in zrepl status.
[BUG] FreeBSD with OpenZFS 2.0: zrepl would wait indefinitely for zfs send to exit on timeouts.
[BUG] fix strconv.ParseInt: value out of range bug (and use the control RPCs).
[DOCS] improve description of multiple pruning rules.
[DOCS] document platform tests.
[DOCS] quickstart: make users aware that prune rules apply to all snapshots.
[MAINT] some platformtests were broken.
[MAINT] FreeBSD: release armv7 and arm64 binaries.
[MAINT] apt repo: update instructions due to apt-key deprecation.

Note to all users: please read up on the following OpenZFS bugs, as you might be affected:

ZFS send/recv with ashift 9->12 leads to data corruption.
Various bugs with encrypted send/recv (Leadership meeting notes)

Finally, I'd like to point you to the GitHub discussion about which bugfixes and features should be prioritized in zrepl 0.6 and beyond!

NOTE:

zrepl is a spare-time project primarily developed by Christian Schwarz.
You can support maintenance and feature development through one of the following services:
Donate via Patreon Donate via GitHub Sponsors Donate via Liberapay Donate via PayPal
Note that PayPal processing fees are relatively high for small donations.
For SEPA wire transfer and commercial support, please contact Christian directly.

0.4.0

[FEATURE] support setting zfs send / recv flags in the config (send: -wLcepbS , recv: -ox ). Config docs here and here .
[FEATURE] parallel replication is now configurable (disabled by default, config docs here ).
[FEATURE] New zrepl status UI:

Interactive job selection.
Interactively zrepl signal jobs.
Filter filesystems in the job view by name.
An approximation of the old UI is still included as --mode legacy but will be removed in a future release of zrepl.

[BUG] Actually use concurrency when listing zrepl abstractions & doing size estimation. These operations were accidentally made sequential in zrepl 0.3.
[BUG] Job hang-up during second replication attempt.
[BUG] Data races conditions in the dataconn rpc stack.
[MAINT] Update to protobuf v1.25 and grpc 1.35.

For users who skipped the 0.3.1 update: please make sure your pruning grid config is correct. The following bugfix in 0.3.1 caused problems for some users:

•: [BUG] pruning: grid: add all snapshots that do not match the regex to the rule's destroy list.

0.3.1

Mostly a bugfix release for zrepl 0.3.

[FEATURE] pruning: add optional regex field to last_n rule
[DOCS] pruning: grid : improve documentation and add an example
[BUG] pruning: grid: add all snapshots that do not match the regex to the rule's destroy list. This brings the implementation in line with the docs.
[BUG] easyrsa script in docs
[BUG] platformtest: fix skipping encryption-only tests on systems that don't support encryption
[BUG] replication: report AttemptDone if no filesystems are replicated
[FEATURE] status + replication: warning if replication succeeeded without any filesystem being replicated
[DOCS] update multi-job & multi-host setup section
RPM Packaging
CI infrastructure rework
Continuous deployment of that new stable branch to zrepl.github.io.

0.3

This is a big one! Headlining features:

Resumable Send & Recv Support No knobs required, automatically used where supported.
Encrypted Send & Recv Support for OpenZFS native encryption, configurable at the job level, i.e., for all filesystems a job is responsible for.
Replication Guarantees Automatic use of ZFS holds and bookmarks to protect a replicated filesystem from losing synchronization between sender and receiver. By default, zrepl guarantees that incremental replication will always be possible and interrupted steps will always be resumable.

TIP:

We highly recommend studying the updated overview section of the configuration chapter to understand how replication works.

TIP:

Go 1.15 changed the default TLS validation policy to require Subject Alternative Names (SAN) in certificates. The openssl commands we provided in the quick-start guides up to and including the zrepl 0.3 docs seem not to work properly. If you encounter certificate validation errors regarding SAN and wish to continue to use your old certificates, start the zrepl daemon with env var GODEBUG=x509ignoreCN=0. Alternatively, generate new certificates with SANs (see both options int the TLS transport docs ).

Quick-start guides:

•: We have added another quick-start guide for a typical workstation use case for zrepl. Check it out to learn how you can use zrepl to back up your workstation's OpenZFS natively-encrypted root filesystem to an external disk.

Additional changelog:

[BREAK] Go 1.15 TLS changes mentioned above.
[BREAK] [CONFIG] more restrictive job names than in prior zrepl versions Starting with this version, job names are going to be embedded into ZFS holds and bookmark names (see this section for details). Therefore you might need to adjust your job names. Note that jobs cannot be renamed easily once you start using zrepl 0.3.
[BREAK] [MIGRATION] replication cursor representation changed

zrepl now manages the replication cursor bookmark per job-filesystem tuple instead of a single replication cursor per filesystem. In the future, this will permit multiple sending jobs to send from the same filesystems.
ZFS does not allow bookmark renaming, thus we cannot migrate the old replication cursors.
zrepl 0.3 will automatically create cursors in the new format for new replications, and warn if it still finds ones in the old format.
Run zrepl migrate replication-cursor:v1-v2 to safely destroy old-format cursors. The migration will ensure that only those old-format cursors are destroyed that have been superseeded by new-format cursors.

[FEATURE] New option listen_freebind (tcp, tls, prometheus listener)
[FEATURE] issue #341 Prometheus metric for failing replications + corresponding Grafana panel
[FEATURE] issue #265 transport/tcp: support for CIDR masks in client IP whitelist
[FEATURE] documented subcommand to generate bash and zsh completions
[FEATURE] issue #307 chrome://trace -compatible activity tracing of zrepl daemon activity
[FEATURE] logging: trace IDs for better log entry correlation with concurrent replication jobs
[FEATURE] experimental environment variable for parallel replication (see issue #306 )
[BUG] missing logger context vars in control connection handlers
[BUG] improved error messages on zfs send errors
[BUG] [DOCS] snapshotting: clarify sync-up behavior and warn about filesystems
[BUG] transport/ssh: do not leak zombie ssh process on connection failures that will not be snapshotted until the sync-up phase is over
[DOCS] Installation: FreeBSD jail with iocage
[DOCS] Document new replication features in the config overview and replication/design.md.
[MAINTAINER NOTICE] New platform tests in this version, please make sure you run them for your distro!
[MAINTAINER NOTICE] Please add the shell completions to the zrepl packages.

0.2.1

[FEATURE] Illumos (and Solaris) compatibility and binary builds (thanks, MNX.io )
[FEATURE] 32bit binaries for Linux and FreeBSD (untested, though)
[BUG] better error messages in ssh+stdinserver transport
[BUG] systemd + ssh+stdinserver: automatically create /var/run/zrepl/stdinserver
[BUG] crash if Prometheus listening socket cannot be opened
[MAINTAINER NOTICE] Makefile refactoring, see commit 080f2c0

0.2

[FEATURE] Pre- and Post-Snapshot Hooks with built-in support for MySQL and Postgres checkpointing as well as custom scripts (thanks, @overhacked!)
[FEATURE] Use zfs destroy pool/fs@snap1,snap2,... CLI feature if available
[FEATURE] Linux ARM64 Docker build support & binary builds
[FEATURE] zrepl status now displays snapshotting reports
[FEATURE] zrepl status --job <JOBNAME> filter flag
[BUG] i386 build
[BUG] early validation of host:port tuples in config
[BUG] zrepl status now supports TERM=screen (tmux on FreeBSD / FreeNAS)
[BUG] ignore connection reset by peer errors when shutting down connections
[BUG] correct error messages when receive-side pool or root_fs dataset is not imported
[BUG] fail fast for misconfigured local transport
[BUG] race condition in replication report generation would crash the daemon when running zrepl status
[BUG] rpc goroutine leak in push mode if zfs recv fails on the sink side
[MAINTAINER NOTICE] Go modules for dependency management both inside and outside of GOPATH (lazy.sh and Makefile force GO111MODULE=on)
[MAINTAINER NOTICE] make platformtest target to check zrepl's ZFS abstractions (screen scraping, etc.). These tests only work on a system with ZFS installed, and must be run as root because they create a file-backed pool for each test case. The pool name zreplplatformtest is reserved for this use case. Only run make platformtest on test systems, e.g. a FreeBSD VM image.

0.1.1

•: [BUG] issue #162 commit d6304f4 : fix I/O timeout errors on variable receive rate

•: A significant reduction or sudden stall of the receive rate (e.g. recv pool has other I/O to do) would cause a writev I/O timeout error after approximately ten seconds.

0.1

This release is a milestone for zrepl and required significant refactoring if not rewrites of substantial parts of the application. It breaks both configuration and transport format, and thus requires manual intervention and updates on both sides of a replication setup.

DANGER:

The changes in the pruning system for this release require you to explicitly define keep rules: for any snapshot that you want to keep, at least one rule must match. This is different from previous releases where pruning only affected snapshots with the configured snapshotting prefix. Make sure that snapshots to be kept or ignored by zrepl are covered, e.g. by using the regex keep rule. Learn more in the config docs...

Notes to Package Maintainers

Notify users about config changes and migrations (see changes attributed with [BREAK] and [MIGRATION] below)
If the daemon crashes, the stack trace produced by the Go runtime and possibly diagnostic output of zrepl will be written to stderr. This behavior is independent from the stdout outlet type. Please make sure the stderr output of the daemon is captured somewhere. To conserve precious stack traces, make sure that multiple service restarts do not directly discard previous stderr output.
Make it obvious for users how to set the GOTRACEBACK environment variable to GOTRACEBACK=crash. This functionality will cause SIGABRT on panics and can be used to capture a coredump of the panicking process. To that extend, make sure that your package build system, your OS's coredump collection and the Go delve debugger work together. Use your build system to package the Go program in this tutorial on Go coredumps and the delve debugger , and make sure the symbol resolution etc. work on coredumps captured from the binary produced by your build system. (Special focus on symbol stripping, etc.)
Consider using the zrepl configcheck subcommand in startup scripts to abort a restart that would fail due to an invalid config.

Changes

•: [BREAK] [MIGRATION] Placeholder property representation changed

The placeholder property now uses on|off as values instead of hashes of the dataset path. This permits renames of the sink filesystem without updating all placeholder properties.
Relevant for 0.0.X-0.1-rc* to 0.1 migrations
Make sure your config is valid with zrepl configcheck
Run zrepl migrate 0.0.X:0.1:placeholder

[FEATURE] issue #55 : Push replication (see push job and sink job)
[FEATURE] TCP Transport
[FEATURE] TCP + TLS client authentication transport
[FEATURE] issue #111: RPC protocol rewrite

[BREAK] Protocol breakage; Update and restart of all zrepl daemons is required.
Use gRPC for control RPCs and a custom protocol for bulk data transfer.
Automatic retries for network-temporary errors

•: Limited to errors during replication for this release. Addresses the common problem of ISP-forced reconnection at night, but will become way more useful with resumable send & recv support. Pruning errors are handled per FS, i.e., a prune RPC is attempted at least once per FS.

•: [FEATURE] Proper timeout handling for the SSH transport

•: [BREAK] Requires Go 1.11 or later.

•: [BREAK] [CONFIG]: mappings are no longer supported

•: Receiving sides (pull and sink job) specify a single root_fs. Received filesystems are then stored per client in ${root_fs}/${client_identity}. See job-overview for details.

•: [FEATURE] [BREAK] [CONFIG] Manual snapshotting + triggering of replication

[FEATURE] issue #69: include manually created snapshots in replication
[CONFIG] manual and periodic snapshotting types
[FEATURE] zrepl signal wakeup JOB subcommand to trigger replication + pruning
[FEATURE] zrepl signal reset JOB subcommand to abort current replication + pruning

•: [FEATURE] [BREAK] [CONFIG] New pruning system

The active side of a replication (pull or push) decides what to prune for both sender and receiver. The RPC protocol is used to execute the destroy operations on the remote side.
New pruning policies (see configuration documentation )

The decision what snapshots shall be pruned is now made based on keep rules
[FEATURE] issue #68: keep rule not_replicated prevents divergence of sender and receiver

•: [FEATURE] [BREAK] Bookmark pruning is no longer necessary

Per filesystem, zrepl creates a single bookmark (#zrepl_replication_cursor) and moves it forward with the most recent successfully replicated snapshot on the receiving side.
Old bookmarks created by prior versions of zrepl (named like their corresponding snapshot) must be deleted manually.
[CONFIG] keep_bookmarks parameter of the grid keep rule has been removed

[FEATURE] zrepl status for live-updating replication progress (it's really cool!)
[FEATURE] Snapshot- & pruning-only job type (for local snapshot management)
[FEATURE] issue #67: Expose Prometheus metrics via HTTP (config docs)

•: Compatible Grafana dashboard shipping in dist/grafana

[CONFIG] Logging outlet types must be specified using the type instead of outlet key
[BREAK] issue #53: CLI: zrepl control * subcommands have been made direct subcommands of zrepl *
[BUG] Goroutine leak on ssh transport connection timeouts
[BUG] issue #81 issue #77 : handle failed accepts correctly (source job)
[BUG] issue #100: fix incompatibility with ZoL 0.8
[FEATURE] issue #115: logging: configurable syslog facility
[FEATURE] Systemd unit file in dist/systemd

Previous Releases

NOTE:

Due to limitations in our documentation system, we only show the changelog since the last release and the time this documentation is built. For the changelog of previous releases, use the version selection in the hosted version of these docs at zrepl.github.io.

Donate via Patreon Donate via GitHub Sponsors Donate via Liberapay Donate via PayPal

zrepl is a spare-time project primarily developed by Christian Schwarz. You can support maintenance and feature development through one of the services listed above. For SEPA wire transfer and commercial support, please contact Christian directly.

Thanks for your support!

NOTE:

PayPal takes a relatively high fixed processing fee plus percentage of the donation. Larger less-frequent donations make more sense there.

Supporters

We would like to thank the following people and organizations for supporting zrepl through monetary and other means:

Prominic.NET, Inc.
Torsten Blum
Cyberiada GmbH
Gordon Schulz
@jwittlincohen
Michael D. Schmitt
Hans Schulz
Henning Kessler
John Ramsden
DrLuke
Mateusz Kwiatkowski (runhyve.app)
Gaelan D'costa
Tenzin Lhakhang
Lapo Luchini
F. Schmid
MNX.io
Marshall Clyburn
Ross Williams
Mike T.
Justin Scholz
InsanePrawn
Ben Woods
Janis Streib
Anton Schirg

AUTHOR

Christian Schwarz

COPYRIGHT

2017-2019, Christian Schwarz