Move a Mastodon instance with less than 3 minutes downtime (LXD/ZFS-based)
As you might know, I’m running metalhead.club, a Mastodon instance for metalheads. Due to the increasing storage and computing demand (and because I wanted to drop my old host) I decided to move the instance to my new, more powerful host. Luckily I’ve packed the whole instance and all its dependencies into a LXC container (with LXD as container manager) a couple of months ago. Usually you would restore your Ruby / NodeJS environment on your new host, transfer database and application files as well as media files and make sure everything fits. In my case it was basically just a file system transfer and re-import on the new LXD host: Much easier and less error prone.
In this post I’ll show you the exact steps how I moved my Mastodon instance yesterday.
In this guide I assume that:
- You’re using LXD via the official snap package
- Your’re using LXD with ZFS
- LXD is set up on the new host
- A trusted SSH connection is available from the old host to the new one
Step #1: Decrease DNS TTL
Usually the TTL for DNS records is somewhere between 3600 (1 hour) and 86400 (24 h), which will cause a mayor downtime for some clients that just refreshed their DNS data. Before you move your instance, you should decrease the TTL to one minute (60) or even less. After the TTL was changed, wait according to the TTL until the new TTL is spread in the DNS. If your previous TTL was 86400, wait 24 hours until you continue with the next steps.
Step #2: First data migration (foundation)
To keep downtime low and not miss any new or updated data, the data migration will be done in two parts: First a basic copy of the container’s file system will be transferred. This might take a while, depending on the dat asize. Then, shortly before IP switching, updated data will be synchronized to the new host with the help of ZFS snapshots. This process will be much faster because older data is already there.
Let’s first create a snapshot of the container’s filesystem on the old host:
root@oldhost# zfs snapshot default/containers/mastodon@mig1
Then transfer the snapshot to the new host’s ZFS filesystem:
root@oldhost# zfs send default/containers/mastodon@mig1 | pv -ps 35g | ssh newhost zfs recv default/containers/mastodon
The pv -ps 35g
command shows the overall progress, where 35g is the expected data size (shown by e.g. zfs list
). The ZFS volume default/containers/mastodon
must not exist on the new host!
After the transfer, mount the file system in the namespace context of the LXD process:
root@newhost# nsenter -t $(pgrep daemon.start) -m
root@newhost# mkdir /var/snap/common/lxd/storage-pools/default/containers/mastodon
root@newhost# mount -t zfs default/containers/mastodon /var/snap/common/lxd/storage-pools/default/containers/mastodon
root@newhost# exit
Then import the new filesystem into LXD:
root@newhost# lxd import --force mastodon
The --force
option is needed because the new LXD instance can’t find any earlier snapshots in the ZFS dataset although backup.yml is announcing them.
lxc list
should now display the imported Mastodon container. You can try and start it via lxc start mastodon
- just as usual.
Step #3: Final data sync
Let’s start with the final dara migration. To make sure the container file system is in a consistent state and no more data is written to the file systems, both Mastodon instances must now be shut down!
root@oldhost# lxc stop mastodon
root@newhost# lxc stop mastodon
Create a new, updated snapshot on the old host and send an incremental snapshot (mig2
) to the new server:
zfs snapshot default/containers/mastodon@mig2
zfs send -i default/containers/mastodon@mig1 default/containers/mastodon@mig2 | ssh newhost zfs recv -F default/containers/mastodon
The -F
option at zfs recv
makes ZFS ignore changes that were caused by the previous startup of the mastodon container.
Step #4: Update DNS to new IP address
It is now time to update your DNS settings and publish the new server’s IP address for the Mastodon instance. Start the new instance:
root@newhost# lxc start mastodon
… and keep the old instance stopped!
Container migration is now complete! :-)