Mastodon: Upgrading PostgreSQL database version 11 to 13

Quite a while ago I migrated my Mastodon container from Debian Buster to Debian Bullseye - but without migrating the PostgreSQL database from version 11 to version 13. There were more important issues on my list, and a migration seemed tricky at first - especially with a relatively large amount of data for my circumstances of 80 GB. The stability of my Mastodon instance was more important and an upgrade would only eat up unnecessary time and expose the instance to some downtime, which I initially wanted to avoid.

Due to the newly announced Debian version 12 “Bookworm” the topic has now become present again, because Debian Bookworm comes with PostgreSQL in version 15. I want to be prepared for this and have decided that the upgrade must now finally be completed.

Apply Pgtune settings and other configuration changes

PostgreSQL 13 was already installed parallel to PostgreSQL 11 by the system upgrade from Buster to Bullseye at that time. Surprisingly, my customizations in postgresql.conf were probably also automatically transferred to the new configuration of version 13 in /etc/postgresql/13/main/postgresql.conf - or did I do it myself and just forget again? …

Perform upgrade - with pg_upgradecluster

There are three ways to perform the upgrade: The easiest way (and usually suggested by the system during the OS upgrade) is to use the pg_upgradecluster script. After a simple

pg_dropcluster --stop 13 main
pg_upgradecluster 11 main

the upgrade script takes care of everything else and you can sit back and relax.

… because although the script works great for small databases (and I would recommend it for that anytime!), for large databases it not only takes a lot of time, but also a lot of memory: A second, Postgres 13 compatible database is created in the background and all data is copied there. So the memory requirement is doubled at the end. Actually, nothing else happens than dumping all data from the old version 11 and then importing it into the new PostgreSQL 13 database.

After 40 minutes I gave up and aborted the process on my Mastodon test deployment. The downtime was clearly too long for me.

Fortunately, there is a more efficient method: upgrading with --link and --method=upgrade. Here the old PostgreSQL 11 database is not kept, but converted “in-place” to a PostgreSQL 13 database. The process cannot be undone or the old database restored - but the method has a crucial advantage: Thanks to the --link option, records mostly do not need to be regenerated or imported, but can simply be transferred to the new version by so-called hardlinks in the file system. A copy operation is then not necessary and we save valuable time!

pg_dropcluster --stop 13 main
pg_upgradecluster --method=upgrade --link 11 main

For my 80 GB Mastodon database, the process took 11 seconds (!). Then the upgrade was as good as done - more about that in a moment. After the upgrade, the old data from version 11 can be deleted using pg_dropcluster 11 main:

Success. Please check that the upgraded cluster works. If it does, you can remove the old cluster with
	pg_dropcluster 11 main

Ver Cluster Port Status Owner    Data directory              Log file
11  main    5433 down   postgres /var/lib/postgresql/11/main /var/log/postgresql/postgresql-11-main.log
Ver Cluster Port Status Owner    Data directory              Log file
13  main    5432 online postgres /var/lib/postgresql/13/main /var/log/postgresql/postgresql-13-main.log

I was cautious at first and made sure the data was now in my PostgreSQL 13 database:

su - postgres
psql mastodon_production
select * from accounts limit 5;
\q
exit

The data was there and I was able to delete remnants of the old PostgreSQL 11 data:

sudo pg_dropcluster 11 main

By the way: Since the file system deletes so called “inodes” (where the actual data for a file is located) only when there is not a single hardlink left pointing to it, you don’t have to worry that there are still needed files in /var/lib/postgresql/11/ or that the pg_dropcluster 11 main could cause unforeseen damage.

Third upgrade path: Replication

The most complicated (but also most elegant) way to upgrade is to attach a second, still empty PostgreSQL 13 database in replication mode (Active/Passive) to the old database, then migrate the data slowly in the background until both databases have the same version level and finally perform the final upgrade to version 13 during a minimal downtime.

There is even a Ruby script for this that can automate the process to some extent: https://github.com/shayonj/pg_easy_replicate.

Due to lack of time I have not tried this method. For even larger databases it is certainly useful. Maybe I will test it during the next upgrade …

Unexpected: Reindexing after upgrade.

As mentioned earlier, the upgrade via pg_upgradecluster with --link option went in seconds and completely without any problems. However, I had not considered that the upgrade to PostgreSQL 13 would result in a complete re-indexing of the database. As can be seen in the Release Notes, the indexing algorithm has been adjusted and performance improved. To benefit from these improvements, a reindexing must be performed - and this the automatic upgrade script pg_upgradecluster apparently did for me without further notice. Convenient, but surprising. Because the reindexing process caused the performance of my Mastodon database to be so bad in the first 5 minutes that metalhead.club was unreachable due to database timeouts. Slowly things improved and after about 10 minutes the database had its original performance back.

It would have been useful to know this before…. ;-)

Customize Pgbouncer

By the way - if you use pgbouncer in your Mastodon setup, you might have to adjust the port number for your new PostgreSQL 13 cluster. The current port number can be changed via

sudo pg_lscluster

to find out the current port number. In my case the port number did not change - because the upgrade script simply set the port number from version 11 for version 13. Convenient! Otherwise the port in /etc/pgbouncer/pgbouncer.ini would have had to be adjusted and pgbouncer restarted:

[databases]
mastodon_production = host=127.0.0.1 port=5432 dbname=mastodon_production user=mastodon password=blablabla