Recently I can into an issue with acme.sh
/ Let’s Encrypt and a failing ACME validation
Error 404 when running acme.sh --renew -d mydomain.tld
[Wed May 3 15:31:45 UTC 2023] Pending, The CA is processing your order, please just wait. (1/30)
[Wed May 3 15:31:49 UTC 2023] mydomain.tld:Verify error:<ipaddress> Invalid response from https://mydomain.tld/.well-known/acme-challenge/5GmSwd0P0ukTtX302yHHhAuZMCEDJx7MmAaBBoPIKtk: 404
[Wed May 3 15:31:49 UTC 2023] Please add '--debug' or '--log' to check more details.
[Wed May 3 15:31:49 UTC 2023] See: https://github.com/acmesh-official/acme.sh/wiki/How-to-debug-acme.sh
My Mastodon instance metalhead.club exists since summer 2016 and seen several waves of new users - but never as many new users as in early November 2022. This has not only led to heavy CPU work on the servers (see my post about scaling up Mastodon’s Sidekiq Workers), but also to greater load on storage space. Mastodon uses a media cache that not only stores copies of preview images for posts containing links - but also copies of all media files that the server knows of. Before the user wave of late 2020 metalhead.club’s media cache was about 350 GB in size with a cache retention time of 60 days. Quickly the numbers escalated and after a few days we were already at 400 GB - and after about 3 weeks we had more than 550 GB of cached media files. Not with 60 days retention time - but with 30 only.
Despite I added hundreds of GB of new storage space, the cache showed no signs of shrinking in the near future, so decided to offload the storage to an S3 storage provider. The local disks would have been full a few days later.