Speeding up global DNS resolution by avoiding CNAMES
As shown in my article “A small CDN for my Mastodon instance metalhead.club,” I like to use CNAMEs to organize my DNS entries. I usually create a DNS record for each host that maps its hostname to the IP. Then I use one or more CNAME entries to link certain (sub)domains to these CNAMEs depending on the service and purpose. This helps with overview and can make organization easier—especially if IP addresses for hosts need to be changed. Thanks to the chaining of entries, only the last mapping from server hostname to IP needs to be adjusted if the IP address of the target host changes.
Using the example from the article mentioned above, a CNAME chain might look like this:
[thomas@thomas-nb]~% dig media.metalhead.club
;; ANSWER SECTION:
media.metalhead.club. 1800 IN CNAME metalheadclub-media.cdn.650thz.de.
metalheadclub-media.cdn.650thz.de. 21600 IN CNAME s3.650thz.de.
s3.650thz.de. 3600 IN A 5.1.72.141
So first media.metalhead.club is resolved, then metalheadclub-media.cdn.650thz.de, and finally s3.650thz.de. Only then is the IP address for the target host determined. I like this in the sense of order, because the structure makes sense to me in my head. However, as I discovered when analyzing my small CDN for metalhead.club, this is not particularly conducive to performance. This is because the use of CNAMES also has a significant disadvantage: the time required for DNS resolution increases with each CNAME in the chain up to the IP address.
For example, resolving the second line from the “Answer Section” alone means: first resolve .de (query DNS root server), then query .de nameserver for 650thz.de, then query 650thz.de nameserver for cdn, and finally query the cdn nameserver for metalhead.club-media. Only then is it clear: s3.650thz.de must be resolved. The game starts again until the IP address is finally determined.
It should be clear that this chain of two CNAME entries takes time to be resolved by the user’s DNS resolver. For regional users who are within range of these name servers, the effort is not so significant, as resolving each link in the chain usually only takes a few tenths of a millisecond.
However, the situation is different for users who (as in the case of metalhead.club) have to access name servers in Europe from the US or Australia, for example. This is because:
- Root server: Global
- .de name server: Global
- 650thz.de: Regional, EU
- cdn.650thz.de: Regional, EU
My chain involves four zones on four (virtual) name servers. Each one must be queried individually, and only two of them are globally positioned, meaning they can be accessed worldwide with minimal latency. All regional servers are only accessible to those outside the EU with significantly increased runtimes. The fact that two (from their perspective “slow”) regional DNS name servers have to be queried in the case of a US user means that the increased latencies add up very quickly.
From the perspective of a US user or a user from Asia, it is therefore particularly important that resolution chains are kept as short as possible. This becomes particularly clear when using a tool such as globalping.io and looking at the global resolution times in DNS mode:
While EU users get media.metalhead.club (media-old.metalhead.club in the screenshot) resolved in about 0.004–0.070 seconds, someone in Japan has to wait a lot longer: about 1.5 seconds! Only then can their browser start connecting to the final destination server.
Of course, this only applies to the first DNS query after the TTL has expired. Subsequent queries are usually answered by the DNS resolver from the cache, so they are answered orders of magnitude faster. However, after the TTL expires, the first request takes another 1.5 seconds! However, given the small number of users in some regions, it cannot be assumed that they use a common DNS resolver. Therefore, it is important to me that uncached requests are also answered quickly.
So I decided to do without CNAME chains altogether. And this is how it worked:
- In the metalhead.club zone, I delegated the subdomain media.metalhead.club directly to Scaleway’s GeoIP nameserver.
- The Scaleway nameserver then resolves directly to an IP address that matches the user’s region.
This means that CNAMEs are no longer involved in the resolution: the entire resolution is now based solely on zone delegation.
The whole process could be improved further by taking the following measures:
- No separation of “normal” DNS nameserver (Core-Networks.de) and GeoIP nameserver (Scaleway): This would eliminate the need for the “media” zone delegation.
- Global accessibility of all nameservers (however, this is a significant cost factor and also requires a new DNS provider).
But even without these two improvements, I was able to significantly reduce resolution time simply by dispensing with CNAMES, as this screenshot shows:
That’s much better: after optimization by dispensing with CNAMES, the resolution of media.metalhead.club now takes only around 0.7 seconds instead of 1.9 seconds!
By the way: for technical reasons, the main domain metalhead.club resolves to IP addresses anyway. No optimization was necessary here.