The Domain Name System (DNS) is one of the most important protocols on the Internet. In fact, it’s often referred to as ‘the phonebook of the Internet’, although most DNS experts aren’t huge fans of this description.
Essentially, DNS is a decentralised directory.
This is where a DNS resolver – commonly referred to as a ‘DNS lookup’ tool – resolves an individual host name to an IP address.
Say, for instance, you want to visit F5’s website. Rather than memorising the physical (IP) address of F5’s web server, you can just type in “www.f5.com” in your web browser, and DNS will provide your operating system with the correct IP address.
Historically, DNS is an interesting beast. It is an old protocol, and the publication of the first Request for Comments (RFC) dates back to 1984. DNS has evolved since then, picking up the pace in recent years with the introduction of DNS-over-TLS (DoT) and DNS-over-HTTPS (DoH). DoT adds Transport Layer Security (TLS) encryption. DoH, is an alternative to DoT whereby DNS queries and responses are encrypted and sent over HTTPS (instead of unencrypted over UDP). Like DoT, DoH ensures that attackers can’t forge or alter DNS traffic (such as man-in-the-middle attacks). And, by encrypting DNS traffic, it protects sensitive information.
It’s always DNS!
If you work in IT, chances are you’ve heard somebody exclaim that “it’s always DNS” – usually if something has gone awry. When DNS has hiccups, it is immediately noticed.
The fact is that DNS is often taken for granted. Once it’s up and running people tend to forget about it. This includes performance monitoring and even how the infrastructure was built out in the first place. It is not an ideal situation and means that mistakes made at the outset can cause considerable operational issues over time.
Common setup mistakes include:
- running DNS servers in the same location, which would cause a DNS outage if this location experiences an outage,
- running DNS infrastructure from a unique network (Autonomous System/ASN), which would also lead to a DNS outage in case of network issues,
- not applying software diversity (i.e., running the same software across all DNS servers, which would allow a bug to impact all servers in play at the same time).
Battling against the outages
Recent DNS outages making the headlines include those experienced by Akamai (July 2021) and Cloudflare (July 2020). The latter resulted in a network-wide outage, which also impacted DNS.
Unfortunately, DNS outages will always happen, no matter how redundant systems are. Whether caused by routing issues, software bugs or human error, it is almost impossible to guarantee that a system will always be up and running.
One of the best ways to prevent an outage is to use multiple DNS providers.
Fortunately, it is a straightforward thing to do. The DNS protocol has built-in mechanisms that enable the easy addition of “Secondary DNS services” via zone transfers (the process of copying the contents of the zone file on a primary DNS server to a secondary DNS server). This means that, whenever a change is performed on your main provider, a notification (NOTIFY) message will be sent to your secondary provider(s), which in turn will ask for the latest changes.
These mechanisms are standard, so you can use most DNS providers as your secondary option (provided they support NOTIFY/Zone transfers, which is the case with most providers).
In addition to creating a Plan B on the server front, having another DNS provider in play can unlock a range of other benefits, including:
- Software diversity. Provider B will likely use different DNS software than provider A. If a bug hits A, it (hopefully) won’t affect B.
- Network redundancy. Providers serve DNS requests out of their network. This means that, even if DNS is still up, a network outage will bring it down. Having a second DNS provider, using a different network/ASN (Autonomous System) helps to mitigate that possibility.
- Latency. Having a low latency is critical to get fast DNS answers. However, some networks have better latencies in specific regions. Having another DNS provider in the mix can help ensure optimal latency across the globe.
Don’t wait for the next outage before you act. It is easier and faster than you think to reinforce your operations with a second DNS provider.
Discussion about this post