Miscellaneous headline

Monitoring DNS Availability

Pretty much all Internet communication relies on the availability of the DNS system. If the DNS is down, the Internet seems down.
DNS monitor check every Internet facing authoritative DNS server for the monitored domain, making sure that the domain is responding to queries.


All connections, with very few exceptions between computer hosts are preceded by a DNS call to help the connecting host to find out to which IP address to connect. If the DNS call fails no connection will be attempted. That is why DNS Availability is key!

In todays world where everything rely on fast connections and load times you need to make sure that the DNS is is not causing any of it.

The DNS protocol were designed back in the 80's. Even though the protocol has evolved over the years the availability and load balancing implementations remain virtually untouched.

Resolver cache

To limit the number of queries directed to authoritative servers (to save bandwidth and increase speed) a DNS resolver will always try to respond with cached data. The resolver iteration process and resolver cache are explained and illustrated in the DNS Tutorial part 1. So instead of me repeating myself please take a look there if you need some refreshing on the subject.

A resolver will only query an authoritative server if it can't find the response in its cache. On a normally utilised resolver the query responses will most likely originate from the resolver cache. Let us not forget that cached data may also exist on a local machine as well.

With this in mind even if you can reach your intended host the authoritative DNS server or servers for that host/domain could actually be down. You wouldn't know for sure until the cached data in your stub and recursive resolver has been dropped by exhausting the record TTL. The only way to know for sure if the authoritative DNS servers are available is to query them directly and bypass cached data altogether.

DNS Availability

Two or more authoritative DNS servers are normally used for redundancy to mitigate risks and enable load balancing.. This in turn brings up another issue, that of how the resolvers know if a server is available or not.Remember that the DNS protocol for the most part uses UDP as transport and doesn't rely on heart-beat like other modern network equipment (unless your have built your DNS infrastructure behind load balancers)!

When the resolver iterate through the domain infrastructure to resolve a query it receive and build a list of nameservers authoritative for relevant domains. The servers are listed in random order and the resolver will query the first nameserver from that list. If that server is unresponsive for any reason the resolver will proceed with a query to the next server in the list and so on until a response is received or the list is exhausted.

This is where things get interesting. The DNS protocol doesn't really go into detail on how this failover should work. Only that it should work. So much have been left to the vendors to figure out and make its own implementation. Some resolver vendors only wait a couple of seconds for a response before continue to the next server in the list while others leave it to the TCP/IP stack to work it out, usually with a 30 or more seconds delay before continuing. A 30 or more seconds delay caused by a non-responsive DNS server is simply not ok. On an interactive session a user would definitely klick on the next google link. If you run a web shop or similar this would probably mean a lost customer.

DNSmonitor makes a huge difference in this area! We monitor the publicly available authoritative DNS servers for a customer domain we will find if a server is unresponsive and/or is prone to time-outs. This will be processed and reported to the monitor dashboard. And since we aren't biased by providing the actual DNS service we will simply provide the information in a neutral fashion.

Internet view

Most organisations use a split horizon setup where the internal and externa DNS name space are different and usually reside on different DNS servers. Problems with the external name space and DNS infrastructure can be hard to detect unless monitored closely. On the internal network you can rely on that your users will detect any DNS or resolver problems before your monitoring system does. On the Internet you normally don't have that luxury...