Visualising DNS Issues with DNSMON
The user experience of an Internet service depends partly on the availability and speed of the Domain Name System (DNS). DNS operators continually need to identify and solve problems that can be located at the end user, a name server, or somewhere in between. In this paper, we show how DNSMON, a production service for measuring and comparing the availability and responsiveness of key name servers, correlates and visualises different types of measurements collected by RIPE Atlas vantage points worldwide.
The Internet Domain Name System (DNS) provides a mapping from user-visible domain names to other identifiers, such as network layer addresses. The performance of most Internet services can be perceptibly influenced by the quality and responsiveness of the DNS. Since it is a distributed system, its performance depends in turn on a number of elements, such as the authoritative name servers, caching name servers, local resolvers and the network in between. Due to the importance of this infrastructure, there have been many monitoring projects collecting data about different aspects of DNS, such as stability, security, performance, and traffic. Some of these projects also provide visualisations for administrators and decision makers — especially the DNS operators at various levels.
The DNSMON visualisation gives operators a quick overview of DNS performance
DNSMON is a monitoring project started in 2001 to actively measure authoritative DNS servers at the root and top-level domain (TLD) level, from a large enough number of vantage points to reliably identify issues at, or close to, the servers themselves. DNSMON aims to constantly monitor all the name servers belonging to entire zones — considered strategic for the functioning of the whole Internet — through performance measurements.
It was initially conceived as a response to claims that root name servers performed poorly. Such claims were often based on measurements from one or — at most — a handful of vantage points, and were thus heavily influenced by network performance on a small number of network paths. The first implementation of DNSMON was based on a small process running on the nodes of the RIPE NCC's Test Traffic Measurement (TTM) network. This process executed DNS queries and reported response times to a central server that stored them in round-robin database (RRD) files. The project has proven its usefulness among operators and has evolved over time. The system's functionality has now been incorporated into the RIPE Atlas network measurement project.
Although this tool targets mostly network operators, the services these operators offer are sufficiently important that, by facilitating analysis and quality of service improvement to diverse locations around the world, all users of the wider Internet benefit.
Example Use Case
A typical use of DNSMON is the following: Suppose that the DNS resolution of a zone exhibited some packet loss or high latency problem. Is there a specific name server involved in the problem? Is the problem bound to a geographical region? How can we exclude the possibility that the problem is related to a malfunction of the monitoring system? Moreover, how did the monitoring system reach the name server before and after the issue and what DNS responses were received?
DNSMON offers an interactive view, both historic and near real-time and at different levels of detail, that help answer just such questions. It has successfully revealed and allowed for the analysis of many operational issues, including less obvious ones.
The paper “ Visualization and Monitoring for the Identification and Analysis of DNS Issues ” was presented at the Tenth International Conference on Internet Monitoring and Protection in Brussels in June 2015.