Two new DNS measurements, both designed to assess the impact of issues with one or more root name servers, have been added to every RIPE Atlas probe.
As you may be aware, DNSMON and RIPE Atlas built-in measurements are often used when analysing outages and other issues with important DNS services, particularly the root zone servers. These measurements do not provide a direct indication of the impact on actual end-user experience because they involve directly querying authoritative name servers, thereby ignoring the effects of having multiple authoritative name servers as well as caching local recursive resolvers.
To improve upon our DNS monitoring, we have initiated two built-in DNS measurements from all RIPE Atlas probes that use the probes' default resolvers instead of authoritative name servers. These measurements should complement each other and existing RIPE Atlas DNS measurements to allow extra analysis of important DNS events.
The first measurement queries for random top-level domains with the intention of avoiding caches and therefore hitting at least one root name server. The second measurement queries for popular domain names, with the intention of hitting caches where appropriate and getting an idea of the true impact on users.
Random domains
The first measurement, which has ID 30001, deliberately bypasses DNS resolver caches to explicitly measure the availability of root servers. Probes query for an A record for a domain called <probe_id>.atlas.ripe.net.<random_string>. The intention of this measurement is to make it possible to see the effect on the root name server infrastructure as a whole if a limited number of root server letters are affected by availability issues.
Popular domains
As a starting point for the second measurement, ID 30002, we will use a recent snapshot of the global top 50 visited sites according to the Alexa top sites list, which combines unique visitors and individual page views in its calculations. Probes will cycle through the domains on this list, querying for a different A record every 10 minutes. This list could be changed in the future to reflect changes in end-user usage or to use a different data source. In the event of a major DNS outage, this measurement would provide a fairly good approximation of end user impact, being sensitive to both redundancy in the root domain name system and to the caches of DNS resolvers.
Comments 2
Comments are disabled on articles published more than a year ago. If you'd like to inform us of any issues, please reach out to us via the contact form here.
Paul Hoffman •
Issuing requests for the top 50 sites from our Atlas probes could open us to some privacy issues due to the content at some of those sites. For example, today pornhub.com is #53, but I'm pretty sure it was higher in the list just a few years ago. It could be difficult to explain to one's boss why the IT Department's trace of your office internet access was accessing such a site. Please consider making the list of the 50 not based on "end-user usage" but on what names might not get us in trouble.
Hide replies
Chris Amin •
Hi Paul, just replicating the response that I sent via email for the benefit of anybody else who reads the article: Sorry, I should have referred to the actual list of domains in the articles. I sent a mail to the RIPE Atlas and MAT-WG mailing lists last month (link below*) proposing the list of domains and asking for any objections. Indeed pornhub or other porn sites are not in the top 50, but in any case I explicitly excluded such sites from the 25 "backup" options in case any of the top 50 were to be excluded for political or other reasons. Rest assured, we had these kind of concerns in mind when setting up these measurements, and the list is fully manual and domains won't automatically float into it, but please do let us know if you have any further concerns. Kind regards, Chris * https://www.ripe.net/ripe/mail/archives/mat-wg/2017-February/000721.html