
The DNS Server That Lagged Behind
• 3 min read
Around the end of October and beginning of November 2024, twenty six African TLDs had a technical problem - one of their authoritative name servers served stale data. This is a tale of monitoring, anycast, and debugging.
For Orange (AS 3215), it does not seem solved yet. A lot of timeouts: % blaeu-resolve -r 1000 --as 3215 --nameserver 1.1.1.1 --type AAAA --displayvalidation www.bortzmeyer.org Nameserver 1.1.1.1 [ (Authentic Data flag) 2001:4b98:dc0:41:216:3eff:fe27:3d3f 2605:4500:2:245b::42] : 129 occurrences [TIMEOUT(S)] : 84 occurrences [2001:4b98:dc0:41:216:3eff:fe27:3d3f 2605:4500:2:245b::42] : 1 occurrences Test #12196625 done at 2018-04-17T12:38:13Z
The tools presented here have been superseded by Blaeu, available at https://framagit.org/bortzmeyer/blaeu and documented at https://labs.ripe.net/Members/stephane_bortzmeyer/creating-ripe-atlas-one-off-measurements-with-blaeu
It seems a very good and useful feature. But calling it "tags" is a bad idea since it already means something else for RIPE Atlas :-(
The link to "RIPE Atlas API v2 manual" goes to atlas-ui-dev.atlas.ripe.net, which times out. (It is probably an internal site.)
@emileaben Rather than standardizing human-readable output format, why not emitting a standard structured format, separating the network part (traceroute) and the visualisation part (a tool using the structured output format). Such a format already exists, in RFC 5388. I let you do the same in JSON :-)
Nice tool, thanks, I recommend it. But a bit rough in both usage (documentation could be improved) and in high-level explanations.
Another reference which may be useful is the paper "Ethical issues in research using datasets of illicit origin" by Daniel R. Thomas, Sergio Pastrana, Alice Hutchings , Richard Clayton and Alastair R. Beresford https://dl.acm.org/citation.cfm?id=3131389 (PDF in https://www.cl.cam.ac.uk/~drt24/papers/2017-ethical-issues.pdf or on Sci-Hub) It is mostly about leaked datasets (such as the Patreon database) but it talks also about active network measurements (such as the Carna scan).
“Thanks Stéphane for your feedback. I refer to TTL violations as in [1] , which is when a resolver " overrides the TTL value" . In regardless if is increased or decreases; just different from what the authoritative returns. So in this context , violation is not protocol violation, is the violation or changing the original TTL value provided by the authoritative. thanks, /gio [1] https://dl.acm.org/citation.cfm?doid=3143361.3143375”
I still disagree with the term: first, a resolver does not always talk with an authoritative name server, it may talk to an upstream resolver a forwarder), and so receive a smaller TTL. Also, all DNS implementations have an upper bound for TTLs (sometimes configurable, as with BIND and Unbound). Is it a "violation" to cap a one-month TTL (seen in the wild) to one week?
Thanks for these very interesting measurements. Really useful. But I disagree with your use of the same term ("TTL violations") for the increase and the decrease of the TTL. A TTL is a *maximum*. A resolver is always free to keep the data for a *shorter* time, for instance because it reboots, or because the cache is full and it has to evict some data. It's only the increase of the TTL which is a protocol violation. Decreasing the TTL, like Amazon does systematically, is bad manners, it transfers costs to someone else, it is selfish, but it is not a protocol violation.
“With the stubby config above I only receive a FORMERR $ dig @::1 -p 8053 A www.catstuff.com ; <<>> DiG 9.8.3-P1 <<>> @::1 -p 8053 A www.catstuff.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: FORMERR, id: 39788 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;www.catstuff.com. IN A ;; Query time: 115 msec ;; SERVER: ::1#8053(::1) ;; WHEN: Tue Nov 28 07:06:35 2017 ;; MSG SIZE rcvd: 45”
This is apparently because a known ECS bug https://github.com/getdnsapi/getdns/issues/357 already fixed in the code repository.
Showing 61 comment(s)