
The DNS Server That Lagged Behind
• 3 min read
Around the end of October and beginning of November 2024, twenty six African TLDs had a technical problem - one of their authoritative name servers served stale data. This is a tale of monitoring, anycast, and debugging.
The tools presented here have been superseded by Blaeu, available at https://framagit.org/bortzmeyer/blaeu and documented at https://labs.ripe.net/Members/stephane_bortzmeyer/creating-ripe-atlas-one-off-measurements-with-blaeu
It seems a very good and useful feature. But calling it "tags" is a bad idea since it already means something else for RIPE Atlas :-(
The link to "RIPE Atlas API v2 manual" goes to atlas-ui-dev.atlas.ripe.net, which times out. (It is probably an internal site.)
@emileaben Rather than standardizing human-readable output format, why not emitting a standard structured format, separating the network part (traceroute) and the visualisation part (a tool using the structured output format). Such a format already exists, in RFC 5388. I let you do the same in JSON :-)
Nice tool, thanks, I recommend it. But a bit rough in both usage (documentation could be improved) and in high-level explanations.
Another reference which may be useful is the paper "Ethical issues in research using datasets of illicit origin" by Daniel R. Thomas, Sergio Pastrana, Alice Hutchings , Richard Clayton and Alastair R. Beresford https://dl.acm.org/citation.cfm?id=3131389 (PDF in https://www.cl.cam.ac.uk/~drt24/papers/2017-ethical-issues.pdf or on Sci-Hub) It is mostly about leaked datasets (such as the Patreon database) but it talks also about active network measurements (such as the Carna scan).
“Thanks Stéphane for your feedback. I refer to TTL violations as in [1] , which is when a resolver " overrides the TTL value" . In regardless if is increased or decreases; just different from what the authoritative returns. So in this context , violation is not protocol violation, is the violation or changing the original TTL value provided by the authoritative. thanks, /gio [1] https://dl.acm.org/citation.cfm?doid=3143361.3143375”
I still disagree with the term: first, a resolver does not always talk with an authoritative name server, it may talk to an upstream resolver a forwarder), and so receive a smaller TTL. Also, all DNS implementations have an upper bound for TTLs (sometimes configurable, as with BIND and Unbound). Is it a "violation" to cap a one-month TTL (seen in the wild) to one week?
Thanks for these very interesting measurements. Really useful. But I disagree with your use of the same term ("TTL violations") for the increase and the decrease of the TTL. A TTL is a *maximum*. A resolver is always free to keep the data for a *shorter* time, for instance because it reboots, or because the cache is full and it has to evict some data. It's only the increase of the TTL which is a protocol violation. Decreasing the TTL, like Amazon does systematically, is bad manners, it transfers costs to someone else, it is selfish, but it is not a protocol violation.
“With the stubby config above I only receive a FORMERR $ dig @::1 -p 8053 A www.catstuff.com ; <<>> DiG 9.8.3-P1 <<>> @::1 -p 8053 A www.catstuff.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: FORMERR, id: 39788 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;www.catstuff.com. IN A ;; Query time: 115 msec ;; SERVER: ::1#8053(::1) ;; WHEN: Tue Nov 28 07:06:35 2017 ;; MSG SIZE rcvd: 45”
This is apparently because a known ECS bug https://github.com/getdnsapi/getdns/issues/357 already fixed in the code repository.
“DNSDB link mentioned is only for account holders. Can we get the information about previous usage of 9.9.9.9 from any other source, which don't need an account ?”
DNSDB is subscription-only. But there are some gratis "passive DNS" databases such as https://passivedns.cn/ or https://www.circl.lu/services/passive-dns/
Showing 60 comment(s)