RFC 9199 puts forward six considerations for large authoritative DNS server operators each derived from peer-reviewed research. Giovane Moura from SIDN Labs walks us through the list of considerations and the experience he and his co-authors had with the RFC process.
With colleagues from the University of Southern California/Information Sciences Institute, we have co-authored RFC 9199, an IETF document in which we present considerations for large authoritative DNS server operators. The considerations are derived from six peer-reviewed academic papers that we published between 2016 and 2021, and which we have also used to further improve the security, resilience, and performance of the .nl ccTLD. In this article, we explain our six considerations in more detail.
Not all RFCs are born equal
What RFC 9199 is not: RFC 9199 is not an IETF consensus document; nor is it a protocol standard. It is an informational document: its sole purpose is to document our findings. Even though most people think of RFCs as protocol standard documents, that is only one category of RFC. RFCs can belong to multiple categories:
- Informational (as RFC 9199)
- Standard (as RFC 4033, which covers DNSSEC)
- Best current practices (as RFC 9210, which covers DNS transport over TCP and operational requirements)
- Experimental (as RFC 9057, which covers the e-mail author header field)
Each category has its own purpose. The drafts can come from various sources, referred to in the IETF context as streams:
- IETF – typically from a working group (such as DNSOP, which standardises protocols)
- Independent: any individual can submit a draft document outside the IETF stream
- IAB: documents published by the Internet Architecture Board
- IRTF: documents published by the Internet Research Task Force
From document draft to RFC
We began by submitting our draft document to the IETF DNSOP working group (IETF stream), which was back in March 2019. While feedback was neutral to positive, the working group (WG) rightfully concluded that they could not contribute to improving the document, given that the conclusions were derived from already peer-reviewed and published research works. An IETF WG is a suitable stream to use when designing a new protocol as we did previously, such as with RFC 6781 (DNSSEC Operational Practices). However, we found it’s not the right stream when presenting conclusions that have already been reached.
We therefore changed the stream of our draft from IETF to Independent. That implied a different evaluation path, on which the Independent Submission Editor chooses reviewers for the document, just as an academic journal editor choose reviewers for an article. That is very different from the IETF stream, in which anyone on the IETF can comment on the draft.
After individual reviews and multiple iterations, our document was finally published as an Independent Submission with Information status.
So, what is RFC 9199?
RFC 9199 is a summary of six measurement-based research works intended to be more accessible for authoritative DNS servers operators. The RFC presents our considerations (advice about what to do) upfront and refers the reader to the respective papers for details about what we have determined. The added value of RFC 9199 is that it summarises our research conclusions and frees the operator to use our findings without reading through multiple dense academic papers.
Consideration #1: Deploy anycast on every authoritative server
In 2015, many authoritative server operators, such as the Root Zone operators and .nl (SIDN), used a mixed setup of authoritative servers: some were unicast, others anycast. Anycast adoption was still progressing.
We have shown that if you are a large, global provider, it is in your best interest to use anycast on every single authoritative server for your zones. For two reasons:
- Anycast servers cope better with DDoS attacks than unicast counterparts.
- Anycast servers deliver far faster responses than unicast servers because clients can be served by nearby servers, rather than by one server at a single location. And resolvers will see all available authoritative servers – so unicast servers will impact them too.
Fast-forward to 2022, and in the meantime all root server operators have deployed IP anycast. Same for SIDN and the .nl zone. Anycast adoption has also grown for ccTLDs and other zones from 2017 to 2021. We cannot speak for all the root operators, but SIDN's migration of .nl to all-anycast services was certainly driven partly by our research work.
Consideration #2: Routing matters more than the number of sites for anycast
When comparing anycast DNS providers, one common metric is the number global anycast locations each provider has. Typical anycast networks range from fewer than ten sites to more than 150 (L-Root, for example).
Intuitively, one might think that more instances are better and will always lead to shorter response times. That is not necessarily true, however. Proper route engineering can matter more than the total number of locations.
We showed that an anycast network with eight locations could deliver similar latency to clients as the K-Root and L-Root were then delivering, with 33 and 144 locations at the time of the study. So, routing optimization can be more important than the number of global locations, depending on your client population and query distribution. We also showed that poor routing can lead to anycast polarization, which causes large clients to see an anycast service as unicast (they reach only one location).
We therefore recommend that operators monitor the latency of their clients, which can also be done using passive TCP queries.
Consideration #3: Compile anycast catchment maps to improve design
Suppose you have an operational anycast network and want to expand it by adding new locations. Now, depending on your peering/transit providers and your client population, adding a single location can lead to massive traffic changes – sites that had lots of traffic may end up receiving little.
It is very hard to predict such traffic changes. To avoid unwanted surprises, operators can measure what effect a proposed change will have on an adjacent network. To measure the effect, the operator needs to announce a test prefix on every single location operated on the production network. The prefix can then be used to establish where the IPv4 client population will be mapped to, using a tool we have developed called Verfploeter. Verfploeter sends ICMP echo requests (ping packets) to an IPV4 hitlist, and the responses go to individual anycast locations. The distribution of the responses received at a given location is that location's 'catchment'.
The impact of each new routing change can be measured by running Verfploeter repeatedly. One limitation of Verfploeter is that it does not currently support IPv6, as the IPv4 hit lists used are generated by means of frequent large-scale ICMP echo scanning, which is not possible using IPv6.
Consideration #4: Employ two strategies when under stress
DDoS attacks are still a significant threat to DNS operators. How should a DNS operator engineer their anycast authoritative DNS server to respond to a DDoS attack?
We found out empirically that there are two main strategies for defending against a DDoS attack:
An operator can withdraw their routes, pre-prepend their AS route to some or all their neighbors, perform other traffic-shifting tricks (such as reducing route announcement propagation using BGP communities), or communicate with their upstream network providers to apply filtering (potentially using FlowSpec) or the DDoS Open Threat Signaling (DOTS) protocol (RFC 8811, RFC 9132, RFC 8783). Such techniques shift both legitimate and attack traffic to other anycast instances (hopefully with greater capacity) or block traffic entirely.
Alternatively, operators can let sites become degraded absorbers by continuing to operate them, knowingly dropping incoming legitimate requests due to queue overflow. However, that approach will also absorb attack traffic directed toward the catchment, hopefully protecting the other anycast instances.
Consideration #5: Consider longer time-to-live values whenever possible
Caching is the cornerstone of good DNS performance and reliability. A 50 ms response to a new DNS query may be considered fast, but a response of less than 1 ms to a cached entry is far quicker. We have shown that caching also protects users from short outages and even significant DDoS attacks.
On DNS resolvers, authoritative DNS server operators directly control caching (analogous to short-term memory). This is because each DNS record has a time-to-live (TTL) field, which specifies how long a record should remain in the resolver's cache.
We recommend that operators use TTLs of at least 4 hours (possibly more) for their records because it makes responses faster. Exceptions should be made for load balancers and some DDoS-based solutions that require short TTLs (although 15 minutes may provide sufficient agility for many operators).
Consideration #6: Consider the difference in parent and child TTL values
In the DNS, there is some level of information replication, both on parent and child authoritative servers. For example, the NS records of the domain example.com can be retrieved from the parent .com authoritative servers:
dig ns example.com @e.gtld-servers.net ;; AUTHORITY SECTION: example.com. 172800 IN NS a.iana-servers.net. example.com. 172800 IN NS b.iana-servers.net.
But they can also be retrieved from the child authoritative servers:
dig NS example.com @a.iana-servers.net. ; ANSWER SECTION: example.com. 86400 IN NS a.iana-servers.net. example.com. 86400 IN NS b.iana-servers.net.
Look at the TTL fields: the parent server reports a 172800 second TTL, while the child reports an 86400 second TTL. Which one should the resolver trust?
We found that 90 per cent of resolvers trust the child's authoritative response. The important conclusion from our study is that authoritative operators cannot depend on their published TTL values alone – the parent's values are also used for timing cache entries in the wild. Operators planning infrastructure changes should assume that an older infrastructure must be left on and operational for at least the longest of the two TTLs.
Researchers and RFCs
Overall, our experience with the RFC process has been very positive. Although it took a long time to achieve RFC status (three years and four months from the first draft to the RFC publication), it allowed us to reach an audience (operators) that would have been unaware of our findings if they had remained confined to academia.
Communicating the findings to a wider audience is mutually advantageous: the operators benefit from information that they can put to practical use, while have benefitted from their feedback and comments in various phases of the process. We are grateful for their patience and help in the process.
RFC 9199 is a summary of the primary considerations of six research papers written over six years. The authors of those papers and the following people who contributed substantially to the content should therefore be considered co-authors. This document would not have been possible without their work:
- Ricardo de O. Schmidt
- Wouter B. de Vries
- Moritz Müller
- Lan Wei
- Cristian Hesselman
- Jan Harm Kuipers
- Pieter-Tjerk de Boer
- Aiko Pras
This article was originally published over on the SIDN Labs blog.