Yevheniya Nosyk

Extended DNS Errors: Unlocking the Full Potential of DNS Troubleshooting

Yevheniya Nosyk
Contributors: Maciej Korczynski, Andrzej Duda

7 min read

1

The Domain Name System (DNS) has traditionally relied on response codes to signal anomalies, but they are of little help to precisely identify the root causes behind failures. In this article, we examine the new Extended DNS Errors (EDE) mechanism that provides extra feedback on DNS resolutions.


EDE was introduced in RFC 8914 as a proposed mechanism for overcoming existing shortcomings in the way DNS anomalies are reported. At Université Grenoble Alpes, we recently studied the implementation of this proposed standard and enumerated domain misconfigurations in the wild. Our goal here is to provide a summary of the key findings of our paper.

Background

Extended DNS Errors rely on EDNS(0) to serve data inside the OPT resource record under the option code of 15. As of September 2023, the Extended DNS Error Codes registry at IANA contains 30 entries, 5 of which were added after the release of the original RFC 8914. Table 1 presents them all.

The codes cover different aspects of DNS, such as DNSSEC validation (1, 2, 5-12, 25, 27), caching (3, 13, 19, 29), resolver policies (4, 15-18, 20), software operation (14, 21-23), etc. These EDE codes exist independently from traditional response codes and the EDE specification does not prohibit any combination of the two. Importantly, any DNS system - whether a recursive resolver, a forwarder, or an authoritative nameserver - can generate, forward, and parse the EDE codes.

Table 1. Registered Extended DNS Error codes.
Code Description Code Description
0. Other Error 15. Blocked
1. Unsupported DNSKEY Algorithm 16. Censored
2. Unsupported DS Digest Type 17. Filtered
3. Stale Answer 18. Prohibited
4. Forged Answer 19. Stale NXDomain Answer
5. DNSSEC Indeterminate 20. Not Authoritative
6. DNSSEC Bogus 21. Not Supported
7. Signature Expired 22. No Reachable Authority
8. Signature Not Yet Valid 23. Network Error
9. DNSKEY Missing 24. Invalid Data
10. RRSIGs Missing 25. Signature Expired before Valid
11. No Zone Key Bit Set 26. Too Early
12. NSEC Missing 27. Unsupported NSEC3 Iterations Value
13. Cached Error 28. Unable to conform to policy
14. Not Ready 29. Synthesized

Implementation

As of May 2023, Extended DNS Errors are implemented by major resolver software vendors (BIND9, Unbound, Knot Resolver, PowerDNS Recursor) and public resolvers (Cloudflare, Quad9, OpenDNS). Note that Google DNS announced its support of RFC 8914 two months after our experiments, in July 2023.

We were wondering what kind of issues can trigger recursive resolvers to return EDE codes. To answer this question, we set up 63 domains reflecting different misconfigurations and corner cases - such as erroneous DNSSEC configurations (wrong keys, signatures, digests, very old/new algorithms), unreachable nameservers, restrictive ACLs, and so on. Please refer to https://extended-dns-errors.com for a full list of domains and feel free to use them for your own tests.

Next, we queried Cloudflare, Quad9, OpenDNS, as well as our own instances of BIND 9.19.9, Unbound 1.16.2, PowerDNS 4.8.2, and Knot 5.6.0. Overall, our 63 test domains generated 12 different EDE codes. Only 4 test cases out of 63 triggered the same results across all the seven tested systems: the no-ds, nsec3-iter-200, unsigned, and valid subdomains did not result in any extended error code. The following factors contributed to the inconsistency among the remaining 94% of tests:

  • Some systems implemented a subset of EDE codes that may not cover all our test cases. For example, as a first step, Unbound focused on DNSSEC-related errors.
  • Some EDE codes depend on the individual resolver’s capabilities. For example, the Cloudflare public resolver was the only system to return Unsupported DNSKEY Algorithm when resolving the domain name signed with ED448 algorithm.
  • Some EDE codes are more specific than others, but still point to the same problem. The majority of DNSSEC-related problems were signalled with either DNSKEY Missing or DNSSEC Bogus extended error codes, depending on the software.

Misconfigurations in the wild

We now set out to discover the most prevalent issues in the wild. We gathered a dataset of more than 303 million registered domains across 1,475 TLDs and requested Cloudflare public DNS to resolve their A records. Overall, 17.7 million domain names triggered 14 individual EDE codes or their combinations.

Lame delegations are the most common issue encountered - 14.8 million domains triggered No Reachable Authority and/or Network Error EDEs. These refer to cases when recursive resolvers cannot reach some or all the domain’s authoritative nameservers. Cloudflare used the EXTRA-TEXT field of the EDE entry to inform that some nameservers returned REFUSED or SERVFAIL response codes, thus not serving the authoritative data.

DNSSEC misconfigurations are another prevalent problem. Expired / missing / not yet valid signatures, missing keys or proofs of non-existence, DNSKEYs not corresponding to DS records, and broken chains of trust - all make those domains inaccessible when end users are behind validating DNS resolvers. However, when using unsupported cryptographic algorithms, resolutions would not fail, but rather be accompanied by Unsupported DNSKEY Algorithm or Unsupported DS Digest Type. Finally, two debugging EDEs were returned to signal that we were served stale answers (Stale Answer) or previously cached SERVFAIL (Cached Error).

Interestingly, 2.47 million domain names under two European ccTLDs triggered the RRSIGs Missing EDE code without leading to DNSSEC validation failures. We reached out to one of the TLD operators who explained to us that despite the TLD zone being correctly configured, Cloudflare DNS signalled the problem with a so-called stand-by KSK; i.e., the one published in the zone file in case the emergency key rollover is needed, but not actively used to establish the chain of trust.

We identified 22 more public suffixes and TLDs with stand-by DNSSEC keys triggering the same error. We contacted Cloudflare and reported our findings. They, in turn, confirmed that it was an expected behavior and updated their documentation to inform that “key rollover in-progress, stand-by key, and attacker stripping signatures” may trigger the RRSIGs Missing EDEs.

Conclusions

Our measurements revealed that all the systems implementing RFC 8914 were successful in determining root causes of misconfigurations with different levels of specificity. Moreover, this standard is particularly useful to enumerate misconfigurations at scale. Therefore, we believe that EDE is a promising technique that assists DNS operators, domain owners, and end clients in identifying and resolving DNS issues.

1

You may also like

View more

About the author

Yevheniya Nosyk Based in Grenoble, France

I am a Ph.D. student at Université Grenoble Alpes (France), where I work on DNS and network security from large-scale Internet measurements point of view.

Comments 1