petr_spacek

NXNSAttack: Upgrade Resolvers to Stop New Kind of Random Subdomain Attack

petr_spacek
4

This article describes a newly discovered DNS protocol vulnerability that affects most recursive DNS resolvers. NXNSAttack allows the execution of random subdomain attacks using the DNS delegation mechanism, resulting in a big packet amplification factor.


First things first

If you operate your own DNS resolver, no matter what brand it is, please upgrade to the latest version now (if you're disappointed that you have to rush with an upgrade now, ask your vendor about early notification for security releases).

Read on if you want to know how the attack works and how to protect your systems next time.

NXNSAttack principle

The newly-discovered vulnerability abuses the DNS delegation mechanism to force DNS resolvers to generate more queries to authoritative servers of attacker’s choice.

How is this possible? The whole DNS is built on a delegation principle, where authoritative DNS servers responsible for upper levels of DNS hierarchy delegate (we could also say "redirect") questions for lower level domains to different servers, thus eliminating the need to maintain one huge database of DNS data for the whole Internet. For example, this is how the authoritative DNS server named “a.gtld-servers.com.”, which is responsible for the “com.” domain, delegates questions “example.com. A” to a different set of servers:

$ kdig @a.gtld-servers.com example.com A
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10976
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 2, ADDITIONAL: 0

;; QUESTION SECTION:
;example.com. IN A

;; AUTHORITY SECTION:
example.com. 172800 IN NS a.iana-servers.net.
example.com. 172800 IN NS b.iana-servers.net.
;; From 2001:503:a83e::2:30@53(UDP) in 46.9 ms

Here we can see that even though we asked for the name “example.com. type A”, the server authoritative for the  “com.” domain delegated our query to “example.com.”, which contains names of two other authoritative DNS servers, but does not contain the answer to our original question.

This is called glueless delegation, i.e. a delegation which contains only the names of the authoritative DNS servers (a.iana-servers.net. and b.iana-servers.net.), but does not contain their IP addresses. Obviously the DNS resolver cannot send a query to “name”, so the resolver first needs to obtain the IPv4 or IPv6 address of the authoritative server “a.iana-servers.net.” or “b.iana-servers.net.” and only then it can continue resolving the original query “example.com. A”.

This glueless delegation is the basic principle of the NXNSAttack: the attacker simply sends back a delegation with fake (random) server names pointing to the victim DNS domain, thus forcing the resolver to generate queries towards the victim's DNS servers (in a futile attempt to resolve fake authoritative server names).

 

Impact

The main discovery in the NXNSAttack paper is that attackers are able to amplify a single DNS query towards a DNS resolver + single DNS answer with fake delegations (i.e. two packets) to fire multiple random queries at victim authoritative servers, effectively using a standard-compliant DNS resolver as an amplifier for random subdomain attacks. In practice, the packet amplification factor (PAF) very much depends on the strategy employed by the DNS resolver implementation used in the attack. For example:

  • The BIND 9.12.3 resolver resolves IPv4 and IPv6 addresses for all NS names obtained from a delegation in parallel, leading to a packet amplification factor of 1000x.
  • The Knot Resolver 5.1.0 resolves NS names one at a time and places other limits on number of resolution steps generated by a single client query, limiting PAF to the order of tens. In fact, half of the packet amplification factor 48x is caused by workarounds for non-compliant authoritative servers. Without workarounds for RFC 8020 and RFC 7816 non-compliance, the PAF of the Knot Resolver would only be 24x (yet another example that workarounds are bad, but that’s another story.)

None of these strategies are inherently wrong, they just represent different trade-offs between resources invested into a single client query vs. processing multiple client queries in parallel.

In the end, spare capacity on the resolver and authoritative servers determines which party will be the “victim” of NXNSAttack because one of them gets overloaded first. As long as the capacity is sufficient, servers will continue to operate just fine, possibly making one of the parties virtually unaffected and, in absence of appropriate monitoring, oblivious to the attack.

Unfortunately, NXNSAttack abuses a basic principle of the DNS protocol, which practically means there is no fix, only mitigation. Luckily researchers followed the responsible disclosure protocol and allowed vendors to implement and release mitigation before making the attack public.

NXNSAttack is a special case of the well-known random subdomain attack, so mitigation approaches fall into two categories: Specific for NXNSAttack and generic for random subdomain attacks.

NXNSAttack mitigation

Unlike traditional random subdomain attacks, in the case of NXNSAttack queries are generated by the resolver itself. This difference allows vendors to implement simple mitigation techniques like limiting the number of names resolved when processing a single delegation, etc.

An obvious advantage is that it is simple, at least in theory.

A disadvantage of mitigation based on counters is that it requires vendors to invent arbitrary limits not based in the DNS protocol specification, basically determining the maximum packet amplification factor. At the same time, these arbitrary limits might break resolution for some domains because they put additional limits on the resolution process.

This is a very practical problem because recently-published research estimates that 4% of second-level domains (e.g. example.com.) have a problem in their delegation from the top-level (e.g. com.), so any change which adds arbitrary limits to retries during the resolution process has to be weighted very carefully.

In the upcoming days we will see how successful vendors were in determining their magic numbers and whether they get away without breaking any major domains.

Generic Random Subdomain Attack mitigation

Any random subdomain attack, NXNSAttack included, generates random query names to bypass DNS caches. It follows that generic mitigation has to prevent attackers from bypassing the cache – and luckily we already have technology to do that!

Aggressive Use of DNSSEC-Validated Cache (RFC 8198) uses DNSSEC “metadata” in form of NSEC(3) and RRSIG records to generate negative answers without the need to contact authoritative servers. How does this work? First let’s have a look at example NSEC records:

$ kdig +dnssec @l.root-servers.net example.
;; ->>HEADER<<- opcode: QUERY; status: NXDOMAIN; id: 55933
;; Flags: qr aa rd; QUERY: 1; ANSWER: 0; AUTHORITY: 6; ADDITIONAL: 1

;; EDNS PSEUDOSECTION:
;; Version: 0; flags: do; UDP size: 4096 B; ext-rcode: NOERROR

;; QUESTION SECTION:
;; example. IN A

;; AUTHORITY SECTION:
events. 86400 IN NSEC exchange. NS DS RRSIG NSEC
events. 86400 IN RRSIG NSEC 8 1 86400 20200531170000 20200518160000 48903 . bWcSkQHURJGO...
. 86400 IN NSEC aaa. NS SOA RRSIG NSEC DNSKEY
. 86400 IN RRSIG NSEC 8 0 86400 20200531170000 20200518160000 48903 . Ru23msHh23...
. 86400 IN SOA a.root-servers.net. nstld.verisign-grs.com. 2020051801 1800 900 604800 86400
. 86400 IN RRSIG SOA 8 0 86400 20200531170000 20200518160000 48903 . pIolh2KxjZbgtwuePLA4...

We sent the DNS query example. A to one of the root DNS servers, and it answered back with NXDOMAIN answer, indicating the name does not exist. At the same time, we received two proofs-of-nonexistence in form of NSEC records (and their DNSSEC signatures in RRSIG records).

The first NSEC record events. 86400 IN NSEC exchange. NS DS RRSIG NSEC means that the root zone contains domain events. with record types NS DS RRSIG NSEC, and more importantly, there are no domains in between names events. and exchange.

The second NSEC record . 86400 IN NSEC aaa. NS SOA RRSIG NSEC DNSKEY means that the root zone contains DNS root . (surprise!) with record types NS SOA RRSIG NSEC DNSKEY, and also that there are no domains in between names . and aaa.. This proves there is no wildcard record *. and thus NXDOMAIN is really the correct answer to query example. A.

Each of the records has time-to-live specified as 86,400 seconds. This allows resolvers to synthesise NXDOMAIN answers for any queries falling into indicated ranges (. – aaa., events. – exchange.) for one full day, effectively cutting traffic towards authoritative servers.

As a consequence, the querying DNS zone which contains N names at random will populate the resolver’s cache in roughly O(N) answers. In other words, the cost of eliminating random subdomain attacks between DNSSEC-validating resolvers and authoritative servers for the duration of TTL is linear with the number of names in the target DNS zone. It works surprisingly well even for large zones with one million domains in them – pretty charts about this setup can be found in my older presentation (from 2018).

What next?

First of all, upgrade your DNS resolvers to get at least some NXNSAttack mitigation.

Once the dust settles, please consider deploying DNSSEC on authoritative servers, and also on DNS resolvers.

Aggressive Use of DNSSEC-Validated Cache limits the impact of random subdomain attacks. It is already implemented in the Knot Resolver. Unbound also has partial support (NSEC only) and BIND has a prototype as well. If your DNS resolver vendor does not offer it at the moment, ask for the feature and stop random subdomain attacks once and for all!

If you are not used to speaking to your DNS software vendor, please fill in this cross-vendor survey.

 

This article has been adapted from the original post which appeared on the CZ.NIC Blog.

Tags:
4

About the author

Comments 4