DNSSEC Algorithm Roll-over
The RIPE NCC was among the first organisations to sign its zones with DNSSEC. Back then, the only algorithm defined for this purpose was RSA/SHA1, and all the RIPE NCC's zones were signed with this algorithm.
In 2009, RFC 5702 was published, and it standardised the use of SHA2 for signatures for DNSSEC. At the time, most software did not yet support SHA2. Then in 2010, the root zone was signed with a SHA2. There was a lot of publicity around this event, and SHA2 support arrived in many validators. This was of course necessary, because without it they would not be able to validate the root zone.
In recent years, researchers have demonstrated that SHA1 is no longer strong enough, and have been suggesting the use of stronger algorithms. This, combined with the fact that the root zone is also signed with SHA2, gave us reason to consider rolling to a stronger algorithm. We also wanted to use this opportunity to learn about controlled algorithm roll-overs and to share our experiences with our community of users.
In DNSSEC, rolling keys while retaining the same algorithm is a relatively straightforward process. It has been analysed, documented and performed so many times by so many people that it is no longer a skill reserved for experts. In fact, DNSSEC signing software generally has full support for it, such that most aspects of regular roll-overs can be automated. We use Secure64's DNSSEC signer, and it automates almost all aspects of key rolls. As far as I know, OpenDNSSEC and BIND also handle key rolls by themselves.
Most zones are set up with separate Key-Signing Keys (KSK) and Zone-Signing Keys (ZSK). With this setup, the signer can roll the ZSK all by itself, using the pre-publish method. For rolling the KSK, signers will generally use the double-signature method. They will also generate Delegation Signer (DS) records, which a human operator can update the parent zone with.
The thing to note here is that the KSKs and the ZSKs can be rolled independently. They are not tied to each other in any way. In most configurations, the signer will have been configured to roll the KSKs at a low frequency (for example, annually) and the ZSKs at a higher frequency (for example, monthly).
Planning and preparation
Most signer software has little or no support for algorithm roll-overs. For example, while our signer was quite happy to sign a zone with only SHA1 or only SHA2, it couldn't sign with two different algorithms at the same time. We asked our vendor, Secure64, to provide us with a new version of the signer software that could do this. This new version of the software was ready in August.
Firstly, we upgraded our backup signer to this new version, and then set up some testing equipment.
Next, we got ourselves a virtual machine and installed BIND and Unbound on it. We also made a list of public recursive DNS resolvers that are known to validate. This list includes Google DNS, Verisign DNS and DNS OARC's public validating resolvers.
Thirdly, we created a new sub-domain, called dar.testdns.ripe.net , for testing this roll-over. We initially configured the signer to sign this zone with just SHA1, so that the zone looked exactly like all our other zones. We then reconfigured our anycast name server, pri.authdns.ripe.net, to serve this zone.
Finally, we wrote a script that would send queries for this zone to BIND, Unbound and all the public DNS services listed above, and record the output to files. We also installed DNSViz on this server, and configured it to test the zone periodically, and record its results in graphical form in PNG image files.
Rolling and testing
We initiated a roll-over of both the KSK and ZSK by editing the signer configuration and introducing the SHA2 algorithm. As we expected, the signer first introduced only the SHA2 signatures generated by the new keys, but did not publish the keys. This is to allow the signatures to propagate and be available in caches before the key is revealed.
;; ANSWER SECTION: dar.testdns.ripe.net. 899 IN SOA pri.authdns.ripe.net. dns.ripe.net. 1446570905 3600 600 864000 900 dar.testdns.ripe.net. 899 IN RRSIG SOA 5 4 900 20151203184347 20151103174347 64536 dar.testdns.ripe.net. zNk7i0y/OTwvyRM1g/OSSnzHyoAgKvptCCMiF2KYCrVnSCRdjvFDnUJw... dar.testdns.ripe.net. 899 IN RRSIG SOA 8 4 900 20151203184347 20151103174347 65116 dar.testdns.ripe.net. msQlimTLz/QVtSZZ4gu4ysvKVa3rU0S7i5JHtg5PaytOkhGbNtdVc3Vx...
After a period of twice the largest TTL in the test zone, the signer introduced the new keys.
This is when things went wrong. We noticed that while BIND and Google DNS were able to validate the zone, Unbound and Verisign DNS began giving SERVFAIL responses. Specifically, queries for the SOA record were failing validation. The DNSKEY RRset was still okay and passing validation.
When we looked at the zone we noticed that the signer had completed the ZSK roll by withdrawing the old ZSK. Normally this would be just fine. However, with an algorithm roll, this breaks some validators. Unbound and Verisign DNS assume that the algorithm signalled by the DS record is used to sign all records in the zone, whether by the KSK or ZSK. They are interpreting section 5.11 of RFC 6840 rather strictly, even though the RFC advises validators to be more lenient.
;; ANSWER SECTION: dar.testdns.ripe.net. 899 IN DNSKEY 257 3 8 AwEAAW03yVzMC+Ah7... dar.testdns.ripe.net. 899 IN DNSKEY 256 3 8 AwEAAZaiyrjlejpX4... dar.testdns.ripe.net. 899 IN DNSKEY 257 3 5 AwEAAbYmtTSuzK+wU...
In our case, we had not yet updated the DS record, so it was still referring to the old KSK, and so Unbound and Verisign DNS refused to validate with the SHA2 signatures made by the new ZSK. The latest version of Unbound now has an option to be more lenient in such cases, and validate anyway, but this version came out only a few weeks ago, so there will be many people out there using an older version. Additionally, Verisign DNS still appears to be strict, so we have to make sure that our zones can be validated by all these validators.
Persisting the ZSK
After more testing and discussion with the developers of Unbound and Secure64, we came to the conclusion that when doing an algorithm roll-over, we need to hold off withdrawing the old ZSK and the old SHA1 signatures, until the KSK roll-over is also complete, and a new DS record is present in the parent. Fortunately, the Secure64 signer has an undocumented option, called "dnssec-zsk-publish-safety", which takes a time value. By setting this to a very large value, we can force the signer to keep publishing the old ZSK while we update the DS record of the zone and allow the KSK roll to complete. After the KSK roll is complete, we can set this value to zero and allow the signer to complete the ZSK roll-over.
DNSSEC algorithm roll-overs are in many ways similar to normal roll-overs, but with these two caveats:
- The KSK and ZSK should be rolled at the same time; and
- The old ZSK cannot be withdrawn until the KSK roll-over is complete.
It is possible to roll the algorithms of the KSK signatures and ZSK signatures separately, but it involves more work, especially because the DS records have to be updated more than once. This part of the process is usually the slowest, and in many cases involves manual work, so it's best to minimise the amount of work in it. If we take care of the above two issues, then we can roll the KSK and ZSK together and upgrade to a stronger algorithm for the entire zone in one go.
While we were in the middle of our tests, NLnet Labs coincidentally published an article on rolling algorithms using OpenDNSSEC . In the article they say that OpenDNSSEC has no direct support for rolling algorithms, but it is possible to do so if the user is willing to do some manual work, edit some files by hand and pause a certain OpenDNSSEC process.
In concluding this article, I will say that algorithm roll-over in DNSSEC is tricky, and if you're going to do it then you'll have to be very careful. Support for it in many signers appears to be absent or limited, although the current version of the Secure64 signer (3.10) will do a fairly good job of it as long as you remember to use one undocumented option. Actually, I am going to suggest to Secure64 that they should document it and explain its use for this special case of algorithm roll-over.