TXT records are perhaps the most flexible type of DNS records available - but have you ever wondered how they’re really used? To see if we can answer this, the TXT records of 1 million domains are examined to see if there’s any rhyme or reason as to how people employ this quirky, open-ended record type.
Probably the most common way that the Domain Name System (DNS) is described is as a way to map domains to IP addresses. But this only tells part of the story: domains can have all sorts of records associated with them other than just IP addresses. These different resource record types, known as RRTYPEs, are used for things like aliasing a domain to another one (CNAME and DNAME records), specifying which mail servers to use for a domain (MX records), or securing domains against poisoning attacks with DNSSEC (DS, DNSKEY, et al.) - and a whole bunch of other things.
One of the most common, and versatile, types are TXT records. They can contain any arbitrary text, and domains can have multiple TXT records associated with them. Each TXT record can also have multiple strings associated with it (see below). They’re often used to configure mail security for a domain, verify domain ownership, and just to record various bits of information.
Because they’re arbitrary though, they can really be used for anything. So I decided to take a closer look, by checking the TXT records for 1 million domains and seeing what I could find.
TXTual history
NB: If you’re not familiar with RFCs, they’re kind of like standards for the Internet - they codify common practices. See Wikipedia for a full description of what they are and how they work.
Due to their flexibility, it's hard to predict how TXT records will be used. But where did they come from? In 1987, TXT records were first defined in RFC 1035 as "descriptive text". The only note provided was that "the semantics of the text depends on the domain where it is found". So that left it pretty open.
Later, in 1993, RFC 1464 was published - "Using the Domain Name System To Store Arbitrary String Attributes". This formalised the use of TXT records to store configuration settings for a domain in the format "key=value". While this format wasn’t required when using TXT records in this way, it definitely seems to have become the most common method used since then.
Another common use of TXT records is in other RFCs. Because of their flexibility, TXT records can be used as a place to store details about a protocol or framework, as in RFC 7208 (Sender Policy Framework). These types of RFCs specify the exact values and format that should be used, as an alternative to defining a whole new RRTYPE like with LOC.
Notes on the format of TXT records
Your basic TXT record looks like you’d expect - a simple character string:
"Example text here"
Domains can also have multiple TXT records associated:
"Example"
"text"
"here"
There is no way to specify any kind of ordering, so multiple records can be returned in a different order each time you ask for them.
But as mentioned earlier, it’s also valid to specify multiple strings:
"one" "two" "three"
Something I discovered while writing this is that RFC 7208 (Sender Policy Framework) has an interesting definition of how this usage is interpreted:
3.3. Multiple Strings in a Single DNS Record
As defined in [RFC1035], Sections 3.3 and 3.3.14, a single text DNS
record can be composed of more than one string. If a published
record contains multiple character-strings, then the record MUST be
treated as if those strings are concatenated together without adding
spaces. For example:
IN TXT "v=spf1 .... first" "second string..."
is equivalent to:
IN TXT "v=spf1 .... firstsecond string..."
TXT records containing multiple strings are useful in constructing
records that would exceed the 255-octet maximum length of a
character-string within a single TXT record.
I think this applies to all DNS records, but it might just be for SPF in particular. If I’m right, then this doesn’t appear to be a well-known fact, because a lot of domains out there specify multiple strings but seem to assume that a space would be added between them when concatenating them together. One particular example is for one of the DNS giants of the Internet, Akamai, who have the following set for akamai.net
:
"This" "is" "not" "the" "nameserver" "you" "are" "looking" "for"
Which, according to RFC 7208, should end up as:
"Thisisnotthenameserveryouarelookingfor"
Move along.
TXT work (source data and methodology)
I wrote a pretty basic shell script that I’m not particularly proud of, but worked well enough, to go through two different lists of top domains and check TXT records for each entry:
#!/bin/bash
domainsfile="domains.txt"
while read domain
do
echo "--"
echo "[$(date)] CHECKING DOMAIN $domain"
echo "--"
host -W 5 -t txt "$domain". # -W 5 for a 5 second timeout
done < "$domainsfile"
The general intent was to ensure the output was human-readable, because I wanted to be able to look through it myself, but also useful to allow for parsing to get totals etc.
I then ran this and captured the output:
./get-txt-records.sh > host.out 2>&1
This produces output in the file host.out for each domain that looks like this:
--
[Mon 03 Apr 2023 04:12:36 PM BST] CHECKING DOMAIN amazonaws.com
--
amazonaws.com descriptive text "pf2vv39dfkf9tszsg5lggfs6tp6bkjn4"
amazonaws.com descriptive text "v=spf1 include:amazon.com ~all"
amazonaws.com descriptive text "spf2.0/pra include:amazon.com ~all"
The script took about 48 hours to run each time against a nameserver that hadn’t been specifically warmed on the results. The first run was with the Tranco list, generated on 10 March 2023, available at https://tranco-list.eu/list/PZJ3J. This wasn’t as interesting as I’d hoped though, I think because it was all effective second-level domains (eSLDs, which I call “parent domains”). So I ran it again with the Cisco top 1m domains list downloaded on March 13th 2023.
I did a lot of checking with just cut, grep, etc., but I did also then write a little script to import all records into an SQLite database to make things easier.
All of the files mentioned here, including scripts, the SQLite database, and files created while looking at record lengths, unique records, etc., have been uploaded to GitHub under a Creative Commons Zero licence:
TXT by numbers
Number of TXT records | 765,650 |
Number of unique TXT records | 595,398 |
Domains with TXT records | 584,244 |
Domains without TXT records | 415,756 |
Average number of TXT records per domain (that has them) | 1 |
Longest TXT record | 7,886 characters |
Second-longest TXT record | 5,498 characters |
Total length of all TXT records concatenated | 49,813,321 characters |
DMARC1 records | 4,218 |
SPF1 records | 164,459 |
SPF1 records with "include" | 131,892 |
SPF1 records with just "v=spf1 -all" | 8,444 |
SPF2 records | 5,091 |
SPF3 records | 808 |
key=value TXT records | 630,317 |
Empt records (just "") | 183 |
Empty records with spaces ("\s+") | 12 |
Empty records with just "~" | 109 |
Verification / confirmation records | 402,230 |
Top 5 verification records | |
google-site-verification | 170,225 |
MS | 56,160 |
28,273 | |
Globalsign | 17,396 |
Apple | 17,201 |
Fixed-length TXT records | |
68 characters |
170,941 (mostly Google site verifications) |
13 characters | 48,763 (mostly MS= TXT records) |
32 characters | 48,648 (random strings) |
59 characters | 33,205 (Facebook verifications) |
26 characters | 25,559 (random strings) |
URLs | 708 |
Non-HTTP URLs | 16 |
Email addresses | 4,487 |
Hello worlds | 3 |
Greetings | 14 |
Swear words not appearing in domains | 0 |
<script> tags | 1 |
Embedded DNS records ("IN ...") | 225 |
Mentions of "ALIAS for" another domain | 363 |
Security code | 769 |
Please | 27 |
References to tickets | 28 |
Conclusion
So, what are TXT records used for exactly? Well, we can see that key-value settings are the most common use case, with domain verification records being the majority of those. SPF records also make a strong showing, as well as a lot of seemingly random fixed-length records that are probably being used for encoding data somehow.
But overall, they really are used for anything and everything. There’s some patterns we can pick out, but the lack of rigid rules means that the freedom to put whatever you like in a TXT record has been liberally accepted by the Internet as a whole. Which, if you ask me, is a good thing - having something in the DNS that can act as a config store, notes field, playground for new standard, or even the basis for file storage (not that this would really be recommended), has meant that we haven’t had to wait for standards to catch up in order to continue making use of this wonderful system that underpins a fundamental part of the Internet.
Comments 3
Comments are disabled on articles published more than a year ago. If you'd like to inform us of any issues, please reach out to us via the contact form here.
Francisco Osornio •
One more use for TXT records: exfiltrate/infiltrate data using DNS servers. TXT records depend on the TCP protocol, so all of DNS implementations must open the firewall TCP 53 port in order to TXT records to work properly. Since TCP is a connection-oriented protocol with bigger payloads, TXT records can be used to exfiltrate data from the internal networks to the outside or to receive new malware instructions or even to receive the ransomware keys the attackers use to lock the files in a computer or server. I like this reading!
Hide replies
Peter Lowe •
Hey Francisco. Yes, this is absolutely a common use - I think this may be what a lot of the fixed-length records are used for, although it's hard to tell. They're seemingly "random" characters, but probably some sort of base64-encoded text when properly reconstructed together. For some more unusual uses of DNS, you might like my presentation on "Bizarre and Unusual Uses of DNS": https://www.youtube.com/watch?v=1uNxHVXBQb4 (It's quite long - but there's a 10-minute version of this I did at FOSDEM here: https://fosdem.org/2023/schedule/event/dns_bizarre_and_unusual_uses_of_dns/)
Josh Rosenbluh •
The TXT record for akamai.net actually resolves to ""Thisisnotthenameserveryouarelookingfor." (with a closing period).