Giovane Moura

NTP Pool - The Internet Timekeeper

Giovane Moura
Contributors: Marco Davids, Caspar Schutijser, Cristian Hesselman, John Heidemann, Georgios Smaragdakis

12 min read

2

The NTP Pool is a network of volunteer-run servers providing time synchronisation services to millions of computers over the Internet using the Network Time Protocol (NTP). But how does it map clients to NTP servers? And why are some clients more equal than others? The team at SIDN Labs investigates.


Ancient Romans relied on sundials and water clocks to keep track of time. Keeping track of time is one thing; accurately transferring this information is another. In ancient Rome, you would have to walk up to a sundial or water clock to know what time it was — if you could find one.

Later, in the Middle Ages, churches started to deploy mechanical clocks in their towers (later upgraded to pendulum clocks, developed by Christiaan Huygens).

As a result, you no longer needed to walk to a sundial or water clock: time flew to your ears. By combining clocks with (very loud) bells, any person in town would receive a bell-encoded loud signal to tell them the time, as a public service. Many churches still provide this time service.

Modern humans rely on the stability of caesium atoms to deliver atomic precision time. Atomic clocks provide the time signal, and time information is propagated to computers and other devices using the Network Time Protocol (NTP) on the Internet. (There is also the issue of how to synchronise atomic clocks, which is a bit of a chicken-and-egg problem, but let us not get sidetracked.)

Internet timekeepers

In the US, NIST has been providing free time services for decades. These are delivered using publicly accessible stratum 1 servers. The US Naval Observatory (USNO) is also a popular time service provider. Later, vendors such as Apple, Google, Cloudflare, Meta, Microsoft and Ubuntu all started providing time services. We (SIDN Labs) also provide free time service at https://time.nl.

The NTP Pool provides a layer over NTP servers, providing a directory of publicly available NTP servers using DNS; it does not directly operate NTP servers. The NTP servers themselves are run by volunteers, who range from Raspberry Pi users onto large cloud operators. The Pool currently has more than 4,700 NTP servers listed in it.

Given there are so many time services to choose from, we wanted to know which ones are the most popular. To find out, we investigated root DNS queries from 2017 and 2022 (DITL datasets). As we do not have access to real NTP traffic from these services, we resorted to using DNS query names to infer how popular the various services are (there are some caveats with using DNS as a metric of popularity, which we discuss in a peer-reviewed paper).

The figure below shows how many IP addresses (DNS resolvers) have sent queries for each time service to the root DNS. We see that the NTP Pool is the most popular time service by far, even more popular than NIST and large cloud/content providers, in the datasets for both 2017 and 2022.

Number of resolvers per time server in DITL root DNS datasets

We obtained comparable results for the Autonomous System aggregation levels. It is quite remarkable that the community-driven NTP Pool is the most popular time service provider, at least from the root DNS vantage points.

Number of ASes per time server in DITL Root DNS datasets

How does the NTP Pool map clients to NTP servers?

The NTP Pool currently lists 4,700 NTP servers. How does it decide what NTP servers are assigned to each client?

To answer that, we started using roughly 10k RIPE Atlas probes to send DNS queries to the NTP Pool DNS servers, and analysed how many unique IP addresses (which are the NTP servers themselves) were returned. In short: clients send queries to pool.ntp.org, and we analysed how many unique responses there were over a 24-hour period.

We found that 10% of the Atlas probes are served by up to 12 NTP servers, and 30% are served by more than 100 NTP servers. Why such a discrepancy? Why do some clients have a more diverse set of servers than other clients? Why are some clients more equal than others?

Number of NTP servers per RIPE Atlas probe

GeoDNS - the time server assigner

GeoDNS is the authoritative DNS server made by NTP Pool to map clients to NTP servers, and it has ultimate responsibility for this assignment. We downloaded GeoDNS and configured it, and carried out a series of experiments to figure out exactly how it works. Our experiments are covered in our paper, which you can check out for details.

In a nutshell, our analysis showed that it all depends on the client’s geolocation. If you are in Japan, you will be served only by the 21 NTP servers located in Japan. If you are in Cameroon, you will have only 1 NTP server, even if the NTP Pool lists more than 4,700 servers. And if there are no NTP servers in your own country, then you will be served by NTP servers on your continent. For example, clients in Bolivia are served by all 46 servers located in South America.

Try it yourself

GeoDNS uses either the client IP address or the client subnet (ECS) specified in the DNS to map the user to NTP servers – ECS has higher priority. The implication of the mapping is that clients are bound by the number of servers available in their country.

As we said, Cameroon has only 1 NTP server, as reported by the NTP Pool website. To know which NTP server this is, we can send DNS queries to pool.ntp.org, using a Cameroon-located IP address in the ECS option. Then we will see how the NTP Pool assigns NTP servers. (If your device is configured to use, say, debian.pool.ntp.org or any vendor, the same mapping applies.)

Want to try it yourself? Just run the Python code below.

import dns.message
import dns.query
import dns.rdatatype
import dns.edns

''' Define the ECS parameters
(replace ADDRESS with an IP address # geolocated in the country 
that you are interested in)
The client's IP address, I am using an address in Cameroon.
Replace with IP addresses located in countries you are interested in
'''
ADDRESS = '165.210.33.254'  

PREFIX = 24  # Prefix length (typically 24 for IPv4)

#we query the default zone (pool.ntp.org)
# but we can use any vendor zone, like
# debian.pool.ntp.org or android.pool.ntp.org
ZONE='pool.ntp.org'

# Create an ECS option
ecs = dns.edns.ECSOption(ADDRESS, PREFIX)

# Make a DNS query for 'pool.ntp.org'
q = dns.message.make_query(ZONE, 'A', use_edns=0, options=[ecs])

# Send the query to one of the Pool authoritative servers
# In this case, I am using the IP address of c.ntpns.org.
auth_server_ip = '50.116.32.247'
response = dns.query.udp(q, auth_server_ip)

# Extract and process the response (e.g. print the IP addresses)
for rrset in response.answer:
    for rr in rrset:
        if rr.rdtype == dns.rdatatype.A:
            print(f'IPv4 Address: {rr.address}')

This code will return a single NTP server to all clients in Cameroon. In our view, this is a very restrictive form of mapping – why will only a single server be assigned to all users in Cameroon? (There are 28.28m inhabitants in Cameroon, with 12.89m Internet users.)

This restrictive mapping of clients to servers raises two questions:

  1. Why does GeoDNS use such a restrictive form of mapping?
  2. What are the consequences for clients?

Why is the mapping so constrained? Is such constraint necessary?

We asked the NTP Pool operators about the mapping, and we were told that it is about “minimising the risk of asymmetric routing and dropped packets”.

Well, it turns out that most Internet paths are already asymmetrical, so it is not a NTP Pool-only problem. (There have been several studies dealing with NTP and asymmetrical paths.)

With regard to packet loss, we carried out experiments from 132 RIPE Atlas probes located in 21 countries – all in countries having Cloudflare as their only time provider if they use the NTP Pool. We compared the packet loss and precision from each probe to each NTP server if they were to use other time servers on other continents, instead of only using Cloudflare, in other words, if GeoDNS were to assign them to other servers elsewhere.

The figure below shows the results. We see on the x axis individual RIPE Atlas probes, which are like real clients. On the y axis, we see each NTP server (one per continent and Cloudflare, for reference). We see that most RIPE Atlas VPs have no problem connecting to NTP servers on other continents – the only exception is the South American servers, which many VPs had issues reaching. For most servers and RIPE Atlas VPs, all servers on other continents could deliver precise time information, and packet loss was not an issue. This small example demonstrates that such restrictive mapping is not needed – at least not on our small scale experiment.

Missing responses ratio per time server and RIPE Atlas probe

Consequences for users

The implications of the mapping for users are clear when we look into how many NTP servers they are assigned to.

The figure below shows the number of NTP servers all users from a country have available, if they use the Pool. Given that the NTP Pool comprises more than 4,700 NTP servers, we regard this distribution as highly skewed and unfair for the client population: African clients are served by far fewer servers than US or West European clients. It looks like it perpetuates the division between the haves and the have-nots, unintentionally.

Number of NTP servers for all users in a country

But the real issue is that users from 27 countries - totalling 767m inhabitants and 465m Internet users - are served by a single Autonomous System as time provider when using the NTP Pool, even if the NTP Pool lists more than 4,700 servers. These are the countries in red in the figure below and expanded in the table.

Number of ASes (time providers) serving each country

The table below lists all countries served by a single AS if they use the NTP Pool.

Table of countries served by a single time provider: Cloudflare and other ASes (bold)

Next, we can compute the number of Internet users per NTP server if they choose to use the NTP Pool. We see that Nigeria, with 2 NTP servers only, has 60m Internet users per server. The US and Western Europe have fewer than 0.47m users per NTP server. (Many African countries have similar ratios, but this is because they do not have NTP servers in their country, so they fall back to the African zone.)

Ratio of million Internet users per NTP server

Security implications

There are multiple security implications of the constrained mapping. First, countries with no servers in their country zone (which fall back to their continent zones) can have all their traffic monopolised by a single NTP server. All it takes is an NTP server to be added to their country zone. If this NTP server happens to be malicious — i.e., sends false time information, it can be used to carry out time-shift attacks. We have shown in our paper (Section 4.3) how that happens incidentally with Guernsey.

The NTP Pool has its own monitors, which detect and evict badly behaved NTP servers, but they can also be fooled. The same attacks can be applied to affect some of the NTP traffic from one country by creating a race condition. A determined attacker can shift the clocks of all NTP Pool devices in a country, if they carry out their attack carefully.

What’s next

We presented our findings to the NTP Pool operators in July 2023, and they are planning to fix the issue we identified, by “having a new DNS name for the new zones and then over time migrate the old names to point to the new one probably country by country so we can start by migrating things that work poorly now”. However, as far as we can tell, these changes have not yet been made.

Finally, even though the current NTP Pool set-up has the issues described, let us not forget the big picture: that we should thank the NTP Pool’s volunteer operators, who have been running this service for 20+ years. They are the most popular time service on the Internet, one of the few services that have not (yet) been replaced by large cloud and content operators. Nevertheless, the system can be improved to prevent such restrictive mapping and potential security incidents.


This blog summarises the main findings of our ACM SIGMMETRICS ‘24 paper, which will be presented in Venice in June.

2

You may also like

View more

About the author

Giovane Moura Based in Arnhem, The Netherlands

Giovane is a Data Scientist with SIDN Labs (.nl registry) and a Assistant Professor at TU Delft, in the Netherlands. He works on security and Internet measurements research projects. You can reach him at http://giovane-moura.nl/

Comments 2