You are here: Home > Publications > RIPE Labs > Philip Homburg > NTP Measurements with RIPE Atlas

NTP Measurements with RIPE Atlas

Philip Homburg — 17 Feb 2015
Contributors: Vesna Manojlovic
This article describes how RIPE Atlas probes and anchors maintain their clocks, and how accurate these clocks are. We also plan to make the NTP measurements we describe here available as an additional measurement type for RIPE Atlas users.

Introduction

It is important that RIPE Atlas probes' clocks are accurate, so that the timestamps of the measurements can be trusted and can be correlated to other network events and observations.

How do NTP measurements work?

The Network Time Protocol (NTP) is a network protocol for clock synchronisation between computer systems. See RFC 5905 for a detailed description of NTP.

The measurement code installed on the RIPE Atlas probes uses the client-server mode of the NTP protocol. That means the RIPE Atlas probe acts as a client and sends a request to the NTP server that is specified as the target of the measurement. Then the probe reports the server's reply together with other information about the measurement, for instance the exact time the measurement was performed.

At the moment, RIPE Atlas developers create NTP measurements using the API. See Appendix A for an example of how we create NTP measurements in RIPE Atlas, and see below for more information about making NTP measurements available to all RIPE Atlas users.

Most of the new fields are directly derived from the NTP protocol. From the NTP timestamps, the probe computes two values: offset and RTT .

Offset is the time difference between the target's clock and the probe's local clock. A positive value means that the probe's clock is ahead; a negative value means that it is behind. RTT is the round trip time, the time between sending the request and receiving the reply.

How do RIPE Atlas devices maintain their clocks?

The Test Traffic Measurement Service (TTM) was the predecessor to the RIPE Atlas network. In the TTM network, every node had a local GPS receiver to provide accurate time. RIPE Atlas probes do not have (GPS) time receivers for two reasons. First, it would make the regular RIPE Atlas probes bigger and more expensive. Second, the RIPE Atlas anchors do not have GPS receivers because it was found that receiving GPS in the data centres where anchors are likely to be installed is difficult. Essentially, an antenna would have to be mounted on the roof, which is often not possible or practical. As a result, RIPE Atlas uses other means to keep probes' and anchors' clocks synchronised.

Different versions of RIPE Atlas probes maintain their clocks in different ways, and the differences are mainly caused by architectural differences.

There are currently four versions of RIPE Atlas measurement devices:

  • The small, black v1 and v2 probes were custom made. The main difference between the two is that the v1 probes have 8 MB of internal memory and the v2 probes have 16 MB. Both types of probes have a total of 16 MB of flash memory for storage. Note that we don't distribute these probes anymore. But there are more than 2,500 out there in the network.
  • The current v3 probes are based on TP-LINK TL-MR3020 routers. These probes have 32 MB of internal memory, 4 MB of internal flash and 4 GB of USB flash.
  • Finally, RIPE Atlas anchors are small servers that run the Centos Linux distribution and the RIPE Atlas probe code as an application on top of that.

How RIPE Atlas anchors maintain their clocks

RIPE Atlas anchors run an NTP daemon that synchronises with pool.ntp.org. The pool.ntp.org project is a big virtual cluster of timeservers providing NTP service for millions of clients. The RIPE Atlas code cannot set the time itself - anchor hosts must allow unrestricted access for the NTP protocol.

How RIPE Atlas probes v1, v2, v3 maintain their clocks

For v1, v2, and v3 probes the situation is a lot more complex. First of all, some of these probes are in an environment where the NTP protocol is blocked or filtered somewhere in the network. Additionally, those probes do not have a battery backed-up, real-time clock.

The lack of a battery backed-up clock requires some tricks to give a probe a sense of time shortly after it boots. The potential lack of access to NTP servers requires a fallback mechanism in case NTP is unavailable.

This fallback is integrated with the code that submits the measurement results. Every three minutes, probes will try to submit their measurement results to the RIPE Atlas infrastructure using an HTTP POST request. The probe then uses the timestamp in the HTTP reply to verify and, if necessary, correct its local clock. Basically, the probe checks whether its local clock is within two seconds of the controller's clock. If not, and the entire HTTP request took less than one second, the probe updates its local clock.

This mechanism is the only one the v1 probes use to synchronise their clocks. The reason is that the main memory on the v1 probes is too small to run a separate NTP daemon (ntpd). The v2 probes run an NTP implementation called 'ntpclient'. And v3 probes currently use OpenNTPD but will switch to ntpd in the near future.

In addition to keeping the clock of a probe synchronised, we also want to include in the measurement results whether the timestamp is likely to be accurate or not. For example, when a probe boots but does not have Internet access, it cannot set its clock - but the probe will still perform measurements. In that case, when the probe later reports the measurement results, it is important to know that the timestamps might be incorrect.

The way this is implemented is as follows. As described above, when a probe reports measurement results, it compares the timestamp in the HTTP reply with its local clock. If the time difference is less than two seconds and the entire HTTP request took less than one second, the probe records that at that time the clock was in sync.

When a probe generates a measurement result, it includes a field called 'lts', which stands for "last time synchronised". In this field, the probe reports the number of seconds that have passed since the probe last recorded that its clock was in sync. So in the ideal case, the values should never exceed 180, because probes report results every three minutes. A probe uses the special value -1 to indicate that, since it last rebooted, it never found its clock to be in sync. Note that this code runs not only on the v1, v2, and v3 probes, but also on the RIPE Atlas anchors.

How accurate are the RIPE Atlas probes' clocks?

The primary reason we care about the accuracy of RIPE Atlas probes' clocks is to find out whether we can trust the timestamps in measurement results. If their clocks are inaccurate, it becomes hard to correlate measurement results with other logs or observations.

In our experiment, we asked all probes to target ntp.atlas.ripe.net using NTP. By default, an NTP measurement sends three NTP requests to a specified target. The results of our specific experiment can be found in Measurement #1841877 . 7,522 probes were allocated for this measurement, of which 7,274 probes provided results. Of these, 192 probes sent a measurement to ntp.atlas.ripe.net, but did not receive a response.

We then took the median of the three results.

Figure 1 below shows the results for all four existing probe categories:

  • RIPE Atlas probes v1 (blue)
  • RIPE Atlas probes v2 (green)
  • RIPE Atlas probes v3 (red)
  • RIPE Atlas anchors (turquoise).

Results that either have an 'lts' value of -1 or give a result larger than 1,000 seconds are grouped under the category 'lts' (purple).

NTP Measurements Complete Figure 1: NTP measurements for all RIPE Atlas probes and anchors, sorted by version

In the image above a number of things are noteworthy:

  • Almost all results are within a few seconds of 0
  • The outliers are all tagged as 'lts'
  • There is an unusual distribution for the v3 probes

As a next step, we zoomed in to see what exactly was happening, especially with the v3 probes. You can see the results in Figure 2 below.

 

NTP Measurements - Zoomed In   Figure 2: NTP measurements for all RIPE Atlas probes, zoomed in

RIPE Atlas v1 (blue) and v2 (green) probes show similar behaviour. Some probes are more than two seconds from 0, but not many. Also, v1 and v2 probes show a similar distribution. We can conclude that the NTP client program that runs on the v2 probes doesn't have any effect, i.e. not running it in the v2 probes is likely to give the same results.

However, v3 probes (red) show an unusual distribution. As you can see in Figure 2 above, there is a thin spike in the middle, and the distribution of the rest of the v3 probes is not symmetric, but skewed a bit to the left.

So, let's zoom in some more on the spike to see what happened with the v3 probes (see Figure 3 below).

NTP Measurements - Spike Figure 3: NTP measurements for v3 RIPE Atlas probes and anchors, zoomed in

This shows that the spike is a group of 557 v3 probes out of a total of 4,622 v3 probes that have their local clocks within 10 ms of ntp.atlas.ripe.net. In addition, most of the RIPE Atlas anchors (green) are also in that same range. Some v3 probes manage good time synchronisation, and it is not clear why the others don't. Regardless, it's important to note that this just means that some probes are more accurate, not that there is a problem with the less accurate probes.

Most RIPE Atlas anchors also have good time synchronisation. As mentioned above, anchors are synchronised to pool.ntp.org. This means that an asymmetric path between the anchor and ntp.atlas.ripe.net may skew results. The reason is that the computation of the 'offset' value assumes that the path is symmetric. These observations require some more study and we will report the results here on RIPE Labs in a future article.

NTP measurements available to RIPE Atlas users

We are currently testing NTP measurements within the RIPE Atlas infrastructure and, once these tests are successful, plan to make them available as a new measurement type (in addition to ping, traceroute and DNS measurements) to RIPE Atlas users. These NTP measurements will be available from all RIPE Atlas probes towards one of the NTP servers. This will allow RIPE Atlas users to measure the time between a collection of probes and a specific target.

This could be interesting for a number of reasons:

  • Network operators can compare various NTP servers and choose the best one for their network.
  • Users can check whether their connection is blocking NTP.
  • NTP server operators can use these measurements to improve their operations.
  • Researchers can explore new ways to measure, for example, the impact of asymmetric routes and network congestion.

We also might want to add other support for NTP measurements in the future, similar to what's available for other measurement types already, including: 

  • A web-based user interface
  • Tools for analysing results, for example to detect asymmetric path delays
  • Visualisations
  • "Status Checks" to enable alerts

The choice and priority of these features will be based on community feedback. Please let us know what you would like us to focus on, and stay tuned more details.

Conclusion

In general, almost all RIPE Atlas probes are within two seconds of UTC, which means their clocks are in sync with the RIPE Atlas infrastructure. Big outliers that fall outside this range have the 'lts' field set, which signals that the probe's state (with regards to its time synchronisation) is unknown.

A small group of v3 probes and most anchors are within 10 ms of UTC. We intend to switch to ntpd and pool.ntp.org for the v3 probes in the near future . At the moment it is not clear why the NTP client doesn't work on v2 probes and why the OpenNTPD works only on some of the v3 probes .

NTP measurements will also be made available to RIPE Atlas users soon.

Feedback

  • For feature suggestions please use the RIPE Atlas mailing list for active RIPE Atlas hosts and interested users, which is also followed and answered by RIPE Atlas developers: 
  • For more general discussions about measurements and future plans, please use the Measurements Analysis and Tools Working Group Mailing List:  mat-wg [at] ripe [dot] net
  • Join the discussion on Twitter: @RIPE_Atlas

 

Appendix A: Example of an NTP measurement in RIPE Atlas

As an example, you can include the following in the definitions section of a measurement specification:

 { "target": "ntp.atlas.ripe.net",
 
        "type": "ntp", "af": "4",
        "description": "oneoff ntp.atlas.ripe.net ntp 4 10",
        "is_oneoff": true }

This would results in the following NTP measurements:

 {
	"af":4,
	"dst_addr":"193.0.0.229",
	"dst_name":"193.0.0.229",
	"from":"193.0.10.127",
	"fw":4661,
	"group_id":1020238,
	"li":"no",
	"lts":35,
	"mode":"server",
	"msm_id":1020238,
	"msm_name":"Ntp",
	"poll":1,
	"prb_id":100,
	"precision":0.0000019074,
	"proto":"UDP",
	"ref-id":"GPS",
	"ref-ts":3627199554.7434468269,
	"result":[
		{
			"final-ts":3627199583.5645947456,
			"offset":-0.026271,
			"origin-ts":3627199583.5634074211,
			"receive-ts":3627199583.5902590752,
			"rtt":0.001161,
			"transmit-ts":3627199583.5902853012
		},
		{
			"final-ts":3627199583.5661029816,
			"offset":-0.026314,
			"origin-ts":3627199583.5650439262,
			"receive-ts":3627199583.59187603,
			"rtt":0.001035,
			"transmit-ts":3627199583.5918998718
		},
		{
			"final-ts":3627199583.5675640106,
			"offset":-0.026291,
			"origin-ts":3627199583.5665426254,
			"receive-ts":3627199583.5933327675,
			"rtt":0.000998,
			"transmit-ts":3627199583.5933551788
		}],
	"root-delay":0,
	"root-dispersion":0.00137329,
	"src_addr":"193.0.10.127",
	"stratum":1,
	"timestamp":1418210783,
	"type":"ntp",
	"version":4
}

(For more information, see the raw data structure documentation .)

2 Comments

Stéphane Bortzmeyer says:
23 Feb, 2015 04:06 PM
RIPEAtlas.RequestSubmissionError: Status 400, reason "{"error": {"code": 104, "message": "__all__: You are not permitted to perform NTP measurements"}}"

I'm waiting these measurements impatiently...
Philip Homburg says:
24 Feb, 2015 02:14 PM
Creation of NTP measurements through the API is now available for all users. The API documentation will be updated next week.
Add comment

You can add a comment by filling out the form below. Comments are moderated so they won't appear immediately. If you have a RIPE NCC Access account, we would like you to log in.