Releasing RIPE Atlas Measurements Source Code

We've decided to release the RIPE Atlas measurement source code for the first time.

This article introduces the RIPE Atlas measurement source code. The measurement code is the part that runs on the RIPE Atlas probes to schedule and execute particular measurements, and thereby forms the core of the RIPE Atlas system.

Motivation

We have several motivations for releasing the source code:

We've heard from a number of users that they would be interested in looking at the source code for various reasons, including:

A purely technical interest in how RIPE Atlas performs measurements
Assessing the methodology and identifying possible systematic errors
Security audits
Strong principles in favour of using open source

We believe that in the long run this helps openness, transparency and security

Doing so makes it possible - and encourages the community - to contribute to RIPE Atlas at the core level

We're releasing this source code in good faith. We trust that the users who will explore it will responsibly disclosure any technical flaws or security issues they find, and report these to us first so that we can fix them.

If you do find this code useful and want to fork it, build your own measurement system with it, or in any other way reuse it, please be so kind as to notify us. We are extremely interested in all work in this field and we would like to be aware of others building on our work. Please also give proper credit to the RIPE NCC and RIPE Atlas.

If you have any questions or concerns, or find any bugs or security issues, please contact us at atlas [at] ripe [dot] net.

Technical details and some historical background

The measurement code started out as just small modifications to the standard Unix utilities in busybox . Shell scripts would run ping and traceroute, redirect the output to log files, and use the wget tool to post the results to the controlling infrastructure.

Over time this has evolved in two different ways. First, a scheduler was written, based on crond but with capabilities tailored towards RIPE Atlas use cases. This scheduler is called 'perd', the periodic daemon. The main difference from cron is that is takes a start and stop time, and an interval (frequency) in seconds. In addition, it adds jitter to avoid all probes starting a measurement at exactly the same time and to avoid a constant period in a single measurement. Finally, perd calls the measurements as C function calls instead of starting new processes. This is required because of the limitations of the version one and two probes, which do not have a MMU.

The main limitation of perd and the traditional Unix utilities is that you can perform only one measurement at a time. To create some sort of parallelism, probes would run multiple instances of perd .

The next logical step was to rewrite the measurement code using libevent , a library that provides event management and abstracts from the select system call and non-blocking I/O. In addition, the output format of the measurements was changed to JSON. perd evolved into eperd to deal with the libevent adaptation. The new libevent -based measurement code can be found in the subdirectory 'eperd' in the source code.

The libevent code has also been modified, in order to fix a bug in the DNS stub-resolver. This code is included in this release.

In this article we're not going to describe all utilities that perform various functions on the probe. However, two are worth mentioning, because they contain tweaks in order to support RIPE Atlas functions. The first one is the telnet daemon, which has been modified to accept commands from the controller to start and stop measurements. The second one is eooqd , a daemon that runs the one-off measurements.

Getting the source code

We plan to release the source code as a tar archive corresponding to firmware versions, starting with the current one, version 4520 .

Building and installing

The procedure for building and installing is described in the file 'INSTALL' in the source code. Basically, it involves compiling and installing libevent and busybox .

This code is known to compile and run on Debian Wheezy and CentOS 6.3. It will probably compile on any modern Linux version.

Running

In this section we'll describe only the actual measurement code. The many other utilities that are needed for the internal operation of RIPE Atlas are beyond the scope of this release.

The command names of the main measurement utilities are evping , evtraceroute , evtdig , evhttpget , and sslgetcert .

These commands should either run as the root user, or with the capability 'cap_net_raw'. In RIPE Atlas, these utilities are invoked by two schedulers: perd and eperd .

Below we describe the commands and options they take one by one. All commands support both IPv4 and IPv6. The output of the measurements is mostly documented in the raw data structure documentation . The difference is a few fields that are not present in the measurement output, but are inserted by the controlling infrastructure.

One thing worth mentioning is that the probe tries to protect itself from a buggy (or malicious) controller. For this reason, the utilities that accept commands from a controller carefully check whether, for example, file names are safe to use.

Functions

evping

This is an implementation of ICMP ping based on libevent .

Option	Description
-4	Use IPv4 (default)
-6	Use IPv6
-A <atlas id>	Atlas ID string (measurement ID)
-O <file name>	Name of output file
-s <size>	Size of probe packet
-c <n packets>	Number of packets to send (default 3)

Example:

 
  evping -4 -c 3 -A "1001" -O /home/atlas/data/new/7 193.0.14.129

evtraceroute

This is an implementation of traceroute using ICMP and UDP and with or without "Paris" using libevent .

Option	Description
-4	Use IPv4 (default)
-6	Use IPv6
-I	Use ICMP
-U	Use UDP (default)
-F	Don't fragment outgoing packets, for path MTU discovery
-a <paris modulus>	Enables Paris-traceroute (to try to keep traceroute packet take the same route over load balancers). The modulus specifies the number of paths to try. Zero disables it (default 16)
-c <n packets>	Number of packets to send per hop (default 3)
-f <first hop>	First hop to probe (default 1)
-g <gap limit>	Give up after getting no response for this many hops. (default 5)
-m <max hops>	Maximum number of hops to probe (default 32)
-w <timeout>	Time (in milliseconds) to wait for a response to come in. (default 1000)
-z <timeout>	Time (in milliseconds) to wait for a duplicate response to come in. (default 10)
-A <atlas id>	Atlas ID string (measurement ID)
-O <file name>	Name of output file
-S <size>	Size of probe packet. Note that for ICMP this includes the ICMP header but for UDP it excludes the UDP header (default 40 bytes)

Example:

 
  evtraceroute -4 -c 3 -U -w 1000 -A "5001" -O /home/atlas/data/new/7 193.0.14.129

evtdig

This is a DNS client implementation based on libevent.

Option	Description
--resolv	send the query to local name servers. When using this option there should be no 'server' argument.
--p_probe_id	prepend class IN query name with '<probeid>.<time>.' to make the query per probe unique
-4 \| -6	IPv4 \| IPv6
-O <file name>	output file name with full path
-t	Make a TCP query (default is UDP)
-e <size>	EDNS UDP buffer size. Default 512. Max 4096
-d	enable DO bit. Default is off
-n	query NSID. Default is off
--noabuf	don't append answer buffer/payload to results
--qbuf	append query buffer/payload to results
-A <atlas id>	Atlas ID string (measurement ID)
-R	Recursion desired.

Class "IN" Options

--a

--ns

--cname

--ptr

--rrsig

--dnskey

--mx

--txt

--ds

--aaaa

--any

--soa

Class "CHOAS" Options	Description
-b	version.bind
-h	hostname.bind
-r	version.server
-i	id.server

Example:

 evtdig -4 --soa . -A "10001" -O /home/atlas/data/new/7 193.0.14.129

evhttpget

This is an HTTP client based on libevent .

Option	Description
-a or --all	Report on all addresses returned by resolving the host name in URL
-c or --combine	Combine the reports for all addresses in one JSON result. Otherwise, each result is a separate JSON.
--get	GET method
--head	HEAD method
--post	POST mehod
--post-file <filename>	File to post
--post-header <filename>	File to post (comes first)
--post-footer <filename>	File to post (comes last)
--store-headers <bytes>	Number of bytes of the header to report
--user-agent <string>	User agent header
-0	HTTP/1.0
-1	HTTP/1.1
-4	Only IPv4 addresses
-6	Only IPv6 addresses
-A <atlas id>	Atlas ID string (measurement ID)
-O <filename>	Output file name

Example:

 
  evhttpget -4 -1 -A "12023" -O /home/atlas/data/new/7 http://www.ripe.net/favicon.ico

sslgetcert

This is a utility that just gets the certificate of an SSL server; it doesn't actually set up a SSL connection.

Option	Description
-4	Only IPv4
-6	Only IPv6
-A <atlas id>	Atlas ID string (measurement ID)
-p	port (default 443)

Example:

 sslgetcert -4  -A "14001" www.ripe.net

eperd

This is a cron -like utility that can run measurements at regular intervals. It is designed to work with libevent -based measurements. The older version, perd , is for measurements that are not based on libevent .

The eperd utility is derived from crond and is still strongly influenced by it. Its input is a 'crontab' that has a somewhat different syntax. Commands are stored in a file called 'root' because they are supposed to be run as root.

An example entry is:

 
  60 1363873067 1366466607 UNIFORM 14 evtdig -4 -h --evdns -A "1015529" -O /home/atlas/data/new/7 193.0.14.129

The first number (60) is the interval in seconds.

The second (1363873067) is the start time in Unix epoch seconds. This also specifies the offset (or phase) in the interval. In this example, the command is supposed to run when the current time modulo 60 is equal to 1363873067 modulo 60. The third number is the end time of the measurement. The next two fields, 'UNIFORM 14', specifies that the actual time the command is run should get a jitter in the range of plus or minus seven seconds. The rest of the line is the command to execute.

Note that command execution is completely internal; no separate process will be created.

Some of the options accepted by eperd are:

Option	Description
-c <directory>	directory where the 'crontab' is
-P <file>	file to store the pid
-O <filename>	output of eperd
-f	run in the foreground

For more options see the busybox documentation for crond .

By default, eperd forks and runs in the background.

Example:

 eperd -c /home/atlas/crons/7 -A 9807 -P /home/atlas/status/perd-7.pid.vol -O /home/atlas/data/new/7

perd

This command is the precursor to eperd and is very similar to it. Perd is used for sslgetcert because that application is not yet based on libevent . In addition, perd also runs the command that submits results (httppost).

Comments 6

The comments section is closed for articles published more than a year ago. If you'd like to inform us of any issues, please contact us.

Stephan Mueller • 04 Jan 2014 02:07

Amazing. Thanks for keeping up the spirit of Open Source! The net is open source, so should be it's tools!

Jonathan • 12 Mar 2015 15:14

Thanks, One question : What's the different between ping and evping / traceroute and evtraceroute ?

Philip Homburg • 12 Mar 2015 15:39

Evping and evtraceroute are re-implementations of the ping and traceroute functionality tailored for use in RIPE Atlas. The main differences are the use of libevent, which allows many simultaneous measurements within a single process, and output in JSON.

Thomas Rohwer • 27 Apr 2017 07:30

Hello, thanks for the work and publishing the software. This is quite useful and much better performing than forking standard programs. I noticed one problem with the evhttpget and sslgetcert tasks: The sockets they use for the TCP connect seems to be blocking (on connect). That leads to lots of timeouts, if you have many tasks, some of whom block for a long time. I traced this to the following in tcputil.c (around line 250): fd= socket(af, SOCK_STREAM, 0); When replacing this by fd= socket(af, SOCK_STREAM | O_NONBLOCK, 0); the frequent timeouts are gone.

Philip Homburg • 08 May 2017 11:40

Hi Thomas, I'm happy that our code is useful for you. And thanks for reporting this. I assumed that at the point of the connect, the socket would already be set to non-blocking. I'll take a look what is going on. Philip

Philip Homburg • 09 May 2017 14:02

Looking at the code, the common case where no interface is specified works fine because libevent makes the socket non-blocking. In the case where an interface is specified there is indeed a bug. I created a ticket to fix the bug.

Releasing RIPE Atlas Measurements Source Code

Philip Homburg

Motivation

Technical details and some historical background

Getting the source code

Building and installing

Running

Functions

evping

evtraceroute

evtdig

evhttpget

sslgetcert

eperd

perd

More from this author

You may also like

What’s the Deal with IPv6 Link-Local Addresses?

Philip Homburg

Further Analysis of RIPE Atlas Version 3 Probes

Philip Homburg

Troubleshooting RIPE Atlas Probes: USB Sticks

Philip Homburg

About the author

Comments 6

Releasing RIPE Atlas Measurements Source Code

Philip Homburg

Share

Motivation

Technical details and some historical background

Getting the source code

Building and installing

Running

Functions

evping

evtraceroute

evtdig

evhttpget

sslgetcert

eperd

perd

Share

More from this author

You may also like

What’s the Deal with IPv6 Link-Local Addresses?

Philip Homburg

Further Analysis of RIPE Atlas Version 3 Probes

Philip Homburg

Troubleshooting RIPE Atlas Probes: USB Sticks

Philip Homburg

About the author

Comments 6