Releasing RIPE Atlas Measurements Source Code

Philip Homburg — May 06, 2013 04:35 PM
Filed under: , ,
We've decided to release the RIPE Atlas measurement source code for the first time.

This article introduces the RIPE Atlas measurement source code. The measurement code is the part that runs on the RIPE Atlas probes to schedule and execute particular measurements, and thereby forms the core of the RIPE Atlas system.

Motivation

We have several motivations for releasing the source code:

  • We've heard from a number of users that they would be interested in looking at the source code for various reasons, including:
    • A purely technical interest in how RIPE Atlas performs measurements
    • Assessing the methodology and identifying possible systematic errors
    • Security audits
    • Strong principles in favour of using open source
    • We believe that in the long run this helps openness, transparency and security
    • Doing so makes it possible - and encourages the community - to contribute to RIPE Atlas at the core level

    We're releasing this source code in good faith. We trust that the users who will explore it will responsibly disclosure any technical flaws or security issues they find, and report these to us first so that we can fix them.

    If you do find this code useful and want to fork it, build your own measurement system with it, or in any other way reuse it, please be so kind as to notify us. We are extremely interested in all work in this field and we would like to be aware of others building on our work. Please also give proper credit to the RIPE NCC and RIPE Atlas.

    If you have any questions or concerns, or find any bugs or security issues, please contact us at atlas-bugs [at] ripe [dot] net.

    Technical details and some historical background

    The measurement code started out as just small modifications to the standard Unix utilities in busybox. Shell scripts would run ping and traceroute, redirect the output to log files, and use the wget tool to post the results to the controlling infrastructure.

    Over time this has evolved in two different ways. First, a scheduler was written, based on crond but with capabilities tailored towards RIPE Atlas use cases. This scheduler is called 'perd', the periodic daemon. The main difference from cron is that is takes a start and stop time, and an interval (frequency) in seconds. In addition, it adds jitter to avoid all probes starting a measurement at exactly the same time and to avoid a constant period in a single measurement. Finally, perd calls the measurements as C function calls instead of starting new processes. This is required because of the limitations of the version one and two probes, which do not have a MMU.

    The main limitation of perd and the traditional Unix utilities is that you can perform only one measurement at a time. To create some sort of parallelism, probes would run multiple instances of perd.

    The next logical step was to rewrite the measurement code using libevent, a library that provides event management and abstracts from the select system call and non-blocking I/O. In addition, the output format of the measurements was changed to JSON. perd evolved into eperd to deal with the libevent adaptation. The new libevent-based measurement code can be found in the subdirectory 'eperd' in the source code.

    The libevent code has also been modified, in order to fix a bug in the DNS stub-resolver. This code is included in this release.

    In this article we're not going to describe all utilities that perform various functions on the probe. However, two are worth mentioning, because they contain tweaks in order to support RIPE Atlas functions. The first one is the telnet daemon, which has been modified to accept commands from the controller to start and stop measurements. The second one is eooqd, a daemon that runs the one-off measurements.

    Getting the source code

    We plan to release the source code as a tar archive corresponding to firmware versions, starting with the current one, version 4520.

    Building and installing

    The procedure for building and installing is described in the file 'INSTALL' in the source code. Basically, it involves compiling and installing libevent and busybox.

    This code is known to compile and run on Debian Wheezy and CentOS 6.3. It will probably compile on any modern Linux version.

    Running

    In this section we'll describe only the actual measurement code. The many other utilities that are needed for the internal operation of RIPE Atlas are beyond the scope of this release.

    The command names of the main measurement utilities are evping, evtraceroute, evtdig, evhttpget, and sslgetcert.

    These commands should either run as the root user, or with the capability 'cap_net_raw'. In RIPE Atlas, these utilities are invoked by two schedulers: perd and eperd.

    Below we describe the commands and options they take one by one. All commands support both IPv4 and IPv6. The output of the measurements is mostly documented in the raw data structure documentation. The difference is a few fields that are not present in the measurement output, but are inserted by the controlling infrastructure.

    One thing worth mentioning is that the probe tries to protect itself from a buggy (or malicious) controller. For this reason, the utilities that accept commands from a controller carefully check whether, for example, file names are safe to use.

    Functions

    evping

    This is an implementation of ICMP ping based on libevent.

    OptionDescription
    -4 Use IPv4 (default)
    -6 Use IPv6
    -A <atlas id> Atlas ID string (measurement ID)
    -O <file name> Name of output file
    -s <size> Size of probe packet
    -c <n packets> Number of packets to send (default 3)

    Example:

    evping -4 -c 3 -A "1001" -O /home/atlas/data/new/7 193.0.14.129

    evtraceroute

    This is an implementation of traceroute using ICMP and UDP and with or without "Paris" using libevent.

    OptionDescription
    -4 Use IPv4 (default)
    -6 Use IPv6
    -I Use ICMP
    -U Use UDP (default)
    -F Don't fragment outgoing packets, for path MTU discovery
    -a <paris modulus> Enables Paris-traceroute (to try to keep traceroute packet take the same route over load balancers). The modulus specifies the number of paths to try. Zero disables it (default 16)
    -c <n packets> Number of packets to send per hop (default 3)
    -f <first hop> First hop to probe (default 1)
    -g <gap limit> Give up after getting no response for this many hops. (default 5)
    -m <max hops> Maximum number of hops to probe (default 32)
    -w <timeout> Time (in milliseconds) to wait for a response to come in. (default 1000)
    -z <timeout> Time (in milliseconds) to wait for a duplicate response to come in. (default 10)
    -A <atlas id> Atlas ID string (measurement ID)
    -O <file name> Name of output file
    -S <size> Size of probe packet. Note that for ICMP this includes the ICMP header but for UDP it excludes the UDP header (default 40 bytes)

    Example:

    evtraceroute -4 -c 3 -U -w 1000 -A "5001" -O /home/atlas/data/new/7 193.0.14.129

    evtdig

    This is a DNS client implementation based on libevent.

    OptionDescription
    --resolv send the query to local name servers. When using this option there should be no 'server' argument.
    --p_probe_id prepend class IN query name with '<probeid>.<time>.' to make the query per probe unique
    -4 | -6 IPv4 | IPv6
    -O <file name> output file name with full path
    -t Make a TCP query (default is UDP)
    -e <size> EDNS UDP buffer size. Default 512. Max 4096
    -d enable DO bit. Default is off
    -n query NSID. Default is off
    --noabuf don't append answer buffer/payload to results
    --qbuf append query buffer/payload to results
    -A <atlas id> Atlas ID string (measurement ID)
    -R Recursion desired.
    Class "IN" Options
    --a
    --ns
    --cname
    --ptr
    --rrsig
    --dnskey
    --mx
    --txt
    --ds
    --aaaa
    --any
    --soa
    Class "CHOAS" OptionsDescription
    -b version.bind
    -h hostname.bind
    -r version.server
    -i id.server

    Example:

    evtdig -4 --soa . -A "10001" -O /home/atlas/data/new/7 193.0.14.129

    evhttpget

    This is an HTTP client based on libevent.

    OptionDescription
    -a or --all Report on all addresses returned by resolving the host name in URL
    -c or --combine Combine the reports for all addresses in one JSON result. Otherwise, each result is a separate JSON.
    --get GET method
    --head HEAD method
    --post POST mehod
    --post-file <filename> File to post
    --post-header <filename> File to post (comes first)
    --post-footer <filename> File to post (comes last)
    --store-headers <bytes> Number of bytes of the header to report
    --user-agent <string> User agent header
    -0 HTTP/1.0
    -1 HTTP/1.1
    -4 Only IPv4 addresses
    -6 Only IPv6 addresses
    -A <atlas id> Atlas ID string (measurement ID)
    -O <filename> Output file name

    Example:

    evhttpget -4 -1 -A "12023" -O /home/atlas/data/new/7 http://www.ripe.net/favicon.ico

    sslgetcert

    This is a utility that just gets the certificate of an SSL server; it doesn't actually set up a SSL connection.

    OptionDescription
    -4 Only IPv4
    -6 Only IPv6
    -A <atlas id> Atlas ID string (measurement ID)
    -p port (default 443)

    Example:

    sslgetcert -4  -A "14001" www.ripe.net

    eperd

    This is a cron-like utility that can run measurements at regular intervals. It is designed to work with libevent-based measurements. The older version, perd, is for measurements that are not based on libevent.

    The eperd utility is derived from crond and is still strongly influenced by it. Its input is a 'crontab' that has a somewhat different syntax. Commands are stored in a file called 'root' because they are supposed to be run as root.

    An example entry is:

    60 1363873067 1366466607 UNIFORM 14 evtdig -4 -h --evdns -A "1015529" -O /home/atlas/data/new/7 193.0.14.129

    The first number (60) is the interval in seconds.

    The second (1363873067) is the start time in Unix epoch seconds. This also specifies the offset (or phase) in the interval. In this example, the command is supposed to run when the current time modulo 60 is equal to 1363873067 modulo 60. The third number is the end time of the measurement. The next two fields, 'UNIFORM 14', specifies that the actual time the command is run should get a jitter in the range of plus or minus seven seconds. The rest of the line is the command to execute.

    Note that command execution is completely internal; no separate process will be created.

    Some of the options accepted by eperd are:

    OptionDescription
    -c <directory> directory where the 'crontab' is
    -P <file> file to store the pid
    -O <filename> output of eperd
    -f run in the foreground

    For more options see the busybox documentation for crond.

    By default, eperd forks and runs in the background.

    Example:

    eperd -c /home/atlas/crons/7 -A 9807 -P /home/atlas/status/perd-7.pid.vol -O /home/atlas/data/new/7

    perd

    This command is the precursor to eperd and is very similar to it. Perd is used for sslgetcert because that application is not yet based on libevent. In addition, perd also runs the command that submits results (httppost).

    1 Comment

    stephanr.mueller@gmx.de
    Stephan Mueller says:
    Jan 04, 2014 03:07 AM
    Amazing. Thanks for keeping up the spirit of Open Source! The net is open source, so should be it's tools!
    Add comment

    You can add a comment by filling out the form below. Only plain text is possible. Web and email addresses will be transformed into clickable links. Comments are moderated so they won't appear immediately.

    Related Items
    Increased Reach of RIPE Atlas Anchors

    Increasing the reach of RIPE Atlas anchors is one of the highest priority goals of RIPE Atlas Team. ...

    Proposing Making RIPE Atlas Data More Public

    RIPE Atlas is now three years old, and is moving from a prototype to production service. Based on ...

    Modifications to the IP Analyser to Reflect New Policy

    We are in the process of implementing the policy regarding Post Depletion Adjustment of Procedures ...

    RIPE Atlas: Improved Probe Pages

    We've made it much easier to get an overview of the history and measurements for all the public ...

    Visualising Bandwidth Capacity and Network Activity in RIPEstat Using M-Lab Data

    As a result of the cooperation between the RIPE NCC and Measurement Lab (M-Lab), you can now ...

    more ...