A few months ago we started offering daily archives of public RIPE Atlas measurements. We're happy to announce that this service is here to stay!
Background
RIPE Atlas measurement data is always available via the API, but many distinct measurements run every single day and aggregating the results manually can be a pain. We collect a rich set of measurement data and, as good citizens, we don't want to generate more measurement traffic than is necessary. It's important that we make this data as easy to retrieve as possible.
And you all want a ton of data, right? Right.
Last year, we prototyped a service to expose this data for around one month from the day of collection. The feedback has all been positive, so we've worked on making it more permanent.
Minor Updates
In making this more permanent, we're also making some minor modifications that you should take into account.
First, the new location for the data:
https://data-store.ripe.net/datasets/atlas-daily-dumps
Second, the directory structure and file naming has changed. We're now bundling all measurements of a particular type into one; that means that each bzip archive will, for example, contain all ping measurements in the given time window. Previously, those measurements were spread across multiple archives for IPv6, IPv4, built-in, and user-defined measurements.
All of those subsets still exist within the data, but if you specifically want those subsets you should do the following:
- To split out IPv6 from IPv4, most of the results in the files are tagged with an 'af' field, except in certain error conditions where a measurement was attempted but did not generate any results.
- Built-in measurements always have a msm_id value less than 1,000,000.
The other difference reflected in the filenames is that we are now generating archives per hour rather than per day. This makes the file sizes a little more manageable for download, and may help with sampling the data.
Finally, the archive will continue to hold data for about one month from the date of collection.
Please update your scripts; the old archive at https://ftp.ripe.net/ripe/atlas will be removed on 15 March 2018.
Be aware that release of this data falls under the regular RIPE NCC Terms of Service. Finally, if you spot any glitches or have any problems using the service, please contact us at atlas@ripe.net!
Comments 4
Comments are disabled on articles published more than a year ago. If you'd like to inform us of any issues, please reach out to us via the contact form here.
Alexander •
So, historic data older than one month will be lost (or made unuvailable publically) forever?
Hide replies
Stephen Strowes •
Hi Alexander! Measurement data -- all of it -- is always available via the API. The archives discussed here are a convenience method, offering data from recent public measurements in bulk form. In the future, we may modify this service to hold data for longer or to provide samples of older data, in the same location. If so, we'll be sure to announce it.
Chenghui Hong •
Hi, I just downloaded a .bz2 hourly file about 7.9 GB after extraction, yet when I try "GET /api/v2/measurements/traceroute/?start_time__gte=1519171404&stop_time__gte=1519178604", this should list traceroute happening within 2 hours, but following the result link and I got only about 500MB of json file, so what's missing here?
Stephen Strowes •
Hey! The difference is that the .bz2 is a dump of measurement results, but /api/v2/measurements/traceroute/ gives you measurement metadata. The json returned from the API call you provided includes 'result' fields, each of which refers to the measurement IDs with the actual results. If you want, you can run through those pointers and pull down the measurement results manually to get at the result data we're putting into the .bz2 archives. Note also that the output of that API call is paginated; use option "page_size=500" (the maximum) to reduce the number of pages you have to loop over. Hope that helps!