Mirjam Kühne

RIPE Data Repository - A 100 TB Portal of Easy-to-Access Datasets

Mirjam Kühne
0 You have liked this article 0 times.

The RIPE NCC has launched the RIPE Data Repository, a ‘portal’ that hosts a collection of data sets useful for scientific and operational Internet research. The repository can hold up to 100TB of diverse sets of data, some going back as far as 1993.

The RIPE Data Repository makes it easy for researchers and operators in the RIPE community to share data more efficiently, acting as a catalyst for useful research.

The project took root after Tony McGregor, a scientist from the University of Waikato in New Zealand, approached the RIPE NCC Science Group with the idea.

"Many interesting Internet-related data sets are lost to the Internet community when research grants expire and researchers can no longer afford to host them,” explains Daniel Karrenberg, Chief Scientist for the RIPE NCC,  “The RIPE Data Repository is a natural home for such orphaned data sets.  Our community is interested in data sets with IP addresses or ASNs in them and primarily those consistently taken over longer time periods."

Tony McGregor describes the benefits: "The data sets in our WITS repository have received a lot of attention over the years but housing them in the antipodes has created some access limits. For example, our data servers are only available via IPv6. RIPE NCC's efforts to store this data in a second location will improve accessibility to the data for the RIPE region and provide much appreciated redundancy."

Easy to Access Data Sets

Before the RIPE Data Repository, each dataset had its own mechanisms and requirements for access. Now, the only requirement to access all the data sets is to have a RIPE Labs account and agree to the RIPE Data Depository Terms and Conditions. Note that data sets with specific privacy concerns may require an additional agreement in the future.

Bringing the data together and making it accessible in the same way has the potential to increase use and reduce misuse, improve attribution of credit and reduce management costs.

From NLANR PMA to IPv6...and more

The RIPE Data Repository currently has five data sets: The Waikato Internet Traffic Storage (WITS) passive datasets, NLANR PMA data, Routing Information Service (RIS) raw data set, Reverse DNS delegations, and IPv6 web stats, with more to be added in the next few weeks. Please check out the RIPE Data Repository and keep checking back!

0 You have liked this article 0 times.

You may also like

View more

About the author

Mirjam Kühne Based in Amsterdam, The Netherlands

I wrote the articles collected here during my time as community builder of the RIPE NCC and the maintainer and editor of RIPE Labs. I have since taken on a new role serving as the Chair of the RIPE Community. You can reach my new profile via the website link below.

Comments 0