INRDB is a non-conventional database hosting many different, number resource related data sets.
Inside the RIPE NCC Science Group cauldron we've been cooking something really tasty for a while now - the ultimate number resource database. Inside it one can find many historical data sets from the RIPE NCC and other entities. We've already been using this database for internal purposes such as the Registration Data Quality measurements, and from now on you too can see more and more services based on this technology. Let us give you a demonstration.
INRDB is a non-conventional database hosting many different, number resource related data sets. We try to preserve the history of all these data sets, which come from different sources, including the RIPE NCC.
In our prototype system we currently store and serve:
- The full history of RIS table dumps, in three different variations
- All the "delegated" files ("stats" files) from all RIRs since December 2003
- Historical versions of the IANA assignments/allocations pages
- A sanitized version of the RIPE Database
- Geolocation information from MaxMind
- Blacklists/spamlists from various sources
- AS relations information and active traceroute measurements data from CAIDA
- Various NCC internal databases
Most of these data sources have very different characteristics in terms of availability, access interface, ability to index on more/less specific resources and whether they provide historical data or not. We wanted to achieve high performance, through a unified query interface, and all the above mentioned features for all the data sources, so our choice was to import these data sets into INRDB. The applications then can access all the data from the processed form:
All in all, the system currently stores and can serve roughly 1 billion (10 9 ) observations ( blobs , as we call them) with about 7.5 billion (7.5*10 9 ) different validity times ( intervals , when the observations were valid).
The system is currently running on a handful of servers, using a self-made storage mechanism (SQL databases just couldn't handle this much data with acceptable response times and cost factor).
Today we present you with access to two live databases: a simple one and a not-so-trivial one.
RIR stats
Every RIR publishes a so called "stats" or "delegations" file every day, listing all their current allocations/assignments on that day. These files have a common format since December 2003, and all this data is available in INRDB (in total, this is about 10.000 files). Using the following tool, you can quickly query about any number resource listed in any of these files.
Of course, this is a tiny data set, but it provides a very simple introduction to how data is stored in INRDB. Our online tool actually hides away the query interface, so you'll only see the output of the system.
For practical purposes we are limiting the number of items given as results.
RIS RIB Light
As I mentioned above, we also store the full RIS table dump history. The "light" version of this data class contains the following information from each RIS table dump:
- Route collector ID
- IPv4/IPV6 prefix
- Announcing AS
- First transit AS
- The number of routing peers seeing this data constellation at any given time (per route collector)
In this data class we have approximately 47M blobs and 2.5BN intervals online. With the following tool you can query any prefix or AS number seen in this data class. You can set options like:
- What resource to look up (AS number or IPv4 or IPv6 prefix)
- Which route collector to ask, or the unified/aggregated information from all of them
For practical purposes we are limiting the number of items given as results.
Credits
Idea: | Daniel Karrenberg |
Design: | Róbert Kisteleki Daniel Karrenberg René Wilhelm Tiziana Refice |
Implementation: | Róbert Kisteleki René Wilhelm Tiziana Refice |
Comments 5
Comments are disabled on articles published more than a year ago. If you'd like to inform us of any issues, please reach out to us via the contact form here.
Anonymous •
<div class="content legacycomment"> <p> It's good idea as far as combining the datasets r concerened.But i have not find any details on the dataset structure and access to researchers. </p> <p> </p> <p> As a student what i found most difficult is to setup a huge database to store all these dataabases to run my experiments.It seems that INRDB can be answer. </p> <p> Please if there are some othere links than update me with them. </p> <p> </p> </div>
Hide replies
Anonymous •
<div class="content legacycomment"> <p> INRDB is currently a prototype, hence we only provide "controlled" access, through this site for example. The production version is expected to be (more) open to the public. </p> <p> </p> <p> Having said this, I'm happy to give access to the service itself to you or anyone else who's interested. I'll contact you. </p> </div>
Hide replies
Anonymous •
<div class="content legacycomment"> <p> Thanx for your reply.I am looking forward to even a "controlled" access. :) </p> </div>
Filip •
Hello, Is it possible to use INRDB and how? I need to search if an Internet prefix was announced on the Internet between 2012 and 2013, by which AS, for how long etc. Can you please help me?
Hide replies
Robert Kisteleki •
Hi, you should check out RIPEstat, specifically the routing history widget: https://stat.ripe.net/widget/routing-history#w.resource=193.0.0.0%2F21