Intro to INRDB - the Internet Number Resource Database

Robert Kisteleki — Sep 22, 2009 01:50 PM
Filed under: , , ,
INRDB is a non-conventional database hosting many different, number resource related data sets.


Inside the RIPE NCC Science Group cauldron we've been cooking something really tasty for a while now - the ultimate number resource database. Inside it one can find many historical data sets from the RIPE NCC and other entities. We've already been using this database for internal purposes such as the Registration Data Quality measurements, and from now on you too can see more and more services based on this technology. Let us give you a demonstration.

INRDB is a non-conventional database hosting many different, number resource related data sets. We try to preserve the history of all these data sets, which come from different sources, including the RIPE NCC.

In our prototype system we currently store and serve:

  • The full history of RIS table dumps, in three different variations
  • All the "delegated" files ("stats" files) from all RIRs since December 2003
  • Historical versions of the IANA assignments/allocations pages
  • A sanitized version of the RIPE Database
  • Geolocation information from MaxMind
  • Blacklists/spamlists from various sources
  • AS relations information and active traceroute measurements data from CAIDA
  • Various NCC internal databases

Most of these data sources have very different characteristics in terms of availability, access interface, ability to index on more/less specific resources and whether they provide historical data or not. We wanted to achieve high performance, through a unified query interface, and all the above mentioned features for all the data sources, so our choice was to import these data sets into INRDB. The applications then can access all the data from the processed form:

INRDB overvie

 

All in all, the system currently stores and can serve roughly 1 billion (10 9 ) observations ( blobs , as we call them) with about 7.5 billion (7.5*10 9 ) different validity times ( intervals , when the observations were valid).

The system is currently running on a handful of servers, using a self-made storage mechanism (SQL databases just couldn't handle this much data with acceptable response times and cost factor).

Today we present you with access to two live databases: a simple one and a not-so-trivial one.

 

RIR stats

Every RIR publishes a so called "stats" or "delegations" file every day, listing all their current allocations/assignments on that day. These files have a common format since December 2003, and all this data is available in INRDB (in total, this is about 10.000 files). Using the following tool, you can quickly query about any number resource listed in any of these files.

Of course, this is a tiny data set, but it provides a very simple introduction to how data is stored in INRDB. Our online tool actually hides away the query interface, so you'll only see the output of the system.

For practical purposes we are limiting the number of items given as results.

Click here and try it out!

 

RIS RIB Light

As I mentioned above, we also store the full RIS table dump history. The "light" version of this data class contains the following information from each RIS table dump:

  • Route collector ID
  • IPv4/IPV6 prefix
  • Announcing AS
  • First transit AS
  • The number of routing peers seeing this data constellation at any given time (per route collector)

In this data class we have approximately 47M blobs and 2.5BN intervals online. With the following tool you can query any prefix or AS number seen in this data class. You can set options like:

  • What resource to look up (AS number or IPv4 or IPv6 prefix)
  • Which route collector to ask, or the unified/aggregated information from all of them

For practical purposes we are limiting the number of items given as results.

Click here and try it out!

 

Credits

Idea: Daniel Karrenberg
Design: Róbert Kisteleki
Daniel Karrenberg
René Wilhelm
Tiziana Refice
Implementation: Róbert Kisteleki
René Wilhelm
Tiziana Refice

3 Comments

pksoul
pksoul says:
Jan 25, 2010 10:58 AM

It's good idea as far as combining the datasets r concerened.But i have not find any details on the dataset structure and access to researchers.

 

As a student what i found most difficult is to setup a huge database to store all these dataabases to run my experiments.It seems that INRDB can be answer. 

Please if there are some othere links than update me with them.

 

kistel
kistel says:
Jan 25, 2010 01:29 PM

INRDB is currently a prototype, hence we only provide "controlled" access, through this site for example. The production version is expected to be (more) open to the public.

 

Having said this, I'm happy to give access to the service itself to you or anyone else who's interested. I'll contact you.

pksoul
pksoul says:
Jan 26, 2010 04:44 AM

Thanx for your reply.I am looking forward to even a "controlled" access. :) 

Add comment

You can add a comment by filling out the form below. Only plain text is possible. Web and email addresses will be transformed into clickable links. Comments are moderated so they won't appear immediately.

Related Items
Modifications to the IP Analyser to Reflect New Policy

We are in the process of implementing the policy regarding Post Depletion Adjustment of Procedures ...

Report on IPv6 Security Test Methodology

The Dutch Institute for Applied Scientific Research (TNO) and a number of Dutch security companies ...

Visualising Bandwidth Capacity and Network Activity in RIPEstat Using M-Lab Data

As a result of the cooperation between the RIPE NCC and Measurement Lab (M-Lab), you can now ...

Valuing IP Addresses

The prospect of exhaustion of the IPv4 address space is not a surprise. We've been anticipating ...

The Assisted Registry Check - Let Us Help You!

The Assisted Registry Check is the new name for the RIPE NCC’s audit activities that have been ...

more ...