You are here: Home > Publications > RIPE Labs > Xavier Le Bris > Ensuring Registry Data Quality

Ensuring Registry Data Quality

Xavier Le Bris — 08 Mar 2011
Contributors: Bert Wijnen, Róbert Kisteleki
In order to ensure accurate and up to date registration data, the RIPE NCC started the Registry Data Quality project in 2009. The article below describes this activity and the problems we found and how we fixed them.

Motivation

To keep the RIPE NCC's reputation as a trusted source of data, we consistently work on the quality of our internal and external records. One of our primary responsibilities as a Regional Internet Registry (RIR) is to keep our resource registry accurate and up to date. It is important that the records reflect the legitimate holders of IP number resources.

During a thorough clean-up activity, called Registry Data Quality (RDQ), we found that the data related to those resources the RIPE NCC is responsible for, showed good quality. However, inter-RIR database transfers and some procedural changes (data that didn't need to be registered in the past but now it does), introduced some inconsistencies over time.

The initial goal of the RDQ project was twofold:

  • to make sure all resources the RIPE NCC is responsible for are registered as  RIPE NCC maintained resources and
  • to make sure all resources that are currently registered as maintained by the RIPE NCC, are correctly recorded in the RIPE NCC's internal records.

     

During this process we looked at the following data sources:

  • the I ANA IP registry
  • RIPE NCC's public resource files (also called stats files)
  • other RIR's public stats files
  • various internal data sets
Note that in the first phase of this project, the RIPE database was not included. Once those inconsistencies found during phase one were cleaned up, we started the second phase of the project were we compared our internal records with the public RIPE Database. This will soon be described in a follow-up article on RIPE Labs.

Methodology

Most records (address prefixes, AS numbers) are registered in more than one data source. In some cases, a record is listed in all the above data sets. For each of these records, we looked at all the data sets and determined how confident we were that this record is correct and up to date. This decision was based on a number of criteria, for instance:
  • has this record been updated recently
  • have we been in contact with the address holder recently
In essence we asked each record: "Do you think you are a RIPE NCC resource?" The record could answer with "Yes", "No", or "I don't know". If the record came back from all data sources with "Yes", then the confidence level for this record was high. If the answers were inconsistent, we knew that the record is not up to date in some of the data sources. For these records, we had to investigate further and make sure the record is updated in all data sets.

As a first step, we compared all records listed in the IANA registry with the internal records at the RIPE NCC and created repository files for all address space the RIPE NCC is responsible for. A lot of attention was devoted to legacy IPv4 addresses. Legacy space that is used in our service region, was transferred to the RIPE NCC from 1992 until 2009. For those address ranges we did not have detailed records and we had to find details in internal mail archives and other internal records.

Another range of resources that contained inconsistencies and outdated records, were those records that were transferred to AfriNIC from 2005 to 2009. This does not mean that AfriNIC is not keeping good records, but is simply due to the fact that the RIPE NCC has no direct contact anymore to the holders of these resources.  The records that the RIPE NCC keeps about these ranges have now been updated in collaboration with AfriNIC.

Results

We succeeded to fix 1,097 inconsistencies related to IPv4 address space listed in our internal records. 600 of these were related to legacy IPv4 space. Another 150 inconsistencies were related to transfers of address ranges after the creation of AfriNIC. Other problems were related to the creation of zonelets as part of  the reverse delegation process.

In addition to that, we found 2,126 inconsistencies related to autonomous system numbers.

Each of these 2,126 ASN related inconsistencies had to be investigated separately. In order to repair that data, we had to

  • create 1,698 objects in the RIPE Database (these are place holders mostly related to ERX and AfriNIC transfers)
  • update 428 objects (holder names) Internal records not modified overtime
  • delete 135 objects (those objects were out of date and the resources not uses anymore)

 

ASN Problem Types (RDQ1)

  Figure 1: Three ways we used to clean up ASN related inconsistencies

Figure 2 below shows the total number of IPv4 related records in which we found inconsistencies (note that one record can contain more than one inconsistency). As you can see, there was a constant number of inconsistencies for records all the way up to January 2009 when we started the RDQ project. Since then the number of inconsistent records decreased. Early in 2010 it was down to zero. The peak you see in January 2010 was caused by a clean-up of DNS records that were not maintained properly and that the RIPE NCC was not responsible for.

Since then all inconsistent records have been cleaned up. With each new /8 the RIPE NCC got allocated, we run these tests. This is visible in the small increase and immediate decrease in January 2011.

  Total number of inconsistencies

Figure 2: Total number of inconsistent records found

While correcting these inconsistencies, we also collaborated closely with the IANA. Some of our investigations lead to the IANA updating their records. All these inconsistencies have been corrected in the meantime.

An independent third party also audits the RIPE NCC’s Registration Services Department regularly to help ensure data accuracy.

Future Plans

Now that we are confident about the accuracy of the internal registration data, we can use this data as a basis to check the accuracy of public records, for instance in the RIPE Database and the Routing Information Service (RIS). This has been done in Phase 2 of the Registry Data Quality project. We will present the results of this second phase here on RIPE Labs shortly. This will show h ow confident we are the maintainer of a record points to the legitimate holder of the resource.

This project helped us to:

  • Strengthen data quality where it is necessary.

  • Prevent conflicts about address space use and help to resolve any conflicts that do arise.

  • Support the policy development process with hard data.

 

 

0 Comments

Add comment

You can add a comment by filling out the form below. Comments are moderated so they won't appear immediately. If you have a RIPE NCC Access account, we would like you to log in.