Proposed Improvements to "Dummification" of Personal Data in the RIPE Database

The RIPE NCC proposes improvements to the algorithm used for removing personal data from the bulk provisioning of RIPE Database data.

This article is obsoleted.

The RIPE NCC provides RIPE Database data in bulk format in three different ways:

Complete dump, accessible from ftp://ftp.ripe.net/ripe/dbase/ripe.db.gz

Split dump of data per object type, accessible from
ftp://ftp.ripe.net/ripe/dbase/split/

A live feed through the Near Real Time Mirroring (NRTM) protocol

To adhere to data protection rules and community requirements, the RIPE NCC has to remove what is considered personal data from these data dumps.

Current Situation

The algorithm currently removes all objects containing personal data (PERSON and ROLE objects) and replaces them with a single dummy object:

Original Object		Dummified Object
person: Fred Blogs address: RIPE Network Coordination Centre (NCC) address: Singel 258 address: 1016 AB Amsterdam address: The Netherlands phone: +31 20 535 4444 fax-no: +31 20 535 4445 e-mail: guy‌@ripe.net nic-hdl: FB99999-RIPE mnt-by: AARDVARK-MNT notify: guy‌@ripe.net changed: guy‌@ripe.net 20040225 source: RIPE	BECOMES	person: Placeholder Person Object address: RIPE Network Coordination Centre address: P.O. Box 10096 address: 1001 EB Amsterdam address: The Netherlands phone: +31 20 535 4444 nic-hdl: DUMY-RIPE mnt-by: RIPE-DBM-MNT remarks: ********************************************************** remarks: * This is a placeholder object to protect personal data. remarks: * To view the original object, please query the RIPE remarks: * Database at: remarks: * http://www.ripe.net/whois remarks: ********************************************************** changed: ripe-dbm‌@ripe.net 20090724 source: RIPE

It also replaces all references to the PERSON or ROLE objects in other objects in order to keep the whole dataset consistent:

Original Object		Dummified Object
inetnum: 193.0.0.0 - 193.0.7.255 netname: RIPE-NCC descr: RIPE Network Coordination Centre descr: Amsterdam, Netherlands remarks: Used for RIPE NCC infrastructure. country: NL admin-c: JDR-RIPE admin-c: BRD-RIPE tech-c: OPS4-RIPE status: ASSIGNED PI source: RIPE mnt-by: RIPE-NCC-MNT mnt-lower: RIPE-NCC-MNT changed: hostmaster‌@ripe.net 20090203 changed: bit-bucket‌@ripe.net 20110217	BECOMES	inetnum: 193.0.0.0 - 193.0.7.255 netname: RIPE-NCC descr: RIPE Network Coordination Centre descr: Amsterdam, Netherlands remarks: Used for RIPE NCC infrastructure. country: NL admin-c: DUMY-RIPE tech-c: DUMY-RIPE status: ASSIGNED PI source: RIPE mnt-by: RIPE-NCC-MNT mnt-lower: RIPE-NCC-MNT changed: unread‌@ripe.net 20110217 remarks: **************************** remarks: * THIS OBJECT IS MODIFIED remarks: * Please note that all data that is generally regarded as personal remarks: * data has been removed from this object. remarks: * To view the original object, please query the RIPE Database at: remarks: * http://www.ripe.net/whois remarks: ****************************

Since ORGANISATION and MNTNER objects might also include personal data, the algorithm tries to obfuscate the contents by removing optional attributes and replacing the values in some mandatory attributes:

Original Object		Dummified Objec t
organisation: ORG-NCC1-RIPE org-name: RIPE Network Coordination Centre org-type: RIR address: RIPE NCC Singel 258 1016 AB Amsterdam Netherlands phone: +31 20 535 4444 fax-no: +31 20 535 4445 e-mail: ncc‌@ripe.net admin-c: AP110-RIPE admin-c: CREW-RIPE tech-c: CREW-RIPE ref-nfy: hm-dbm-msgs‌@ripe.net mnt-ref: RIPE-NCC-RIS-MNT mnt-ref: RIPE-NCC-HM-MNT notify: hm-dbm-msgs‌@ripe.net mnt-by: RIPE-NCC-HM-MNT changed: bitbucket‌@ripe.net 20121217 source: RIPE	BECOMES	organisation: ORG-NCC1-RIPE org-name: Dummy organisation name for ORG-NCC1-RIPE org-type: RIR address: Dummy address for ORG-NCC1-RIPE e-mail: unread‌@ripe.net mnt-ref: RIPE-NCC-RIS-MNT mnt-ref: RIPE-NCC-HM-MNT mnt-by: RIPE-NCC-HM-MNT changed: unread‌‌@ripe.net 20000101 source: RIPE remarks: **************************** remarks: * THIS OBJECT IS MODIFIED remarks: * Please note that all data that is generally regarded as personal remarks: * data has been removed from this object. remarks: * To view the original object, please query the RIPE Database at: remarks: * http://www.ripe.net/whois remarks: ****************************

Original Object		Dummified Object
mntner: AARDVARK-MNT descr: Mntner for guy's objects admin-c: FB99999-RIPE tech-c: FB99999-RIPE upd-to: denis‌@ripe.net auth: X509-1 auth: X509-1689 auth: MD5-PW # Filtered notify: guy‌@ripe.net mnt-by: AARDVARK-MNT referral-by: AARDVARK-MNT changed: guy‌@ripe.net 20120510 source: RIPE # Filtered	BECOMES	mntner: AARDVARK-MNT descr: Dummy description for AARDVARK-MNT admin-c: DUMY-RIPE upd-to: unread‌@ripe.net auth: MD5-PW $1$SaltSalt$DummifiedMD5HashValue. # Real value hidden for security mnt-by: AARDVARK-MNT referral-by: AARDVARK-MNT changed: unread‌@ripe.net 20000101 source: RIPE remarks: **************************** remarks: * THIS OBJECT IS MODIFIED remarks: * Please note that all data that is generally regarded as personal remarks: * data has been removed from this object. remarks: * To view the original object, please query the RIPE Database at: remarks: * http://www.ripe.net/whois remarks: ****************************

It has been suggested to us that the dummification process goes beyond what is needed to protect the data. For example:

Currently ORGANISATION and MNTNER objects are available through the live RIPE Database without any access restrictions or limits so it is easily possible to collect object keys from one of the dummified dumps and retrieve the full objects from the live database.

Obfuscating the references doesn't provide any added value since all other object types are available with no limits from the live RIPE Database or the split dump files. All of the references are already exposed with no limits.

The current dummification process renders the data useless for many different uses, for example:

The references between actual resources and their administrative and technical contacts create a meaningful and useful relation between resources and entities which is completely lost in the current dummification process.

Useful research data is lost by dampening all objects containing personal data to a single dummy placeholder object.

The RIPE NCC is proposing a new dummification process to address these shortcomings, while staying withing the data protection rules.

Proposed algorithm:

Keeping the links and references in all of the objects

Keeping the PERSON and ROLE objects in the dump with their real NIC handles and only obfuscating the personal data fields:

For email addresses, we will keep the domain part of the address and will only obfuscate the email account part
For phone and fax numbers, we will keep the first half of the number and will obfuscate the rest
For addresses, if the address is longer than two lines, we will keep the last line
Names will be fully obfuscated

One exception is a ROLE object with an "abuse-mailbox:" attribute. The email value of the "abuse-mailbox:" attribute will not be obfuscated . All other email addresses in the object will have the email account part obfuscated . None of the other obfuscation that applies to a ROLE object (as described above) will be done. By design, this object will be available in any bulk data without any query limits. So there is no added value in obfuscating too much of this object.

In all other objects we will:

Always replace the MD5 password hash in the MNTNER objects with a default hash

For all email addresses in any attribute, keep the domain part of the address and will obfuscate the email account part

If the proposal is accepted, the examples used in the previous section will look like the objects below:

Original Object		Dummified Object
person: Fred Blogs address: RIPE Network Coordination Centre (NCC) address: Singel 258 address: 1016 AB Amsterdam address: The Netherlands phone: +31 20 535 4444 fax-no: +31 20 535 4445 e-mail: guy‌@ripe.net nic-hdl: FB99999-RIPE mnt-by: AARDVARK-MNT notify: guy‌@ripe.net changed: guy‌@ripe.net 20040225 source: RIPE	BECOMES	person: Name Removed address: * address: * address: * address: The Netherlands phone: +31 20 5.. .... fax-no: +31 20 5.. .... e-mail: ‌@ripe.net nic-hdl: FB99999-RIPE mnt-by: AARDVARK-MNT notify: ‌@ripe.net changed: *‌@ripe.net 20040225 source: RIPE

Original Object		Dummified Object
inetnum: 193.0.0.0 - 193.0.7.255 netname: RIPE-NCC descr: RIPE Network Coordination Centre descr: Amsterdam, Netherlands remarks: Used for RIPE NCC infrastructure. country: NL admin-c: JDR-RIPE admin-c: BRD-RIPE tech-c: OPS4-RIPE notify: ncc‌@ripe.net status: ASSIGNED PI source: RIPE mnt-by: RIPE-NCC-MNT mnt-lower: RIPE-NCC-MNT changed: bit-bucket‌@ripe.net 20110217	BECOMES	inetnum: 193.0.0.0 - 193.0.7.255 netname: RIPE-NCC descr: RIPE Network Coordination Centre descr: Amsterdam, Netherlands remarks: Used for RIPE NCC infrastructure. country: NL admin-c: JDR-RIPE admin-c: BRD-RIPE tech-c: OPS4-RIPE notify: *‌@ripe.net status: ASSIGNED PI source: RIPE mnt-by: RIPE-NCC-MNT mnt-lower: RIPE-NCC-MNT changed: *‌@ripe.net 20110217

Original Object		Dummified Object
organisation: ORG-NCC1-RIPE org-name: RIPE Network Coordination Centre org-type: RIR address: RIPE NCC Singel 258 1016 AB Amsterdam Netherlands phone: +31 20 535 4444 fax-no: +31 20 535 4445 e-mail: ncc‌@ripe.net admin-c: AP110-RIPE admin-c: CREW-RIPE tech-c: CREW-RIPE ref-nfy: hm-dbm-msgs‌@ripe.net mnt-ref: RIPE-NCC-RIS-MNT mnt-ref: RIPE-NCC-HM-MNT notify: hm-dbm-msgs‌@ripe.net mnt-by: RIPE-NCC-HM-MNT changed: bitbucket‌@ripe.net 20121217 source: RIPE	BECOMES	organisation: ORG-NCC1-RIPE org-name: RIPE Network Coordination Centre org-type: RIR address: RIPE NCC Singel 258 1016 AB Amsterdam Netherlands phone: +31 20 5.. .... fax-no: +31 20 5.. .... e-mail: *‌@ripe.net admin-c: AP110-RIPE admin-c: CREW-RIPE tech-c: CREW-RIPE ref-nfy: ‌@ripe.net mnt-ref: RIPE-NCC-RIS-MNT mnt-ref: RIPE-NCC-HM-MNT notify: ‌@ripe.net mnt-by: RIPE-NCC-HM-MNT changed: *‌@ripe.net 20121217 source: RIPE

Original Object		Dummified Object
mntner: AARDVARK-MNT descr: Mntner for guy's objects admin-c: FB99999-RIPE tech-c: FB99999-RIPE upd-to: guy‌@ripe.net auth: X509-1 auth: X509-1689 auth: MD5-PW # Filtered notify: guy‌@ripe.net mnt-by: AARDVARK-MNT referral-by: AARDVARK-MNT changed: guy‌@ripe.net 20120510 source: RIPE # Filtered	BECOMES	mntner: AARDVARK-MNT descr: Mntner for guy's objects admin-c: FB99999-RIPE tech-c: FB99999-RIPE upd-to: *‌@ripe.net auth: X509-1 auth: X509-1689 auth: MD5-PW $1$SaltSalt$DummifiedMD5HashValue. # Real value hidden for security notify: ‌@ripe.net mnt-by: AARDVARK-MNT referral-by: AARDVARK-MNT changed: **‌@ripe.net 20120510 source: RIPE

Original Object		Dummified Object
role: RIPE NCC tech contact address: RIPE Network Coordination Centre (NCC) address: Singel 258 address: 1016 AB Amsterdam address: The Netherlands phone: +31 20 535 4444 fax-no: +31 20 535 4445 e-mail: ncc‌@ripe.net nic-hdl: RNTC-RIPE mnt-by: RIPE-DBM-MNT notify: ripe-dbm‌@ripe.net changed: ripe-dbm‌@ripe.net 20040225 source: RIPE	BECOMES	role: RIPE NCC tech contact address: * address: * address: * address: The Netherlands phone: +31 20 5.. .... fax-no: +31 20 5.. .... e-mail: ‌@ripe.net nic-hdl: RNTC-RIPE mnt-by: RIPE-DBM-MNT notify: ‌@ripe.net changed: *‌@ripe.net 20040225 source: RIPE

Original Object		Dummified Object
role: RIPE NCC abuse handler address: RIPE Network Coordination Centre (NCC) address: Singel 258 address: 1016 AB Amsterdam address: The Netherlands phone: +31 20 535 4444 fax-no: +31 20 535 4445 e-mail: ncc‌@ripe.net abuse-mailbox: abuse‌@ripe.net nic-hdl: RNAH-RIPE mnt-by: RIPE-DBM-MNT notify: ripe-dbm‌@ripe.net changed: ripe-dbm‌@ripe.net 20040225 source: RIPE	BECOMES	role: RIPE NCC abuse handler address: RIPE Network Coordination Centre (NCC) address: Singel 258 address: 1016 AB Amsterdam address: The Netherlands phone: +31 20 535 4444 fax-no: +31 20 535 4445 e-mail: *‌@ripe.net abuse-mailbox: abuse‌@ripe.net nic-hdl: RNAH-RIPE mnt-by: RIPE-DBM-MNT notify: ‌@ripe.net changed: **‌@ripe.net 20040225 source: RIPE

Implementation

Since the resulting dataset is still self-consistent and RIPE RPSL compliant, we don't expect any incompatibility with existing tools:

We can keep generating both old and new dumps on the RIPE NCC's FTP server. If we go live with this new dummification algorithm, we will move the old data format to a subdirectory in our FTP server and will keep generating files in both formats for 30 days.

The same will happen for the NRTM feed. We will switch the main feed to the new format but will keep a server with old format running, pointing customers in case of an incompatibility to the old server.

We will decommission the old dummification software if there are no open incompatibility reports after 30 days of running both processes in parallel.

Any feedback about this proposal would be appreciated on the RIPE Database Working Group mailing list's discussion for this topic: http://www.ripe.net/ripe/mail/archives/db-wg/2013-May/004048.html