The RIPE NCC is pleased to announce a new Geolocation prototype service for storing geolocation data in the RIPE Database. This article outlines the methodology applied in this prototype.
Please also see Example Usage of RIPE Database Geolocation Prototype
The RIPE Database Update at RIPE 62 in April 2011 raised some issues concerning geolocation. The presentation also outlined some suggestions for including this information in the RIPE Database. Based on the feedback we have received from:
- RIPE 62 Meeting
- Regional Meetings
- Mailing lists
- RIPE NCC trainings for Members
- Questions to RIPE Database Customer Support
- A preview of the results of the Member Survey
we built a prototype to store geolocation data in the RIPE Database. This was an action point on the RIPE NCC from the RIPE Database Working Group session at RIPE 62 . Please note that this is only a prototype and has no impact on real data in the RIPE Database. This prototype has its own sand-boxed test database. You can create any data objects you want in this sand-boxed environment to check out the geolocation functionality.
The new Geolocation prototype version of Webupdates is available now at: https://apps.db.ripe.net/webupdates/search.html
and the new Geolocation Finder tool is available now at: https://apps.db.ripe.net/search/geolocation-finder.html
Many people have commented on specific uses for having easy access to this information. The main use cases currently identified can be grouped into the following categories:
- Content providers can target their audience with a higher degree of accuracy
- Content consumers will receive content they can understand and is relevant to where they are located
- Service providers can offer services customised for a location and that conforms to any local regulations
- Commercial providers of geolocation data can take this as an additional (weighted) input
We can illustrate this with a couple of real case examples:
- The RIPE NCC Registration Services regularly get comments from LIRs who receive complaints from their users about language. Content providers may serve tailored information in the wrong language when IP address blocks are redistributed.
- Some service providers restrict access to their services based on location. Again redistributed IP addresses may cause some problems.
- Some countries have more than one official language.
The basic requirements we took from the discussions were:
- Optional location information stored with IP addresses
- Optional language information stored with IP addresses
- Efficient management of the data
- Resource holder administered
- Easy access to the data by different stakeholders
The new data is included in the INETNUM and INET6NUM objects. Two new optional attributes have been added:
The first attribute stores the geolocation data as longitude and latitude. The second stores the language as a two-letter code from the ISO 639-1 language code list. We made use of the hierarchical nature of address space by adopting the same mechanism as used for the IRT object reference. This allows for efficient management of data without needing to include these optional attributes in every object where the information is required.
Consider an LIR with an allocation where all assignments from that allocation are based in the same area for people speaking the same language(s). Then the LIR only needs to enter the "geoloc:" and "language:" attributes in the allocation INET6NUM object. These entries will apply to all more specific objects without their own attributes. If any of the more specific assignments are in a different location, they can include their own "geoloc:" attribute. This will take precedence over the parent object's value. Similarly, if any of the more specific assignments are for users who speak a different language, they can include their own "language:" attribute. This will also take precedence over the parent object's value. The two attributes apply independently. So an assignment can have its own value for one of them, but take the parent object's value for the other.
Searching for the data
If any INETNUM or INET6NUM object is queried that contains any of the new optional attributes, they will show in the object data just as any other attribute does. But there is no atomic query built into the whois server to search for this data. So we provided a Geolocation Finder tool based on the existing Abuse Finder tool . A user enters an IP address and the tool does all the queries necessary to find relevant data. We reused the approach from the IRT object, except that for IRT the searching up the hierarchy is done internally by the whois server and for this it is done by the Geolocation Finder tool.
We take the IP address entered as the starting point and query for the (encompassing) object. We are looking for two items of data contained in the "geoloc:" and "language:" attributes. If the encompassing object has both these attributes, we return this data. If it has one of these attributes, this value is taken for that part of the data and we search up the hierarchy for the other piece of data. If it has neither attribute we search up the hierarchy for the pieces of data.
When searching, the Geolocation Finder tool repeatedly does a query to return the one level less specific object to the current object. We then check the returned object looking for whichever attribute(s) that has not yet been found in the previous objects. Each attribute is taken independently. If one of them is found, the search for that attribute ends. If both are found, the search ends. The search also stops at the least specific point in the hierarchy of user data. For example an allocation object of an LIR.
At the end of the searching, any data found is returned to the user. In the text part of the returned data the primary key of the object we took the data from is shown in parentheses after the value. On the map we only show the location. This is represented as a circle, currently with the actual location at the centre of the circle.
For some use cases a high degree of accuracy is required to pin point the location with a very high precision. For other use cases, particularly where privacy is concerned, a much courser precision may be preferred. This could be achieved by the user removing some of the precision of the numerical values. Then the user has control over the level of precision. Or perhaps with the displayed circle not centered on the pin point. Some randomising could be applied here by the Geolocation Finder tool. Maybe the circle size could be user set. There are a number of issues yet to be decided in this area.
- Precision of location data as discussed above.
- Setting the default location for the map based on the country code attribute.
- Defining a meaning for the location to avoid repeating the problem that exists for the country code attribute in which no one knows what the country means. Is it the corporate headquarters of the LIR, the data centre or the End User address? If there are use cases for multiple meanings then we may need multiple attribute (sub) types.
- Provide a helper for adding language code so users don't need to browse for the ISO list.
- Display both the language name and ISO code in the Geolocation Finder.
- Daily dump of addresses covered by geolocation data, with the data.
Example of using the service
Please see Example Usage for RIPE Database Geolocation Prototype. This article provides a step-by-step guide showing how to add, manage and how to find this data.
What we have now is just a prototype demonstrating a proof of concept. As shown above there are already a list of issues that need to be further considered. We need your feedback, comments and input to decide if there is any general interest in this feature and, if so, how to move it forward.