Over the past months we've been looking at our Routing Information Service (RIS) and thinking about how to make it best fit for purpose. Ahead of our upcoming RIPE NCC Open House on RIS, this post raises a set of open questions to our community aimed at starting a conversation about how we can keep RIS useful to you.
At the RIPE NCC we run a large route collecting system called RIS. We currently have 21 active route collectors that collect BGP data from roughly 1,300 BGP peering sessions. We know from our last membership survey that RIS is regarded by our members as a useful and important service. And we use RIS and the raw data it collects ourselves as a key back-end for RIPEstat and for producing analyses aimed at supporting our community.
Ultimately, RIS is one of the ways we help contribute to a stable and innovative Internet. As a neutral platform for the collection of data, it provides greater transparency of the inter-domain routing system. The data is potentially useful for a wide array of applications for troubleshooting, monitoring, post-mortem analyses of routing events, measuring adoption of new technologies, and so on.
A few years back RIS was redesigned so we could collect more data and be more realtime, and the numbers show that we indeed capture from more RIS peers than before:
|ASN||Hegemony Score||Peers with RIS||Fullfeed in RIS||Name|
|721||3.0%||NO||NO||DNIC - DoD|
Table 2: Do we have the most central networks (according to AS-Hegemony for IPv4 address space) in RIS?
|ASN||Hegemony Score||Peers with RIS||Fullfeed in RIS||Name|
Table 3: A slightly different ranking. Now hegemony scores represent the number of networks a particular network has influence over
The scoring that we show in table 2 is based on the centrality of a network with regards to IPv4 address space (based on AS-hegemony). So this biases for networks that originate and/or transit for networks with a large address space footprint. This is why, for instance AS721 scores so high in this particular metric, even though it is not one of the large transit networks. If we calculate this a bit differently (table 3), namely by counting the number of networks that have a large dependency (directly or indirectly) on a given network, we get many of the same names. We could calculate AS-Hegemony differently, for instance by weighing each ASN equally, regardless of how much address space it originates, or weighing it by the end-user population that is in each network.
So what do you think? Are particular flavours of AS-Hegemony a good metric to base RIS peering decisions on? Does this fit your intuition (or data) on what are the most influential networks on the Internet nowadays?
RIS has 21 route collectors, and most of them collect from peers that are physically close (i.e., the same IXP LAN). And while this gives us a somewhat localised view, by capturing local networks and their upstreams, we don't steer this towards trying to capture the networks that are the most influential at that particular location. One way to do better here would be to leverage expertise from NOGs and other local organisations. If we can create lists of networks with the most local relevance, maybe we can use local netops expertise to validate this information and help capture BGP data from these networks. And if we capture the right local information, maybe we can develop insights that are locally relevant (like how, say, BGP hijacks affected the area your NOG represents).
Can we set measurable goals to track progress on the quality of our route collection by leveraging AS-Hegemony (or something similar) as a guiding principle? We think this makes sense, but again, we'd love to have some feedback.
3) Figuring Out How Real-time We Want Our Data Collection To Be
Our current platform (specifically RIS-live) enables us to be near-realtime, which is something we know users appreciate. It enables innovations by the community. An excellent example of this is BGP-Alerter, which feeds off RIS-live to detect anomalies for your network in near-realtime. We also found out that sometimes, for instance when one of the RIS peers generates a large number of BGP updates, we do have processing delays of minutes, and in extreme cases even hours. This also causes extra delays down processing pipelines for RIS-backed data in RIPEstat for instance. As we are looking for improvements of data delivery, we would like to know what performance users would want from this and what this would enable for them and others.
4) Better Documenting RIS and the Data We Collect
Do you know this feeling when you get lost in a big supermarket so it takes you a lot of time to actually find what you need? What helps in these cases is signposts. When digging into RIS data, one can easily feel lost by the sheer amount of it. When looking at visualisations of routing (for instance BGPlay), it's easy to feel the same: there are a lot of things moving all the time here, but are they relevant for me? If we collected better meta-data, that would make for easier navigation of the data, in exactly the way those big hanging signs in supermarkets make for easier navigation through the aisles.
Also, better meta-data can enable new uses of the data or allow for improvements of existing uses of the data. For example, what about a BGPlay that shows which peers do route-origin validation? What type of a feed is this? What is the geographic location of this multihop peer? Is this peer doing route-origin validation? Some of this info we can try to infer, at the risk of being wrong, of course. We can also try to ask RIS peers for more information about the feed they provide us.
As an example of how adding better meta-data gives more insight, we tried to infer the location from which we collect data - i.e. the physical location of the RIS peer - for the cases where the peer and collector are not in the same place - i.e. the multi-hop collectors rrc00 and rrc24. If you compare figures 1 and 2, you can see that by adding this meta-data we get a better picture of what physical locations we get data from, and also what locations we are missing data from, if we wanted to focus on coverage of geographical areas.
Figure 1: RIS route collector locations
Figure 2: RIS route collector (red open circles) and inferred RIS peer locations (green circles). Countries have a dark shade if there is a collector in the country and a lighter shade if there is at least 1 RIS peer in the country.
As you can see from the images above, there are also quite a few countries, even in our service region, that we don't have a single BGP feed from.
It would be great to get some feedback on what others think what the most useful meta-data to collect is, and how collecting this will improve the service/enable new use-cases.
RIS isn't the only route collection system out there. A number of other systems are in place - e.g. the RouteViews project - that are also actively collecting data right now. But fortunately, collecting BGP data doesn't have to be a competitive endeavour. In fact, we like to think of it as a collaborative one, which benefits the entire community. A big part of what we want to figure out is how we can collaborate with other route collector projects in such a way as to offer the best possible joint datasets and services. What can we share and coordinate so our community is better off? I've already mentioned NOGs, but what other collaborations would help our community with collection of data that would add more insight?
Each of the five points raised above represents an area where we think progress can be made on improving RIS. And while we have ideas on how we would like to move forward in these areas, we really think that starting a conversation with the community right now will help make sure we move in the direction that is most useful to RIS users.
To facilitate this, we'll be holding a RIPE NCC Open House on RIS on October 20. The session will give anyone who wants to join a chance to come together with experts from across the community for a focused discussion on RIS. We very much hope to see you there.
In the meantime, we would appreciate any feedback you have to give us. Let us know your ideas on how to make things better. Let us know if you want to help out with specific parts of this. And of course, if you think of other relevant points we've not covered here, don't hesitate to tell us about it. Please write down your thoughts under this article, or email us at firstname.lastname@example.org.