Route Collection at the RIPE NCC - Where are we and where should we go?

Over the past months we've been looking at our Routing Information Service (RIS) and thinking about how to make it best fit for purpose. Ahead of our upcoming RIPE NCC Open House on RIS, this post raises a set of open questions to our community aimed at starting a conversation about how we can keep RIS useful to you.

At the RIPE NCC we run a large route collecting system called RIS. We currently have 21 active route collectors that collect BGP data from roughly 1,300 BGP peering sessions. We know from our last membership survey that RIS is regarded by our members as a useful and important service. And we use RIS and the raw data it collects ourselves as a key back-end for RIPEstat and for producing analyses aimed at supporting our community.

Ultimately, RIS is one of the ways we help contribute to a stable and innovative Internet. As a neutral platform for the collection of data, it provides greater transparency of the inter-domain routing system. The data is potentially useful for a wide array of applications for troubleshooting, monitoring, post-mortem analyses of routing events, measuring adoption of new technologies, and so on.

Moving Forward

A few years back RIS was redesigned so we could collect more data and be more realtime, and the numbers show that we indeed capture from more RIS peers than before:

Year	RIS peers
2017	732
2018	792
2019	1,003
2020	1,261

Table 1: Number of RIS peers as of 30 June each year

As a result, the volume of data we collect has grown too. In fact, it's grown to the point where we started to get slightly worried. The question is - does all that extra data give us a better service? What value is gained from each additional peer? Each additional route collector?

What makes questions like this hard to answer is that RIS is a multi-purpose platform. While each additional bit of information may be useful to somebody, it will cause extra storage and processing for everybody else that it's not useful for. But this is exactly what you don't want when the timeliness of the insights you want to produce is essential, which means reducing the amount of data you need to process.

With that said, as we look at ways to improve the quality of service our users get from RIS, we're really looking to strike the right balance for our users between the amount of data available and the amount of time it takes to get information out of that data. But getting this right involves talking to the community about what you expect to get out of RIS. So, without further ado, here's a list of points where we'd like to get your input.

1) Use Cases for Peering with RIS

We don't explicitly ask RIS peers their reasons for peering when we establish the peering. Anecdotally, we probably know the main motivations for peering, and we can probably guess why people might want to peer with RIS. In general, we know many peers feed data for the good of the Internet, which is something we very much like to promote and encourage. So we develop RIS in ways that enables these peers to keep feeding. Likewise, we know many peers feed data because it is beneficial for their network. Either they come out better in Internet rankings (for instance CAIDAs AS-Rank), or having their data out in the open makes other networks be able to better troubleshoot issues between the networks.

But aside from all this, it's good to actually hear specific use-cases. What do you get out of RIS? What do you want out of RIS that's hard to get right now? If you have use-cases for peering with RIS, we would love to know.

2) Better Peering Strategy for RIS

Our current peering strategy is open, but mostly passive. We don't go out to figure out what the best new peers are that would add the most value for the system as a whole. By putting more effort here we might be able to make a big leap in terms of adding value to the whole system.

We think this has two parts: globally important networks and locally important networks

Globally Important Networks

Intuitively, you'd want to be at the most central networks, because these will most likely influence the global routing system most. And if you look at table 2 and 3, we're doing pretty well in that regard. But there is always room for improvement, which is also visible here.

ASN	Hegemony Score	Peers with RIS	Fullfeed in RIS	Name
3356	13.4%	NO	NO	Level3
1299	10.0%	YES	YES	Telia
174	8.4%	YES	YES	Cogent
6939	6.4%	YES	NO	Hurricane Electric
2914	5.6%	YES	YES	NTT
4134	5.1%	YES	NO	China Telecom
7018	4.5%	YES	YES	ATT
209	4.0%	NO	NO	CenturyLink
3257	3.4%	NO	NO	GTT
721	3.0%	NO	NO	DNIC - DoD

Table 2: Do we have the most central networks (according to AS-Hegemony for IPv4 address space) in RIS?

ASN	Hegemony Score	Peers with RIS	Fullfeed in RIS	Name
1299	42348	YES	YES	Telia
174	32236	YES	YES	Cogent
3356	31294	NO	NO	Level3
6939	25115	YES	NO	Hurricane Electric
2914	17426	YES	YES	NTT
3257	15503	NO	NO	GTT
6453	7202	YES	YES	TATA
6461	6953	YES	NO	Zayo
6762	6070	YES	YES	Telecom Italia

Table 3: A slightly different ranking. Now hegemony scores represent the number of networks a particular network has influence over

The scoring that we show in table 2 is based on the centrality of a network with regards to IPv4 address space (based on AS-hegemony). So this biases for networks that originate and/or transit for networks with a large address space footprint. This is why, for instance AS721 scores so high in this particular metric, even though it is not one of the large transit networks. If we calculate this a bit differently (table 3), namely by counting the number of networks that have a large dependency (directly or indirectly) on a given network, we get many of the same names. We could calculate AS-Hegemony differently, for instance by weighing each ASN equally, regardless of how much address space it originates, or weighing it by the end-user population that is in each network.

So what do you think? Are particular flavours of AS-Hegemony a good metric to base RIS peering decisions on? Does this fit your intuition (or data) on what are the most influential networks on the Internet nowadays?

Locally

RIS has 21 route collectors, and most of them collect from peers that are physically close (i.e., the same IXP LAN). And while this gives us a somewhat localised view, by capturing local networks and their upstreams, we don't steer this towards trying to capture the networks that are the most influential at that particular location. One way to do better here would be to leverage expertise from NOGs and other local organisations. If we can create lists of networks with the most local relevance, maybe we can use local netops expertise to validate this information and help capture BGP data from these networks. And if we capture the right local information, maybe we can develop insights that are locally relevant (like how, say, BGP hijacks affected the area your NOG represents).

Can we set measurable goals to track progress on the quality of our route collection by leveraging AS-Hegemony (or something similar) as a guiding principle? We think this makes sense, but again, we'd love to have some feedback.

3) Figuring Out How Real-time We Want Our Data Collection To Be

Our current platform (specifically RIS-live) enables us to be near-realtime, which is something we know users appreciate. It enables innovations by the community. An excellent example of this is BGP-Alerter, which feeds off RIS-live to detect anomalies for your network in near-realtime. We also found out that sometimes, for instance when one of the RIS peers generates a large number of BGP updates, we do have processing delays of minutes, and in extreme cases even hours. This also causes extra delays down processing pipelines for RIS-backed data in RIPEstat for instance. As we are looking for improvements of data delivery, we would like to know what performance users would want from this and what this would enable for them and others.

4) Better Documenting RIS and the Data We Collect

Do you know this feeling when you get lost in a big supermarket so it takes you a lot of time to actually find what you need? What helps in these cases is signposts. When digging into RIS data, one can easily feel lost by the sheer amount of it. When looking at visualisations of routing (for instance BGPlay), it's easy to feel the same: there are a lot of things moving all the time here, but are they relevant for me? If we collected better meta-data, that would make for easier navigation of the data, in exactly the way those big hanging signs in supermarkets make for easier navigation through the aisles.

Also, better meta-data can enable new uses of the data or allow for improvements of existing uses of the data. For example, what about a BGPlay that shows which peers do route-origin validation? What type of a feed is this? What is the geographic location of this multihop peer? Is this peer doing route-origin validation? Some of this info we can try to infer, at the risk of being wrong, of course. We can also try to ask RIS peers for more information about the feed they provide us.

As an example of how adding better meta-data gives more insight, we tried to infer the location from which we collect data - i.e. the physical location of the RIS peer - for the cases where the peer and collector are not in the same place - i.e. the multi-hop collectors rrc00 and rrc24. If you compare figures 1 and 2, you can see that by adding this meta-data we get a better picture of what physical locations we get data from, and also what locations we are missing data from, if we wanted to focus on coverage of geographical areas.

Figure 1: RIS route collector locations

Figure 2: RIS route collector (red open circles) and inferred RIS peer locations (green circles). Countries have a dark shade if there is a collector in the country and a lighter shade if there is at least 1 RIS peer in the country.

As you can see from the images above, there are also quite a few countries, even in our service region, that we don't have a single BGP feed from.

It would be great to get some feedback on what others think what the most useful meta-data to collect is, and how collecting this will improve the service/enable new use-cases.

5) Collaboration

RIS isn't the only route collection system out there. A number of other systems are in place - e.g. the RouteViews project - that are also actively collecting data right now. But fortunately, collecting BGP data doesn't have to be a competitive endeavour. In fact, we like to think of it as a collaborative one, which benefits the entire community. A big part of what we want to figure out is how we can collaborate with other route collector projects in such a way as to offer the best possible joint datasets and services. What can we share and coordinate so our community is better off? I've already mentioned NOGs, but what other collaborations would help our community with collection of data that would add more insight?

Conclusion

Each of the five points raised above represents an area where we think progress can be made on improving RIS. And while we have ideas on how we would like to move forward in these areas, we really think that starting a conversation with the community right now will help make sure we move in the direction that is most useful to RIS users.

To facilitate this, we'll be holding a RIPE NCC Open House on RIS on October 20. The session will give anyone who wants to join a chance to come together with experts from across the community for a focused discussion on RIS. We very much hope to see you there.

In the meantime, we would appreciate any feedback you have to give us. Let us know your ideas on how to make things better. Let us know if you want to help out with specific parts of this. And of course, if you think of other relevant points we've not covered here, don't hesitate to tell us about it. Please write down your thoughts under this article, or email us at labs@ripe.net.

Comments 1

The comments section is closed for articles published more than a year ago. If you'd like to inform us of any issues, please contact us.

GT • 02 Nov 2020 21:33

For me, RIS is definitely useful in detecting BGP hijacking and localising such incidents to the origin. Also, in troubleshooting route flapping incidents. I currently use your API to capture the data (after filtering to specific ASNs and prefixes), then feed this into an ELK stack for data analysis. I have documented part of my setup here: https://www.avecseclabs.com/bif/2020/03/02/how-to-monitor-your-ip-space-for-bgp-hijacking/

Route Collection at the RIPE NCC - Where are we and where should we go?

Emile Aben