A Tale of BGP Collectors and Customer Cones
BGP route collectors, such as the RIPE Routing Information Service (RIS) and Route Views collect near real-time information about the state of the control-plane of the Internet from networks around the globe, and keep an archive that provides a fascinating view into the evolution of the Internet.
They provide a valuable service to the Internet, however we can always think about how to make them even more valuable. One piece of information that is complex to extract and difficult to get right is if a path announced to a collector is from the ISP's customer cone, an internal route, or an external route learned via peering or transit. We can define these categories as follows:
- Customer Cone: Routes announced by BGP customers, static prefixes used for non-BGP customers, data center routes, etc.
- External Routes: Routes learned from peers and transit providers which the ISP would normally announce to customers. Often, ISPs do not announce such routes to collectors.
- Internal Routes: ISPs occasionally announce to the collector internal point-to-point and other routes they would not normally announce to customers, peers, or transit providers.
If it was easy to get this information in the data collected by BGP route collectors, it would enhance the value these services provide.
When an operator uses a collector to look at an ISP's announcement of a prefix, it is very useful to know if the ISP also announced it to their customers and/or peers/transits. Researchers want to differentiate similarly in order to understand route propagation. One usually wishes to ignore any internal-only routes an ISP may announce to the collector, as they would not be announcing them to peers, transits, or customers.
An ISP is expected to announce internal and customer routes to their customers, and customer routes to their external peers and transits. In general, one does not need to differentiate whether the ISP will announce to peers or transits, and the ISP may not wish to expose the business relationships with external providers. So we do not differentiate peers from transit providers.
For instance, the following question was recently posted to the MAT-WG mailing list (by Job Snijders):
I'd really like the stat team to implement a tool that tells me what prefixes are originated AND transitted by a given ASN. The use-case is that I (before setting up BGP sessions) can make an automated assessment what amount of prefixes the network will announce, which prefixes they are likely to announce and if that matches up with the prefix-filter I would apply on that bgp session. A second use case is that when I want to connect a peer or customer to which I have never been connected before, I have no idea what they will announce to me. Should I want to make predictions for traffic engineering purposes, it would be beneficial if I could fetch a list of prefixes this peer is likely to announce to other peers from the stat system. A sliding window of say 30 days would be perfect, for my purposes I do not need the tool to offer data going back to the beginning of time. Do others agree that the above would be a useful feature?
Unfortunately this is easier said than done. Several research groups have tried to infer customer cone-type information from BGP and related data. One popular approach is based on the valley-free assumption (also called Gao-Rexford), where peering relationships are one of three categories (customer-provider, peer-peer, or siblings) and there is a strict order in which to evaluate how to reach a destination: customer over peer, peer over provider. This can be regarded as a useful starting point of analysis, but it is also far from universally true . See the discussion at the end of Benno Overeinder's presentation at the RIPE 69 Routing WG , or the paper " 10 Lessons from 10 Years of Measuring and Modeling the Internet’s Autonomous Systems " (Roughan et al.). The most comprehensive work out of the research community in this area is CAIDA's AS rank .
In 2006, there was an effort in the IETF to standardise announcing this type of information by using BGP communities ( RFC4384 ). Unfortunately it didn't get a lot of traction, probably due the complexity of that RFC.
This doesn't mean that the information is not retrievable from BGP communities at all. Below, we document our attempts to figure out the customer cone of several networks, using RIS data from 1 September 2015. We'll only consider IPv4 prefixes in the text below, but the same analysis can be done for IPv6 prefixes. As you will see, this is not easy and is error-prone due to the complexity involved in the data currently available.
TL;DR: It's complex! Skip the examples for a proposed fix.
AT&T uses the following communities on its BGP routes (this information has been provided to us by the AT&T engineer who established the RIS-AS7018 peering):
7018:1000 - large aggregates (e.g. 220.127.116.11/8 and 2001:1890::/29) 7018:2000 - routes from customers, announced to other customers and to peers 7018:2500 - routes from customers who request AT&T to announce only to other AT&T customers and not to AT&T peers 7018:5000 - peer routes Each BGP route will have one and exactly one of these four communities. In addition some routes will have a second community in the range 7018:[30000-39999], but these communities have nothing to do with determining AT&T's 'customer cone.' The set of routes received by AT&T's customers who want to see all of AT&T's customer routes is the union of the sets of routes tagged with communities 7018:1000, 7018:2000, and 7018:2500. The set of routes received by AT&T's peers is union of the sets of routes tagged with communities 7018:1000 and 7018:2000
In the direct RIS-AS7018 RIS peering data, we find 109 IPv4 prefixes tagged with 7018:1000 , 38,790 tagged with 7018:2000 , and a further 9,681 tagged with 7018:2500 .
In the measurements on the World IPv6 Launch website, AS6389 and AS7132 as well as AS7018 are listed as AT&T networks. AS6389, AS7132 and AS7018 can be considered sibling networks , i.e. autonomous systems under control of the same organisation. The prefixes originated from AS6389 and AS7132 are tagged with the 7018:2000 community , so from the point of view of AS7018 they are customer routes. That means they are in the customer cone. This is interesting because sibling- relations do not have to be of the customer-provider kind.
48,576 (routes either tagged with 7018:1000, 7018:2000, or 7018:2500)
For AS701 (Verizon, formerly MCI, formerly UUnet), we couldn't actually find any BGP community value that looked like tagging its customer cone , nor a likely candidate from the BGP communities that currently exist in RIS. AS701 doesn't peer with RIS or Route Views anymore, so currently this method can't be used to get to AS701's customer cone.
Best estimate: ????
AS3356 (Level 3)
AS3356 uses "3356:123" to tag its customer routes. In RIS, we see 235,314 prefixes tagged with this community. Level 3 doesn't have a direct peering with RIS though, but RIS includes AS3549, which is part of the same company (in 2011 Global Crossing, who operated AS3549, was bought by Level 3). If we restrict ourselves to "3356:123" tagged prefixes through the AS3549 peering of RIS, we only find 17. Either the BGP best path selection or the tags being stripped off could have caused that.
Route Views does have direct AS3356 peering, and from that direct peering we find 283,400 prefixes tagged with "3356:123". If we consider the 2,555 prefixes that in this direct peering are directly originated by AS3356, roughly half are tagged with this community, the other half is not (1,052 tagged vs. 1,533 not).
Best estimate: 284,933 (Route Views direct peer + AS3356 as originator if not tagged yet)
AS2914 uses "2914:410" to tag its customer routes , and in RIS we see a total of 180,220 unique IPv4 prefixes tagged that way. Restricting ourselves to the three direct peerings RIS has with AS2914 we see 174,219 for the peering at PAIX and 176,900 for the other two (LINX and DECIX). A total of 173,551 prefixes is seen at all three of these feeds. It is interesting how this shows regional differences. If we take the prefixes seen in all feeds and compare that to the 546 prefixes that are originated from AS2914, we find ten prefixes that are not tagged with the customer route community.
Best estimate: 173,561
These four examples show the complexities and inaccuracies involved in estimating customer cones from route collector data. This is not the fault of the network operators providing the feed; there is no standard by which they can signal this type of information.
An additional complexity we didn't analyse in detail, is tagging everything but the customer cone (AS6453 (TATA) appears to be doing this).
If you wanted to use current BGP community attributes to look into a network's customer cone, you'd need to maintain a mapping of specific tags for specific networks, i.e. an extra level of indirection which complicates analysis, and needs maintenance. This is a lot of work that a researcher or operator is not necessarily looking forward to. And even then, the current BGP community tags used by operators may or may not contain the prefixes the peer originates itself, which further complicates matters.
We found two pieces of research in this area. First, there is a BGP Community taxonomy from 2008 by Benn oî t Donnet, with an accompanying webpage that contains the actual taxonomy . More recently, Vasileios Giotsas captured BGP community mappings as part of his thesis , and a mapping file is available here , which was used for validating CAIDA's AS rank work.
A new Internet-Draft, Marking Announcements to BGP Collectors , proposes to make it a best current practice to tag prefixes towards route collectors with one of three BGP community markings.
RIS and Route Views peers can make BGP route collector data more useful by adding a couple of simple BGP community markings. If you are in support of this and peer with RIS or Route Views, please consider tagging your routes once this Internet-Draft has matured. If you have ideas or comments on this, please don't hesitate to
comment on the IETF GROW mailing list