This is the first of two articles in which we present a new attack vector on the routing infrastructure of the Internet using BGP communities.
The Internet as we know it relies on a combination of different protocol stacks, protocol extensions, tools and best practices. Additional features and extensions, such as BGP communities, have been added over the years to make Internet operations feasible.
BGP communities (as defined in RFC 1997) are used to signal reachability information between Autonomous Systems (ASes). In principle, BGP is used to distribute routes amongst peers, but for network-managing purposes it is possible to attach additional properties to each routing update. These properties are encoded within the 'communities' attribute in BGP.
Most network operators happily work with BGP communities on a daily basis and, from what we have gathered, do not see them as a potential security issue. However, under certain circumstances, BGP communities can be used to drop and redirect traffic. In principle, the same effect can be achieved via BGP hijacks, but community-based attacks are much harder to spot and almost impossible to attribute.
We, as measurement researchers, took a closer look to understand what could possibly go wrong when dealing with BGP communities.
BGP community usage is increasing
When looking at the publicly available BGP data provided by services such as RIPE NCC's Routing Information Service (RIS), RouteViews, PCH and Isolario, not only is the number of ASes and prefixes increasing, but also the use of BGP communities. Between 2010 and 2018, the number of communities has almost tripled (a shown in Figure 1 below). Such behaviour makes us researchers very curious.
Figure 1: Historic development of BGP data: RIPE Routing Information Service (RIS) and RouteViews
Before we explore this phenomenon any further, let’s first define what communities are.
A community is a single 32-bit field (an integer), and an optional attribute in BGP update messages. By convention, the Integer is split into two 16-bit values, interpreted as an AS number and a value, where the AS referred to in the community is usually defining the meaning of the value.
Here we run into the first problem: today we have ASes with 32 bit AS numbers that will not fit into that scheme and therefore cannot use communities in a sensible way. To mitigate this, RFC 8092 defines BGP Large Communities, a 96-bit (12 byte) field and allow these ASes to use communities as well.
The only problem here (from an Internet measurement perspective) is that as BGP Large Communities are quite young, adoption is still slow (worse than that of IPv6). We currently see only 51 global administrators (which can be compared to the “AS-part” of BGP ‘legacy’ Communities) using them (Please note that at the time of publication, the number of BGP Large Communities has increased to over 100. For more information see BGP Large Communities Uptake - An Update).
For a detailed description of the development and usage of BGP Large Communities, see this excellent article by Job Snijders in the Internet Protocol Journal, Volume 20, Issue 1.
Use of BGP communities
All BGP communities fall in either of the following categories:
- Informational communities with passive semantics
- Action communities with active semantics.
While passive semantics only tag an announcement with some additional information, for example, the location where a router has learned that route or some value signaling the RTT on that specific link, action communities with active semantics directly influence routing policy decisions on a peer. They are used to signal some intent from one AS to another, for example, to execute path prepending, to change the local preference/MED or to apply Remote Triggered Blackholing (RTBH) on a prefix (more on that in the next article).
The problem here is that there are almost no standardised communities. Without documentation, you cannot tell if a community is active or passive - you don’t even know what it’s being used for in the first place!
BGP communities as an attack vector?
Given the increasing popularity of BGP communities and the ability to remotely trigger actions and relay information, the question arises - to what extent can BGP communities be leveraged for attacks?
One necessary condition for community based attacks is that communities are being propagated from one peer to another.
We observed that 14% of transit providers are propagating received communities to their peers. At first, this number – 2.2k out of 15.5k ASes - seems to be rather small. But given the ongoing flattening of the Internet topology and the highly connected AS graph, this number is actually sufficient for a wide propagation of communities.
The propagation behaviour of BGP communities is specified in two separate RFCs:
- RFC 1997 defines communities as a transitive optional attribute and
- RFC 7454 states an AS should scrub communities used internally but forward foreign communities.
Still many people do not expect communities to be propagated that widely — including authors of this work!
Propagated communities might trigger actions multiple hops away beyond the direct peer they are announced to. There is no way of knowing if the observed behaviour is intended, for example, for traffic management, or not. Which AS has added the community to the announced prefix too remains unknown (more on that later).
Our assessment here is that there is a high risk of misuse or even attacks!
Let’s look into some measurements!
In order to assess how widely BGP communities are actually propagated throughout the Internet, we analysed the available BGP routing data on community propagation behaviour.
Our dataset consists of BGP dumps provided by four projects (RIPE RIS, RouteViews, Isolario, and PCH) for the full month of April 2018. There were almost 39 billion BGP messages to process - and we found more than 63k different communities in these messages.
More than 75% of all BGP announcements have at least one BGP community set. 5,659 ASes are using communities. That is about 9% of the total number of ASes.
We found that 10% of the observed communities have an AS-hop-count of more than six different ASes and more than 50% traverse more than four ASes. We even found a community that is forwarded across 11 ASes. If you compare this with the mean distance of AS-paths of 4.7, or the median of 5, from our datasets, this means that communities are pretty much propagated globally.
But how can we measure the AS-hop-count of a community while proclaiming one cannot say which AS has added a community in the first place? Well, to be honest, we cannot. We can only infer from the routing data which AS on the AS path has probably added it, which gives us a lower bound for the travel distance. This means that the travel distance could be even higher.
Consider this simple example (Figure 2):
Figure 2: Infering the travel distance of a community (AS hops)
We infer that AS2 has added the community 2:303 while AS3 has added the community 3:123, while in fact AS2 has also added community 3:123. This gives us an AS-hop-count of 1 for community 3:123 and 2 for community 2:303, which yields a lower bound.
Leaking BGP communities
Another interesting observation we made is the ‘leaking’ of communities ‘off-path’. This refers to communities found on the route collectors that contain ASNs in the first 16 bits which are not present in the AS path of the announcement, that is, they have not traversed the AS denoted in the community (Figure 3 illustrates this scenario).
Figure 3: Communities for ASes not on the AS path - 'leaking' off-path
Why is this the case? Some ASes add communities to trigger actions with one of their upstreams to the announcements towards all of their peers, and not just the upstream AS where this would be required. Furthermore, if a community is directed at an AS multiple hops away the direct peer might propagate the prefix to other peers, including all communities received. We thus distinguish on-path and off-path communities.
More about this and Remote Triggered Blackholing (RTBH) in the next article.
On-path versus off-path
When comparing off-path and on-path communities the distribution of values (the 16-bit right-hand part of the communities) we find distinct differences, as you can see in Figure 4 below.
Figure 4: Distribution of values for off-path ond on-path communities
In the off-path community values on the left, you find, for example, values used to trigger blackholing in upstream networks. Meanwhile, in the on-path communities on the right, you notice that the ‘nice’ numbers, which are easy to remember, are dominant. We think that blackholing communities are so prominent on the off-path side, because ASes that do not implement blackholing will not touch blackholing communities and just forward them and propagate them to all peers. On the other hand, the on-path values look like operator-assigned numbers for traffic engineering – limited manual checks with community documentation of some ASes confirm that suspicion.
In our next article, we will discuss the results of our experiments into Remotely Triggered Blackholing (RTBH), which is commonly used to mitigate denial of service (DoS) attacks, and provide recommendations as to what providers can do to prevent external parties from interfering with their networks.