Large BGP Communities - Largely Accepted Now?
BGP communities allow networks to share extra bits of information about routes. This allows network operators to better tune the traffic coming in and out of their network. The way the bits in the original BGP communities specification are used allows only 16-bit ASNs to be used in BGP communities.
And here is the problem: The shortage of 16-bit ASNs means that the Internet is increasingly populated with 32-bit ASNs. These have been in use since 2007, but still, people seem to prefer the old 16-bit ASNs over the new 32-bit ASNs. One reason for this is the lack of support for BGP communities for 32-bit ASNs (for more explantion, please see BGP Large Communities - Time to Take Action).
The latest attempt to fix the issue is called Large BGP Communities. And things look good for Large BGP Communities. There is a number of active implementations, blog posts, and very recently INEX was the first IXP to deploy it in production onto their route server.
However, there was a small problem when deploying this new attribute in BGP. After the draft specification was largely worked out, and an early allocation of a specific identifier for this new BGP attribute was made, a test prefix was announced with the specific IANA BGP path attribute value of 30 (we will refer to it as attribute 30 in this article). It was discovered that one network could not send traffic to the test prefix, which was later tracked down to a specific router vendor. Subsequently more attribute identifiers were discovered that had been squatted by various vendors so they cannot actually be used to extend the BGP protocol.
In the meantime, IANA assigned the BGP path attribute value 32 to Large BGP Communities (attribute 32).
In this article we measure if this new attribute creates any problems and if it can be widely used. We look at it both from the BGP control plane (the RIPE Routing Information System RIS and CAIDA Periscope) as well as the data plane (RIPE Atlas). Spoiler: For the new BGP communities attribute value 32, we did not find any problems. So as far as we can tell from our measurement data the path is clear for wide deployment of the current specification of Large BGP Communities.
Control Plane (RIPE RIS)
If we look at RIPE RIS data, which collects BGP data, we don't see any signs of either the test prefix with attribute 30, nor the prefix with attribute 32 having issues propagating. We tested attribute 30 one day after it was announced, and for attribute 32 which is still announced you can see the current visibility in the RIPEstat visibility widget. As you can see in Figure 1 it is 100% visible for all of the 158 full-feed BGP vantage points RIPE RIS has.
Figure 1: RIPE RIS visibility of a test prefix with attribute 32
For an alternative view of the control plane we also used Periscope. With Periscope we have access to 403 BGP looking glasses. For these we compared the visibility of a test prefix with attribute 32 to a prefix without the attribute but announced from the same location. We found 310 of the BGP looking glasses tables included the test prefix without attribute 32, and all of them also included the prefix with attribute 32.
Unknown BGP attributes in the last year
Because of the history with the attribute value shifting from 30 to 32, it is interesting to see which other attribute values were visible in RIS over time. Because of a technical difficulty (see footnote 1) we can only do this on updates, so the reported values are based on the number of updates, not number of prefixes seen.
If we look at this data for the last year (November 2015 until October 2016) we see this many BGP update messages with attributes unknown to the libbgpdump parser:
|21||AS_PATHLIMIT (deprecated) (draft-ietf-idr-as-pathlimit)||74,200|
|20||Connector attribute (deprecated) (RFC6037)||57,157|
|30||draft-ietf-idr-large-community (now deprecated)||2,547|
|28||BGP Entropy Label Capability Attribute (RFC6790) deprecated (RFC7447)||436|
Table 1: "Unknown" BGP attributes, as found in BGP updates in RIPE RIS
In this mail to the IETF IDR-WG list we detailed some of our findings: attributes 20,21 and 28 are deprecated, and 128 should probably not be seen in the default free zone. We only saw attribute 30 after 11 October which was when the first test prefix with attribute 30 was announced. In later data we of course also see attribute 32, but also only after it was announced on a test prefix. This is interesting, because it shows that while attribute 30 was squatted before these dates, we would not have been able to uncover that by looking at RIS BGP data alone. Currently there is a draft that will deprecate various squatted BGP attribute values.
Conclusion: With RIS we can see various BGP attribute values being used and we get a sense of their prevalence. By only monitoring BGP we would not have caught attribute 30 being squatted. For all the BGP data we looked into, we found no signs of prefixes in messages with attribute 32 not being propagated in BGP.
Data Plane (RIPE Atlas)
In the analysis before we looked at the Internet control plane, but looking at that alone will not answer the important question here: will packets flow through the Internet towards a prefix that has a Large BGP Community defined on it? We can't answer that for the whole Internet, but we can look at it from over 9,300 RIPE Atlas probes.
During the announcement of a prefix with attribute 30 there was only a single one-off measurement from all RIPE Atlas probes towards a pingable address in the prefix with attribute 30. The results of that measurement are visible in Figure 2 below. We found 216 probes that report reachability problems. Unfortunately we can't distinguish between temporary reachability failures and attribute 30 related problems from this data alone, but comparing to the data we collect on attribute 32 later on (see Figure 3 below), there is a significantly larger number of probes where we see reachability problems. We know of at least one network that was affected by the attribute 30 squatting. We verified that the probes in this network indeed didn't receive responses from their ping requests towards the prefix with attribute 30 set. This indicates that this type of problem is detectable with this type of measurement.
Figure 2: 016 RIPE Atlas probes report reachability problems for attribute 30
For attribute 32 we set up a recurring ping measurement towards pingable targets both in an attribute 32 prefix and a test prefix (no-attribute-32) that was announced without an attribute 32 from the same router. If we look at results of the reachability test towards the attribute 32 prefix on a map (Figure 3) we found roughly 40 RIPE Atlas probes for which none of the five ping requests the probe did for this measurement got answered.
If we look at this over a two-day period and aggregate multiple rounds of this measurement, we see that most of these were temporary failures. For the 9,313 probes that we could compare the attribute 32 prefix test to the no-attribute-32 baseline test all probes received responses to ping requests to the attribute 32 prefix (for details see footnote 2). For the part of the Internet we can test with RIPE Atlas putting an attribute 32 on a prefix does not seem to negatively affect reachability.
Figure 3: In a single round of measurements towards a pingable target in a prefix originated with attribute 32 set, only 40 RIPE Atlas probes report reachability problems to that target. These all turned out to be temporary failures or probes behind devices with partial Internet visibility.
Our measurement results on prefixes with Large BGP Communities are reassuring: we didn't find any problems - neither in RIPE RIS nor in RIPE Atlas.
RIPE Atlas is deployed in 6% of ASNs, so using RIPE Atlas allows us to measure a decent sample of the Internets' ASNs (although, like with election polls, we don't know how representative our sample is relative to the total Internet population).
We know that many host RIPE Atlas probes for the good of the Internet. We hope this is one of the measurements that helps make things a little better. The benefit of hosting a probe is that you'll be included in future censuses like this, so if your network contains potentially problematic hardware we can measure and alert. If you are not hosting a probe yet, you can apply for a probe.
If you have comments, suggestions, or if you think we could or should do additional measurements, please comment below.
Footnote 1: Our collector infrastructure always captures the raw BGP messages within the MRT update files, regardless of whether the collector understood the attributes in the message. Within the MRT table dumps, the dumps produced by Quagga do not include all unknown transient attributes, and only include certain attributes which the Quagga code chooses to add. The dumps produced by our newer (ExaBGP-based) collectors (RRC18 and up) include all attributes from the original update message in the table-dump entry for each prefix.
Footnote 2: We take all measurements from 2016-11-07 to 2016-11-10 towards the attribute 32 prefix and a no-attribute-32 prefix announced from the same location. From both of these sets of measurements we create 4 sets of probes:
- probes that were part of the attribute 32 prefix test (ALL_LB32)
- probes that were part of the no-attribute-32 prefix test (ALL_NOLB32)
- probes that received responses when pinging the attribute 32 prefix pingable target (OK_LB32)
- probes that received responses when pinging the no-attribute-32 prefix pingable target (OK_NOLB32)
We then took the intersection of the OK_NOLB32 and ALL_LB32 probe sets (this is the set of probes that had visibility to the no-attribute-32 prefix and was part of the attribute-32 prefix test, a total of 9,313 probes) and removed the OK_LB32 probe set. This resulted in an empty set, ie. all the probes that we could test this way, did have visibility to the attribute 32 prefix. Both measurements were created with the system-ipv4-works tag, but because they were started on different days, the set of probes taking part in the measurements was slightly different because probes are not removed from measurements if they change away from the system-ipv4-works state.