RIS is a widely used public route collector project that provides a better view of Internet routing. One thing that makes RIS unique is that it inserts observable patterns of activity into BGP through a process called beaconing. We take a look at RIS beacons and ask how to make space for more.
RIS is designed to observe the state of global routing - but it isn’t an entirely passive observer. Part of the infrastructure is set up to actively announce and withdraw - at publicly scheduled intervals - a set of prefixes in BGP. These are the RIS beacons, and by emitting predictable patterns of activity into global routing whose effects can then be observed, they help us get a better understanding of the dynamics of BGP.
The actual beaconing process happens at the various RIS route collectors (RRCs) we have distributed across the Internet. Each RRC announces a set of prefixes (in IPv4, IPv6, anchoring prefixes, etc.) from AS12654 to networks connected to it by means of a BGP peering session. As RRCs have a distinct physical and topological location - i.e. a city and IXP - we see where the patterns originate, giving us a geographically rooted view of the resulting effects in BGP. The exception here are multi-hop RRCs that have established BGP sessions with networks from all over the world, meaning the beaconing done from them doesn’t have a strong geographical focus.
Making space
Since we've been doing RIS beaconing for some years now, adding more RRCs as we’ve gone along, we filled the IPv4 address space that we have dedicated to this, and with the current shortage of free IPv4 address space it doesn’t seem prudent to ask for more. At the same time there are a few new types of beaconing that we could start doing to gain better insights into various aspects of BGP routing. So the time seems right to revisit our beaconing setup and see how we can improve this.
This post is a more extensive version of the talk I gave at RIPE 86 and is best be read as a call for feedback on plans to make space for new and improved ways of doing RIS beaconing.
The per-RRC beaconing setup
You can get a detailed writeup of our beaconing setup in the RIS Docs.
Our standard practice, each time we deployed a new RRC, has been to give it an anchor prefix (stable) and a beacon prefix (flipping between up and down state every two hours). As we did this in both IPv4 and IPv6, this setup requires four prefixes in total for each RRC.
As mentioned earlier, since RRCs have specific geographic and topological location, setting things up on a per-RRC basis in this way has allowed us to look at BGP propagation properties for these specific locations.
That said though, also as mentioned, an issue with this setup is that it takes up most of the IPv4 prefixes we have available for beaconing, which is a real issue when we want to extend or try out other more specialised setups (we’ll come to those a few sections down). Before we can move on with anything else, we would need to reduce the current address space usage.
The next section looks at current patterns in propagation and after that we propose a way to reduce the footprint based on the analysis.
Analysis
Given the aim is to inject announcement/withdrawal patterns into BGP and observe the results for Internet routing, one thing we would want to ensure with RIS beacons is that a diverse range of upstream networks are actually choosing to propagate the relevant prefixes.
To get a measure of the diversity of upstreams around each RRC available for beaconing, I looked at how many upstream networks propagate our beacons to figure out to what extent these are useful. The idea is that if we have a limited number of upstreams, our propagation will be more about these upstreams than about the common behaviour we see at these points in the Internet topology (i.e. IXP LANs and how prefixes propagate by participants there).
The table below shows the number of networks that propagate our RIS beacon prefixes per RRC:
RRC | Where are the peers? [*] | IPv4 networks | IPv6 networks |
---|---|---|---|
RRC00 | Worldwide (Multihop) | 64 | 9 |
RRC01 | London (LINX/LONAP) | 33 | 27 |
RRC03 | Amsterdam (AMS-IX, NL-ix) | 30 | 29 |
RRC04 | Geneva (CIXP) | 5 | 3 |
RRC05 | Vienna (VIX) | 13 | 10 |
RRC06 | Tokyo (DIXIE,JPIX) | 1 | 1 |
RRC07 | Stockholm (Netnod) | 10 | 8 |
RRC10 | Milan (MIX) | 12 | 10 |
RRC11 | New York (NYIIX) | 11 | 9 |
RRC12 | Frankfurt (DE-CIX) | 35 | 33 |
RRC13 | Moscow (MSK-IX) | 6 | 5 |
RRC14 | Palo Alto (Equinix PAO) | 5 | 5 |
RRC15 | São Paolo (IX.br SPO) | 9 | 8 |
RRC16 | Miami (Equinix MIA) | 0 | 0 |
RRC18 | Barcelona (CATNIX) | - | 3 |
RRC19 | Johannesburg (NAPAfrica) | 6 | 5 |
RRC20 | Zürich (SwissIX) | - | 15 |
RRC21 | Paris/Marseilles (France-IX PRS/MRS) | 16 | 14 |
RRC22 | Bucharest (InterLAN) | - | 14 |
RRC23 | Singapore (Equinix SIN) | 0 | 0 |
RRC24 | LACNIC region (Multihop) | 3 | 3 |
[*] Due to some IXPs spanning multiple cities (notably NL-ix) and remote peering we know some of the peer routers at IXPs are not in the same city as our collector, but for a majority of cases of IXP-connected collectors we believe the peer router is in the same city as our collector.
My conclusion from looking at this table is that, while having our standard per-RRC beacons announced from many of the route collectors allows for inspection of how routing behaves from various parts of the Internet, it also looks like there is a big difference in the value that beaconing from a specific RRC adds.
If you also consider the value that each adds to the sum of already existing beacons, I think it's safe to say that we can remove some of the beaconing from some of the RRCs to make room for alternative setups that add more value. As we have enough IPv6 address space to keep doing beaconing, we can maintain a beacon/anchor prefix per RRC in IPv6 while cutting down on doing this in IPv4.
One solution I would like to propose is to keep this standard set-up for our three most diverse IXP collectors (RRC01, RRC03 and RRC12) and, for all the other collectors, use BGP anycast for their anchor and beacon IPv4 prefixes. Using BGP anycast means we would announce the same prefix from multiple RRCs at the same time.
The practical implementation that makes most sense to me would be to use all global multihop collectors (i.e., RRC00+RRC25) to announce a single anchor/beacon pair. For the other RRCs, I propose we treat them as a single entity per RIR region to anycast anchor/beacon prefix pairs from.
Overall this is what that would look like:
RRC | Function | Change? | Anycast? |
---|---|---|---|
01 | LON IXPs | no | no |
03 | AMS IXPs | no | no |
12 | FRA IXPs | no | no |
00, 25 | multihop-anycast | yes | yes |
24, 15 | LACNIC anycast | yes | yes |
19 [*] | AFRINIC anycast | no [*] | no [*] |
06, 23 | APNIC anycast | yes | yes |
11, 14, 16 | ARIN anycast | yes | yes |
04, 05, 07, 10, 13, 18, 20, 21, 22, 26 | RIPE NCC anycast | yes | yes |
[*] The situation with RRC19 is special in that it’s currently being replaced so it’s not active. When it’s active it will be the only RRC in the AFRINIC region, so functionally would not be anycast until we would have a second collector in Africa.
This plan will reduce our "generic" beacon footprint to nine anchor and nine beacon IPv4 prefixes, which leaves room for alternative setups.
When it comes to IPv6, we intend to keep the per-RRC anchor/beacon prefix setup as is. For one thing, there aren’t the space constraints to take into account, but more than that, this approach keeps things consistent over time, and will allow us to re-evaluate upstream diversity on a per-RRC basis. All that said, if people think we should make changes here too - say, for the sake of staying consistent in how we do things in v4/v6 - please comment below.
Other existing setups
In addition to the 'standard' approach outlined above, we've also tried out a number of other setups that don't follow the per-RRC pattern, but which have helped explore other aspects of BGP. Here are a few of them with some details on what we intend to do with them:
- RPKI valid/invalid/unknown: We stably announce three IPv4 and four IPv6 prefixes from RRC03 (AMS IXPs) that have different states in the RPKI system ('valid', 'invalid', and 'unknown'). As route origin validation (ROV) becomes more prevalent, the value of this specific extensive set-up diminishes. I heard that having a prefix which is RPKI invalid has value in continuing to monitor the Internet, specifically for temporary lapses of ROV - i.e. this prefix should not propagate if ROV becomes ubiquitous. So for this setup it makes sense to only keep a single IPv4 and IPv6 prefix that have RPKI 'invalid' state, and free up the rest of the address space for re-use. It also makes sense to anycast these RPKI invalids, while at the same time have the ability to monitor how far it spreads. As we can’t predict how this will look like we intend to find out while implementing this what a good balance between inserting and observing this prefix is.
- Fast beacon (10m up / 10m down): This was set up as a special request for a community member, but we never turned it off. As this pulses 12x faster than our regular setup this also generates roughly an order of magnitude more data on a per prefix basis. The big advantage of this is that one can in near-realtime look at how announces and withdrawals traverse the Internet. I think this has a large educational value, and an untapped potential for capturing the state of BGP route propagation with higher granularity. For instance this could become useful as a potential early warning signal for disruptions in BGP route propagation, as delays in propagation can indicate the routing system as a whole behaving outside of expected bounds. To explore this I created an ObservableHQ notebook that looks at this prefix in detail. As I see this set up as high potential, we intend to keep it as-is and explore the signal further.
- Anycast failover simulation: This was set up in a time where anycast was a bit rarer than it is in the current Internet, and tests a specific fail-over from two sites to one site. Feedback I received suggested that this set-up is probably not adding a lot of value in the current Internet, and I didn't hear anybody speak up about keeping it, so my conclusion is to stop this failover simulation and use this prefix for other purposes.
The table below sums up what this entails for IPv4 and IPv6 address space use of beacons for special set-ups:
RPKI “invalid” | keep and anycast |
---|---|
RPKI “unknown” | discontinue |
RPKI “valid” | discontinue [*] |
Anycast failover simulation | discontinue |
Fast beacon | keep |
[*] Our intent is to create ROAs and IRR records for all the other prefixes, so effectively we will have many prefixes originating from RIS that are RPKI valid.
New setups
Now that we have room for other setups, there are a few that we think add enough value that we should add them. Many of these are around RPKI and routing security, which we know is a topic our community cares about.
- RPKI key material beaconing: A recent paper by Fontugne et al. explored the propagation of RPKI key material changes into the routing system. We want a beacon that captures this propagation delay. Operationally it is very relevant, so people can understand what the typical delay looks like, and it would allow us to monitor if this delay becomes unreasonably large. For this you need a stable prefix in each address family (v4/v6), and software that changes the key material between RPKI “valid” and RPKI “invalid” states in such a way that one can determine the delay between the key material change and when effects of this show up at RIS peers. As this is somewhat of an uncharted territory, we will need to determine a good approach for this. We either set a fixed schedule for the key material flip (easiest to implement), or we add a random component to the time the key material flips and we keep a public log of when this happened (more complex, but allows for observations of temporal effects). In the aforementioned paper a random component was used. From the authors we heard that we need at least five hours of “invalid” state for this, but we also heard of plans of ROV deployments with filters that get updated daily, so we will likely need to fine-tune this as we go. Note that our current plan is to only flip RPKI key material in the RIPE NCC trust anchor, but we do not exclude the possibility to do this for other trust anchors later, in collaboration with the other RIRs.
- ASPA key material beaconing: As we expect ASPA deployment to become operationally relevant in the very near future, we want to add a RIS beacon to capture the dynamics of key material switches for ASPA. Again this will be implemented as a switching of key material in the RPKI system. For now we want to reserve an IPv4 and an IPv6 prefix for this purpose.
The setup we intend to run as follows. Towards RIS peers we want to announce a stable prefix with the following AS-PATH: 12654 196615. We want to flip ASPA states for AS196615 between having no upstreams and having AS12654 as the only valid upstream. Both ASNs are assigned to the RIS project, but due to software limitations we currently can not announce this type of BGP message out of RIS yet.
We don’t expect to be able to do this in Q3 yet, but once we do this allows us to get insights into the propagation speed of ASPA key material changes as well as insights into the uptake of ASPA-validation on the Internet.
- RFD beaconing: The latest research on beaconing for route flap dampening is available in two papers both here. It is operationally relevant to track this since it helps understand and reduce BGP churn. After communications with the authors of the work mentioned above, we understand that we need to flap a prefix for two hours between announced and withdrawn every minute to trigger RFD. As this causes significant amount of updates/withdraws I think it is wise to limit how often we do this, and we intend to try this out and find the right frequency for doing this, starting off with an initial monthly schedule. Because this setup will have a longer “down” time than any other setup we have, it will also give more insight into another phenomenon: ghost/zombie routes. These are routes for prefixes that have long been withdrawn from the Internet, so this prefix with a longer “downtime” will allow us to see how long they stay, beyond the two hours of “downtime” we have for our other beaconing prefixes.
- Longer-than-/24-IPv4-prefixes: During RIPE 86 the topic of propagation of longer-than-/24 IPv4 prefixes surfaced again. RIS used to beacon two /24s, two /25s and two /28s out of ARIN experimental space before for this purpose (more on that here and here). As this was temporarily assigned space, we stopped doing this with that address space. But with the proposed consolidation of IPv4 prefixes we can start using some of that reclaimed space for a similar setup, and keep track of changes to the propagation of longer than /24 IPv4 prefixes. This is very relevant for BGP filtering, and usability of longer than /24 IPv4 prefixes for global routing. In contrast to the ARIN prefixes we used, I think it would be good to try announce this from many RRCs (i.e. anycast). Doing this from IXP connected RRCs would be especially beneficial as I expect these prefixes to propagate best there. As we also want to retain the capability of observing these prefixes propagate, we can’t anycast this from all RRCs. We intend to find out the best balance between RRCs to announce from and RRCs to observe from once we implement this.
- A deliberately unannounced prefix: As far as I’m aware, there are no IPv4 prefixes that are deliberately unannounced. We might be able to find some of this by looking at AS0 in the RPKI system, but since we don’t know if the maintainers of this address space intend to keep it that way, we propose that we keep an IPv4 and IPv6 prefix allocated to RIS deliberately out of the BGP routing system. This is useful if you want to study non-BGP routing on the Internet, as well as finding networks that squat/hijack unannounced address space. We have set aside a prefix for this, and we have a RIPE Atlas traceroute measurement towards an address in that prefix (documented here). We will add an AS0 VRP in RPKI for this, so it is also clear in the RPKI key materials that our intent is that these prefixes are not visible in the public Internet routing.
Nitty gritty
There are a few things we do for the beacons to maximise the utility, in terms of ability to track the origin of an announcement, and to help debugging in the data plane. This section documents changes in this.
To help track the origin of an announcement, we "tag" BGP messages so that one can track which RRC the message originated from. For this we overload the AGGREGATOR
field - we encode useful information about the when (i.e. timing) and where (i.e. what RRC) of a specific announcement from our RIS infrastructure. This is documented here:
These are encoded as follows:
AGGREGATOR IP ADDRESS: 10.x.y.z, where x, y and z form a 24-bit count
of the number of seconds between the start of the month and the time of
the announcement.
AGGREGATOR AS NUMBER: 64512 + n, where n is a 10-bit number
encoded as:
MSB LSB
+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
+---+---+---+---+---+---+---+---+---+---+
| RRC ID | Sequence Number |
+---+---+---+---+---+---+---+---+---+---+
Adding 64512 brings the resulting number into the private AS number range.
We already ran out of 4-bit space to number the RRCs, and as we propose anycast setups, and different intervals, the need to change this. This labs article serves as a heads-up for this change. We don't expect an impact of these changes, but if people do operationally depend on this, please let us know so we can try take this into account when implementing a new encoding for this.
For prefixes where we don't do a special thing for RPKI/IRR we make sure the prefixes are all in the IRR and have valid ROAs. The exceptions to this would be the RPKI invalid prefix, the prefixes where we would beacon RPKI key material (ROA, and ASPA), and the deliberately unannounced prefix where we will put an AS0.
While RIS and RIS beaconing specifically operate on the control plane, it can also be useful for debugging the data plane. To make RIS most useful for this we will make sure that there is a host in these prefixes that responds to ping. We will use the first normal address in the prefix for that - i.e., the .1 in an IPv4 /24 , the ::1 in an IPv6 /48, etc. We will keep the pingable attribute in the RIPE DB for these up to date.
For the multihop collectors this is a bit more complicated. As we don't have direct data-plane connectivity to our peers, the results from reachability tests to an ICMP ping responder are probably more of an interesting academic exercise than something that is operationally relevant, but as the associated effort to keep this up is minimal, we intend to keep these active. For those wanting to look into this already, we have documented RIPE Atlas measurements for this purpose .
Discussion and conclusion
Once we revamped this set up, we should be able to better expose the insights they provide by building visualisations and analysis on top of them, together with those interested in the community. If you are interested in working with us on this, please let us know. Some of the other building blocks we plan to use for doing this are ready to use: RIS Live for streaming RIS data and ObservableHQ for visualising it.
Some setups we heard of didn’t make the cut for this plan, mostly because we want to focus on stable set-ups that we can continue to do for the upcoming years.
For specific time-limited experiments we advise people to consider using the PEERING BGP testbed.
In this post we document our plans to improve RIS beaconing, and we intend to implement this the second half of 2023. We would love to hear of possible improvements on this plan so we can incorporate this into our revamp of RIS, so let us know what you think in the comments below or over on the RIPE NCC Forum.
Comments 0
Comments are disabled on articles published more than a year ago. If you'd like to inform us of any issues, please reach out to us via the contact form here.