New RIPE Atlas Probe Stability System Tags
For some time, RIPE Atlas probes have been tagged with system tags that indicate basic network configuration (IPv4 Capable and IPv6 Capable) and a minimal level of connectivity (IPv4 Works and IPv6 Works). These tags can be used to find or exclude probes that are unable or very unlikely to perform RIPE Atlas measurements using either IPv4 or IPv6, but they do not provide an indication of which probes are very likely to carry out reliable measurements, nor any indication of the medium or long term status of a probe.
To complement the existing system tags, a new set of tags have been introduced based on a probe stability metric that is intended to indicate which probes have been generally reliable when performing measurements over rolling time periods of the past 24 hours, 30 days, and 90 days.
The success rates are considered in relation to the most successful probes targeting each destination, which has the effect of ignoring globally unreachable targets and reflecting the connectivity of probes and not the targets that they happen to be measuring. Each time period is broken up into hourly buckets, and a maximum of 5% of the buckets are allowed to be "unstable", meaning that probe has less than a 95% success rate for its "worst" targets compared with other probes. This means that outages or connectivity problems are tolerated so long as they are short and infrequent. It also means that after a two hour outage a particular probe will not be considered stable over the previous 24 hours, but may still (for the meanwhile) be considered stable over the course of the past 30 or 90 days.
Understanding and Using the New Tags
The new stability tags are not intended as a value judgement about probes. The RIPE Atlas project is committed to having a diverse set of probes that is at least somewhat representative of the internet at large, and a network of perfectly stable probes would not properly reflect reality, and would exclude various forms of investigation and research. Instead, these tags are intended to be used to remove noise when trying to verify or investigate problems with a particular target in cases where the stability of individual probes is a distraction.
Example of CLI Toolset usage of new tags
Which tag you choose will depend on the time frame of the new measurement:
- For one-off or short-running measurements you will probably be interested in the "system-ipv4-stable-1d" and "system-ipv6-stable-1d" tags;
- For medium-term measurements the probes' record over the previous 30 days will likely be more elevant, so you should use the "system-ipv4-stable-30d" and "system-ipv6-stable-30d" tags;
- For long-term measurements and measurements that you do not plan to stop, the "system-ipv4-stable-90d" and "system-ipv6-stable-90d" tags will likely be appropriate.
Considerations and Caveats
As with any metric, the stability tags are based on various methodological decisions that come with certain caveats. One such consideration is that only ICMP ping measurements are processed when deciding on stability, thereby potentially ignoring problems that only occur with the TCP or UDP transport protocols. Various thresholds, such as requiring a 95% success rate per target or tolerating outages of up to 5% of the considered time period, were set experimentally and seem to produce good results, but are necessarily somewhat arbitrary. It should be noted that although a lot of effort has been taken to emphasise issues close to probes and ignore issues close to targets, it is very difficult to do this with absolute certainty, and there may be cases where major instability of certain targets affects the probe stability tags. Parts of the methodology could be changed in future based on user experience and feedback.
The graphs below show how many probes have the various tags for both IPv4 and IPv6, excluding probes that have never connected to the RIPE Atlas infrastructure or have been disconnected for more than 90 days. The first two graphs show the raw numbers of probes and RIPE Atlas anchors with the various tags. The second two graphs show the number of probes and anchors as a percentage of those that are not disqualified from receiving the tag on the basis of having first connected to the RIPE Atlas infrastructure after the start of the relevant time period or not having a configured interface of the given IP version.
The graphs show that RIPE Atlas anchors are much more stable than the average probe, especially when it comes to IPv6 connectivity. In general, the success rate of probes is noticeably lower over IPv6 than IPv4, even when compared with the number of probes that have an IPv6 interface to begin with. Although the anchors have a lower success rate over IPv6 than IPv4, this difference is much less pronounced than in the general population of probes.