Further Virtualisation Testing for the RIPE Atlas Anchor

This report is a follow up to an earlier article, in which we detailed our investigations into the use of virtualisation technology as a suitable means to provide other services via the RIPE Atlas anchor hardware, beyond its primary anchor function.

Introduction

The RIPE NCC has decided not to offer multiple services on the RIPE Atlas anchor boxes because of non-technical reasons. You can learn more about those reasons in RIPE Atlas Anchors Pilot: Summary and Next Steps . Despite the fact that we are not moving forward with multiple services, we wanted to report the findings of our virtualisation investigations during the RIPE Atlas anchors pilot phase, detailed here.

In a previous report , we described the results of an investigation of three alternative virtualisation technologies in order to run multiple services on the RIPE Atlas anchor boxes.

The outcome of that investigation was that a container-based virtualisation technique could support a multi-service RIPE Atlas anchor solution. However, a few questions remained unanswered by this first project - primarily the sub-second time behaviour of a virtualised RIPE Atlas anchor - that we address here.

Next steps

A follow-up investigation was done by the RIPE NCC to address two remaining open questions:

Is there a measurable impact of lost tick compensation in the accuracy of time keeping in the virtual environment used?

Is time stability at the millisecond scale also sufficient for use of the virtualised anchor for reliable measurements?

A series of experiments were run with a small Python script that measures time drift by repeatedly pinging an external system (connected to the same local switch) and adding a random period of inactivity. These experiments were run with three average values for the inactivity period: 0.1, 1, and 10 seconds.

Results

The outcome of the described tests is given in the tables and illustrated by the graphs below.

The results of the measurements showed that there was no relevant difference in behaviour between a physical system and a virtual system. There was a small difference found in the measured elapsed time, in which the virtual system showed a .01 ms longer RTT. However, for the target function of the RIPE Atlas anchors, a timing accuracy of the order of a millisecond is considered more than sufficient. Please see below the results in more detail.

Results (ms)	Min	Max	Mean	Stand.dev.
Physical	0.084	0.614	0.23	0.027
Virtual	0.106	0.608	0.24	0.027

Table 1: Timing measurements (10 sec average wait time)

Results (ms)	Min	Max	Mean	Stand.dev.
Physical	0.084	0.611	0.23	0.027
Virtual	0.098	0.612	0.24	0.028

Table 2: Timing measurements (1 sec average wait time)

Results (ms)	Min	Max	Mean	Stand.dev.
Physical	0.085	1.2	0.23	0.030
Virtual	0.094	0.599	0.24	0.028

Table 3: Timing measurements (0.1 sec average wait time)

In the results from Table 3, we see a larger value for the max RTT for the physical system. This was due to a single outlier result; it was not further investigated.

Below are the graphs that illustrate the above table values for Table 2 (you can click on each image to enlarge it).

Conclusion

Based upon these results, we concluded that a virtualised RIPE Atlas anchor, built on OpenVZ, would allow us to provide multiple services, while at the same time use the same RIPE Atlas anchor system for reliable RIPE Atlas measurements.

However, the RIPE NCC has decided not to offer multiple services on the RIPE Altas anchor box for several non-technical reasons as outlined in another RIPE Labs article .