Petros Gigis

openPenny: Developing an Open-Source Tool for Detecting Non-Spoofed Traffic

Author image
Petros Gigis

12 min read

0
Article lead image

In this article, we introduce openPenny, an open-source traffic checker currently under development as part of the RIPE NCC Community Projects Fund. The goal of openPenny is to help network operators identify non-spoofed traffic arriving at unexpected entry points: this offers a new primitive to detect routing misconfigurations, evaluate policy or commercial adjustments, and defend against security threats such as BGP hijacks.


Imagine an ISP with Points of Presence (PoPs) around the globe, or an IXP with many members. One of the key challenges in such large networks is getting visibility on the traffic they receive and have to forward.

We envision a monitoring system able to build and constantly update a traffic map of the Internet from the perspective of the network it is running in. The system would collect information on where traffic from each source IP prefix enters and where it goes.

How would this map be useful? For one thing, it may support alerting on any practically relevant external traffic or path change, especially if sudden or unexpected.

Stealthy hijacks

Let's look at an example. Figure 1 shows a real-world case of a BGP prefix hijack, invisible to control plane monitoring. In this case, the BGP path for prefix 203.127.0.0/16, owned by AS3758 (SingNet), is legitimately announced by that AS to all its neighbours with an RPKI-compliant route. On February 10, 2025, however, AS17894 (Innove Comm) announced a /24 sub-prefix of SingNet’s 203.127.0.0/16 to AS4775 (Globe Tel). Since AS17894 was not the legitimate owner of the prefix, such a /24 announcement was RPKI invalid. So, the announcement from AS17894 (Innove Comm) propagated through all the ASes not enforcing Route Origin Validation (ROV) but was discarded by the ROV-enabled ASes.

This is why in the figure, AS4775, AS15412 and AS6762 all have a /24 route towards AS17894 (Innove Comm), but AS37100 (SEACOM) and AS6461 (Zayo) don’t. Note that by discarding the /24 BGP announcement, AS37100 (SEACOM) and AS6461 (Zayo) also stopped the propagation of the BGP route: this meant that SingTel, SingNet and public BGP monitors on these ASes never received the /24 BGP announcement originated by AS17894 (Innove Comm).

Figure 1. Real-world incident: Innove Comm announces to GlobeTe a BGP route for the /24 prefix. SEACOM and Zayo discard the announcement because it is RPKI-invalid.

From a data plane perspective, though, traffic from AS37100 (SEACOM) – and any other AS sending packets through it – destined to an IP in 203.127.255.0/24 was redirected to AS17894 (Innove Comm), as shown by the orange path in the figure. This traffic never reached its intended destination AS3758 (SingNet). 

To investigate the issue, SingNet could look at control-plane data. However, doing so will provide no help whatsoever in troubleshooting the connectivity loss if no vantage points exist within the non-ROV ASes (AS6762, AS15412, AS4775) that accepted the invalid /24 BGP route. Alternatively, SingNet might use active methods such as RIPE Atlas traceroutes. While these can help, they require broad vantage point coverage along the affected path, precise timing, and fine-grained monitoring, and may still miss the anomaly if intermediate non-ROV ASes silently drop packets with low TTLs.

Disclaimer: According to the related IETF draft, this event was likely caused by a benign misconfiguration, as AS17894 and AS3758 are connected through their parent companies, Globe Tel and SingTel.

Now imagine a variation of the previous incident, illustrated in Figure 2, where traffic eventually reaches its origin, AS3758 (SingNet). In this case, AS17894 (Innove Comm) is connected to AS7473 (SingTel). As before, AS17894 announces a more specific /24 subnet of the /16 prefix owned by SingNet to AS4775 (GlobeTel). As a consequence, traffic crossing SEACOM and destined to the /24 subnet announced by Innove Comm starts to be forwarded via AS6762 (TISparkle), AS15412 (FLAG Tel), and AS4775 (GlobeTel), which also differs from the AS-path in BGP route used by SEACOM.

Figure 2. Hypothetical scenario: Innove Comm announces an RPKI-invalid BGP route for the /24 prefix to GlobeTel, and forwards the corresponding traffic to SingTel. Being ROV-enabled, SEACOM and Zayo reject the RPKI-invalid announcement originated by Innove Comm.

Now traffic for 203.127.255.0/24 can be inspected, delayed, or tampered with by networks, such as Innove Comm, that were never meant to carry it. If the /24 announcement is due to a misconfiguration, the attracted traffic can also overload the new path, leading to performance degradation, and can cause economic damages to some ASes (e.g., paying for traffic they are inadvertently providing transit for).

Even worse, detecting and troubleshooting cases like the one illustrated in Figure 2 is nearly impossible today. From SingNet’s perspective, there is no reachability problem: traffic from AS37100 (SEACOM) and its downstream ASes still arrives. SingTel also doesn’t see any anomaly: on the control plane, it never received the /24 route originated by Innove Comm, as it was discarded by AS6461 (Zayo); on the data plane, SingTel receives traffic from Innove Comm towards destinations within the /16 prefix it did announce to all its neighbours. More in general, unless there are vantage points inside TISparkle, FLAG Tel or Globe Tel, the invalid /24 BGP route originated by Innove Comm does not show up in any control plane log.

Not all is lost, however: evidence of the path change can still be extracted from data-plane traffic measurements. More precisely, SingTel may notice that part of the traffic sourced at SEACOM (or at other ASes transiting through SEACOM) is now entering through a different ingress point, as highlighted in the figure. If the traffic map we envision is built and maintained at SingTel, such an entry-point change, affecting only part of the traffic destined to SingNet, is easy to spot.

This is great! Except that there is one big caveat: what if the traffic received by SingTel is source-spoofed? Receiving spoofed traffic provides no real information on its actual source, and hence no signal about inter-domain paths. It’s absolutely not worth alerting operators, asking them to embark into complex troubleshooting analyses of possible path changes, whenever a network receives spoofed traffic.

This is exactly why we are building openPenny. Its goal is to tell non-spoofed traffic aggregates from spoofed ones, providing a reliable primitive to build Internet traffic maps as well as detection or alerting systems on top of them. In fact, we have already outlined (see this paper) several cases where detecting entry-point traffic changes can uncover inter-domain paths inconsistent with intended routing policies, provide evidence of peering violations and expose route leaks as soon as they emerge.

The challenge: accurately detecting non-spoofed traffic

If IP spoofing weren’t a problem, the ISP in the previous example could rely on simple packet counters to track where traffic enters the network and trust these observations. However, the problem is far from solved. Eliminating spoofing requires widespread adoption of BCP 38, which remains distant from reality. Source Address Validation (SAV) helps at the edge, where IP ownership is clearer and spoofed traffic is easier to filter. But deeper in the core, where traffic from many networks converges, ownership is much harder to trace, making it far less effective.

As a result, even if traffic arrives through an unexpected ingress point, operators cannot be fully confident that the source address is genuine and not forged. Despite decades of progress in network monitoring, conventional methods relying on control-plane data or passive sampling, as used by ISPs and IXPs, still struggle to answer this fundamental question. Tools like NetFlow and sFlow can show where and how much traffic enters, but they fall short when it comes to answering a more fundamental question: is the traffic non-spoofed, or is someone forging the source address?

In contrast to prior research focused on detecting spoofed traffic, this project takes a different approach: identifying non-spoofed traffic with very high accuracy, even when it’s mixed with spoofed flows. We do this because, for the types of problems we care about, confirming that traffic is non-spoofed is what truly matters as it shows that a valid path exists between the source and our monitor. That said, accurate detection is also important, as without it unnecessary alerts would be generated.

Although finding non-spoofed traffic might sound straightforward, getting it consistently right is surprisingly hard, and especially when we want to build reliable measurements on top of it, as in the example above. One key challenge is that Internet routing is inherently asymmetric: traffic often takes one path outbound and a completely different one on the return. Most monitoring tools only see one direction, making it difficult to reconstruct the full path. On top of that, packet loss, variations in transport protocol behaviour, and malicious activity make it even harder to understand the traffic we observe.

Intuition: minimally perturbing traffic

Distinguishing non-spoofed traffic from spoofed traffic is not trivial. Our approach builds on the key idea that non-spoofed traffic usually consists of closed-loop flows, where there is meaningful two-way communication: the sender transmits data, the receiver responds with acknowledgements, and the sender adjusts its behaviour accordingly. This back-and-forth creates a feedback loop that spoofed traffic cannot replicate, as spoofed traffic is unidirectional. By looking for this kind of closed-loop behaviour, we can use it as a reliable proxy for identifying  non-spoofed traffic.

Based on this insight, we designed Penny, a probabilistic traffic checker that detects non-spoofed traffic by dropping a small number of TCP packets and observing retransmissions. The idea is simple: non-spoofed sources will retransmit the dropped packets, while spoofed ones will not. But while the concept sounds straightforward, it gets more complicated in practice. Load balancing, differences in TCP behaviour, performance impact on flows, and even attackers trying to mimic retransmissions can all challenge this assumption.

Penny’s drop rate is tuneable, so operators can balance detection accuracy against operational overhead. In our tests, dropping just 12 packets across multiple flows produced a strong signal, with false positive rates as low as one in a million, even in environments with mixed traffic, and caused negligible impact on non-spoofed flows most of the time. It is worth noting that when spoofed traffic dominates, Penny switches from flow-aggregate-level analysis to tracking individual flows, aiming to identify at least one legitimate, non-spoofed source within the set.

For further details, see our ACM SIGCOMM’24 paper, which presents the full methodology and results, and explains how we address key challenges: minimising performance degradation for legitimate flows, handling external factors such as path changes or remote packet loss, and ensuring resilience against spoofers attempting to mimic legitimate traffic patterns.

Introducing openPenny

openPenny is an open-source extension of Penny that we are developing to help network operators detect non-spoofed traffic across traffic aggregates. It supports two complementary operational modes designed to maximise visibility for the deploying network.

Active Mode: This is the primary detection mode. It requires selected traffic slices to be redirected to an external analysis box running the openPenny tool. Utilising Penny’s probabilistic model, openPenny intentionally drops a small number of packets, observing retransmissions to generate a robust signal indicating the presence of non-spoofed sources.

Passive Mode: This mode is transparent and non-intrusive. It operates on mirrored packets and uses flow sampling to provide detailed visibility without touching live traffic. Passive mode enables operators to:

  • Detect packet loss and determine whether it occurs upstream or downstream of the monitor.
  • Identify per-packet load balancing by observing gaps in TCP sequence numbers.
  • Flag abrupt TCP terminations that may suggest misbehaving middleboxes.

The two modes are complementary and can be combined. For example, operators can first use Passive mode, which operates transparently, to examine specific traffic segments. If Passive mode flags anomalies indicating something looks off, operators can then selectively switch to Active mode on the flagged traffic segment to check for the presence of non-spoofed traffic.

How openPenny fits into ISP deployments

Before analysing traffic, openPenny must first gain access to it. Depending on the mode, this is done by either redirecting or mirroring selected traffic slices to an external analysis box.

While PoP architectures vary, we envision a general deployment model (see Figure 3). In this setup, traffic from AS3 enters AS1 through router R3. Part of it is forwarded normally (see orange arrows inside AS1’s PoP). The rest of the entering traffic, represented by the blue arrows, is steered through (or mirrored to) an x86 analysis box running openPenny.

Figure 3. openPenny ISP deployment scenario.

System design

openPenny is designed with the following principles in mind:

  • Local Operation – All processing stays within the ISP’s network; no external data sharing.
  • Low False Positives – It won’t say traffic is not spoofed when it is, even in the presence of an active adversary.

To run efficiently and avoid introducing packet loss or delay, openPenny is configurable and designed to operate within key system constraints:

  • Bandwidth-Aware – It must not overload the link between the router or switch and the analysis box.
  • Resource-Efficient – It must stay within the processing limits of the middlebox handling the analysis.

Project status

We are currently developing the openPenny codebase and testing it in a real testbed consisting of a 100 Gbps switch environment to evaluate its performance under conditions that are close to real ISP deployments.

To improve the design and explore new use cases, we are also working closely with the RIPE community. At RIPE 90 in Lisbon, we received helpful feedback on both the active and passive modes of openPenny. Since we expect openPenny to be used directly by operators in their NOCs, operator feedback is essential. We want the system to integrate well with real workflows and be practically useful, so any feedback is welcome.

We plan to release the tool as open-source on GitHub later in 2025.


References

0

You may also like

View more

About the author

Author image
Petros Gigis Based in London, United Kingdom

Petros Gigis is a final-year PhD student in Computer Science at University College London. His research interests include computer networks, Internet routing, and Internet measurements.

Comments 0