You are here: Home > Publications > RIPE Labs > Benoit Donnet > Gunfight at OK Corral: MPLS vs. Probing

Gunfight at OK Corral: MPLS vs. Probing

Benoit Donnet — 12 Oct 2018
Contributors: Pascal Mérindol
In this article, we investigate the various relationships between MPLS tunnels and standard probing techniques (i.e., ping and traceroute). We investigate several possible MPLS configurations and explain how they interact with measurement techniques. We also provide an external link to an online platform allowing users to test their MPLS configuration.


MPLS routers, i.e., Label Switching Routers (LSRs), exchange labelled packets over Label Switched Paths (LSPs). In practice, those packets are tagged with one or more label stack entries (LSE) inserted between the frame header (data-link layer) and the IP packet (network layer). Each LSE is made of four fields as illustrated by Figure 1: an MPLS label used for forwarding the packet to the next router, a Traffic Class field for quality of service, priority, and Explicit Congestion Notification, a bottom of stack flag bit (to indicate whether the current LSE is the last in the stack), and a time-to-live (LSE-TTL) field having the same purpose as the IP-TTL field (i.e., avoiding routing loops).

Figure 1: LSE format

Labels may be allocated through the Label Distribution Protocol (LDP) as follows:

  1. Through ordered LSP control (default configuration of Juniper routers),
  2. Through independent LSP control (default configuration of Cisco routers).

In the former mode, a LSR only binds a label to a prefix if this prefix is local (typically, the exit point of the LSR), or if it has received a label binding proposal from the IGP next hop towards this prefix. This mode is thus iterative as each intermediate upstream LSR waits for a proposal of its downstream LSR (to build the LSP from the exit to the entry point). Juniper routers use this mode as default and only propose labels for loopback IP addresses. In the second mode, that is the Cisco default one, a LSR creates a label binding for each prefix it has in its RIB (connected or – redistributed in – IGP routes only) and distributes it to all its neighbours. This mode does not require any proposal from downstream LSRs. Consequently, a label proposal is sent to all neighbours without ensuring that the LSP is enabled up to the exit point of the tunnel. LSP setup takes less time but may lead to an uncommon situation in which an LSP can end abruptly before reaching the exit point.

The last LSR is the Egress Label Edge Router (the Egress LER). Depending on its configuration, two labelling modes may be performed. The default mode is Penultimate Hop Popping (PHP), where the Egress advertises an Implicit Null label (label value of 3). The previous LSR (Penultimate Hop LSR (PH, P3 in Figure 2) is in charge of removing the LSE to reduce the load on the Egress. In the Ultimate Hop Popping (UHP), the Egress LER advertises an Explicit Null label (label value of 0). The PH will use this Explicit Null label and the Egress LER will be responsible for its removal. The Ending Hop LSR (EH) is the LSR in charge of removing the label, it can be the PH in case of PHP, the Egress LER in case of UHP or possibly another LSR in the case of independent LSP control.

Depending on its location along the LSP, a LSR applies one of the three following operations:

  • Push. The first MPLS router (Ingress Label Edge Router – Ingress LER) pushes one or several LSEs in the IP packet that turns into an MPLS one. The Ingress LER associates the FEC of the packet to its LSP. Prior to pushing the LSE into the packet, the Ingress LER has to initialise the LSE-TTL. Two behaviours can be configured: either the Ingress LER resets the LSE-TTL to an arbitrary value (255, no-ttl-propagate) or it copies the current IP-TTL value into the LSE-TTL (ttl-propagate, the default behaviour). Operators can configure this operation using the no-ttl-propagate option provided by the router manufacturer. Once the LSE-TTL has been initialised, the LSE is pushed on the packet and then sent to an outgoing interface of the Ingress LER. In most cases, except for a given Juniper OS (i.e., Olive), the IP-TTL is decremented before being encapsulated into the MPLS header.
  • Swap: Within the LSP, each LSR makes a label lookup in the Label table, swaps the incoming label with its corresponding outgoing label and sends the MPLS packet further along the LSP.
  • Pop: The EH, the last LSR of the LSP, deletes the LSE, and converts the MPLS packet back into an IP one. The EH can be the Egress Label Edge Router (the Egress LER) when UHP is enabled or the LH otherwise.

If the LSE-TTL expires, the LSR, in the fashion of any IP router, forges an ICMP time-exceeded that is sent back to the packet originator. It is worth to notice that a LSR may implement RFC 4950 (as it should be the case in all recent OSes). If so, it means that the LSR will quote the full MPLS LSE stack of the expired packet in the ICMP time-exceeded message.

ICMP processing in MPLS tunnels varies according to the ICMP type of message. ICMP Information messages (e.g., echo-reply) are directly sent to the destination (e.g., originator of the echo-request) if the IP FIB allows for it (otherwise no replies are generated). On the contrary, ICMP Error messages (e.g., time-exceeded) are generally forwarded to the Egress LER that will be in charge to forward the packet through its IP plane. This leads to unexpected values in TTL fields of packets received (from the prober perspective – see LTE, LER, and LJER fields in Figure 2) that might be used to identify tunnels (see RRR Sec. 2).

Figure 2 illustrates the main vocabulary associated with MPLS tunnels.

Figure 2: Illustration of MPLS vocabulary and relationship between MPLS and standard probing techniques. The figure consists of two parts. The upper part represents the network topology we use, throughout the post to illustrate concepts. In particular, with respect to MPLS, P1 is the LSP First Hop (FH), while P3 is the Penultimate Hop (PH). In case of PHP, P3 is the Ending Hop (EH) and is responsible for removing the LSE. In case of UHP, the LSE is removed by the Egress LER (PE2). The middle part of the figure presents the MPLS behaviour with respect to probing (i.e., traceroute and ping).

Probing MPLS Network

According to whether LSRs implement RFC 4950 or not and whether they activate the ttl-propagate option or not, MPLS tunnels can be revealed to traceroute.

Explicit tunnels are those with RFC 4950 and the ttl-propagate option activated (this is the default configuration). As such, they are fully visible by traceroute including labels along the LSP. Implicit tunnels activate the ttl-propagate option but not the RFC 4950. No IP information is missed but LSRs are viewed as ordinary IP routers, leading to a lack of “semantic” in the traceroute output. Opaque tunnels are obscured from traceroute as the RFC 4950 is implemented but not the ttl-propagate option and moreover the EH that pops the last label has not received an Explicit or Implicit NULL label. Consequently, only the EH is revealed while the remainder of the tunnel is hidden. Finally, Invisible tunnels are hidden as the no-ttl-propagate option is activated (RFC 4950 may be implemented or not).

As illustrated in Figure 2 (in the lower part), Explicit tunnels are the ideal case as all the MPLS information comes natively with traceroute.

Implicit tunnels might be identified based on the way LSRs process ICMP messages (see Sec. 1 – u-turn effect) and the IP-TTL quoted in the time-exceeded message (the so-called qTTL) that is increased by one at each subsequent LSR of the LSP due to the ttl-propagate option (ICMP time-exceeded are generated based on the LSE-TTL while the IP-TTL of the probe is left unchanged within the LSP and, thus, quoted as such in the ICMP time-exceeded). The u-turn effect relies on the fact that, by default, LSRs send ICMP time-exceeded messages to the Egress LER which, in turns, forwards the packets to the probing source. However, they reply directly to other kinds of probe (e.g., echo-request) using their own IP forwarding table, if available. As a result, in general, return paths are shorter for time-exceeded packets than echo-request messages. Thereby, the u-turn effect is the signature related to the difference in these lengths.

Opaque tunnels are only encountered with Cisco LSPs and are a consequence of the way labels are distributed with LDP (see Sec. 1). Indeed, a label proposal may be sent to all neighbours without ensuring that the LSP is enabled up to the Egress LER, leading so to Opaque tunnels because an LSP can end abruptly without reaching the Egress LER (where the prefix is injected in the IGP) that should bind an Explicit (UHP) or Implicit Null label (PHP). As illustrated in Figure 2, Opaque tunnels and their length can be identified thanks to the LSE-TTL. LSPs end without a standard terminating label (Implicit or Explicit Null) and so they break with the last MPLS header of the neighbour that may not be an MPLS speaker. It appears the vast majority of Opaque tunnels seems to be caused by Carrier-of-Carriers VPN or similar technologies. Indeed, they provoke an abrupt tunnel ending (the bottom label is necessarily carried up to the end of the tunnel to determine the correct outgoing VPN), and unfortunately lead to non revealable tunnels

The traceroute behavior, for Invisible tunnel, is different according to the way the LSE is popped from the packet (i.e., UHP or PHP), as illustrated in Figure 2.

With Invisible UHP tunnels (typically, with Cisco routers using the 15.2 IOS), upon reception of a packet with IP-TTL of 1, the Egress LER does not decrement this TTL, but, rather, forwards the packet to the next hop (CE2 in the example), so that the Egress does not show up in the trace. In contrast, the next hop will appear twice: once for the probe that should have expired at the Egress and once at the next probe. UHP indeed provokes a surprising pattern, a duplicated IP at two successive hops, illustrated as “Invisible UHP” in Figure 2. This duplicated IP addresses might be misunderstood as a forwarding loop.

On the contrary, PHP moves the Pop function at the PH, one hop before the end of the tunnel. This PH does not decrement the IP-TTL whatever its value is. Except for some JunOS, the packet is still MPLS switched because the LSE-TTL has not expired on it. It is somehow surprising because for Explicit and Implicit tunnels, the PH replies on its own. It is because the LSE-TTL has also expired. In Figure 2 we can see that there is no more asymmetry in path length for router P3 proving so its reply does not follow a u-turn via the Egress. On the contrary, any other LSR on the LSP builds a time-exceeded message when the LSE-TTL expires and then continues to MPLS switch their reply error packet to the Egress LER unless the mpls ip ttl-expiration pop <stack size> command has been activated for Cisco routers. It seems to be just an option for Juniper routers with the icmp-tunneling command.

We provide an online platform for testing interactions between traceroute/ping and MPLS tunnels. It is enough to probe towards platform also allows users to assess advanced probing algorithms to reveal the content of PHP/UHP Invisible tunnels (see our presentation at RIPE 77 for more details).  



Add comment

You can add a comment by filling out the form below. Comments are moderated so they won't appear immediately. If you have a RIPE NCC Access account, we would like you to log in.