When RFC1883 was published in 1995 it marked not the end of the process that produced the IPv6 protocol architecture, but rather was a milestone in the evolution of the IPv6 protocol.
Some elements of the architecture have withstood the test of time largely unchanged, while others have required changes or been found to be unusable and are therefore observably rare to nonexistent on the Internet. The emergence and massive impact of the commercial Internet largely postdate the early work on replacing the IPv4 protocol, and our demands and expectations for the protocol have evolved accordingly.
In RFC1883, the IPv6 flow label was a newly introduced element of the IPv6 header, not present in any form in the IPv4 header, allowing for the extension of connection (flow) oriented packet handling down from the transport layer.
All packets belonging to the same flow must be sent with the same source address, destination address, priority, and flow label.” — RFC1883, Section 6
It was thought at the time, that flow cached approaches to forwarding might represent a significant optimization over the expense of performing a forwarding lookup based on longest prefix match.
“Routers are free to “opportunistically” set up flow-handling state for any flow, even when no explicit flow establishment information has been provided to them via a control protocol, a hop-by-hop option, or other means.” — RFC1883, Section 6
Instead, over time, as Content Addressable Memory table-based Longest Prefix Match lookups in routers supplanted software forwarding, the expense of flow-based forwarding state went out of favour for scaling routers.
Flow state as a forwarding optimization remained exclusively in choke-points where flow-state-tracking was an actual necessity by design such as firewalls and Network Address Translation/Port Address Translation. Modern routers are bounded in performance by the worst-case scenario for forwarding costs, namely that the next-hop for all packets forwarded must be looked up on a per-packet basis.
In 2011, RFC6294 and RFC6437 were published, the latter an update of the flow label specification. RFC6294 examines the use of the flow label’s history and establishes a consensus basis for the use of this 20-bit field in the IPv6 header.
One enduring application for the flow label (mentioned in RFC6294) is a component of a hash for equal-cost multi-path forwarding, whether for stateless load balancers or simply across an aggregated bundle of links. Using only the IP source, the destination for hashing is a very limited source of entropy, particularly for two hosts communicating with each other as the source and destination will be the same for all flows.
Typically, IPv4 fields included in the hash might be source IP, destination IP, protocol, source port, and destination. In the IPv6 hashing scenario, hashing has the added complication of potentially not being able to find the L4 transport header for several reasons: interceding extension headers, hop-by-hop options, non-initial fragments, or IPsec.
Due to these complications, it’s desirable to be able to hash exclusively on the source IP, destination IP, and flow label. For cases where the flow is not set to zero, the extra 20 bits of randomness is more than sufficient to achieve an extremely even distribution of flows across a cluster of servers or link aggregation bundles.
Applications of the flow label
As IPv6 deployment has progressed, various parts of the stack have added or improved support for the use of the flow label.
Major operating systems now set the flow label to non-zero values. Ethernet switches, routers, load balancers, and other devices that depend on being able to distribute traffic across two or more paths for scalability reasons have added the ability to use the flow label as an addition to a hash component.
For applications where flow handling is stateful (such as firewalls, Layer 3/4 load-balancers, and connection-oriented load balancers) it’s important that the flow label is not changed for the duration of the connection, otherwise the computed hash will change and, depending on the number of possible destinations, there is anywhere from a 50% chance (two destinations) to a near certainty (hundreds or thousands of possible paths) that the flow will be hashed onto a new destination.
The use of the flow label as a hash element for any stateful application requires that the flow label remain unchanged over the life of the flow.
Stateful inspection firewalls have a further complication with state expiration in that they will emit Internet Control Message Protocol unreachable or TCP resets for established connections without having recourse to the original outgoing flow label, so they will either set to zero or use a new flow label. In either case, assumptions about hashing the return message to the sender using the flow label depend on having the one associated with the flow handy.
Cases have been observed where devices that connect to hashed endpoints do not in fact honour this behaviour. By in large, this flow label changing behaviour has been traced to IPv6 supporting CPE/firewalls, which change the flow label between the initial syn and the ack.
It’s not always clear what the motivation for changing the flow label mid flow might be, but the quality of service policies that favour preferential treatment for TCP handshakes appear to play a role. This behaviour will break TCP connections that hash through a load balancer, which includes the flow label as a hash component.
At Fastly, this hashing is performed by an Ethernet switch ASIC, and to avoid breakage, the IPv6 hashing function must not include the flow label. As in IPv4, the hash function includes the source and destination information in the L3 and L4 headers.
More recently in the IETF, alternative uses for the flow label have been proposed. One such proposal is draft-fioccola-spring-flow-label-alt-mark, which is an alternative marking method for packets to be embedded in the flow label rather than the original proposal for an inserted extension header. The proposal envisions that it can only be used in controlled domains where flow label hashing is disabled.
This application is probably safe from immediate damage to IPv6 flow label hashing. However, once one application to manipulate the field between packets associated with a flow been approved it will be tempting to consider other uses for the 20 bits available in the flow label header. For example, in IPv4 the IP TOS bits have been recycled many times for both public and internal purposes; some of those applications will no doubt escape the sandbox over time.
Hashing over with rapidly changing flow labels increases the potential for packet reordering even for the most basic uses of a flow label hash.
Avoid using the flow label as a hash component
Network operators that have to support load-balanced services should probably avoid using the flow label as a hash component. There are cases where we have seen Ethernet switch ASICs, which include flow label hashing by default, without recourse to disabling it. The ability to control the inputs to the hash function should be a consideration in any load-balancing RFP.
While it is tempting to consider using the flow label bits for one’s own purposes internally, doing so runs the risk of colliding with someone else’s application requiring additional sanitation of packets ingressing to or egressing from one’s network. While sanitation is allowed by the RFCs (RFC6437, Section 6.1) it creates its own risks where leakage occurs and flow labels are treated as non-opaque values either within a network or by a third party.
This blog post was originally published on the APNIC blog.
Comments are disabled on articles published more than a year ago. If you'd like to inform us of any issues, please reach out to us via the contact form here.
Cecil Ward •
I would appreciate some clarification and expansion. Please be kind and forgive my ignorance. Who what where in the stack is choosing values for the flow label? Applications are not necessarily being written or rewritten to be flow-label aware. If they are dual-stacked how can they manage when this feature is not available in IPv4. Transports could use the flow label instead of apps but how would they cope with lack of support in the entity at the remote end or even worse a different and unknown usage scheme for the flow label by the other end? The lack of clarity in the original spec and the lack of a usage scheme self-identifier are problems that imho can't be fixed without the availability of the Tardis. In closed controlled environments these problems are not a factor but kit and software is, I would assume, not being built that only works in such niche environments. Or is it? An RFC that I read recently seemed to present a fairly hopeless picture. Thanks for a useful article. Cecil Ward.
joel jaeggli •
If you're working up at the linux sockets layer or above the selection of a (random) flow-label is taken care of for you. all you're speficying is (c example) sockfd6 = socket (r->ai_family, r->ai_socktype, r->ai_protocol); can take a look at: https://www.lugod.org/presentations/ipv6programming/PortMeth.pdf RFC's tend not to speficy API's at least historically. but as with source port numbers this is soupposed to be handled automatically unless you have a really good reason to mess with it. I do agree that it's a bit of a mess from my vantage point, hence the motivation for writing the above missive.