Networking in Docker comes with challenges, but IPv6’s abundance of addresses holds some solutions.
Recently, I was working on migrating my RIPE Atlas probe from running natively on a Raspberry Pi to running on a different Pi inside Docker.
If you don’t know about it already, you should definitely check out the RIPE Atlas project, but it’s not necessary in order to follow along here.
By default, a Docker container will be assigned an IPv4 address in a private (RFC 1918) range, which the Docker daemon will then NAT to the host’s address. You can expose ports (essentially port forward) from containers to be visible at ports on the host’s IP.
Also by default, Docker just doesn’t do IPv6, which has definitely been added to my list of Docker pet peeves! If you want to expose ports from containers over IPv6, then one current practice seems to be:
- Create a Docker network with an IPv6 unique local address (ULA) range
- Use the docker-ipv6nat container to NAT this to the host’s IPv6 address
My cringing at the thought of NATv6 aside, NAT does, for both IPv4 and IPv6, work *okay* for exposing many services, and in both cases does allow you to specify an address on the host to bind to.
The difference with my RIPE Atlas container is that the probes don’t actually need to expose any ports. They simply open an outbound connection to the RIPE Atlas servers and then conduct measurements as instructed.
Because of this, my objective was actually to specify an IPv6 address (in the range assigned by my ISP) for outgoing connections from the Atlas container. The add-on NATv6 container, and also the built-in NATv4 for that matter, doesn’t seem to have this capability.
In my specific circumstance, I don’t mind NATing IPv4 in Docker — I only have a single public v4 address so my router does NATv4 anyway. But I did want to set the outgoing IPv6 address for neatness, ease of identification, and separation of Atlas traffic.
A Solution or Two
I think there’s probably two ways to accomplish my objective.
The first, which isn’t the one I ended up using, is to connect the container to a Docker network using the macvlan driver. This essentially makes the container the same as any other device on the network. It can receive DHCP and RA’s directly from a router and set itself up however you configure it. The main reason I didn’t go this route is that I would’ve had to make my own Docker image and embed my chosen IPv6 address within the guest OS’s settings (or write a script to pull it from an environment var). This is kind of clunky and also works against one of Docker’s main objectives: portability.
The second option I think is sneakily clever. First up, here’s the relevant parts of my docker-compose.yaml, with generic IP addresses (assume 2001:db8::/56 is the range assigned to me from my ISP):
services: probe: # other container options... networks: probe-network: ipv6_address: 2001:db8::a:71a5 networks: probe-network: driver: bridge enable_ipv6: true ipam: driver: default config: - subnet: 2001:db8::a:71a5/125
This alone will establish Docker’s default IPv4+NAT setup, but also will give the container 2001:db8::a:71a5, and add the necessary routes to the host for the 2001:db8::a:71a5/125 range. As a side note, a /125 was the smallest range that worked for me, due to the address required for the interface on the host and other overheads.
At this point, the host should be able to ping the container at 2001:db8::a:71a5, but nothing beyond the host will work, and that’s simply because nothing else knows where to find that address on the local network.
To fix this, firstly some sysctl options on the Docker host need to be set to allow it to route packets:
$ sysctl -w net.ipv6.conf.all.forwarding=1 # Enable IPv6 forwarding $ sysctl -w net.ipv6.conf.INTERFACE.accept_ra=2 # If required, re-enable RA's on INTERFACE
Or if the host uses systemd’s networkd, then the following option in the external interface’s config file will set the same sysctl options in the background:
The host will now be able to route packets to the container, but other devices still won’t know that the host is responsible for the container’s address. There’s a few ways to fix this depending on your exact network setup, such as having the Docker host make Router Advertisements, or adding the subnet to a router as a static route (with the latter noteworthy for also being possible with IPv4 if desired).
Both of those options would be great for an IPv6 range, but for individual addresses, particularly ones within the range of an existing network, there’s a simpler way…
One of the neatest features of IPv6 I’ve discovered so far is NDP proxying. NDP is the IPv6 equivalent of IPv4’s ARP. They’re both essentially the protocols that devices use to find the MAC address for a given IP address on their local network segment.
NDP proxying allows a device to attract traffic from the local network bound for a given address by way of Neighbour Advertisements, but without actually assigning the address to an interface.
If the Docker host uses networkd, another option in the external interface’s config will add an address for NDP proxying:
Once a packet arrives, the host will see that it matches the route for 2001:db8::a:71a5/125 and pass it to the Docker interface.
If you’re not using systemd, then you can of course control NDP proxying directly:
$ sysctl -w net.ipv6.conf.INTERFACE.proxy_ndp=1 # Enable NDP proxying on INTERFACE $ ip -6 neigh add proxy 2001:db8::a:71a5 dev INTERFACE # Add address for proxying on INTERFACE
NDP proxying works in my case as the address of the container is within the range of my LAN, so routes already exist for it. Something on the network just needs to stick up its hand and say “Hey! That’s me!”, which is pretty much exactly what NDP proxying makes the host do. If I wanted the container address to be outside of my LAN, then I’d have to go with the RA or static route option.
I should note that ARP proxying is also possible, but from my very brief look at it, it doesn’t seem to be selectable like NDP proxying. At least from the systemd manual, it sounds like a method a router may use to attract all traffic from a network segment so it can route it itself, so definitely something that could brick a network if you’re not careful.
I’ll end by noting that I haven’t extensively tested and researched these solutions, so you should before implementing them in something important.
If you discover any big drawbacks then let me know.
This article was originally published on Cam's Blog.