Please read this guest post by Samir Jafferali from LinkedIn in which he explains which steps he followed when setting up an anycast network: from acquiring address space to setting up routing and announcing the prefix from multiple PoPs.
Virtually every web page or app you use talks to a unique remote server with a unique IP address. When popular websites or apps have servers around the world, one problem is that complex systems have to be built to make sure you're talking to the optimal server (usually the closest one to you), otherwise performance can suffer. With anycast, a single IP address is assigned to the global servers and then the glue of the Internet, BGP (Border Gateway Protocol), routes you to the “closest” server.
The server isn’t necessarily the closest physically to you, but the one “network-close” to you. While anycast isn't new, using it to improve performance for web traffic is a relatively recent trend that some sites like LinkedIn are using.
If you want to see anycast in action, click here to see where my network routes you.
This guide is intended for techies and sysadmin types who’d like to build a “Hello World” anycast network. If you’re not technical, consider skimming it to get an overview of how Internet plumbing is assembled. Running your own network is not only fun, but also instructive, and will give you a different vantage point on key topical issues like net neutrality, censorship, and IPv6.
The administrative steps will take a couple weeks since you’ll be interacting with other organisations. The technical steps will take a few evenings to get things up and running, and considerably more time if you decide to science the heck out of it. The completed network spans 20 Points of Presence (PoPs) across six continents as confirmed with a trial Catchpoint account:
Figure 1: The completed network spanning 20 PoPs across six continents
1. Register with a Regional Internet Registry
The Regional Internet Registry (RIR) is the body that'll assign the building blocks such as your subnet and Autonomous System. The location of your service, your sensitivity to cost, and how much red tape you're comfortable with, will decide which RIR you pick. I gravitated to the RIPE NCC as it had a good mix of accessibility and cost, and four of the points of presence (PoPs) I was planning to use were in Europe. Creating an account with the RIPE NCC is straightforward. Get familiar with the RIPE Database as you can immediately start creating metadata (maintainer, person, organisation, etc.) used in the next steps.
Figure 2: Screenshot of the RIPE Database search field
What criteria do you use for selecting a RIR? Share them with me in the comments.
2. Acquire a /24 and /48
The smallest subnets you can advertise over BGP are a /24 for IPv4 and a /48 for IPv6 so you'll need to get your hands on at least one of them. Acquiring IPv6 space is relatively easy due to its abundance and some organisations will assign you a /44 for free.
Figure 3: My inetnum object in the RIPE Database
IPv4 is optional for this project. You have two choices: become a holder of space by becoming a member of an RIR, or lease space from an existing holder. Holding space is more complex for many reasons, and even if you receive a /22 from the RIPE NCC (as of December 2016 they still have 13.3 million IPv4 addresses to hand out) you’ll be stuck with hefty annual RIR fees. I initially went down this route and could write an entire other blog post about that. For this project and the remainder of this guide I'm going to cover leasing as it has low up front costs and the ability to go month to month. If you decide to use IPv4 you’ll need to shop around; webhostingtalk and lowendbox are your friends. After discussions with several parties it became clear that Prager IT was the route to go as their professionalism and enthusiasm checked all my boxes. Not only do they include a complimentary IPv6 /48 with the lease of IPv4 space, but they were responsive to multiple Letter of Authority (LOA) requests. While the leasing process was an administrative slog, requiring contract work and somewhat elaborate verification, it was straightforward.
What suggestions do you have for acquiring IPv4 and IPv6 space? Let me know in the comments.
3. Apply for an Autonomous System
The Autonomous System number (ASN) is used to identify your network on routers. This step is technically optional. You can announce your IP space without a public ASN by using a private ASN with a transit provider that already has a public ASN. However, without a public ASN you won’t be able to peer or use advanced traffic management like prepending and communities. If you don't want an ASN jump to step 4.
With the RIPE NCC, there is no fee for an ASN as long as you get sponsored by an existing member of the RIPE NCC (an LIR). The recommendation here is to lease your IP space from someone that’ll sponsor you. Also note that applying for an ASN will require you are “multi-homed”, meaning that your paperwork will need to show you have contracts with at least two transit providers (see next step).
Figure 4: My aut-num object in the RIPE Database
Once you have your ASN you can create IPv4 and IPv6 route objects in the RIPE DB so that transit providers can automatically whitelist your prefixes without an LOA.
Are there other approaches to acquiring an ASN?
4. Get Connectivity and Virtual instances
You’ll acquire the last two building blocks of your network in this step. Getting connectivity and compute cycles has traditionally been a daunting step as it required getting real hardware in collocated space or an Internet Exchange (IX). By leveraging Virtual Private Servers (VPS) providers that support "Bring Your Own IP” you get easy access to both transit and virtual instances. While most networks peer with others, for simplicity we’ll rely solely on transit.
By leveraging Virtual Private Servers (VPS) vendors that support "Bring Your Own IP” you get easy access to both transit and virtual instances.
Singlehome: My recommendation here is to use Vultr as your primary cloud+transit provider as they have 15 locations across 4 continents, have good BGP support (like communities), and will set you back $5 per month per PoP (BGP session fees included). And, if you don’t have an ASN, you can use theirs. Because of their high level of automation, once your BGP session has been approved it's really as simple as spinning up VPS instances and announcing your space.
Figure 5: My IPv4 and IPv6 peers
Multihome: While this step is optional, redundancy is always recommended. The goal here is twofold. First, add a second provider in your primary markets (North America in my case), then maximise coverage. As I wanted to have presence on 6 continents and shore up APAC, I strategically picked Hong Kong, India, Brazil, and South Africa as my next expansions and set out to find providers in those regions.
After a lot of searching and sending emails, I settled on these 3 providers:
Because of heterogeneous processes and environments this process will take some time. I ran into many issues such as IP binding, upstream filtering & approvals, invalid boot devices, BGP config issues, nuances with OpenVZ and IPv6, LOA requests, iptables and selinux madness, inoperable VNC/console and timezone delays. Despite this, the support staff was terrific which helped push the process along (and a special shout out to Andrew at HostUs for going above and beyond). VPS fees ranges from $5-$7 a month however they also tack on one time or recurring BGP session fees.
There are more VPS suggestions here. I also found Nat Morris’s great presentation about anycast on a shoestring when completing this write-up. It was interesting to see how we tackled the same problem for different audiences in very different ways. Some other providers that I considered were Zappie, Packet.net (bare metal), and Safehouse. YMMV.
What are your recommendations for VPS providers that support BGP and ‘Bring Your Own IP’? Do they include communities?
5. Configure your cloud platform
Here’s where you setup your favorite web services on one instance. For HTTP I went with Caddy, a reverse proxy load balancer written in Go that’s been used by the likes of Netflix and Gopher Academy. It’ll give you HTTP/2 and TLS (through Let’s Encrypt) with just a few config lines. For DNS I went with NSD, a lightweight authoritative server used by some TLDs and root servers. The recommendation here is to configure basic health checks to validate BGP, DNS, and HTTP and to otherwise restart them in a logical fashion. Config management helps and do harden your boxes.
6. Announce your /24 & /48 from one PoP
With your platform ready, it’s time to announce it to the world. You’ll need to pick a BGP client like exabgp, bird (what I used), or quagga and configure 4 announcements: /24 and /48 (your entire IPv4/IPv6 space), and /32 and /128 (the instance). The former will get shared with the world and the latter only to your local router.
Figure 6: Announcing my ASN with bird
If you’re on OpenVZ virtualization you may skip the latter due to how your IPs are bound. `birdc show route` and `birdc show proto all` commands and BGP looking glasses are your friend. Vultr has a good bird guide. Once you form neighbours and announce your space to the world, move to the next step.
Figure 7: AS 49073 IPv4 Route Propagation
7. Announce from multiple PoPs
After you have one host working, replicate your configs to multiple PoPs. Once you’re announcing from a second site you’re officially anycasted.
You can quickly validate your performance by using any global ping or traceroute tool that measures latency from multiple locations.
Figure 8: Using a global ping tool
8. Correcting glaring routing issues
The network built so far is certainly not optimised and there will be routing woes. While anycast routing provides an elegant and simplified way of routing users based on routing policies and shortest hop count, it does have some pitfalls. BGP is not latency aware, not QoS aware, and not server load aware. Having administered a few large scale anycast DNS platforms in prior companies, I have seen many cases of routing users to a sub-optimal country or continent, server load asymmetry, and routing over degraded networks.
For this guide I’ll cover what to spot check as a first pass to ensure that routing is at least somewhat sane. Because the Internet is constantly changing, a real anycast network needs to be continuously tuned.
Because your upstreams have varied levels of connectivity, they will attract traffic from varied types of networks. In my case, some remote nodes were so well connected that they were pulling traffic from Europe and North America. While there are many knobs, the three main tools in your belt are communities, prepending, and selective announcements. Since not all providers supported self-service communities, and prepending failed to help, I needed to work with my upstreams.
Because my HTTP responses include a header with the PoP name, I could tag traffic in Catchpoint to show traffic patterns. The first half of the following graph shows global traffic tagged by the Mumbai PoP. After I requested that my upstream drop all its peers and announce only out of the National Internet Exchange of India, we see the situation improve.
Figure 9: Tagging traffic in Catchpoint
HostUs, my Hong Kong provider, also worked closely with me to tune the communities.
9. GeoDNS to the rescue
Host1Plus’s Sao Paulo and Johannesburg nodes provided a couple challenges. First, my Sao Paulo BGP announcements were pulling traffic from North America, and the Johannesburg upstream delayed provisioning my BGP session for weeks. Because Host1Plus didn’t support self-service communities and they couldn’t assist me setting communities either, I decided to improvise. The plan was now to use anycast globally and unicast strategically. Because I was using a 3rd level domain, I was able to use my 2nd level DNS service, Route53, to advertise the unicast IPs of the Sao Paulo and Johannesburg nodes in those regions. Using GeoDNS for DNS name server A records is not common, however it is totally valid, and practical for 3rd level domains, and used by at least one of the top 5 Alexa sites.
Hopefully someone reading this can share a variety of VPS vendors that have BGP community support, especially in emerging markets.
This is the final footprint of the network:
In total, how much will this project set you back?
You can spend zero or as much as you want.
- $0 : Use free IPv6 space and Vultr’s ASN + self-labeled "free VPS hosting"
- $65+ a month: Lease your IPv4 and IPv6 and spin up anywhere from 2 to 15 Vultr PoPs. In my case the IPv4+IPv6 lease cost is $55 a month, and Vultr costs $5 per PoP.
- $190 a month + one time BGP session fees: The network described in this guide which includes IPv4+IPv6 lease cost of $55 a month and another $135 a month for a server in each of the 20 locations.
If you have reservations about spending money, when you consider how much some networking classes/labs cost, you may have the justification you need. If you’re more comfortable building in a laboratory environment rather than the real Internet, you can somewhat follow along using the DN42 network. DN42 is a private Internet operated by enthusiasts and is a safe sandbox.
There are many things to test, including peering, traffic steering, route dampening, and BGP convergence. You could try these yourself, or you could just wait... Because yes, I just gave you a preview of my next posts.
If you have tips, ideas, feedback, or questions, then please participate in the comment dialogue.