IPv6 Internet Pollution
The traditional view on Internet pollution has been that it was comprised primarily of worm scanning and DDoS backscatter. However, some of our recent analysis on IPv4 darknets has shown that Internet pollution is much broader in scope.
It can consist of traffic that results from mis-configurations, topology mapping scans, software code bugs, bad default setting, routing instability and even Internet censorship  . Based on this background we sought to better understand Internet pollution in IPv6. There had been some prior work by Sandia Labs and APNIC  in this area. Our goal was to perform a significantly larger study, which would cover multiple RIR regions, the largest amount of IPv6 space we could advertise, and last as long as possible - long enough to be able to detect any emerging trends.
As networks attempt to enable an increasing number of IPv6 applications and services, it is highly likely that inexperience, software configuration differences, or even software or hardware bugs will result in errors that can lead to Internet pollution. Observing such undesired traffic can provide valuable insight to help navigate the rocky start of the new technology. In addition, by observing IPv6 background radiation traffic, we can watch for the emergence of malicious activity, such as scanning and worms, via the new protocol. Identifying these issues early in the adoption process minimizes the cost of fixing them and can provide a best-practice template for future IPv6 network administrators.
The IPv6 darknet experiment is setup in a similar manner to previous experiments with IPv4 address space. We first obtain Letters of Authority (LoA) from the Regional Internet Registries (RIRs) for the prefixes to be studied. Next we work with our upstream Internet providers to make sure they accept our announcements of these prefixes. All the resulting data for these prefixes that flows towards our routers is collected and archived for analysis. The significant difference in this scenario is that while in IPv4 we were able to announce address space prior to any sub-allocations, for the IPv6 experiment, we announce a covering /12 prefix for each RIR which is the largest single block of address space that each of them has been allocated. We announced the following prefixes:
- AfriNIC (2c00::/12)
- APNIC (2400::/12)
- ARIN (2600::/12)
- LACNIC (2800::/12)
- RIPE NCC (2a00::/12, modified to 2a08::/13+2a04::/14 after 2 days)
Data Analysis and Results
Figure 2 and 3 above show the traffic rates we observed for each RIR dataset in a one week sample of our collected data. The traffic varies considerably across RIRs and is the highest for the ARIN and the Lacnic region. In particular we see significant spikes in the ARIN dataset that peak at 1Mbps. The composition of the traffic also varies greatly from one region to another. In the one week sample data we show here, we see ICMP dominating in the Lacnic region, while UDP appears to dominate in the ARIN and the APNIC region. TCP traffic is in all cases a small portion of the overall traffic. On the other hand, an earlier three-month data sample that we analyzed shows TCP dominating the ARIN data. We intend to report on a longitudinal view of these data in an upcoming publication.
While we find no evidence of worm activity we are able to detect some limited amounts of scanning that appears to be directed at limited subsets of particular network prefixes. Some of this is perhaps related to the topology discovery/mapping performed by large CDN operations.
In one of several interesting case studies we examined, we were able to observe link-local source addresses in our dataset indicating the lack of proper filtering at network edges to prevent such traffic from leaking into the Internet. A large portion of the observed traffic can be characterized as DNS requests and replies. Interestingly, we also observe BGP, NTP and SMTP traffic in our datasets.
We have only begun the process of combing through the data to get a better understanding on IPv6 background radiation. So far our high level analysis has revealed that like IPv4 Internet pollution, IPv6 Internet pollution is also highly unpredictable and varied. Over 90% of the pollution is directed at less than a 100 specific destinations and over 90% of IPv6 Internet pollution is sourced from less than a 1,000 unique sources. We find interesting instances of pollution that can be characterized as DNS, BGP, NTP, SMTP, HTTP and ICMP traffic.
Perhaps more interestingly, we find evidence that shows the highly unstable nature of the average prefix in the IPv6 routing table. We are also able to show that a significant amount of pollution traffic has a high degree of locality of reference to existing prefixes in the routing table. We also believe that using a covering prefix to detect Internet pollution is an important mechanism for detecting potential mis-configurations, instability and other issues as an increasing number of networks introduce IPv6 based services.
We are continuing to monitor some of the /12 prefixes with the goal of observing long term trends. In the future we would like to particularly consider the following additional areas to enhance our research:
- Improved co-ordination with various network operations groups – Originally we did not widely publicize our experiment, as we did not want to risk the possible contamination of our data samples based on intentional tampering by malicious actors. However, in our experience we have not observed any such activity and we therefore believe that is possible to relax this concern and better co-ordinate future activities – particular beaconing described below – on public forums.
- Summary reports based on our analysis – It is possible to generate summary reports in some cases where specific networks are seen as the sources of traffic of various types in our pollution dataset. This feedback can help the network operations research community identify issues.
- IPv6 routing beacons – We would like to particularly focus our future research activities on implementing IPv6 routing beacons which would give us a better opportunity to understand the dynamics of traffic in flight during periods of routing instability.
 Internet Pollution – Part 2, Scot Walls and Manish Karir, NANOG51, Miami, February 2011, http://www.merit.edu/research/pdf/2011/Internet-Pollution-Part2.pdf
 18.104.22.168/8, Manish Karir, Eric Wustrow, George Michaelson, Geoff Huston, Michael Bailey, Farnam Jahanian, NANOG 49, San Francisco, June 2010, http://www.merit.edu/research/pdf/2010/karir_1slash8.pdf
 IPv6 Background Radiation, Geoff Huston, First International Workshop on Darkspace and Unsolicited Traffic Analysis (DUST), 201