Danny Lachos

Beyond the Network View: DNS-Driven Application Visibility

Author image
Danny Lachos
Contributors: Ingmar Poese

6 min read

0
Article lead image

Network operators often lack visibility into which applications drive their traffic. We present an open-source DNS-based correlation method that enriches NetFlow and BGP data with application and CDN information. This enables a shift from a network-centric to an application-oriented view of traffic.


Internet-based applications and data-intensive science applications increasingly rely on specialised network providers for high-quality connectivity. However, identifying how these applications generate and transmit traffic flows remains a significant challenge for network operators.

A purely network view is no longer enough

Over-The-Top applications (OTT-Apps) - including data-intensive science applications (e.g., the large hadron collider, telescopes, light sources) and Internet-based applications (e.g., video, gaming, social networks) - rely on specialised network providers for high-quality network connectivity to globally distribute their traffic.

To work efficiently, network providers (Research & Educational (R&E) and/or commercial) need to understand how applications create network traffic flows and analyse how these flows work. However, they have historically focused on obtaining information only about Autonomous Systems (ASes), transit providers, and peers. This network-oriented approach is not enough to have visibility into the traffic patterns of a given OTT-App and estimate its network usage.

An interesting example of this current application-oriented approach is the global CrowdStrike outage in 2024. While the underlying network infrastructure often remained technically operational, critical applications worldwide (including those for airlines, banking, and healthcare) were partially or completely shut down. Many people remember the massive failures of those services on 19 July 2024 - however, no one remembers if a specific AS reported a "down" status.

In this context, we propose and develop an open-source methodology that includes the analysis, design, and implementation of a large-scale real-time network data correlation system that uses a set of different data sources (e.g., Netflow, BGP) but mainly feeds on DNS streams to generate a new/extended multi-dimensional traffic information. As a result, a purely network-centric view can be expanded to an application-oriented view, and thus efficiently identify OTT-Apps delivering traffic to a network using different CDN domains.

Towards an application-oriented approach: architecture and workflow

The high-level architecture and entire workflow rely on two components:

1. DNS-Netflow Correlation. The output of this component includes extended and correlated data: Netflow and a list of Hostnames/URLs representing a DNS domain name resolution. The sequence of events are:

  • (1.1) DNS Classification: Records are categorised into DNS A/4A (mapping IP to name) and DNS CNAME (mapping name to name) lists.
  • (1.2) Netflow Capture: In parallel, ingress interfaces capture records containing timestamps, source/destination IPs, bytes, etc.
  • (1.3) Iterative Lookup: Use the getName(IP) function to find the domain for a specific source IP and then traverse the CNAME list (using getName(Name)) until the final domain is reached or a pre-defined loop limit is hit.

2. CDN-APP Classification. This final output extends the traffic flows with CDN domain and OTT-App information (including BGP):

  • (2.1) BGP Correlation: Data is correlated with BGP to identify traffic paths, including source, handover, nexthop, and destination ASes.
  • (2.2) CDN Identification: The getCDN() function extracts the second-level (2LD) and top-level domain (TLD) from the initial Hostname/URL, utilising the Public Suffix List (PSL) database published by Mozilla.
  • (2.3) App Association: This second lookup goes through the list of domain names to obtain an OTT-App. The getAPP() function uses a URL-APP database to associate a specific domain name or URL to the OTT-App it belongs to (e.g., dssott.com is for Disney+, pv-cdn.net is for AmazonPrime, etc.). This URL-APP is a customised/curated list that continually evolves as new sources are discovered.

From a network view to an application-oriented view

The shift from a network-centric to an application-oriented view is essential because the modern Internet is defined by services, not just infrastructure. Traditional network-focused solutions, such as legacy flow tools or Deep Packet Inspection (DPI), are significantly constrained in their ability to associate traffic data with specific applications. The latter is also more invasive, becomes increasingly ineffective due to encryption, and demands a ridiculous amount of hardware, particularly when operating on a big scale.

To address these limitations, our methodology shifts the focus from active analysis to passive DNS-based correlation. By leveraging DNS signals, we can properly identify application flows in a much less invasive way, without requiring payload information, and with high computational efficiency. This approach bridges the gap between raw network metrics and the actual services being delivered, enriching purely network-level data with useful application-level information.

Several benefits can be obtained by adding an application-oriented perspective to the network view:

  • Improved troubleshooting: Respond to user complaints more effectively, as these issues are typically related to specific applications rather than abstract IP addresses or ASes.
  • Operational intelligence: Deeper understanding of the main influencing traffic, such as sporting events like Champions League games or Formula-1 races.
  • Resource management and optimisation: Knowing which applications drive traffic growth provides the information needed for data-driven coordination with service stakeholders.
  • Usage trends: Analyse changing use patterns to improve future network planning.

To visualise this transition, the following figure provides a multi-dimensional dashboard that connects these two worlds:

  • The Application-oriented view (left side): This section uncovers the service-layer details, identifying which OTT-Apps (e.g., ByteDance, TikTok) are generating traffic and which CDN Domains (e.g., akamai.net, fastly.net) are being used to deliver that content.
  • The Network-oriented view (right side): This section maintains the traditional infrastructure perspective, mapping those application flows through the sequence of Source, Handover, Nexthop, and Destination ASes

Conclusion

The transition from a network-centric to an application-oriented perspective is fundamental for analysing network traffic and Internet-scale events. By correlating DNS streams with Netflow and BGP data, network providers can look beyond abstract ASes or IP addresses. This level of visibility into OTT-Apps and CDN domains is essential for effective troubleshooting, informed resource management, and a better understanding of the service-driven trends that define today's Internet traffic.


More research/technical details are available in our:

0

About the author

Author image
Danny Lachos Based in Berlin, Germany

Danny Lachos is a Senior Network Engineer at BENOCS. He received his Ph.D. and M.Sc. degrees in Computer Engineering from the University of Campinas (UNICAMP), Brazil, in 2021 and 2016, respectively. Danny has a particular interest in new and flexible network and application integration mechanisms in the context of multi-domain environments.

Comments 0