Blog Viewer

Sampling Evolution

By Dmitry Shokarev posted 10-16-2022 00:00

  

Are the flow caches effective to support IPFIX implementations today? What happens if we stop using them? Learn about the new IPFIX implementation in Juniper PTX routers.

Introduction

NetFlow was initially designed as a flow monitoring technique where network element monitors IP flows, aggregates statistics over multiple packets of the same flow and periodically exports flow statistics to the collector. Flow statistics typically include information that identifies a flow (a combination of IP header fields), and metadata, such as incoming interface of the packet and outgoing interface. Modern versions of the NetFlow protocol are standardized in IETF, RFC 5101 and RFC 7011, and it is now called IPFIX. These protocols are extensible and allow reporting of arbitrary information from the network element. IPFIX applications extend beyond traditional routing applications: IPFIX is used for Carrier Grade NAT session logging, arbitrary MIB statistics export.

However, the main IPFIX application is export of flows, and IPFIX is strongly associated with flow aggregation on the router itself.

The goal of this article is to demonstrate that there is a little to no benefit in using flow aggregation on the router anymore for many practical core and peering applications, more than that, there are applications where flow aggregation is not desirable, because of the extra latency aggregation introduces.

Flow Export Implementation

Typical IPFIX flow export implementation is comprised of two processes, export and sampling, plus a flow table where millions of flows reside temporary.

Figure 1. Sampling, Aggregation and Export.

When packet is sampled, the sampling process extracts packet header fields that form the key and consults the flow table. In case of a miss, a new entry is added, and flow key and other fields of the new entry are initialized. If existing flow table entry is hit, packet and byte count fields of an existing flow are incremented.

The export process periodically walks the table and exports flows that had no activity for the specified duration (inactive flow timeout), or active flows that have not been exported for a long time (active flow timeout).

Sampling and export processes may be implemented in the ASIC, or in software. Flow table may also reside in the ASIC or in software. Hybrid models are possible too. Regardless of the implementation, how effective is the flow aggregation?

The answer depends on the sampling rate, and timeouts configuration. Figure 2 shows statistics collected from the production network serving Internet traffic. With 1:4096 sampling rate, 60s active flow timeout and 15s inactive flow timeout 90.7661% of flow records contain only one packet.

Figure 2. Distribution of number of packets in an exported flow.

To measure the effectiveness of the flow table we need to measure the proportion of packets that hit known flow table entries. The formula to compute this number is rather simple and it is shown on the plot. Only 16.56% of packets can leverage the flow aggregation capability of the device in this case.

Both total packet statistics and total flow statistics is reported by JUNOS today and can be collected by running the command below.

user@router> show services accounting flow inline-jflow fpc-slot 0
  Flow information
    FPC Slot: 0
    Flow Packets: 2525882835, Flow Bytes: 1759044202265
    Active Flows: 16602, Total Flows: 2045424663
    Flows Exported: 2042540070, Flow Packets Exported: 776552744
    Flows Inactive Timed Out: 2037891816, Flows Active Timed Out: 7516247
 
    IPv4 Flows:
    IPv4 Flow Packets: 2464206449, IPv4 Flow Bytes: 1714169514735
    IPv4 Active Flows: 16109, IPv4 Total Flows: 1996750136
    IPv4 Flows Exported: 1993944138, IPv4 Flow Packets exported: 730194911
    IPv4 Flows Inactive Timed Out: 1989433298, IPv4 Flows Active Timed Out: 7300731
 
    IPv6 Flows:
    IPv6 Flow Packets: 61676386, IPv6 Flow Bytes: 44874687530
    IPv6 Active Flows: 493, IPv6 Total Flows: 48674527
    IPv6 Flows Exported: 48595932, IPv6 Flow Packets Exported: 46357833
    IPv6 Flows Inactive Timed Out: 48458518, IPv6 Flows Active Timed Out: 215516

We conducted analysis on several deployments, and the results are presented in the Table 1.

ID Rate Inactive Timeout Active Timeout Flow Table Hit Ratio
1 100 15 600 80.01%
2 500 60 60 63.41%
3 1000 15 60 46.13%
4 1000 15 60 55.86%
5 1000 60 60 35.38%
6 1024 60 1800 16.40%
7 2000 60 1800 1.74%
8 4000 60 1800 29.43%
9 4096 60 1800 17.66%
10 8192 60 120 13.55%
11 10000 60 1800 16.39%
12 32767 60 1800 3.54%

Table 1: Flow Table Hit Ratio

As the data suggests, the percentage of flows with a single packet increases with the sampling rate. And there is a theoretical explanation for this.

Figure 3 shows the theoretical analysis of single packet flow percentage as function of the sampling rate (from 16 to 8192) and different flow lengths (from 16 to 4096), see the Appendix section for details how these plots are produced.

Figure 3. Percentage of flows with only one packet for different sampling rates and flow lenghts.

Studies in Annex [1] and [2] show that more than 99% of the flows have less than 200 packets, hence most of the time only single packet from the flow is sampled with reasonable sampling rates.

If only single packet is sampled, then flow aggregation table maintenance offers no benefit, and the implementation can be simplified.

In a simplified implementation, flow table with millions of entries is replaced with a small circular buffer with hundreds of entries, Figure 4.

Sampling process adds new entries to the buffer, and export process gathers them, creates flow reports and sends out to the collector.

Figure 4. Simplified sampling implementation.

Conclusion

Simplified IPFIX sampling implementation reduces memory footprint and increases the performance: no flow table management is needed.
There is another added benefit: flow reporting latency reduces to less than a second, compared to 15 and more seconds in IPFIX case (typical active flow and inactive flow timeouts). If no flow table is managed, the next step in the evolution process is to report the packet content instead of just packet fields. This is a topic for the future article.
Keep in mind that sampling rate must be relatively high for the approach to be feasible, but these sampling rates are typical in most of the peering and core deployments. Lower sampling rates, down to 1:1, require very different implementation, to be described in the future article.

Application in Juniper Routers

Starting from 21.3 release, Juniper 100GE PTX routers based on Express 2 chipset (Paradise) use new simplified implementation by default, with an option to fall back to the flow cache mode when nexthop-learning knob is configured, see documentation.
400GE PTX routers only use new simplified IPFIX implementation, and demonstrate outstanding sampling performance among routers in its class, up to 150 thousands of sampled packets per second per line card or a fixed form factor system.

References

  • 1 - L. Qian and B. E. Carpenter, "A flow-based performance analysis of TCP and TCP applications," 2012 18th IEEE International Conference on Networks (ICON), 2012, pp. 41-45, doi: 10.1109/ICON.2012.6506531.
  • 2 - Jurkiewicz, Piotr, Grzegorz Rzym, and Piotr Boryło. "Flow length and size distributions in campus Internet traffic." Computer Communications 167 (2021): 15-30
  • https://datatracker.ietf.org/doc/html/rfc7011

Acknowledgements

Many thanks to Ranjith S V V Kumar and Alex Baban for reviewing this article.

Glossary

  • ASIC: Application Specific Integrated Circuit
  • CGNAT: Carrier Grade Network Address Translation
  • IETF: Internet Engineering Task Force
  • RFC: Requests for Comments
  • IPFIX: IP Flow Information Export

Appendix: Theoretical Analysis R Script

The R script below is used to produce the Figure 3.

library(tidyverse)
library(ggplot2) 
library(scales) 
library(ggthemes)
summary_analysis <-data.frame( rate = seq(16,8192,16)) %>%
  left_join( data.frame( flow_length = c(16,32,64,128,256,512,1024,2048,4096)), by=character()) %>%
  mutate( p_sampled_once = dbinom(1,size=flow_length, prob=1/rate),
          p_sampled_twice_or_more = 1 - pbinom(1,size=flow_length, prob=1/rate),
          p_seeing_one = p_sampled_once / (p_sampled_once + p_sampled_twice_or_more),
          p_seeing_two_or_more = p_sampled_twice_or_more / (p_sampled_once + p_sampled_twice_or_more))

ggplot( summary_analysis, aes(x=rate, y=p_seeing_one)) +
  geom_line() +  
  scale_y_continuous( labels=label_percent(accuracy=1))+
  theme_hc()+ 
  theme(panel.border = element_blank(),
        plot.background=element_blank())+
  facet_wrap(vars(flow_length), labeller=labeller(flow_length = label_both)) +
  xlab( "Sampling rate") + 
  ylab( "Percentage of records with one packet") 

Feedback

Revision History

Version Author(s) Date Comments
1 Dmitry Shokarev October 2022 Initial publication
2 Dmitry Shokarev October 2022 Minor corrections


#PTXSeries

Permalink