Network administrators want to detect TCP-level packet reordering to diagnose performance problems and attacks. However, reordering is expensive to measure, because each packet must be processed relative to the TCP sequence number of its predecessor in the same flow. Due to the volume of traffic, detection should take place in the data plane as the packets fly by. However, restrictions on the memory size and the number of memory accesses per packet make it impossible to design an efficient algorithm for pinpointing flows with heavy packet reordering. In practice, packet reordering is typically a property of a network path, due to a congested or flaky link. Flows traversing the same path are correlated in their out-of-orderness, and aggregating out-of-order statistics at the IP prefix level provides useful diagnostic information. In this paper, we present efficient algorithms for identifying IP prefixes with heavy packet reordering under memory restrictions. First, we sample as many flows as possible, regardless of their sizes, but only for a short period at a time. Next, we separately monitor the large flows over long periods, in addition to the flow sampling. In both algorithms, we measure at the flow level, and aggregate statistics and allocate memory at the prefix level. Our simulation experiments, using packet traces from campus and backbone networks, and our P4 prototype show that our algorithms correctly identify $80\%$ of the prefixes with heavy packet reordering using moderate memory resources.
翻译:网络管理员希望检测TCP层面的数据包重排序,以诊断性能问题和攻击行为。然而,测量重排序代价高昂,因为每个数据包必须基于同一流中前一个数据包的TCP序列号进行处理。由于流量规模庞大,检测应在数据平面中随数据包传输实时进行。但内存容量和每数据包内存访问次数的限制,使得设计高效算法来精确定位存在严重数据包重排序的流变得不可行。在实践中,数据包重排序通常是网络路径的属性,由拥塞或不可靠链路引起。经过相同路径的流在乱序性上具有关联性,在IP前缀层级聚合乱序统计数据可提供有用的诊断信息。本文提出了在内存限制下识别存在严重数据包重排序的IP前缀的高效算法。首先,我们在短时间内尽可能多地采样流(无论其大小)。其次,除流采样外,我们还对大型流进行长期单独监控。在这两种算法中,我们都在流层级进行测量,并在前缀层级聚合统计数据和分配内存。基于校园网和骨干网数据包的仿真实验以及P4原型表明,我们的算法能在使用中等内存资源的情况下正确识别80%存在严重数据包重排序的前缀。