Programmable data planes offer precise control over the low-level processing steps applied to network packets, serving as a valuable tool for analysing malicious flows in the field of intrusion detection. Albeit with limitations on physical resources and capabilities, they allow for the efficient extraction of detailed traffic information, which can then be utilised by Machine Learning (ML) algorithms responsible for identifying security threats. In addressing resource constraints, existing solutions in the literature rely on compressing network data through the collection of statistical traffic features in the data plane. While this compression saves memory resources in switches and minimises the burden on the control channel between the data and the control plane, it also results in a loss of information available to the Network Intrusion Detection System (NIDS), limiting access to packet payload, categorical features, and the semantic understanding of network communications, such as the behaviour of packets within traffic flows. This paper proposes P4DDLe, a framework that exploits the flexibility of P4-based programmable data planes for packet-level feature extraction and pre-processing. P4DDLe leverages the programmable data plane to extract raw packet features from the network traffic, categorical features included, and to organise them in a way that the semantics of traffic flows are preserved. To minimise memory and control channel overheads, P4DDLe selectively processes and filters packet-level data, so that only the features required by the NIDS are collected. The experimental evaluation with recent Distributed Denial of Service (DDoS) attack data demonstrates that the proposed approach is very efficient in collecting compact and high-quality representations of network flows, ensuring precise detection of DDoS attacks.
翻译:可编程数据平面能够精确控制对网络数据包的低层级处理步骤,是入侵检测领域中分析恶意流量的重要工具。尽管受限于物理资源和计算能力,它们仍能高效提取详细的流量信息,这些信息随后可被用于识别安全威胁的机器学习算法所利用。为解决资源约束问题,现有文献中的方案通常通过在数据平面收集统计流量特征来压缩网络数据。这种压缩虽能节省交换机内存资源并减轻数据平面与控制平面之间控制通道的负担,但同时也导致网络入侵检测系统可用信息的损失,限制了其对数据包负载、分类特征以及网络通信语义理解(如流量中数据包的行为模式)的访问能力。本文提出P4DDLe框架,该框架利用基于P4的可编程数据平面的灵活性,实现数据包级特征提取与预处理。P4DDLe借助可编程数据平面从网络流量中提取原始数据包特征(包括分类特征),并以保留流量语义的方式组织这些特征。为最小化内存与控制通道开销,P4DDLe对数据包级数据进行选择性处理与过滤,仅收集网络入侵检测系统所需的特征。基于近期分布式拒绝服务攻击数据的实验评估表明,所提方法能高效收集网络流量的紧凑且高质量表征,确保对分布式拒绝服务攻击的精确检测。