Many cybersecurity problems that require real-time decision-making based on temporal observations can be abstracted as a sequence modeling problem, e.g., network intrusion detection from a sequence of arriving packets. Existing approaches like reinforcement learning may not be suitable for such cybersecurity decision problems, since the Markovian property may not necessarily hold and the underlying network states are often not observable. In this paper, we cast the problem of real-time network intrusion detection as casual sequence modeling and draw upon the power of the transformer architecture for real-time decision-making. By conditioning a causal decision transformer on past trajectories, consisting of the rewards, network packets, and detection decisions, our proposed framework will generate future detection decisions to achieve the desired return. It enables decision transformers to be applied to real-time network intrusion detection, as well as a novel tradeoff between the accuracy and timeliness of detection. The proposed solution is evaluated on public network intrusion detection datasets and outperforms several baseline algorithms using reinforcement learning and sequence modeling, in terms of detection accuracy and timeliness.
翻译:许多需要基于时序观测进行实时决策的网络安全问题(例如基于到达的数据包序列进行网络入侵检测)可抽象为序列建模问题。现有方法如强化学习可能不适用于此类网络安全决策问题,因为马尔可夫性未必成立,且底层网络状态通常不可观测。本文通过将实时网络入侵检测建模为因果序列建模问题,利用变换器架构的强大能力实现实时决策。通过将包含奖励、网络数据包及检测决策的历史轨迹作为因果决策变换器的条件,所提出的框架将生成未来检测决策以实现期望回报。这使得决策变换器能够应用于实时网络入侵检测,并实现检测准确性与时效性之间的新型权衡。在公开网络入侵检测数据集上的评估表明,所提方法在检测准确性和时效性上均优于多种基于强化学习和序列建模的基线算法。