Many cybersecurity problems that require real-time decision-making based on temporal observations can be abstracted as a sequence modeling problem, e.g., network intrusion detection from a sequence of arriving packets. Existing approaches like reinforcement learning may not be suitable for such cybersecurity decision problems, since the Markovian property may not necessarily hold and the underlying network states are often not observable. In this paper, we cast the problem of real-time network intrusion detection as casual sequence modeling and draw upon the power of the transformer architecture for real-time decision-making. By conditioning a causal decision transformer on past trajectories, consisting of the rewards, network packets, and detection decisions, our proposed framework will generate future detection decisions to achieve the desired return. It enables decision transformers to be applied to real-time network intrusion detection, as well as a novel tradeoff between the accuracy and timeliness of detection. The proposed solution is evaluated on public network intrusion detection datasets and outperforms several baseline algorithms using reinforcement learning and sequence modeling, in terms of detection accuracy and timeliness.
翻译:许多需要基于时序观测进行实时决策的网络安全问题可抽象为序列建模问题,例如从到达数据包序列中检测网络入侵。由于马尔可夫性未必成立且底层网络状态通常不可观测,现有方法如强化学习可能不适用于此类网络安全决策问题。本文将实时网络入侵检测问题建模为因果序列建模,并借助Transformer架构实现实时决策能力。通过使因果决策Transformer以包含奖励、网络数据包和检测决策的历史轨迹为条件,我们提出的框架将生成未来检测决策以实现期望回报。该方法不仅使决策Transformer能够应用于实时网络入侵检测,还实现了检测准确性与及时性之间的新型权衡。在公开网络入侵检测数据集上的评估表明,本方案在检测准确性和及时性方面均优于基于强化学习和序列建模的多种基线算法。