Recently, reinforcement and deep reinforcement learning (RL/DRL) have been applied to develop autonomous agents for cyber network operations(CyOps), where the agents are trained in a representative environment using RL and particularly DRL algorithms. The training environment must simulate CyOps with high fidelity, which the agent aims to learn and accomplish. A good simulator is hard to achieve due to the extreme complexity of the cyber environment. The trained agent must also be generalizable to network variations because operational cyber networks change constantly. The red agent case is taken to discuss these two issues in this work. We elaborate on their essential requirements and potential solution options, illustrated by some preliminary experimentations in a Cyber Gym for Intelligent Learning (CyGIL) testbed.
翻译:近年来,强化学习与深度强化学习(RL/DRL)被用于开发自主网络作战(CyOps)智能体,这些智能体在代表性环境中通过强化学习,特别是深度强化学习算法进行训练。训练环境必须高保真地模拟网络作战场景,而智能体的目标是学习并完成作战任务。由于网络环境的极端复杂性,构建优秀的模拟器十分困难。同时,由于作战网络持续动态变化,训练后的智能体还必须具备对网络变体的泛化能力。本研究以红队智能体为例,探讨上述两个问题。我们详细阐述了其核心需求与潜在解决方案,并通过在智能学习网络靶场(CyGIL)测试平台上的初步实验加以说明。