This work aims to enable autonomous agents for network cyber operations (CyOps) by applying reinforcement and deep reinforcement learning (RL/DRL). The required RL training environment is particularly challenging, as it must balance the need for high-fidelity, best achieved through real network emulation, with the need for running large numbers of training episodes, best achieved using simulation. A unified training environment, namely the Cyber Gym for Intelligent Learning (CyGIL) is developed where an emulated CyGIL-E automatically generates a simulated CyGIL-S. From preliminary experimental results, CyGIL-S is capable to train agents in minutes compared with the days required in CyGIL-E. The agents trained in CyGIL-S are transferrable directly to CyGIL-E showing full decision proficiency in the emulated "real" network. Enabling offline RL, the CyGIL solution presents a promising direction towards sim-to-real for leveraging RL agents in real-world cyber networks.
翻译:本研究旨在通过强化学习与深度强化学习技术,为网络作战(CyOps)构建自主智能体。所需的强化学习训练环境极具挑战性,既要通过真实网络仿真实现高保真度,又要通过模拟器实现大量训练回合的运行。本文开发了统一训练环境——智能学习网络训练场(CyGIL),其中仿真环境CyGIL-E可自动生成模拟环境CyGIL-S。初步实验结果表明,CyGIL-S能在数分钟内完成智能体训练,而CyGIL-E则需要数天时间。经CyGIL-S训练的智能体可直接迁移至CyGIL-E,在仿真的"真实"网络中展现出完整的决策能力。通过支持离线强化学习,CyGIL方案为在真实网络环境中部署强化学习智能体提供了从仿真到现实的可行方向。