Autonomous cyber agents may be developed by applying reinforcement and deep reinforcement learning (RL/DRL), where agents are trained in a representative environment. The training environment must simulate with high-fidelity the network Cyber Operations (CyOp) that the agent aims to explore. Given the complexity of net-work CyOps, a good simulator is difficult to achieve. This work presents a systematic solution to automatically generate a high-fidelity simulator in the Cyber Gym for Intelligent Learning (CyGIL). Through representation learning and continuous learning, CyGIL provides a unified CyOp training environment where an emulated CyGIL-E automatically generates a simulated CyGIL-S. The simulator generation is integrated with the agent training process to further reduce the required agent training time. The agent trained in CyGIL-S is transferrable directly to CyGIL-E showing full transferability to the emulated "real" network. Experimental results are presented to demonstrate the CyGIL training performance. Enabling offline RL, the CyGIL solution presents a promising direction towards sim-to-real for leveraging RL agents in real-world cyber networks.
翻译:自主网络智能体可通过强化学习与深度强化学习(RL/DRL)进行开发,其中智能体需在具有代表性的环境中训练。该训练环境必须高保真地模拟智能体旨在探索的网络网络作战行动(CyOp)。鉴于网络作战的复杂性,构建一个优秀的仿真器具有挑战性。本文提出了一种系统化解决方案,可在智能学习网络健身房(CyGIL)中自动生成高保真仿真器。通过表示学习与持续学习,CyGIL提供了统一的网络作战训练环境,其中仿真型CyGIL-E能自动生成模拟型CyGIL-S。仿真器生成过程与智能体训练流程相集成,进一步缩短了所需的智能体训练时间。在CyGIL-S中训练的智能体可直接迁移至CyGIL-E,展现出对仿真"真实"网络的完全可迁移性。实验结果表明了CyGIL的训练性能。通过支持离线强化学习,CyGIL解决方案为在真实网络环境中部署基于强化学习的智能体开辟了从仿真到现实的可行路径。