Autonomous Cyber Defence is required to respond to high-tempo cyber-attacks. To facilitate the research in this challenging area, we explore the utility of the autonomous cyber operation environments presented as part of the Cyber Autonomy Gym for Experimentation (CAGE) Challenges, with a specific focus on CAGE Challenge 2. CAGE Challenge 2 required a defensive Blue agent to defend a network from an attacking Red agent. We provide a detailed description of the this challenge and describe the approaches taken by challenge participants. From the submitted agents, we identify four classes of algorithms, namely, Single- Agent Deep Reinforcement Learning (DRL), Hierarchical DRL, Ensembles, and Non-DRL approaches. Of these classes, we found that the hierarchical DRL approach was the most capable of learning an effective cyber defensive strategy. Our analysis of the agent policies identified that different algorithms within the same class produced diverse strategies and that the strategy used by the defensive Blue agent varied depending on the strategy used by the offensive Red agent. We conclude that DRL algorithms are a suitable candidate for autonomous cyber defence applications.
翻译:自主网络防御需要应对高速网络攻击。为促进这一挑战性领域的研究,我们探讨了作为网络自主实验平台(CAGE)挑战赛组成部分的自主网络操作环境的实用性,重点聚焦于CAGE挑战2。该挑战要求一个防御方的蓝色智能体保护网络免受攻击方红色智能体的入侵。我们详细描述了该挑战,并阐述了参赛者所采用的方法。从提交的智能体中,我们识别出四类算法,即单智能体深度强化学习(DRL)、层次化DRL、集成方法及非DRL方法。在这几类中,我们发现层次化DRL方法最擅长学习有效的网络防御策略。通过对智能体策略的分析,我们观察到同一类别内的不同算法会产生多样化的策略,且防御方蓝色智能体采取的策略会随攻击方红色智能体策略的变化而调整。我们得出结论:DRL算法是自主网络防御应用的理想候选方案。