Development of autonomous cyber system defense strategies and action recommendations in the real-world is challenging, and includes characterizing system state uncertainties and attack-defense dynamics. We propose a data-driven deep reinforcement learning (DRL) framework to learn proactive, context-aware, defense countermeasures that dynamically adapt to evolving adversarial behaviors while minimizing loss of cyber system operations. A dynamic defense optimization problem is formulated with multiple protective postures against different types of adversaries with varying levels of skill and persistence. A custom simulation environment was developed and experiments were devised to systematically evaluate the performance of four model-free DRL algorithms against realistic, multi-stage attack sequences. Our results suggest the efficacy of DRL algorithms for proactive cyber defense under multi-stage attack profiles and system uncertainties.
翻译:现实世界中开发自主网络系统防御策略及行动建议极具挑战性,这包括对系统状态不确定性与攻防动态特性的表征。我们提出一种数据驱动的深度强化学习框架,用于学习能动态适应演进式对抗行为、同时最小化网络系统运营损失的前瞻性上下文感知防御对策。针对不同技能水平与持久度的多类型攻击者,我们构建了包含多种防护姿态的动态防御优化问题。通过开发定制化仿真环境并设计实验,系统评估了四种无模型深度强化学习算法在真实多阶段攻击序列下的性能。结果表明,深度强化学习算法在多阶段攻击场景与系统不确定性条件下,对主动式网络防御具有有效性。