We believe that agents for automated incident response based on machine learning need to handle changes in network structure. Computer networks are dynamic, and can naturally change in structure over time. Retraining agents for small network changes costs time and energy. We attempt to address this issue with an existing method of relational agent learning, where the relations between objects are assumed to remain consistent across problem instances. The state of the computer network is represented as a relational graph and encoded through a message passing neural network. The message passing neural network and an agent policy using the encoding are optimized end-to-end using reinforcement learning. We evaluate the approach on the second instance of the Cyber Autonomy Gym for Experimentation (CAGE~2), a cyber incident simulator that simulates attacks on an enterprise network. We create variants of the original network with different numbers of hosts and agents are tested without additional training on them. Our results show that agents using relational information are able to find solutions despite changes to the network, and can perform optimally in some instances. Agents using the default vector state representation perform better, but need to be specially trained on each network variant, demonstrating a trade-off between specialization and generalization.
翻译:我们认为,基于机器学习的自动化事件响应智能体需要能够处理网络结构的变化。计算机网络具有动态性,其结构会随时间自然演变。针对微小的网络变化重新训练智能体会耗费大量时间和能源。我们尝试利用现有的关系型智能体学习方法来解决这一问题,该方法假设对象间的关系在不同问题实例中保持一致。计算机网络的状态被表示为关系图,并通过消息传递神经网络进行编码。消息传递神经网络与使用该编码的智能体策略通过强化学习进行端到端优化。我们在第二代网络自主实验平台(CAGE~2)上评估了该方法,该平台是一个模拟企业网络攻击的网络事件模拟器。我们创建了具有不同主机数量的原始网络变体,并在未经额外训练的情况下测试智能体。结果表明,利用关系信息的智能体能够在网络发生变化时找到解决方案,并在某些情况下实现最优性能。使用默认向量状态表示的智能体表现更佳,但需要对每个网络变体进行专门训练,这体现了专业化与泛化能力之间的权衡。