Intelligent autonomous agents hold much potential for the domain of cyber-security. However, due to many state-of-the-art approaches relying on uninterpretable black-box models, there is growing demand for methods that offer stakeholders clear and actionable insights into their latent beliefs and motivations. To address this, we evaluate Theory of Mind (ToM) approaches for Autonomous Cyber Operations. Upon learning a robust prior, ToM models can predict an agent's goals, behaviours, and contextual beliefs given only a handful of past behaviour observations. In this paper, we introduce a novel Graph Neural Network (GNN)-based ToM architecture tailored for cyber-defence, Graph-In, Graph-Out (GIGO)-ToM, which can accurately predict both the targets and attack trajectories of adversarial cyber agents over arbitrary computer network topologies. To evaluate the latter, we propose a novel extension of the Wasserstein distance for measuring the similarity of graph-based probability distributions. Whereas the standard Wasserstein distance lacks a fixed reference scale, we introduce a graph-theoretic normalization factor that enables a standardized comparison between networks of different sizes. We furnish this metric, which we term the Network Transport Distance (NTD), with a weighting function that emphasizes predictions according to custom node features, allowing network operators to explore arbitrary strategic considerations. Benchmarked against a Graph-In, Dense-Out (GIDO)-ToM architecture in an abstract cyber-defence environment, our empirical evaluations show that GIGO-ToM can accurately predict the goals and behaviours of various unseen cyber-attacking agents across a range of network topologies, as well as learn embeddings that can effectively characterize their policies.
翻译:智能自主代理在网络安全领域具有巨大潜力。然而,由于许多前沿方法依赖于不可解释的黑箱模型,业界日益需要能够为利益相关者提供关于其潜在信念与动机的清晰且可操作见解的方法。为此,我们评估了用于自主网络作战的心智理论(ToM)方法。在习得稳健先验知识后,ToM模型仅需少量历史行为观测即可预测智能体的目标、行为及情境信念。本文提出一种专为网络防御设计的、基于图神经网络(GNN)的新型ToM架构——图输入-图输出(GIGO)-ToM,该架构能够准确预测任意计算机网络拓扑上对抗性网络智能体的攻击目标与攻击轨迹。为评估后者,我们提出一种用于度量基于图的概率分布相似性的Wasserstein距离新扩展形式。针对标准Wasserstein距离缺乏固定参考尺度的问题,我们引入图论归一化因子以实现不同规模网络间的标准化比较。我们将此度量(称为网络传输距离NTD)与加权函数相结合,该函数可根据自定义节点特征对预测结果进行差异化加权,使网络运维人员能够探究任意战略考量。通过在抽象网络防御环境中与图输入-密集输出(GIDO)-ToM架构进行基准测试,实证评估表明:GIGO-ToM能够准确预测多种未见网络攻击智能体在各类网络拓扑中的目标与行为,并能学习可有效表征其策略的嵌入表示。