Advanced Persistent Threats (APTs) are sophisticated, long-term cyberattacks that are difficult to detect because they operate stealthily and often blend into normal system behavior. This paper presents a neuro-symbolic anomaly detection framework that combines a Graph Autoencoder (GAE) with rare pattern mining to identify APT-like activities in system-level provenance data. Our approach first constructs a process behavioral graph using k-Nearest Neighbors based on feature similarity, then learns normal relational structure using a Graph Autoencoder. Anomaly candidates are identified through deviations between observed and reconstructed graph structure. To further improve detection, we integrate an rare pattern mining module that discovers infrequent behavioral co-occurrences and uses them to boost anomaly scores for processes exhibiting rare signatures. We evaluate the proposed method on the DARPA Transparent Computing datasets and show that rare-pattern boosting yields substantial gains in anomaly ranking quality over the baseline GAE. Compared with existing unsupervised approaches on the same benchmark, our single unified model consistently outperforms individual context-based detectors and achieves performance competitive with ensemble aggregation methods that require multiple separate detectors. These results highlight the value of coupling graph-based representation learning with classical pattern mining to improve both effectiveness and interpretability in provenance-based security anomaly detection.
翻译:高级持续性威胁(APTs)是一类复杂、长期的网络攻击,因其隐蔽操作且常与正常系统行为相融合而难以检测。本文提出一种神经符号异常检测框架,该框架将图自编码器(GAE)与稀有模式挖掘相结合,以识别系统级溯源数据中类似APT的活动。我们的方法首先基于特征相似性使用k近邻算法构建进程行为图,随后利用图自编码器学习正常关系结构。异常候选通过观测图结构与重建图结构之间的偏差进行识别。为进一步提升检测能力,我们集成了一个稀有模式挖掘模块,该模块能发现低频行为共现模式,并利用它们对呈现稀有特征的进程提升异常评分。我们在DARPA透明计算数据集上评估了所提方法,结果表明稀有模式增强机制相比基线GAE在异常排序质量上取得了显著提升。与同一基准测试中现有的无监督方法相比,我们提出的单一统一模型持续优于基于上下文的独立检测器,其性能可与需要多个独立检测器的集成聚合方法相竞争。这些结果凸显了将基于图的表示学习与经典模式挖掘相结合,对于提升溯源安全异常检测的效能与可解释性具有重要价值。