We present ProvG-Searcher, a novel approach for detecting known APT behaviors within system security logs. Our approach leverages provenance graphs, a comprehensive graph representation of event logs, to capture and depict data provenance relations by mapping system entities as nodes and their interactions as edges. We formulate the task of searching provenance graphs as a subgraph matching problem and employ a graph representation learning method. The central component of our search methodology involves embedding of subgraphs in a vector space where subgraph relationships can be directly evaluated. We achieve this through the use of order embeddings that simplify subgraph matching to straightforward comparisons between a query and precomputed subgraph representations. To address challenges posed by the size and complexity of provenance graphs, we propose a graph partitioning scheme and a behavior-preserving graph reduction method. Overall, our technique offers significant computational efficiency, allowing most of the search computation to be performed offline while incorporating a lightweight comparison step during query execution. Experimental results on standard datasets demonstrate that ProvG-Searcher achieves superior performance, with an accuracy exceeding 99% in detecting query behaviors and a false positive rate of approximately 0.02%, outperforming other approaches.
翻译:摘要:我们提出了 ProvG-Searcher,一种用于在系统安全日志中检测已知高级持续性威胁(APT)行为的新方法。该方法利用溯源图(一种事件日志的综合性图表示),通过将系统实体映射为节点、交互映射为边,来捕获和描述数据溯源关系。我们将溯源图搜索任务形式化为子图匹配问题,并采用图表示学习方法。搜索方法的核心在于将子图嵌入到向量空间中,使得子图关系能直接进行评估。我们通过使用顺序嵌入实现这一目标,从而将子图匹配简化为查询与预计算子图表示之间的直接比较。为应对溯源图规模和复杂性带来的挑战,我们提出了一种图划分方案和一种行为保持的图简化方法。总体而言,我们的技术显著提升了计算效率,允许大部分搜索计算离线执行,而在查询执行阶段仅需轻量级的比较步骤。在标准数据集上的实验结果表明,ProvG-Searcher 具有优越性能,检测查询行为的准确率超过 99%,误报率约为 0.02%,优于其他方法。