Advanced Persistent Threats (APTs) evolve through multiple stages, each exhibiting distinct temporal and structural behaviors. Accurate stage estimation is critical for enabling adaptive cyber defense. This paper presents StageFinder, a temporal graph learning framework for multi-stage attack progression inference from fused host and network provenance data. Provenance graphs are encoded using a graph neural network to capture structural dependencies among processes, files, and connections, while a long short-term memory (LSTM) model learns temporal dynamics to estimate stage probabilities aligned with the MITRE ATT&CK framework. The model is pretrained on the DARPA OpTC dataset and fine-tuned on labeled DARPA Transparent Computing data. Experimental results demonstrate that StageFinder achieves a macro F1-score of 0.96 and reduces prediction volatility by 31 percent compared to state-of-the-art baselines (Cyberian, NetGuardian). These results highlight the effectiveness of fused provenance and temporal learning for accurate and stable APT stage inference.
翻译:高级持续性威胁(APT)攻击通常经历多个阶段,每个阶段表现出不同的时序与结构行为。准确的阶段估计对于实现自适应网络防御至关重要。本文提出StageFinder,一种基于融合主机与网络溯源数据的时序图学习框架,用于多阶段攻击进程推断。该框架利用图神经网络对溯源图进行编码,以捕捉进程、文件与连接间的结构依赖关系;同时采用长短期记忆(LSTM)模型学习时序动态特征,从而估计符合MITRE ATT&CK框架定义的攻击阶段概率分布。模型在DARPA OpTC数据集上进行预训练,并在标注的DARPA透明计算数据上进行微调。实验结果表明,相较于现有最优基线方法(Cyberian、NetGuardian),StageFinder实现了0.96的宏观F1分数,并将预测波动性降低了31%。这些结果凸显了融合溯源分析与时序学习在实现精准稳定APT阶段推断方面的有效性。