Threat actor attribution is a crucial defense strategy for combating advanced persistent threats (APTs). Cyber threat intelligence (CTI), which involves analyzing multisource heterogeneous data from APTs, plays an important role in APT actor attribution. The current attribution methods extract features from different CTI perspectives and employ machine learning models to classify CTI reports according to their threat actors. However, these methods usually extract only one kind of feature and ignore heterogeneous information, especially the attributes and relations of indicators of compromise (IOCs), which form the core of CTI. To address these problems, we propose an APT actor attribution method based on multimodal and multilevel feature fusion (APT-MMF). First, we leverage a heterogeneous attributed graph to characterize APT reports and their IOC information. Then, we extract and fuse multimodal features, including attribute type features, natural language text features and topological relationship features, to construct comprehensive node representations. Furthermore, we design multilevel heterogeneous graph attention networks to learn the deep hidden features of APT report nodes; these networks integrate IOC type-level, metapath-based neighbor node-level, and metapath semantic-level attention. Utilizing multisource threat intelligence, we construct a heterogeneous attributed graph dataset for verification purposes. The experimental results show that our method not only outperforms the existing methods but also demonstrates its good interpretability for attribution analysis tasks.
翻译:威胁行为者归因是抵御高级持续性威胁(APT)的关键防御策略。网络威胁情报(CTI)通过分析来自APT的多源异构数据,在APT行为者归因中发挥重要作用。现有归因方法从不同CTI视角提取特征,并利用机器学习模型根据威胁行为者对CTI报告进行分类。然而,这些方法通常仅提取单一类特征,忽略了异构信息,尤其是构成CTI核心的入侵指标(IOC)属性及其关联关系。针对上述问题,我们提出一种基于多模态多层次特征融合的APT行为者归因方法(APT-MMF)。首先,利用异质属性图对APT报告及其IOC信息进行建模;其次,提取并融合属性类型特征、自然语言文本特征和拓扑关系特征等模态特征,构建全面的节点表示;进一步,设计多层次异质图注意力网络以学习APT报告节点的深层隐藏特征——该网络融合了IOC类型级、基于元路径的邻域节点级及元路径语义级注意力机制。利用多源威胁情报构建异质属性图数据集进行验证。实验结果表明,本方法不仅性能优于现有方法,且在归因分析任务中展现出良好的可解释性。