TREC: APT Tactic / Technique Recognition via Few-Shot Provenance Subgraph Learning

APT (Advanced Persistent Threat) with the characteristics of persistence, stealth, and diversity is one of the greatest threats against cyber-infrastructure. As a countermeasure, existing studies leverage provenance graphs to capture the complex relations between system entities in a host for effective APT detection. In addition to detecting single attack events as most existing work does, understanding the tactics / techniques (e.g., Kill-Chain, ATT&CK) applied to organize and accomplish the APT attack campaign is more important for security operations. Existing studies try to manually design a set of rules to map low-level system events to high-level APT tactics / techniques. However, the rule based methods are coarse-grained and lack generalization ability, thus they can only recognize APT tactics and cannot identify fine-grained APT techniques and mutant APT attacks. In this paper, we propose TREC, the first attempt to recognize APT tactics / techniques from provenance graphs by exploiting deep learning techniques. To address the "needle in a haystack" problem, TREC segments small and compact subgraphs covering individual APT technique instances from a large provenance graph based on a malicious node detection model and a subgraph sampling algorithm. To address the "training sample scarcity" problem, TREC trains the APT tactic / technique recognition model in a few-shot learning manner by adopting a Siamese neural network. We evaluate TREC based on a customized dataset collected and made public by our team. The experiment results show that TREC significantly outperforms state-of-the-art systems in APT tactic recognition and TREC can also effectively identify APT techniques.

翻译：APT（高级持续性威胁）具有持久性、隐蔽性和多样性等特点，是对网络基础设施的最大威胁之一。作为对策，现有研究利用溯源图捕捉主机内系统实体间的复杂关系，以实现有效的APT检测。除像大多数现有工作那样检测单一攻击事件外，理解用于组织并完成APT攻击活动的策略/技术（如Kill-Chain、ATT&CK）对安全运维更为重要。现有研究试图手动设计一组规则，将低层级系统事件映射至高层级APT策略/技术。然而，基于规则的方法粒度粗且泛化能力不足，因此仅能识别APT策略，无法辨识细粒度的APT技术及变种APT攻击。本文提出TREC，这是首个利用深度学习技术从溯源图中识别APT策略/技术的尝试。为解决"大海捞针"问题，TREC基于恶意节点检测模型与子图采样算法，从大型溯源图中分割出覆盖单个APT技术实例的小型紧凑子图。为解决"训练样本稀缺"问题，TREC采用孪生神经网络，以少样本学习方式训练APT策略/技术识别模型。我们基于团队收集并公开的自定义数据集对TREC进行评估。实验结果表明，TREC在APT策略识别上显著优于现有最先进系统，且能有效识别APT技术。