Constructing Multi-label Hierarchical Classification Models for MITRE ATT&CK Text Tagging

MITRE ATT&CK is a cybersecurity knowledge base that organizes threat actor and cyber-attack information into a set of tactics describing the reasons and goals threat actors have for carrying out attacks, with each tactic having a set of techniques that describe the potential methods used in these attacks. One major application of ATT&CK is the use of its tactic and technique hierarchy by security specialists as a framework for annotating cyber-threat intelligence reports, vulnerability descriptions, threat scenarios, inter alia, to facilitate downstream analyses. To date, the tagging process is still largely done manually. In this technical note, we provide a stratified "task space" characterization of the MITRE ATT&CK text tagging task for organizing previous efforts toward automation using AIML methods, while also clarifying pathways for constructing new methods. To illustrate one of the pathways, we use the task space strata to stage-wise construct our own multi-label hierarchical classification models for the text tagging task via experimentation over general cyber-threat intelligence text -- using shareable computational tools and publicly releasing the models to the security community (via https://github.com/jpmorganchase/MITRE_models). Our multi-label hierarchical approach yields accuracy scores of roughly 94% at the tactic level, as well as accuracy scores of roughly 82% at the technique level. The models also meet or surpass state-of-the-art performance while relying only on classical machine learning methods -- removing any dependence on LLMs, RAG, agents, or more complex hierarchical approaches. Moreover, we show that GPT-4o model performance at the tactic level is significantly lower (roughly 60% accuracy) than our own approach. We also extend our baseline model to a corpus of threat scenarios for financial applications produced by subject matter experts.

翻译：MITRE ATT&CK是一个网络安全知识库，它将威胁行为者和网络攻击信息组织成一组描述攻击者实施攻击原因与目标的战术，每个战术包含一组描述攻击中可能使用方法的技战术。ATT&CK的一个重要应用是安全专家利用其战术与技战术层级体系，作为标注网络威胁情报报告、漏洞描述、威胁场景等内容的框架，以支持下游分析。迄今为止，标注过程仍主要依赖人工完成。本技术报告通过分层"任务空间"表征方法，对MITRE ATT&CK文本标注任务进行系统描述，既整合了现有基于人工智能/机器学习方法的自动化研究成果，同时阐明了构建新方法的实现路径。为展示其中一条路径，我们利用任务空间分层结构，通过对通用网络威胁情报文本的实验，分阶段构建了适用于文本标注任务的多标签分层分类模型——使用可共享的计算工具并将模型公开发布给安全社区（通过https://github.com/jpmorganchase/MITRE_models）。我们的多标签分层方法在战术层级实现了约94%的准确率，在技战术层级实现了约82%的准确率。该模型仅依赖经典机器学习方法即达到或超越了当前最优性能，无需依赖大语言模型、检索增强生成、智能体或更复杂的分层方法。此外，我们证明GPT-4o模型在战术层级的性能（约60%准确率）显著低于我们的方法。我们还将基线模型扩展应用于领域专家编制的金融应用威胁场景语料库。