TTPXHunter: Actionable Threat Intelligence Extraction as TTPs form Finished Cyber Threat Reports

Understanding the modus operandi of adversaries aids organizations in employing efficient defensive strategies and sharing intelligence in the community. This knowledge is often present in unstructured natural language text within threat analysis reports. A translation tool is needed to interpret the modus operandi explained in the sentences of the threat report and translate it into a structured format. This research introduces a methodology named TTPXHunter for the automated extraction of threat intelligence in terms of Tactics, Techniques, and Procedures (TTPs) from finished cyber threat reports. It leverages cyber domain-specific state-of-the-art natural language processing (NLP) to augment sentences for minority class TTPs and refine pinpointing the TTPs in threat analysis reports significantly. The knowledge of threat intelligence in terms of TTPs is essential for comprehensively understanding cyber threats and enhancing detection and mitigation strategies. We create two datasets: an augmented sentence-TTP dataset of 39,296 samples and a 149 real-world cyber threat intelligence report-to-TTP dataset. Further, we evaluate TTPXHunter on the augmented sentence dataset and the cyber threat reports. The TTPXHunter achieves the highest performance of 92.42% f1-score on the augmented dataset, and it also outperforms existing state-of-the-art solutions in TTP extraction by achieving an f1-score of 97.09% when evaluated over the report dataset. TTPXHunter significantly improves cybersecurity threat intelligence by offering quick, actionable insights into attacker behaviors. This advancement automates threat intelligence analysis, providing a crucial tool for cybersecurity professionals fighting cyber threats.

翻译：理解对手的行动模式有助于组织采用高效的防御策略并在社区内共享情报。这些知识通常以非结构化自然语言文本的形式存在于威胁分析报告中。需要一种翻译工具来解释威胁报告语句中阐述的行动模式，并将其转化为结构化格式。本研究提出了一种名为TTPXHunter的方法，用于从已完成的网络威胁报告中自动提取战术、技术和程序（TTP）形式的威胁情报。该方法利用网络领域最先进的自然语言处理技术来增强少数类TTP的句子，并显著提高威胁分析报告中TTP的精确定位。以TTP形式呈现的威胁情报知识对于全面理解网络威胁以及增强检测和缓解策略至关重要。我们创建了两个数据集：一个包含39,296个样本的增强句子-TTP数据集，以及一个包含149份真实世界网络威胁情报报告的报告-TTP数据集。此外，我们在增强句子数据集和网络威胁报告上评估了TTPXHunter。TTPXHunter在增强数据集上取得了92.42%的F1分数这一最高性能，并且在报告数据集上评估时，以97.09%的F1分数超越了现有的最先进的TTP提取解决方案。TTPXHunter通过提供关于攻击者行为的快速、可操作洞察，显著改善了网络安全威胁情报。这一进展实现了威胁情报分析的自动化，为应对网络威胁的网络安全专业人员提供了关键工具。