Identifying Adversary Tactics and Techniques in Malware Binaries with an LLM Agent

Understanding TTPs (Tactics, Techniques, and Procedures) in malware binaries is essential for security analysis and threat intelligence, yet remains challenging in practice. Real-world malware binaries are typically stripped of symbols, contain large numbers of functions, and distribute malicious behavior across multiple code regions, making TTP attribution difficult. Recent large language models (LLMs) offer strong code understanding capabilities, but applying them directly to this task faces challenges in identifying analysis entry points, reasoning under partial observability, and misalignment with TTP-specific decision logic. We present TTPDetect, the first LLM agent for recognizing TTPs in stripped malware binaries. TTPDetect combines dense retrieval with LLM-based neural retrieval to narrow the space of analysis entry points. TTPDetect further employs a function-level analyzing agent consisting of a Context Explorer that performs on-demand, incremental context retrieval and a TTP-Specific Reasoning Guideline that achieves inference-time alignment. We build a new dataset that labels decompiled functions with TTPs across diverse malware families and platforms. TTPDetect achieves 93.25% precision and 93.81% recall on function-level TTP recognition, outperforming baselines by 10.38% and 18.78%, respectively. When evaluated on real world malware samples, TTPDetect recognizes TTPs with a precision of 87.37%. For malware with expert-written reports, TTPDetect recovers 85.7% of the documented TTPs and further discovers, on average, 10.5 previously unreported TTPs per malware.

翻译：理解恶意软件二进制文件中的战术、技术与程序对于安全分析和威胁情报至关重要，但在实践中仍具挑战性。现实中的恶意软件二进制文件通常被剥离符号、包含大量函数，并将恶意行为分散在多个代码区域，使得TTP归属判定困难。近期的大型语言模型展现出强大的代码理解能力，但直接应用于此任务面临以下挑战：分析入口点识别困难、部分可观测条件下的推理问题，以及与TTP特定决策逻辑的错位。本文提出TTPDetect——首个用于识别剥离符号恶意软件二进制文件中TTP的LLM代理系统。TTPDetect结合密集检索与基于LLM的神经检索技术，以缩小分析入口点的搜索空间。该系统进一步采用函数级分析代理，包含按需执行增量式上下文检索的上下文探索器，以及实现推理时对齐的TTP特定推理指南。我们构建了新的数据集，为跨多种恶意软件家族和平台的反编译函数标注TTP信息。TTPDetect在函数级TTP识别任务中达到93.25%的精确率和93.81%的召回率，分别超越基线方法10.38%和18.78%。在实际恶意软件样本评估中，TTPDetect以87.37%的精确率识别TTP。对于具有专家撰写报告的恶意软件，该系统能还原85.7%的已记录TTP，并平均在每个样本中新发现10.5个未报告的TTP。