Tactics, Techniques, and Procedures (TTPs) outline the methods attackers use to exploit vulnerabilities. The interpretation of TTPs in the MITRE ATT&CK framework can be challenging for cybersecurity practitioners due to presumed expertise, complex dependencies, and inherent ambiguity. Meanwhile, advancements with Large Language Models (LLMs) have led to recent surge in studies exploring its uses in cybersecurity operations. This leads us to question how well encoder-only (e.g., RoBERTa) and decoder-only (e.g., GPT-3.5) LLMs can comprehend and summarize TTPs to inform analysts of the intended purposes (i.e., tactics) of a cyberattack procedure. The state-of-the-art LLMs have shown to be prone to hallucination by providing inaccurate information, which is problematic in critical domains like cybersecurity. Therefore, we propose the use of Retrieval Augmented Generation (RAG) techniques to extract relevant contexts for each cyberattack procedure for decoder-only LLMs (without fine-tuning). We further contrast such approach against supervised fine-tuning (SFT) of encoder-only LLMs. Our results reveal that both the direct-use of decoder-only LLMs (i.e., its pre-trained knowledge) and the SFT of encoder-only LLMs offer inaccurate interpretation of cyberattack procedures. Significant improvements are shown when RAG is used for decoder-only LLMs, particularly when directly relevant context is found. This study further sheds insights on the limitations and capabilities of using RAG for LLMs in interpreting TTPs.
翻译:战术、技术和程序(TTP)描述了攻击者利用漏洞的方法。由于对专业知识的要求、复杂的依赖关系以及固有的模糊性,MITRE ATT&CK框架中的TTP解释对网络安全从业者而言具有挑战性。与此同时,大型语言模型(LLMs)的进步推动了近期大量探索其在网络安全运营中应用的研究。这引发我们思考:仅编码器(如RoBERTa)和仅解码器(如GPT-3.5)LLMs在理解和总结TTP以告知分析人员网络攻击程序的预期目的(即战术)方面表现如何?最先进的LLMs容易因提供不准确信息而产生幻觉,这在网络安全等关键领域尤为棘手。因此,我们提出使用检索增强生成(RAG)技术,为仅解码器LLMs(无需微调)提取每个网络攻击程序的相关上下文。我们进一步将这种方法与仅编码器LLMs的监督微调(SFT)进行对比。结果表明,直接使用仅解码器LLMs(即其预训练知识)和仅编码器LLMs的SFT均对网络攻击程序提供不准确的解释。当对仅解码器LLMs使用RAG时,特别是在找到直接相关上下文的情况下,性能显著提升。本研究进一步揭示了在解释TTP过程中使用RAG增强LLMs的局限性与能力。