Analysts in Security Operations Centers (SOCs) are often occupied with time-consuming investigations of alerts from Network Intrusion Detection Systems (NIDS). Many NIDS rules lack clear explanations and associations with attack techniques, complicating the alert triage and the generation of attack hypotheses. Large Language Models (LLMs) may be a promising technology to reduce the alert explainability gap by associating rules with attack techniques. In this paper, we investigate the ability of three prominent LLMs (ChatGPT, Claude, and Gemini) to reason about NIDS rules while labeling them with MITRE ATT&CK tactics and techniques. We discuss prompt design and present experiments performed with 973 Snort rules. Our results indicate that while LLMs provide explainable, scalable, and efficient initial mappings, traditional Machine Learning (ML) models consistently outperform them in accuracy, achieving higher precision, recall, and F1-scores. These results highlight the potential for hybrid LLM-ML approaches to enhance SOC operations and better address the evolving threat landscape.
翻译:安全运营中心(SOC)的分析师常常需要耗费大量时间对网络入侵检测系统(NIDS)产生的告警进行调查。许多NIDS规则缺乏清晰的解释,也未能与攻击技术相关联,这使得告警分诊和攻击假设生成变得复杂。大型语言模型(LLMs)可能是一种有前景的技术,它通过将规则与攻击技术相关联,有望缩小告警可解释性差距。在本文中,我们研究了三种主流LLM(ChatGPT、Claude和Gemini)在推理NIDS规则并为其标记MITRE ATT&CK战术与技术方面的能力。我们讨论了提示设计,并展示了使用973条Snort规则进行的实验。我们的结果表明,虽然LLM提供了可解释、可扩展且高效的初始映射,但传统机器学习(ML)模型在准确性方面始终优于它们,获得了更高的精确率、召回率和F1分数。这些结果突显了LLM与ML混合方法在增强SOC运营、更好地应对不断演变的威胁态势方面的潜力。