Holmes: An Evidence-Grounded LLM Agent for Auditable DDoS Investigation in Cloud Networks

Cloud environments face frequent DDoS threats due to centralized resources and broad attack surfaces. Modern cloud-native DDoS attacks further evolve rapidly and often blend multi-vector strategies, creating an operational dilemma: defenders need wire-speed monitoring while also requiring explainable, auditable attribution for response. Existing rule-based and supervised-learning approaches typically output black-box scores or labels, provide limited evidence chains, and generalize poorly to unseen attack variants; meanwhile, high-quality labeled data is often difficult to obtain in cloud settings. We present Holmes (DDoS Detective), an LLM-based DDoS detection agent that reframes the model as a virtual SRE investigator rather than an end-to-end classifier. Holmes couples a funnel-like hierarchical workflow (counters/sFlow for continuous sensing and triage; PCAP evidence collection triggered only on anomaly windows) with an Evidence Pack abstraction that converts binary packets into compact, reproducible, high-signal structured evidence. On top of this evidence interface, Holmes enforces a structure-first investigation protocol and strict JSON/quotation constraints to produce machine-consumable reports with auditable evidence anchors. We evaluate Holmes on CICDDoS2019 reflection/amplification attacks and script-triggered flooding scenarios. Results show that Holmes produces attribution decisions grounded in salient evidence anchors across diverse attack families, and when errors occur, its audit logs make the failure source easy to localize, demonstrating the practicality of an LLM agent for cost-controlled and traceable DDoS investigation in cloud operations.

翻译：云环境因其集中化的资源和广泛的攻击面而频繁面临DDoS威胁。现代云原生DDoS攻击进一步快速演变，并常混合使用多向量策略，造成了一个运维困境：防御者既需要线速监控，又要求可解释、可审计的归因以进行响应。现有的基于规则和基于监督学习的方法通常输出黑盒分数或标签，提供的证据链有限，且对未见过的攻击变种泛化能力差；同时，在云环境中高质量标注数据往往难以获取。我们提出Holmes（DDoS侦探），一种基于大语言模型的DDoS检测代理，它将模型重新定位为虚拟的站点可靠性工程师调查员，而非端到端的分类器。Holmes结合了一个漏斗式的分层工作流（使用计数器/sFlow进行持续感知和分流；仅在异常窗口触发PCAP证据收集）与一个“证据包”抽象，该抽象将二进制数据包转换为紧凑、可复现、高信息量的结构化证据。在此证据接口之上，Holmes强制执行一种结构优先的调查协议和严格的JSON/引用约束，以生成带有可审计证据锚点的机器可读报告。我们在CICDDoS2019反射/放大攻击和脚本触发的泛洪场景上对Holmes进行了评估。结果表明，Holmes能够基于不同攻击家族中的显著证据锚点做出归因决策，并且在发生错误时，其审计日志使得故障源易于定位，这证明了大语言模型代理在云运维中实现成本可控且可追溯的DDoS调查的实用性。