Investigating cybersecurity incidents requires collecting and analyzing evidence from multiple log sources, including intrusion detection alerts, network traffic records, and authentication events. This process is labor-intensive: analysts must sift through large volumes of data to identify relevant indicators and piece together what happened. We present a RAG-based system that performs security incident analysis through targeted query-based filtering and LLM semantic reasoning. The system uses a query library with associated MITRE ATT&CK techniques to extract indicators from raw logs, then retrieves relevant context to answer forensic questions and reconstruct attack sequences. We evaluate the system with five LLM providers on malware traffic incidents and multi-stage Active Directory attacks. We find that LLM models have different performance and tradeoffs, with Claude Sonnet 4 and DeepSeek V3 achieving 100% recall across all four malware scenarios, while DeepSeek costs 15 times less ($0.008 vs. $0.12 per analysis). Attack step detection on Active Directory scenarios reaches 100% precision and 82% recall. Ablation studies confirm that a RAG architecture is essential: LLM baselines without RAG-enhanced context correctly identify victim hosts but miss all attack infrastructure including malicious domains and command-and-control servers. These results demonstrate that combining targeted query-based filtering with RAG-based retrieval enables accurate, cost-effective security analysis within LLM context limits.
翻译:调查网络安全事件需要从多个日志来源收集和分析证据,包括入侵检测告警、网络流量记录和身份验证事件。这一过程劳动密集:分析师必须筛选大量数据以识别相关指标并拼凑出事件全貌。我们提出了一种基于RAG的系统,通过目标驱动的查询过滤和大语言模型语义推理进行安全事件分析。该系统利用关联MITRE ATT&CK技术的查询库从原始日志中提取指标,随后检索相关上下文以回答取证问题并重建攻击序列。我们使用五个大语言模型提供商对恶意软件流量事件和多阶段Active Directory攻击进行了系统评估。研究发现,不同大语言模型在性能与权衡方面存在差异:Claude Sonnet 4和DeepSeek V3在所有四个恶意软件场景中均实现100%召回率,而DeepSeek成本降低15倍(每次分析0.008美元 vs 0.12美元)。在Active Directory场景的攻击步骤检测中,达到了100%精确率和82%召回率。消融研究证实RAG架构至关重要:未使用RAG增强上下文的基线大语言模型虽能正确识别受害主机,但会遗漏所有攻击基础设施(包括恶意域名和命令控制服务器)。这些结果表明,将目标驱动的查询过滤与基于RAG的检索相结合,可在Llm上下文限制内实现准确且经济高效的安全分析。