Modern Network Intrusion Detection Systems generate vast volumes of low-level alerts, yet these outputs remain semantically fragmented, requiring labor-intensive manual correlation with high-level adversarial behaviors. Existing solutions for automating this mapping-rule-based systems and machine learning classifiers-suffer from critical limitations: rule-based approaches fail to adapt to novel attack variations, while machine learning methods lack contextual awareness and treat tactic-technique mapping as a syntactic matching problem rather than a reasoning task. Although Large Language Models have shown promise in cybersecurity tasks, preliminary experiments reveal that existing LLM-based methods frequently hallucinate technique names or produce decontextualized mappings due to their single-step classification approach. To address these challenges, we introduce RHINO, a novel framework that decomposes LLM-based attack analysis into three interpretable phases mirroring human reasoning: (1) behavioral abstraction, where raw logs are translated into contextualized narratives; (2) multi-role collaborative inference, generating candidate techniques by evaluating behavioral evidence against MITRE ATT&CK knowledge; and (3) validation, cross-referencing predictions with official MITRE definitions to rectify hallucinations. RHINO bridges the semantic gap between low-level observations and adversarial intent while improving output reliability through structured reasoning. We evaluate RHINO on three benchmarks across four backbone models. RHINO achieved high accuracy, with model performance ranging from 86.38% to 88.45%, resulting in relative gains from 24.25% to 76.50% across different models. Our results demonstrate that RHINO significantly enhances the interpretability and scalability of threat analysis, offering a blueprint for deploying LLMs in operational security settings.
翻译:现代网络入侵检测系统生成海量低级别告警,但这些输出在语义层面仍呈碎片化状态,需要人工进行高强度劳动才能与高级别对抗行为建立关联。现有自动化映射方案——基于规则的系统与机器学习分类器——存在关键局限:基于规则的方法难以适应新型攻击变体,而机器学习方法缺乏上下文感知能力,将战术-技术映射视为句法匹配问题而非推理任务。尽管大语言模型在网络安全任务中展现出潜力,初步实验表明现有基于LLM的方法因其单步分类模式,常产生技术名称幻觉或脱离上下文的映射结果。为应对这些挑战,我们提出RHINO框架,该框架将基于LLM的攻击分析解构为三个可解释的推理阶段,模拟人类分析流程:(1)行为抽象阶段,将原始日志转化为情境化叙事;(2)多角色协同推理阶段,通过行为证据与MITRE ATT&CK知识库的比对生成候选技术;(3)验证阶段,将预测结果与官方MITRE定义交叉比对以修正幻觉。RHINO通过结构化推理机制,弥合了低级观测数据与对抗意图之间的语义鸿沟,同时提升了输出可靠性。我们在三个基准数据集上使用四种骨干模型对RHINO进行评估。RHINO实现了86.38%至88.45%的高准确率,相较于基线模型获得24.25%至76.50%的相对性能提升。实验结果表明,RHINO显著增强了威胁分析的可解释性与可扩展性,为在实战化安全场景中部署LLM提供了系统化框架。