KRONE: Hierarchical and Modular Log Anomaly Detection

Log anomaly detection is crucial for uncovering system failures and security risks. Although logs originate from nested component executions with clear boundaries, this structure is lost when stored as flat sequences. As a result, state-of-the-art methods often miss true dependencies within executions while learning spurious correlations across unrelated events. We propose KRONE, the first hierarchical anomaly detection framework that automatically derives execution hierarchies from flat logs to enable modular, multi-level anomaly detection. At its core, the KRONE Log Abstraction Model extracts application-specific semantic hierarchies, which are used to recursively decompose log sequences into coherent execution units, referred to as KRONE Seqs. This transforms sequence-level detection into a set of modular KRONE Seq-level detection tasks. For each test KRONE Seq, KRONE adopts a hybrid modular detection strategy that routes between an efficient level-independent Local-Context detector for rapid filtering and a Nested-Aware detector that captures cross-level semantic dependencies, augmented with LLM-based anomaly detection and explanation. KRONE further optimizes detection through cached result reuse and early-exit strategies along the hierarchy. Experiments on three public benchmarks and one industrial dataset from ByteDance Cloud demonstrate that KRONE achieves substantial improvements in accuracy (42.49% to 87.98%), F1 score, data efficiency (117.3x reduction), resource efficiency (43.7x reduction), and interpretability. KRONE improves F1-score by 10.07% (82.76% to 92.83%) over prior methods while reducing LLM usage to only 1.1% to 3.3% of the test data. Code: https://github.com/LeiMa0324/KRONE Demo: https://leima0324.github.io/KRONE_Demo_official/

翻译：日志异常检测对于发现系统故障和安全风险至关重要。尽管日志源自具有清晰边界的嵌套组件执行过程，但在存储为平面序列时会丢失这种结构。因此，现有最先进的方法常常遗漏执行过程中的真实依赖关系，却从无关事件中学习到虚假关联。我们提出KRONE——首个能够自动从平面日志中推导执行层次结构、支持模块化多层级异常检测的层次化检测框架。其核心组件KRONE日志抽象模型可提取应用特定的语义层次结构，并据此将日志序列递归分解为连贯的执行单元（称为KRONE序列）。这将序列级检测转化为一组模块化的KRONE序列级检测任务。针对每个待检测的KRONE序列，KRONE采用混合模块化检测策略：在高效率的层级无关局部上下文检测器（用于快速过滤）与嵌套感知检测器（可捕获跨层级语义依赖并集成基于大语言模型的异常检测与解释）之间动态路由。KRONE还通过沿层级结构复用缓存结果和提前退出策略进一步优化检测效率。在三个公开基准数据集及字节跳动云工业数据集上的实验表明，KRONE在准确率（提升42.49%至87.98%）、F1分数、数据效率（降低117.3倍）、资源效率（降低43.7倍）和可解释性方面均实现显著提升。相较现有方法，KRONE在F1分数上提升10.07%（从82.76%提升至92.83%），同时将大语言模型的使用量降至测试数据的1.1%至3.3%。代码：https://github.com/LeiMa0324/KRONE 演示：https://leima0324.github.io/KRONE_Demo_official/