Revisiting OmniAnomaly for Anomaly Detection: performance metrics and comparison with PCA-based models

Deep learning models have become the dominant approach for multivariate time series anomaly detection (MTSAD), often reporting substantial performance improvements over classical statistical methods. However, these gains are frequently evaluated under heterogeneous thresholding strategies and evaluation protocols, making fair comparisons difficult. This work revisits OmniAnomaly, a widely used stochastic recurrent model for MTSAD, and systematically compares it with a simple linear baseline based on Principal Component Analysis (PCA) on the Server Machine Dataset (SMD). Both methods are evaluated under identical thresholding and evaluation procedures, with experiments repeated across 100 runs for each of the 28 machines in the dataset. Performance is evaluated using Precision, Recall and F1-score at point-level, with and without point-adjustment, and under different aggregation strategies across machines and runs, with the corresponding standard deviations also reported. The results show large variability across machines and show that PCA can achieve performance comparable to OmniAnomaly, and even outperform it when point-adjustment is not applied. These findings question the added value of more complex architectures under current benchmarking practices and highlight the critical role of evaluation methodology in MTSAD research.

翻译：深度学习模型已成为多变量时间序列异常检测（MTSAD）的主流方法，通常声称相比经典统计方法有显著的性能提升。然而，这些增益常常是在异质的阈值策略和评估协议下进行评估的，使得公平比较变得困难。本工作重新审视了 OmniAnomaly（一种广泛用于 MTSAD 的随机循环模型），并在服务器机器数据集（SMD）上将其与基于主成分分析（PCA）的简单线性基线进行了系统性比较。两种方法均在相同的阈值和评估流程下进行评估，实验针对数据集中 28 台机器每台重复运行 100 次。性能使用逐点级别的精确率、召回率和 F1 分数进行评价，分别考虑是否采用点调整，并采用跨机器和跨运行的不同聚合策略，同时报告相应的标准差。结果表明，不同机器间性能存在巨大差异，且 PCA 能够达到与 OmniAnomaly 相当的性能，甚至在未应用点调整时表现更优。这些发现对当前基准测试实践下更复杂架构的附加价值提出了质疑，并突显了评估方法在 MTSAD 研究中的关键作用。

相关内容

PCA

关注 3

在统计中，主成分分析（PCA）是一种通过最大化每个维度的方差来将较高维度空间中的数据投影到较低维度空间中的方法。给定二维，三维或更高维空间中的点集合，可以将“最佳拟合”线定义为最小化从点到线的平均平方距离的线。可以从垂直于第一条直线的方向类似地选择下一条最佳拟合线。重复此过程会产生一个正交的基础，其中数据的不同单个维度是不相关的。这些基向量称为主成分。

【CMU博士论文】基于数据的决策 — 从异常检测的视角, 163页pdf

专知会员服务

48+阅读 · 2023年7月31日

弹药异常检测《使用机器学习进行缺陷表征》最佳论文，MODSIM World 2023

专知会员服务

37+阅读 · 2023年7月22日

深度学习在时间序列异常检测中的应用综述

专知会员服务

110+阅读 · 2022年11月11日

索邦大学121页博士论文《时间序列中的无监督异常检测》

专知会员服务

104+阅读 · 2022年7月25日