Time series anomaly detection is important in modern large-scale systems and is applied in a variety of domains to analyze and monitor the operation of diverse systems. Unsupervised approaches have received widespread interest, as they do not require anomaly labels during training, thus avoiding potentially high costs and having wider applications. Among these, autoencoders have received extensive attention. They use reconstruction errors from compressed representations to define anomaly scores. However, representations learned by autoencoders are sensitive to anomalies in training time series, causing reduced accuracy. We propose a novel encode-then-decompose paradigm, where we decompose the encoded representation into stable and auxiliary representations, thereby enhancing the robustness when training with contaminated time series. In addition, we propose a novel mutual information based metric to replace the reconstruction errors for identifying anomalies. Our proposal demonstrates competitive or state-of-the-art performance on eight commonly used multi- and univariate time series benchmarks and exhibits robustness to time series with different contamination ratios.
翻译:时间序列异常检测在现代大规模系统中具有重要意义,广泛应用于各类领域以分析和监控多样化系统的运行状态。无监督方法因其在训练过程中无需异常标签而受到广泛关注,从而避免了潜在的高成本并拥有更广泛的应用前景。其中,自编码器获得了大量研究关注。这类方法通过压缩表征的重构误差来定义异常分数。然而,自编码器学习到的表征对训练时间序列中的异常值较为敏感,会导致检测精度下降。本文提出一种新颖的编码后分解范式,将编码表征分解为稳定表征与辅助表征,从而增强在污染时间序列上进行训练时的鲁棒性。此外,我们提出一种基于互信息的新型度量指标以替代重构误差进行异常识别。我们的方法在八个常用多变量与单变量时间序列基准数据集上展现出具有竞争力或最先进的性能,并对不同污染比例的时间序列表现出良好的鲁棒性。