Mechanical defects in real situations affect observation values and cause abnormalities in multivariate time series, such as sensor values or network data. To perceive abnormalities in such data, it is crucial to understand the temporal context and interrelation between variables simultaneously. The anomaly detection task for time series, especially for unlabeled data, has been a challenging problem, and we address it by applying a suitable data degradation scheme to self-supervised model training. We define four types of synthetic outliers and propose the degradation scheme in which a portion of input data is replaced with one of the synthetic outliers. Inspired by the self-attention mechanism, we design a Transformer-based architecture to recognize the temporal context and detect unnatural sequences with high efficiency. Our model converts multivariate data points into temporal representations with relative position bias and yields anomaly scores from these representations. Our method, AnomalyBERT, shows a great capability of detecting anomalies contained in complex time series and surpasses previous state-of-the-art methods on five real-world benchmarks. Our code is available at https://github.com/Jhryu30/AnomalyBERT.
翻译:实际场景中的机械缺陷会影响观测值,并导致多变量时间序列(如传感器值或网络数据)中出现异常。要感知此类数据中的异常,关键在于同时理解时间上下文与变量间的相互关系。针对时间序列(尤其针对无标签数据)的异常检测任务历来具有挑战性,我们通过将合适的数据退化机制应用于自监督模型训练来解决该问题。我们定义了四种合成异常类型,并提出了一种退化方案:将部分输入数据替换为其中一种合成异常。受自注意力机制启发,我们设计了基于Transformer的架构,以高效识别时间上下文并检测非自然序列。该模型将多变量数据点转化为具有相对位置偏置的时间表征,并基于这些表征生成异常分数。我们的方法AnomalyBERT展现出强大的复杂时间序列异常检测能力,在五个真实世界基准上超越了此前最优方法。代码已开源:https://github.com/Jhryu30/AnomalyBERT。