Anomaly detection is crucial in various domains, such as finance, healthcare, and cybersecurity. In this paper, we propose a novel deep anomaly detection method for tabular data that leverages Non-Parametric Transformers (NPTs), a model initially proposed for supervised tasks, to capture both feature-feature and sample-sample dependencies. In a reconstruction-based framework, we train the NPT model to reconstruct masked features of normal samples. We use the model's ability to reconstruct the masked features during inference to generate an anomaly score. To the best of our knowledge, our proposed method is the first to combine both feature-feature and sample-sample dependencies for anomaly detection on tabular datasets. We evaluate our method on an extensive benchmark of tabular datasets and demonstrate that our approach outperforms existing state-of-the-art methods based on both the F1-Score and AUROC. Moreover, our work opens up new research directions for exploring the potential of NPTs for other tasks on tabular data.
翻译:异常检测在金融、医疗和网络安全等多个领域中至关重要。本文提出了一种针对表格数据的深度异常检测新方法,该方法利用最初为监督任务设计的非参数Transformer(NPTs)模型,以捕捉特征-特征与样本-样本之间的双重依赖关系。在基于重建的框架下,我们训练NPT模型对正常样本的掩码特征进行重建,并利用模型在推理阶段重建掩码特征的能力生成异常分数。据我们所知,本文方法是首个将特征-特征与样本-样本依赖关系相结合用于表格数据集异常检测的工作。我们在广泛的表格数据集基准上评估了该方法,结果表明,基于F1分数和AUROC指标,我们的方法均优于现有最先进方法。此外,本研究为探索NPT在表格数据其他任务中的潜力开辟了新的研究方向。