Calibrating Tabular Anomaly Detection via Optimal Transport

Tabular anomaly detection (TAD) remains challenging due to the heterogeneity of tabular data: features lack natural relationships, vary widely in distribution and scale, and exhibit diverse types. Consequently, each TAD method makes implicit assumptions about anomaly patterns that work well on some datasets but fail on others, and no method consistently outperforms across diverse scenarios. We present CTAD (Calibrating Tabular Anomaly Detection), a model-agnostic post-processing framework that enhances any existing TAD detector through sample-specific calibration. Our approach characterizes normal data via two complementary distributions, i.e., an empirical distribution from random sampling and a structural distribution from K-means centroids, and measures how adding a test sample disrupts their compatibility using Optimal Transport (OT) distance. Normal samples maintain low disruption while anomalies cause high disruption, providing a calibration signal to amplify detection. We prove that OT distance has a lower bound proportional to the test sample's distance from centroids, and establish that anomalies systematically receive higher calibration scores than normals in expectation, explaining why the method generalizes across datasets. Extensive experiments on 34 diverse tabular datasets with 7 representative detectors spanning all major TAD categories (density estimation, classification, reconstruction, and isolation-based methods) demonstrate that CTAD consistently improves performance with statistical significance. Remarkably, CTAD enhances even state-of-the-art deep learning methods and shows robust performance across diverse hyperparameter settings, requiring no additional tuning for practical deployment.

翻译：表格异常检测（TAD）由于表格数据的异质性而仍然具有挑战性：特征缺乏自然关联性，在分布和尺度上差异巨大，并且表现出多种类型。因此，每种TAD方法都对异常模式做出了隐含假设，这些假设在某些数据集上表现良好，但在其他数据集上却失效，并且没有一种方法能在多样化的场景中始终优于其他方法。我们提出了CTAD（校准表格异常检测），这是一个与模型无关的后处理框架，它通过样本特异性校准来增强任何现有的TAD检测器。我们的方法通过两种互补的分布来表征正常数据，即来自随机抽样的经验分布和来自K-means质心的结构分布，并使用最优传输距离来衡量添加一个测试样本如何破坏它们的兼容性。正常样本保持较低的破坏度，而异常则导致较高的破坏度，从而提供了一个用于放大检测的校准信号。我们证明了最优传输距离存在一个与测试样本到质心距离成正比的下界，并确立了异常样本在期望上系统地获得比正常样本更高的校准分数，这解释了该方法为何能跨数据集泛化。在34个不同的表格数据集上，使用涵盖所有主要TAD类别（密度估计、分类、重构和基于隔离的方法）的7种代表性检测器进行的广泛实验表明，CTAD能持续且具有统计显著性地提升性能。值得注意的是，CTAD甚至能增强最先进的深度学习方法，并在不同的超参数设置下表现出稳健的性能，在实际部署中无需额外调优。