Towards One-for-All Anomaly Detection for Tabular Data

Tabular anomaly detection (TAD) aims to identify samples that deviate from the majority in tabular data and is critical in many real-world applications. However, existing methods follow a ``one model for one dataset (OFO)'' paradigm, which relies on dataset-specific training and thus incurs high computational cost and yields limited generalization to unseen domains. To address these limitations, we propose OFA-TAD, a generalist one-for-all (OFA) TAD framework that only requires one-time training on multiple source datasets and can generalize to unseen datasets from diverse domains on-the-fly. To realize one-for-all tabular anomaly detection, OFA-TAD extracts neighbor-distance patterns as transferable cues, and introduces multi-view neighbor-distance representations from multiple transformation-induced metric spaces to mitigate the transformation sensitivity of distance profiles. To adaptively combine multi-view distance evidence, a Mixture-of-Experts (MoE) scoring network is employed for view-specific anomaly scoring and entropy-regularized gated fusion, with a multi-strategy anomaly synthesis mechanism to support training under the one-class constraint. Extensive experiments on 34 datasets from 14 domains demonstrate that OFA-TAD achieves superior anomaly detection performance and strong cross-domain generalizability under the strict OFA setting. The source code is available at https://github.com/Shiy-Li/OFA-TAD.

翻译：表格异常检测（TAD）旨在识别表格数据中偏离大多数样本的数据点，在众多实际应用中至关重要。然而，现有方法遵循"一个数据集对应一个模型（OFO）"的范式，依赖于特定数据集训练，导致计算成本高昂且对未见领域的泛化能力有限。为解决这些局限，我们提出OFA-TAD，一种通用的"一劳永逸"（OFA）TAD框架，只需在多个源数据集上一次性训练，即可实时泛化至来自不同领域的未见数据集。为实现在线表格异常检测，OFA-TAD提取邻域距离模式作为可迁移线索，并引入来自多个变换诱导度量空间的多视角邻域距离表示，以缓解距离分布对变换的敏感性。为自适应组合多视角距离证据，采用混合专家（MoE）评分网络进行视角特定异常评分与熵正则化门控融合，并配备多策略异常合成机制以支持单类约束下的训练。在来自14个领域的34个数据集上进行的大量实验表明，OFA-TAD在严格的OFA设置下实现了优越的异常检测性能与强跨领域泛化能力。源代码已开源：https://github.com/Shiy-Li/OFA-TAD。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【AAAI 2026 教程】面向图数据异常活动检测的基础模型研究进展

专知会员服务

25+阅读 · 1月26日

基于深度学习的视频异常检测：综述

专知会员服务

28+阅读 · 2024年9月10日

分布外如何检测？东大等最新《视觉语言模型时代的广义异常检测及其拓展》综述

专知会员服务

25+阅读 · 2024年8月2日

GPT-4V在异常检测表现如何？通用异常检测新曙光：华科大等揭秘GPT-4V的全方位异常检测表现

专知会员服务

39+阅读 · 2023年11月11日