OCSVM-Guided Representation Learning for Unsupervised Anomaly Detection

Unsupervised anomaly detection (UAD) aims to detect anomalies without labeled data, a necessity in many machine learning applications where anomalous samples are rare or not available. Most state-of-the-art methods fall into two categories: reconstruction-based approaches, which often reconstruct anomalies too well, and decoupled representation learning with density estimators, which can suffer from suboptimal feature spaces. While some recent methods attempt to couple feature learning and anomaly detection, they often rely on surrogate objectives, restrict kernel choices, or introduce approximations that limit their expressiveness and robustness. To address this challenge, we propose a novel method that couples representation learning with an analytically solvable One-Class SVM (OCSVM), through a custom loss formulation that directly aligns latent features with the OCSVM decision boundary. The model is evaluated on two tasks: a \deleted{new} benchmark based on MNIST-C, and a challenging brain MRI \deleted{subtle} lesion detection task. Unlike most methods that focus on large, hyperintense lesions at the image level, our approach succeeds to target small, non-hyperintense lesions, while we evaluate voxel-wise metrics, addressing a more clinically relevant scenario. Both experiments evaluate a form of robustness to domain shifts, including corruption types in MNIST-C and texture or population age variations in MRI. Results demonstrate performance and robustness of our proposed model, highlighting its potential for general UAD and real-world medical imaging applications. The source code is available at https://github.com/Nicolas-Pinon/uad_ocsvm_guided_repr_learning.

翻译：无监督异常检测（UAD）旨在无需标注数据即可检测异常，这在许多机器学习应用中至关重要，因为异常样本罕见或不可获取。当前最先进的方法主要分为两类：基于重构的方法（通常过度还原异常）和基于密度估计器与解耦表示学习的方法（可能受限于次优特征空间）。尽管近期部分方法尝试将特征学习与异常检测耦合，但往往依赖替代目标函数、限制核函数选择或引入近似策略，从而削弱了模型的表达能力与鲁棒性。针对这一挑战，我们提出了一种新颖方法，通过定制损失函数将表示学习与可解析求解的一类支持向量机（OCSVM）直接耦合，使潜在特征与OCSVM决策边界对齐。模型在两个任务上进行了评估：基于MNIST-C的新基准任务，以及具有挑战性的脑部MRI病灶检测任务。与多数聚焦于图像级大范围高信号病灶的方法不同，我们的方法成功实现了对小范围非高信号病灶的检测，并采用体素级指标进行评估，更贴近临床实际场景。两项实验均评估了模型对领域偏移的鲁棒性，包括MNIST-C的污染类型差异及MRI的纹理或人群年龄变化。结果表明，所提模型在性能与鲁棒性方面表现优异，突显了其在通用UAD及真实医学影像应用中的潜力。源代码已发布于https://github.com/Nicolas-Pinon/uad_ocsvm_guided_repr_learning。