AMAE: Adaptation of Pre-Trained Masked Autoencoder for Dual-Distribution Anomaly Detection in Chest X-Rays

Unsupervised anomaly detection in medical images such as chest radiographs is stepping into the spotlight as it mitigates the scarcity of the labor-intensive and costly expert annotation of anomaly data. However, nearly all existing methods are formulated as a one-class classification trained only on representations from the normal class and discard a potentially significant portion of the unlabeled data. This paper focuses on a more practical setting, dual distribution anomaly detection for chest X-rays, using the entire training data, including both normal and unlabeled images. Inspired by a modern self-supervised vision transformer model trained using partial image inputs to reconstruct missing image regions -- we propose AMAE, a two-stage algorithm for adaptation of the pre-trained masked autoencoder (MAE). Starting from MAE initialization, AMAE first creates synthetic anomalies from only normal training images and trains a lightweight classifier on frozen transformer features. Subsequently, we propose an adaptation strategy to leverage unlabeled images containing anomalies. The adaptation scheme is accomplished by assigning pseudo-labels to unlabeled images and using two separate MAE based modules to model the normative and anomalous distributions of pseudo-labeled images. The effectiveness of the proposed adaptation strategy is evaluated with different anomaly ratios in an unlabeled training set. AMAE leads to consistent performance gains over competing self-supervised and dual distribution anomaly detection methods, setting the new state-of-the-art on three public chest X-ray benchmarks: RSNA, NIH-CXR, and VinDr-CXR.

翻译：摘要：医学图像（如胸部X光片）中的无监督异常检测正逐渐受到关注，因为它缓解了对劳动密集且成本高昂的异常数据专家标注的稀缺性。然而，几乎所有现有方法都基于仅从正常类别表示中训练的单类分类，并丢弃了可能占相当比例的无标注数据。本文聚焦于一个更实际的场景——胸部X光的双分布异常检测，利用包括正常图像和无标注图像在内的全部训练数据。受一种通过局部图像输入重建缺失区域的现代自监督视觉Transformer模型的启发，我们提出AMAE，一种用于预训练掩码自编码器（MAE）自适应二阶段算法。从MAE初始化开始，AMAE首先仅从正常训练图像中生成合成异常，并在冻结的Transformer特征上训练轻量级分类器。随后，我们提出一种自适应策略以利用包含异常的无标注图像。该自适应方案通过对无标注图像分配伪标签，并使用两个独立的基于MAE的模块分别建模伪标签图像的规范分布和异常分布。我们评估了所提自适应策略在不同无标注训练集异常比例下的有效性。AMAE在竞争性自监督和双分布异常检测方法中取得了持续的性能提升，并在三个公开胸部X光基准测试（RSNA、NIH-CXR和VinDr-CXR）上设立了新的最先进水平。