Unsupervised anomaly localization in high-resolution breast scans using deep pluralistic image completion

Automated tumor detection in Digital Breast Tomosynthesis (DBT) is a difficult task due to natural tumor rarity, breast tissue variability, and high resolution. Given the scarcity of abnormal images and the abundance of normal images for this problem, an anomaly detection/localization approach could be well-suited. However, most anomaly localization research in machine learning focuses on non-medical datasets, and we find that these methods fall short when adapted to medical imaging datasets. The problem is alleviated when we solve the task from the image completion perspective, in which the presence of anomalies can be indicated by a discrepancy between the original appearance and its auto-completion conditioned on the surroundings. However, there are often many valid normal completions given the same surroundings, especially in the DBT dataset, making this evaluation criterion less precise. To address such an issue, we consider pluralistic image completion by exploring the distribution of possible completions instead of generating fixed predictions. This is achieved through our novel application of spatial dropout on the completion network during inference time only, which requires no additional training cost and is effective at generating diverse completions. We further propose minimum completion distance (MCD), a new metric for detecting anomalies, thanks to these stochastic completions. We provide theoretical as well as empirical support for the superiority over existing methods of using the proposed method for anomaly localization. On the DBT dataset, our model outperforms other state-of-the-art methods by at least 10\% AUROC for pixel-level detection.

翻译：数字化乳腺断层合成（DBT）中的肿瘤自动检测是一项困难的任务，原因在于肿瘤自然稀有病案、乳腺组织变异性和高分辨率等因素。鉴于该问题中异常图像的稀缺性和正常图像的丰富性，异常检测/定位方法可能更适用。然而，大多数机器学习领域的异常定位研究都集中在非医学数据集上，我们发现这些方法在医学成像数据集上表现欠佳。当从图像补全角度解决该任务时，这一问题有所缓解——异常的存在可通过原始外观与其基于周围环境的自动补全结果之间的差异来指示。然而，在相同环境下往往存在多种有效的正常补全结果，尤其是在DBT数据集中，这使得该评估标准不够精确。为解决这一问题，我们考虑通过探索可能补全结果的分布（而非生成固定预测）来实现多元图像补全。这通过创新性地在推理阶段对补全网络应用空间丢弃法（spatial dropout）实现，该方法无需额外训练成本，且能有效生成多样化的补全结果。基于这些随机补全，我们进一步提出最小补全距离（MCD）作为检测异常的新指标。我们从理论和实验两方面证明了该方法在异常定位任务中优于现有方法的优势。在DBT数据集上，我们的模型在像素级检测中的AUROC指标至少比现有最优方法高出10%。