Generative models are popular for medical imaging tasks such as anomaly detection, feature extraction, data visualization, or image generation. Since they are parameterized by deep learning models, they are often sensitive to distribution shifts and unreliable when applied to out-of-distribution data, creating a risk of, e.g. underrepresentation bias. This behavior can be flagged using uncertainty quantification methods for generative models, but their availability remains limited. We propose SLUG: A new UQ method for VAEs that combines recent advances in Laplace approximations with stochastic trace estimators to scale gracefully with image dimensionality. We show that our UQ score -- unlike the VAE's encoder variances -- correlates strongly with reconstruction error and racial underrepresentation bias for dermatological images. We also show how pixel-wise uncertainty can detect out-of-distribution image content such as ink, rulers, and patches, which is known to induce learning shortcuts in predictive models.
翻译:生成模型在医学影像任务中广泛应用,例如异常检测、特征提取、数据可视化或图像生成。由于这些模型通过深度学习模型进行参数化,它们通常对分布偏移敏感,且在应用于分布外数据时不可靠,从而可能引发如代表性不足偏倚等风险。此类行为可通过生成模型的不确定性量化方法进行标记,但现有方法仍较为有限。我们提出SLUG:一种针对变分自编码器的新型不确定性量化方法,该方法将拉普拉斯近似的最新进展与随机迹估计器相结合,能够优雅地适应图像维度变化。我们证明,相较于VAE编码器方差,我们的不确定性量化评分与皮肤病学图像的重建误差及种族代表性不足偏倚呈现强相关性。我们还展示了像素级不确定性如何检测分布外图像内容(如墨水痕迹、标尺和贴片),这类内容已知会引发预测模型中的学习捷径。