Skin lesion analysis models are biased by artifacts placed during image acquisition, which influence model predictions despite carrying no clinical information. Solutions that address this problem by regularizing models to prevent learning those spurious features achieve only partial success, and existing test-time debiasing techniques are inappropriate for skin lesion analysis due to either making unrealistic assumptions on the distribution of test data or requiring laborious annotation from medical practitioners. We propose TTS (Test-Time Selection), a human-in-the-loop method that leverages positive (e.g., lesion area) and negative (e.g., artifacts) keypoints in test samples. TTS effectively steers models away from exploiting spurious artifact-related correlations without retraining, and with less annotation requirements. Our solution is robust to a varying availability of annotations, and different levels of bias. We showcase on the ISIC2019 dataset (for which we release a subset of annotated images) how our model could be deployed in the real-world for mitigating bias.
翻译:皮肤病变分析模型会受到图像采集过程中放置的人工痕迹的影响,这些痕迹虽不包含临床信息,却会左右模型预测结果。现有解决此问题的方法通过正则化模型来防止学习这些虚假特征,仅取得部分成功;而现有的测试时去偏技术因对测试数据分布做出不切实际的假设或需要医学从业者进行繁琐的标注,不适用于皮肤病变分析。我们提出TTS(测试时选择),一种人机协同方法,该方法利用测试样本中的正(例如,病灶区域)和负(例如,人工痕迹)关键点。TTS无需重新训练,且标注需求更少,即可有效引导模型避免利用与人工痕迹相关的虚假相关性。我们的解决方案对标注可用性的变化以及不同程度的偏差具有鲁棒性。我们在ISIC2019数据集上展示(我们发布了一个注释图像子集)了模型如何在现实世界中部署以减轻偏差。