State-of-the-art machine learning models often learn spurious correlations embedded in the training data. This poses risks when deploying these models for high-stake decision-making, such as in medical applications like skin cancer detection. To tackle this problem, we propose Reveal to Revise (R2R), a framework entailing the entire eXplainable Artificial Intelligence (XAI) life cycle, enabling practitioners to iteratively identify, mitigate, and (re-)evaluate spurious model behavior with a minimal amount of human interaction. In the first step (1), R2R reveals model weaknesses by finding outliers in attributions or through inspection of latent concepts learned by the model. Secondly (2), the responsible artifacts are detected and spatially localized in the input data, which is then leveraged to (3) revise the model behavior. Concretely, we apply the methods of RRR, CDEP and ClArC for model correction, and (4) (re-)evaluate the model's performance and remaining sensitivity towards the artifact. Using two medical benchmark datasets for Melanoma detection and bone age estimation, we apply our R2R framework to VGG, ResNet and EfficientNet architectures and thereby reveal and correct real dataset-intrinsic artifacts, as well as synthetic variants in a controlled setting. Completing the XAI life cycle, we demonstrate multiple R2R iterations to mitigate different biases. Code is available on https://github.com/maxdreyer/Reveal2Revise.
翻译:最先进的机器学习模型经常学习训练数据中嵌入的虚假相关性。当将这些模型用于高风险决策(例如皮肤癌检测等医学应用)时,这带来了风险。为了解决这一问题,我们提出了“揭示到修正”(R2R)框架,该框架涵盖了整个可解释人工智能(XAI)生命周期,使从业者能够以最少的人工交互迭代识别、缓解和(重新)评估模型的虚假行为。第一步(1),R2R通过发现归因中的异常值或检查模型学到的潜在概念来揭示模型弱点。第二步(2),检测并空间定位输入数据中的问题伪影,然后利用这一点(3)修正模型行为。具体而言,我们应用RRR、CDEP和ClArC方法进行模型修正,并(4)(重新)评估模型性能以及对伪影的剩余敏感性。利用两个用于黑色素瘤检测和骨龄估计的医学基准数据集,我们将R2R框架应用于VGG、ResNet和EfficientNet架构,从而揭示并修正真实数据集内在伪影,以及在受控环境中的合成变体。完成XAI生命周期后,我们展示了多次R2R迭代以缓解不同偏差。代码可在 https://github.com/maxdreyer/Reveal2Revise 获取。