We describe a method for verifying the output of a deep neural network for medical image segmentation that is robust to several classes of random as well as worst-case perturbations i.e. adversarial attacks. This method is based on a general approach recently developed by the authors called "Trust, but Verify" wherein an auxiliary verification network produces predictions about certain masked features in the input image using the segmentation as an input. A well-designed auxiliary network will produce high-quality predictions when the input segmentations are accurate, but will produce low-quality predictions when the segmentations are incorrect. Checking the predictions of such a network with the original image allows us to detect bad segmentations. However, to ensure the verification method is truly robust, we need a method for checking the quality of the predictions that does not itself rely on a black-box neural network. Indeed, we show that previous methods for segmentation evaluation that do use deep neural regression networks are vulnerable to false negatives i.e. can inaccurately label bad segmentations as good. We describe the design of a verification network that avoids such vulnerability and present results to demonstrate its robustness compared to previous methods.
翻译:我们提出了一种验证深度神经网络在医学图像分割中输出结果的方法,该方法对随机扰动和最坏情况扰动(即对抗攻击)均具有鲁棒性。该方法基于作者近期开发的通用框架"信任,但需验证":通过辅助验证网络以分割结果作为输入,预测输入图像中特定掩蔽特征。设计良好的辅助网络在输入分割结果准确时能产生高质量预测,而在分割结果错误时则生成低质量预测。通过将此类网络的预测结果与原始图像进行比对,可检测出错误分割。然而,为确保验证方法的真正鲁棒性,我们需要一种不依赖黑箱神经网络的质量评估方法。事实上,我们证明此前使用深度神经回归网络的分割评估方法存在假阴性漏洞,即可能将错误分割误判为正确。本文描述了避免此类漏洞的验证网络设计,并通过实验结果展示了其相较于先前方法的鲁棒性提升。