Deep learning has demonstrated remarkable achievements in medical image segmentation. However, prevailing deep learning models struggle with poor generalization due to (i) intra-class variations, where the same class appears differently in different samples, and (ii) inter-class independence, resulting in difficulties capturing intricate relationships between distinct objects, leading to higher false negative cases. This paper presents a novel approach that synergies spatial and spectral representations to enhance domain-generalized medical image segmentation. We introduce the innovative Spectral Correlation Coefficient objective to improve the model's capacity to capture middle-order features and contextual long-range dependencies. This objective complements traditional spatial objectives by incorporating valuable spectral information. Extensive experiments reveal that optimizing this objective with existing architectures like UNet and TransUNet significantly enhances generalization, interpretability, and noise robustness, producing more confident predictions. For instance, in cardiac segmentation, we observe a 0.81 pp and 1.63 pp (pp = percentage point) improvement in DSC over UNet and TransUNet, respectively. Our interpretability study demonstrates that, in most tasks, objectives optimized with UNet outperform even TransUNet by introducing global contextual information alongside local details. These findings underscore the versatility and effectiveness of our proposed method across diverse imaging modalities and medical domains.
翻译:深度学习在医学图像分割领域已取得显著成就。然而,现有深度模型因以下因素导致泛化能力不足:(i)类内变异——同一类别在不同样本中呈现不同形态;(ii)类间独立性——难以捕捉不同目标间的复杂关联,导致假阴性病例增多。本文提出一种融合空间与频谱表征的创新方法,以增强领域泛化医学图像分割。我们引入创新的频谱相关系数目标函数,提升模型捕获中阶特征与上下文长程依赖的能力。该目标函数通过融入有价值的频谱信息,对传统空间目标函数形成互补。大量实验表明,将此目标函数与UNet、TransUNet等现有架构共同优化,可显著增强泛化性、可解释性与噪声鲁棒性,生成更可靠的预测结果。例如在心脏分割任务中,相较于UNet与TransUNet,DSC指标分别提升0.81个百分点和1.63个百分点。可解释性研究显示,在多数任务中,基于UNet优化该目标函数通过引入全局上下文信息与局部细节,其表现甚至超越TransUNet。这些发现充分表明本方法在不同成像模态和医学领域中的普适性与有效性。