Deep learning based image enhancement models have largely improved the readability of fundus images in order to decrease the uncertainty of clinical observations and the risk of misdiagnosis. However, due to the difficulty of acquiring paired real fundus images at different qualities, most existing methods have to adopt synthetic image pairs as training data. The domain shift between the synthetic and the real images inevitably hinders the generalization of such models on clinical data. In this work, we propose an end-to-end optimized teacher-student framework to simultaneously conduct image enhancement and domain adaptation. The student network uses synthetic pairs for supervised enhancement, and regularizes the enhancement model to reduce domain-shift by enforcing teacher-student prediction consistency on the real fundus images without relying on enhanced ground-truth. Moreover, we also propose a novel multi-stage multi-attention guided enhancement network (MAGE-Net) as the backbones of our teacher and student network. Our MAGE-Net utilizes multi-stage enhancement module and retinal structure preservation module to progressively integrate the multi-scale features and simultaneously preserve the retinal structures for better fundus image quality enhancement. Comprehensive experiments on both real and synthetic datasets demonstrate that our framework outperforms the baseline approaches. Moreover, our method also benefits the downstream clinical tasks.
翻译:基于深度学习的图像增强模型显著提升了眼底图像的可读性,以降低临床观察的不确定性和误诊风险。然而,由于难以获取不同质量下的成对真实眼底图像,现有方法大多采用合成图像对作为训练数据。合成图像与真实图像之间的域偏移不可避免地阻碍了此类模型在临床数据上的泛化能力。本文提出一种端到端优化的师生框架,同时进行图像增强与域自适应。学生网络利用合成图像对进行监督增强,并通过在无增强真值的真实眼底图像上强制师生预测一致性,正则化增强模型以减少域偏移。此外,我们提出一种新颖的多阶段多注意力引导增强网络(MAGE-Net)作为师生网络的骨干。MAGE-Net利用多阶段增强模块与视网膜结构保留模块,逐步融合多尺度特征,同时保留视网膜结构以实现更优的眼底图像质量增强。在真实与合成数据集上的综合实验表明,我们的框架优于基线方法。此外,本方法还有助于提升下游临床任务的性能。