JOINEDTrans: Prior Guided Multi-task Transformer for Joint Optic Disc/Cup Segmentation and Fovea Detection

Deep learning-based image segmentation and detection models have largely improved the efficiency of analyzing retinal landmarks such as optic disc (OD), optic cup (OC), and fovea. However, factors including ophthalmic disease-related lesions and low image quality issues may severely complicate automatic OD/OC segmentation and fovea detection. Most existing works treat the identification of each landmark as a single task, and take into account no prior information. To address these issues, we propose a prior guided multi-task transformer framework for joint OD/OC segmentation and fovea detection, named JOINEDTrans. JOINEDTrans effectively combines various spatial features of the fundus images, relieving the structural distortions induced by lesions and other imaging issues. It contains a segmentation branch and a detection branch. To be noted, we employ an encoder pretrained in a vessel segmentation task to effectively exploit the positional relationship among vessel, OD/OC, and fovea, successfully incorporating spatial prior into the proposed JOINEDTrans framework. There are a coarse stage and a fine stage in JOINEDTrans. In the coarse stage, OD/OC coarse segmentation and fovea heatmap localization are obtained through a joint segmentation and detection module. In the fine stage, we crop regions of interest for subsequent refinement and use predictions obtained in the coarse stage to provide additional information for better performance and faster convergence. Experimental results demonstrate that JOINEDTrans outperforms existing state-of-the-art methods on the publicly available GAMMA, REFUGE, and PALM fundus image datasets. We make our code available at https://github.com/HuaqingHe/JOINEDTrans

翻译：基于深度学习的图像分割与检测模型极大提升了视网膜标志物（如视盘、视杯和中心凹）的分析效率。然而，眼科疾病相关病变和低图像质量等因素会严重干扰视盘/视杯分割与中心凹检测的自动化过程。现有方法大多将每个标志物的识别视为单一任务，且未考虑先验信息。针对这些问题，我们提出一种面向视盘/视杯联合分割与中心凹检测的先验引导多任务Transformer框架——JOINEDTrans。JOINEDTrans通过有效融合眼底图像的多类空间特征，缓解了病变及成像问题造成的结构畸变。该框架包含分割分支与检测分支。值得注意的是，我们采用经血管分割任务预训练的编码器，充分利用血管、视盘/视杯与中心凹之间的位置关系，成功将空间先验信息整合到JOINEDTrans框架中。该框架包含粗阶段和精阶段：粗阶段通过联合分割与检测模块获取视盘/视杯粗分割结果及中心凹热力图定位；精阶段裁剪感兴趣区域进行精细化处理，并利用粗阶段预测结果提供额外信息以提升性能与收敛速度。实验结果表明，JOINEDTrans在公开的GAMMA、REFUGE和PALM眼底图像数据集上均优于现有最优方法。代码已开源至https://github.com/HuaqingHe/JOINEDTrans。