MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation

Volumetric medical segmentation is a critical component of 3D medical image analysis that delineates different semantic regions. Deep neural networks have significantly improved volumetric medical segmentation, but they generally require large-scale annotated data to achieve better performance, which can be expensive and prohibitive to obtain. To address this limitation, existing works typically perform transfer learning or design dedicated pretraining-finetuning stages to learn representative features. However, the mismatch between the source and target domain can make it challenging to learn optimal representation for volumetric data, while the multi-stage training demands higher compute as well as careful selection of stage-specific design choices. In contrast, we propose a universal training framework called MedContext that is architecture-agnostic and can be incorporated into any existing training framework for 3D medical segmentation. Our approach effectively learns self supervised contextual cues jointly with the supervised voxel segmentation task without requiring large-scale annotated volumetric medical data or dedicated pretraining-finetuning stages. The proposed approach induces contextual knowledge in the network by learning to reconstruct the missing organ or parts of an organ in the output segmentation space. The effectiveness of MedContext is validated across multiple 3D medical datasets and four state-of-the-art model architectures. Our approach demonstrates consistent gains in segmentation performance across datasets and different architectures even in few-shot data scenarios. Our code and pretrained models are available at https://github.com/hananshafi/MedContext

翻译：摘要：体积医学分割是三维医学图像分析的关键组成部分，用于划分不同的语义区域。深度神经网络显著提升了体积医学分割的性能，但通常需要大规模标注数据才能获得更好效果，而此类数据的获取成本高昂且难以实现。为了解决这一局限，现有工作通常采用迁移学习或设计专门的预训练-微调阶段来学习代表性特征。然而，源域与目标域之间的不匹配可能导致难以学习到体积数据的最优表征，而多阶段训练不仅需要更高的计算资源，还需仔细选择各阶段的设计方案。相比之下，我们提出了一种名为MedContext的通用训练框架，该框架与架构无关，可集成到任何现有的三维医学分割训练框架中。我们的方法有效联合学习自监督的上下文线索与有监督的体素分割任务，无需大规模标注的体积医学数据或专门的预训练-微调阶段。所提方法通过让网络学习在输出分割空间中重建缺失器官或器官部分，从而在其内部引入上下文知识。MedContext的有效性在多个三维医学数据集和四种最先进的模型架构上得到了验证。我们的方法在不同数据集和架构上均展现出分割性能的持续提升，即使在少样本数据场景下也是如此。我们的代码和预训练模型可在 https://github.com/hananshafi/MedContext 获取。