Medical images are often characterized by their structured anatomical representations and spatially inhomogeneous contrasts. Leveraging anatomical priors in neural networks can greatly enhance their utility in resource-constrained clinical settings. Prior research has harnessed such information for image segmentation, yet progress in deformable image registration has been modest. Our work introduces textSCF, a novel method that integrates spatially covariant filters and textual anatomical prompts encoded by visual-language models, to fill this gap. This approach optimizes an implicit function that correlates text embeddings of anatomical regions to filter weights, relaxing the typical translation-invariance constraint of convolutional operations. TextSCF not only boosts computational efficiency but can also retain or improve registration accuracy. By capturing the contextual interplay between anatomical regions, it offers impressive inter-regional transferability and the ability to preserve structural discontinuities during registration. TextSCF's performance has been rigorously tested on inter-subject brain MRI and abdominal CT registration tasks, outperforming existing state-of-the-art models in the MICCAI Learn2Reg 2021 challenge and leading the leaderboard. In abdominal registrations, textSCF's larger model variant improved the Dice score by 11.3% over the second-best model, while its smaller variant maintained similar accuracy but with an 89.13% reduction in network parameters and a 98.34\% decrease in computational operations.
翻译:医学图像通常具有结构化的解剖表征和空间非均匀对比度。利用神经网络中的解剖先验信息可显著增强其在资源受限临床环境中的实用性。先前研究已成功将此类信息用于图像分割,但在可变形图像配准领域的进展有限。本文提出textSCF方法,通过融合空间协变滤波器与视觉语言模型编码的文本解剖提示填补这一空白。该方法优化隐函数实现解剖区域文本嵌入与滤波器权重的关联,突破了卷积运算的平移不变性约束。textSCF不仅提升计算效率,还能保持甚至提高配准精度。通过捕获解剖区域间的上下文交互,该方法展现出卓越的区域间迁移能力及配准过程中保持结构不连续性的能力。textSCF在脑部MRI跨受试者配准和腹部CT配准任务中经过严格测试,在MICCAI Learn2Reg 2021挑战赛中超越现有最优模型并领跑排行榜。在腹部配准中,textSCF大型变体将Dice分数较次优模型提升11.3%,而其小型变体在保持相近精度的同时,网络参数减少89.13%,计算操作量降低98.34%。