Spatially Covariant Image Registration with Text Prompts

Medical images are often characterized by their structured anatomical representations and spatially inhomogeneous contrasts. Leveraging anatomical priors in neural networks can greatly enhance their utility in resource-constrained clinical settings. Prior research has harnessed such information for image segmentation, yet progress in deformable image registration has been modest. Our work introduces textSCF, a novel method that integrates spatially covariant filters and textual anatomical prompts encoded by visual-language models, to fill this gap. This approach optimizes an implicit function that correlates text embeddings of anatomical regions to filter weights, relaxing the typical translation-invariance constraint of convolutional operations. TextSCF not only boosts computational efficiency but can also retain or improve registration accuracy. By capturing the contextual interplay between anatomical regions, it offers impressive inter-regional transferability and the ability to preserve structural discontinuities during registration. TextSCF's performance has been rigorously tested on inter-subject brain MRI and abdominal CT registration tasks, outperforming existing state-of-the-art models in the MICCAI Learn2Reg 2021 challenge and leading the leaderboard. In abdominal registrations, textSCF's larger model variant improved the Dice score by 11.3% over the second-best model, while its smaller variant maintained similar accuracy but with an 89.13% reduction in network parameters and a 98.34\% decrease in computational operations.

翻译：医学图像通常具有结构化的解剖表征和空间非均匀对比度。利用神经网络中的解剖先验信息可显著增强其在资源受限临床环境中的实用性。先前研究已成功将此类信息用于图像分割，但在可变形图像配准领域的进展有限。本文提出textSCF方法，通过融合空间协变滤波器与视觉语言模型编码的文本解剖提示填补这一空白。该方法优化隐函数实现解剖区域文本嵌入与滤波器权重的关联，突破了卷积运算的平移不变性约束。textSCF不仅提升计算效率，还能保持甚至提高配准精度。通过捕获解剖区域间的上下文交互，该方法展现出卓越的区域间迁移能力及配准过程中保持结构不连续性的能力。textSCF在脑部MRI跨受试者配准和腹部CT配准任务中经过严格测试，在MICCAI Learn2Reg 2021挑战赛中超越现有最优模型并领跑排行榜。在腹部配准中，textSCF大型变体将Dice分数较次优模型提升11.3%，而其小型变体在保持相近精度的同时，网络参数减少89.13%，计算操作量降低98.34%。

相关内容

图像配准

关注 810

图像配准是图像处理研究领域中的一个典型问题和技术难点，其目的在于比较或融合针对同一对象在不同条件下获取的图像，例如图像会来自不同的采集设备，取自不同的时间，不同的拍摄视角等等，有时也需要用到针对不同对象的图像配准问题。具体地说，对于一组图像数据集中的两幅图像，通过寻找一种空间变换把一幅图像映射到另一幅图像，使得两图中对应于空间同一位置的点一一对应起来，从而达到信息融合的目的。该技术在计算机视觉、医学图像处理以及材料力学等领域都具有广泛的应用。根据具体应用的不同，有的侧重于通过变换结果融合两幅图像，有的侧重于研究变换本身以获得对象的一些力学属性。

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日