RFR-WWANet: Weighted Window Attention-Based Recovery Feature Resolution Network for Unsupervised Image Registration

The Swin transformer has recently attracted attention in medical image analysis due to its computational efficiency and long-range modeling capability. Owing to these properties, the Swin Transformer is suitable for establishing more distant relationships between corresponding voxels in different positions in complex abdominal image registration tasks. However, the registration models based on transformers combine multiple voxels into a single semantic token. This merging process limits the transformers to model and generate coarse-grained spatial information. To address this issue, we propose Recovery Feature Resolution Network (RFRNet), which allows the transformer to contribute fine-grained spatial information and rich semantic correspondences to higher resolution levels. Furthermore, shifted window partitioning operations are inflexible, indicating that they cannot perceive the semantic information over uncertain distances and automatically bridge the global connections between windows. Therefore, we present a Weighted Window Attention (WWA) to build global interactions between windows automatically. It is implemented after the regular and cyclic shift window partitioning operations within the Swin transformer block. The proposed unsupervised deformable image registration model, named RFR-WWANet, detects the long-range correlations, and facilitates meaningful semantic relevance of anatomical structures. Qualitative and quantitative results show that RFR-WWANet achieves significant improvements over the current state-of-the-art methods. Ablation experiments demonstrate the effectiveness of the RFRNet and WWA designs. Our code is available at \url{https://github.com/MingR-Ma/RFR-WWANet}.

翻译：Swin Transformer因其计算效率与长程建模能力，近年来在医学图像分析领域备受关注。基于这些特性，Swin Transformer适用于在复杂腹部图像配准任务中建立不同位置对应体素间的更远距离关联。然而，基于Transformer的配准模型将多个体素合并为单一语义标记，这种合并过程限制了Transformer对粗粒度空间信息的建模与生成能力。针对此问题，我们提出恢复特征分辨率网络（RFRNet），该网络使Transformer能够向更高分辨率层级贡献细粒度空间信息与丰富的语义对应关系。此外，移位窗口划分操作缺乏灵活性，导致其无法感知不确定距离内的语义信息，也无法自动建立窗口间的全局连接。为此，我们提出加权窗口注意力（WWA）机制，用于自动构建窗口间的全局交互。该机制在Swin Transformer模块内的常规与循环移位窗口划分操作之后实现。所提出的无监督变形图像配准模型命名为RFR-WWANet，该模型可检测长程相关性，并促进解剖结构间有意义的语义关联。定性与定量结果表明，RFR-WWANet相较于当前最先进方法取得了显著改进。消融实验验证了RFRNet与WWA设计的有效性。我们的代码开源地址为：\url{https://github.com/MingR-Ma/RFR-WWANet}。

相关内容

图像配准

关注 810

图像配准是图像处理研究领域中的一个典型问题和技术难点，其目的在于比较或融合针对同一对象在不同条件下获取的图像，例如图像会来自不同的采集设备，取自不同的时间，不同的拍摄视角等等，有时也需要用到针对不同对象的图像配准问题。具体地说，对于一组图像数据集中的两幅图像，通过寻找一种空间变换把一幅图像映射到另一幅图像，使得两图中对应于空间同一位置的点一一对应起来，从而达到信息融合的目的。该技术在计算机视觉、医学图像处理以及材料力学等领域都具有广泛的应用。根据具体应用的不同，有的侧重于通过变换结果融合两幅图像，有的侧重于研究变换本身以获得对象的一些力学属性。

【CVPR 2022】MixFormer：跨窗口与维度的特征融合，MixFormer: Mixing Features across Windows and Dimensions

专知会员服务

15+阅读 · 2022年3月19日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

111+阅读 · 2020年3月12日