In this paper, we propose SRIF, a novel Semantic shape Registration framework based on diffusion-based Image morphing and Flow estimation. More concretely, given a pair of extrinsically aligned shapes, we first render them from multi-views, and then utilize an image interpolation framework based on diffusion models to generate sequences of intermediate images between them. The images are later fed into a dynamic 3D Gaussian splatting framework, with which we reconstruct and post-process for intermediate point clouds respecting the image morphing processing. In the end, tailored for the above, we propose a novel registration module to estimate continuous normalizing flow, which deforms source shape consistently towards the target, with intermediate point clouds as weak guidance. Our key insight is to leverage large vision models (LVMs) to associate shapes and therefore obtain much richer semantic information on the relationship between shapes than the ad-hoc feature extraction and alignment. As a consequence, SRIF achieves high-quality dense correspondences on challenging shape pairs, but also delivers smooth, semantically meaningful interpolation in between. Empirical evidence justifies the effectiveness and superiority of our method as well as specific design choices. The code is released at https://github.com/rqhuang88/SRIF.
翻译:本文提出SRIF,一种基于扩散图像变形与流估计的新型语义形状配准框架。具体而言,给定一对经过外参对齐的形状,我们首先从多视角对其进行渲染,随后利用基于扩散模型的图像插值框架生成两者之间的中间图像序列。这些图像被输入动态3D高斯泼溅框架,通过该框架我们根据图像变形过程重建并后处理得到中间点云。最终,针对上述流程,我们提出一种新颖的配准模块来估计连续归一化流,该流以中间点云作为弱监督,将源形状连续一致地变形至目标形状。我们的核心洞见在于利用大规模视觉模型建立形状间的语义关联,从而获得比传统特征提取与对齐方法更丰富的形状关系语义信息。因此,SRIF不仅在挑战性形状对上实现了高质量的密集对应,还能生成平滑且语义合理的中间插值结果。实验证据验证了本方法及其具体设计选择的有效性与优越性。代码发布于https://github.com/rqhuang88/SRIF。