3D reconstruction of dynamic scenes is a long-standing problem in computer graphics and increasingly difficult the less information is available. Shape-from-Template (SfT) methods aim to reconstruct a template-based geometry from RGB images or video sequences, often leveraging just a single monocular camera without depth information, such as regular smartphone recordings. Unfortunately, existing reconstruction methods are either unphysical and noisy or slow in optimization. To solve this problem, we propose a novel SfT reconstruction algorithm for cloth using a pre-trained neural surrogate model that is fast to evaluate, stable, and produces smooth reconstructions due to a regularizing physics simulation. Differentiable rendering of the simulated mesh enables pixel-wise comparisons between the reconstruction and a target video sequence that can be used for a gradient-based optimization procedure to extract not only shape information but also physical parameters such as stretching, shearing, or bending stiffness of the cloth. This allows to retain a precise, stable, and smooth reconstructed geometry while reducing the runtime by a factor of 400-500 compared to $\phi$-SfT, a state-of-the-art physics-based SfT approach.
翻译:动态场景的三维重建是计算机图形学中的长期难题,且可用信息越少,解决难度越大。模板化形状重建方法旨在从RGB图像或视频序列中重建基于模板的几何结构,通常仅利用单目摄像头(如普通智能手机录制)而无需深度信息。然而,现有重建方法要么缺乏物理基础且噪声较大,要么优化速度缓慢。为解决这一问题,我们提出了一种基于预训练神经代理模型的布料SfT重建算法,该模型评估速度快、稳定性高,并通过物理仿真正则化生成平滑的重建结果。对仿真网格进行可微分渲染,能够实现重建结果与目标视频序列之间的逐像素比较,进而通过梯度优化过程提取形状信息及物理参数(如布料拉伸、剪切或弯曲刚度)。与当前最先进的基于物理的SfT方法$\phi$-SfT相比,本方法在保留精确、稳定且平滑的重建几何结构的同时,运行速度提升了400-500倍。