We introduce a novel, data-driven approach for reconstructing temporally coherent 3D motion from unstructured and potentially partial observations of non-rigidly deforming shapes. Our goal is to achieve high-fidelity motion reconstructions for shapes that undergo near-isometric deformations, such as humans wearing loose clothing. The key novelty of our work lies in its ability to combine implicit shape representations with explicit mesh-based deformation models, enabling detailed and temporally coherent motion reconstructions without relying on parametric shape models or decoupling shape and motion. Each frame is represented as a neural field decoded from a feature space where observations over time are fused, hence preserving geometric details present in the input data. Temporal coherence is enforced with a near-isometric deformation constraint between adjacent frames that applies to the underlying surface in the neural field. Our method outperforms state-of-the-art approaches, as demonstrated by its application to human and animal motion sequences reconstructed from monocular depth videos.
翻译:我们提出了一种新颖的数据驱动方法,用于从非刚性变形形状的无结构且可能不完整的观测中重建时间一致的三维运动。我们的目标是为经历近似等距变形(例如穿着宽松衣物的人体)的形状实现高保真度的运动重建。本工作的核心创新在于能够将隐式形状表示与基于显式网格的变形模型相结合,从而在不依赖参数化形状模型或解耦形状与运动的前提下,实现细节丰富且时间一致的运动重建。每一帧被表示为从特征空间解码得到的神经场,该特征空间融合了跨时间的观测,从而保留了输入数据中存在的几何细节。通过在相邻帧之间施加适用于神经场底层表面的近似等距变形约束,确保了时间一致性。我们的方法在从单目深度视频重建的人体和动物运动序列上的应用表明,其性能优于现有最先进方法。