We address 4D reconstruction from partial point cloud sequences, where depth-sensor observations are incomplete, unordered, and lack explicit temporal correspondences. This geometry-only setting is challenging due to missing observations and ambiguous dynamics. While recent progress has largely relied on image-based methods, existing point-based approaches typically focus on single objects, assume relatively complete inputs, or require explicit correspondences. To address these limitations, we propose DynaTok, a point-based framework for correspondence-free 4D reconstruction from partial point cloud sequences without images. DynaTok encodes frames into compact latent tokens, aggregates incomplete observations over time with a Transformer-based spatiotemporal encoder, and decouples geometry and motion through residual tokens in a unified model. A flow-matching decoder then reconstructs complete, temporally consistent 4D point-cloud sequences conditioned on the latent tokens. Experiments on object- and scene-level benchmarks demonstrate improved reconstruction quality and temporal coherence from partial point cloud observations. Project page: https://wrchen530.github.io/dynatok/.
翻译:我们针对局部点云序列的4D重建问题展开研究,其中深度传感器观测数据存在不完整、无序且缺乏显式时间对应关系的特点。这种纯几何设定因缺失观测值和动态模糊性而极具挑战性。尽管近期研究进展主要依赖基于图像的方法,但现有基于点云的方法通常聚焦于单一物体、假设输入相对完整或需要显式对应关系。为突破这些局限,我们提出DynaTok——一种无需图像即可从局部点云序列实现无对应4D重建的纯点云框架。DynaTok将各帧编码为紧凑的隐式令牌,通过基于Transformer的时空编码器随时间聚合不完整观测,并在统一模型中利用残差令牌解耦几何与运动。随后,基于流匹配的解码器以隐式令牌为条件,重建完整且时序一致的4D点云序列。在物体级和场景级基准测试上的实验表明,该方法能从局部点云观测数据中显著提升重建质量与时间连贯性。项目主页:https://wrchen530.github.io/dynatok/。