Reconstructing dynamic 3D scenes from sparse multi-view videos is highly ill-posed, often leading to geometric collapse, trajectory drift, and floating artifacts. Recent attempts introduce generative priors to hallucinate missing content, yet naive integration frequently causes structural drift and temporal inconsistency due to the mismatch between stochastic 2D generation and deterministic 3D geometry. In this paper, we propose GeoRect4D, a novel unified framework for sparse-view dynamic reconstruction that couples explicit 3D consistency with generative refinement via a closed-loop optimization process. Specifically, GeoRect4D introduces a degradation-aware feedback mechanism that incorporates a robust anchor-based dynamic 3DGS substrate with a single-step diffusion rectifier to hallucinate high-fidelity details. This rectifier utilizes a structural locking mechanism and spatiotemporal coordinated attention, effectively preserving physical plausibility while restoring missing content. Furthermore, we present a progressive optimization strategy that employs stochastic geometric purification to eliminate floaters and generative distillation to infuse texture details into the explicit representation. Extensive experiments demonstrate that GeoRect4D achieves state-of-the-art performance in reconstruction fidelity, perceptual quality, and spatiotemporal consistency across multiple datasets.
翻译:从稀疏多视角视频中重建动态三维场景是一个高度病态问题,常导致几何坍缩、轨迹漂移及浮空伪影。现有方法引入生成式先验来补全缺失内容,但随机二维生成与确定性三维几何之间的不匹配往往引发结构漂移与时间不一致。本文提出GeoRect4D——一种新颖的稀疏视角动态重建统一框架,通过闭环优化过程将显式三维一致性与生成式精炼相结合。具体而言,GeoRect4D引入退化感知反馈机制,该机制将鲁棒的锚点动态3DGS基底与单步扩散矫正器融合,以逼真地生成高保真细节。该矫正器利用结构锁定机制与时空协同注意力,在恢复缺失内容的同时有效保持物理合理性。此外,我们提出渐进优化策略,采用随机几何净化消除浮空伪影,并通过生成式蒸馏将纹理细节注入显式表征。大量实验表明,GeoRect4D在多个数据集的重建保真度、感知质量及时空一致性上均达到最先进水平。