RelayGS：通过接力高斯重建具有大规模复杂运动的动态场景 (RelayGS: Reconstructing Dynamic Scenes with Large-Scale and Complex Motions via Relay Gaussians)

Reconstructing dynamic scenes with large-scale and complex motions remains a significant challenge. Recent techniques like Neural Radiance Fields and 3D Gaussian Splatting (3DGS) have shown promise but still struggle with scenes involving substantial movement. This paper proposes RelayGS, a novel method based on 3DGS, specifically designed to represent and reconstruct highly dynamic scenes. Our RelayGS learns a complete 4D representation with canonical 3D Gaussians and a compact motion field, consisting of three stages. First, we learn a fundamental 3DGS from all frames, ignoring temporal scene variations, and use a learnable mask to separate the highly dynamic foreground from the minimally moving background. Second, we replicate multiple copies of the decoupled foreground Gaussians from the first stage, each corresponding to a temporal segment, and optimize them using pseudo-views constructed from multiple frames within each segment. These Gaussians, termed Relay Gaussians, act as explicit relay nodes, simplifying and breaking down large-scale motion trajectories into smaller, manageable segments. Finally, we jointly learn the scene's temporal motion and refine the canonical Gaussians learned from the first two stages. We conduct thorough experiments on two dynamic scene datasets featuring large and complex motions, where our RelayGS outperforms state-of-the-arts by more than 1 dB in PSNR, and successfully reconstructs real-world basketball game scenes in a much more complete and coherent manner, whereas previous methods usually struggle to capture the complex motion of players. Code will be publicly available at https://github.com/gqk/RelayGS

翻译：重建具有大规模复杂运动的动态场景仍然是一个重大挑战。神经辐射场和3D高斯泼溅（3DGS）等近期技术虽展现出潜力，但在处理包含显著运动的场景时仍存在困难。本文提出RelayGS，一种基于3DGS的新方法，专门用于表示和重建高度动态的场景。我们的RelayGS通过包含三个阶段的流程，学习具有规范3D高斯和紧凑运动场的完整4D表示。首先，我们从所有帧学习忽略时间场景变化的基础3DGS，并使用可学习掩码将高度动态的前景与运动微弱的背景分离。其次，我们复制第一阶段解耦前景高斯的多个副本，每个副本对应一个时间片段，并利用每个片段内多帧构建的伪视图对这些高斯进行优化。这些被称为"接力高斯"的单元作为显式中继节点，将大规模运动轨迹分解为更小且易于处理的片段。最后，我们联合学习场景的时间运动并优化前两阶段学习的规范高斯。我们在两个包含大规模复杂运动的动态场景数据集上进行了全面实验，RelayGS在PSNR指标上以超过1 dB的优势优于现有最优方法，并能以更完整连贯的方式成功重建真实世界篮球比赛场景，而现有方法通常难以捕捉球员的复杂运动。代码将在 https://github.com/gqk/RelayGS 公开提供。