3D Gaussian Splatting has demonstrated notable success in large-scale scene reconstruction, but challenges persist due to high training memory consumption and storage overhead. Hybrid representations that integrate implicit and explicit features offer a way to mitigate these limitations. However, when applied in parallelized block-wise training, two critical issues arise since reconstruction accuracy deteriorates due to reduced data diversity when training each block independently, and parallel training restricts the number of divided blocks to the available number of GPUs. To address these issues, we propose Momentum-GS, a novel approach that leverages momentum-based self-distillation to promote consistency and accuracy across the blocks while decoupling the number of blocks from the physical GPU count. Our method maintains a teacher Gaussian decoder updated with momentum, ensuring a stable reference during training. This teacher provides each block with global guidance in a self-distillation manner, promoting spatial consistency in reconstruction. To further ensure consistency across the blocks, we incorporate block weighting, dynamically adjusting each block's weight according to its reconstruction accuracy. Extensive experiments on large-scale scenes show that our method consistently outperforms existing techniques, achieving a 12.8% improvement in LPIPS over CityGaussian with much fewer divided blocks and establishing a new state of the art. Project page: https://jixuan-fan.github.io/Momentum-GS_Page/
翻译:3D高斯泼溅技术在大规模场景重建中已展现出显著成效,但高训练内存消耗与存储开销仍是持续存在的挑战。融合隐式与显式特征的混合表示提供了一种缓解这些限制的途径。然而,在并行化的分块训练中应用时,会引发两个关键问题:由于各块独立训练导致数据多样性降低,重建精度随之下降;同时并行训练将分块数量限制在可用GPU数量之内。为解决这些问题,我们提出了Momentum-GS——一种新颖的方法,它利用基于动量的自蒸馏机制来促进各块间的一致性与精度,同时将分块数量与物理GPU数量解耦。我们的方法维护一个通过动量更新的教师高斯解码器,确保训练过程中存在稳定的参考基准。该教师以自蒸馏方式为每个分块提供全局指导,从而提升重建的空间一致性。为进一步确保分块间的一致性,我们引入了分块加权机制,根据各块的重建精度动态调整其权重。在大规模场景上的大量实验表明,我们的方法持续优于现有技术,在比CityGaussian少得多的分块数量下,LPIPS指标提升了12.8%,并确立了新的最优性能。项目页面:https://jixuan-fan.github.io/Momentum-GS_Page/