Gaussian splatting methods are emerging as a popular approach for converting multi-view image data into scene representations that allow view synthesis. In particular, there is interest in enabling view synthesis for dynamic scenes using only monocular input data -- an ill-posed and challenging problem. The fast pace of work in this area has produced multiple simultaneous papers that claim to work best, which cannot all be true. In this work, we organize, benchmark, and analyze many Gaussian-splatting-based methods, providing apples-to-apples comparisons that prior works have lacked. We use multiple existing datasets and a new instructive synthetic dataset designed to isolate factors that affect reconstruction quality. We systematically categorize Gaussian splatting methods into specific motion representation types and quantify how their differences impact performance. Empirically, we find that their rank order is well-defined in synthetic data, but the complexity of real-world data currently overwhelms the differences. Furthermore, the fast rendering speed of all Gaussian-based methods comes at the cost of brittleness in optimization. We summarize our experiments into a list of findings that can help to further progress in this lively problem setting. Project Webpage: https://lynl7130.github.io/MonoDyGauBench.github.io/
翻译:高斯溅射方法正逐渐成为一种流行的技术,用于将多视角图像数据转换为支持视图合成的场景表示。特别地,研究者们关注如何仅使用单目输入数据实现动态场景的视图合成——这是一个不适定且极具挑战性的问题。该领域研究进展迅速,已出现多篇同时声称性能最优的论文,但这些结论不可能全部成立。在本研究中,我们对多种基于高斯溅射的方法进行了系统梳理、基准测试与分析,提供了先前研究缺乏的公平比较。我们使用了多个现有数据集以及一个新的指导性合成数据集,该数据集旨在分离影响重建质量的因素。我们将高斯溅射方法按具体运动表示类型进行系统分类,并量化其差异如何影响性能。实验结果表明,在合成数据中这些方法的性能排序是明确的,但当前真实世界数据的复杂性掩盖了其差异。此外,所有基于高斯的方法的快速渲染速度是以优化过程的脆弱性为代价的。我们将实验总结为一系列发现,以助力这一活跃问题领域的进一步进展。项目网页:https://lynl7130.github.io/MonoDyGauBench.github.io/