Modeling and re-rendering dynamic 3D scenes is a challenging task in 3D vision. Prior approaches build on NeRF and rely on implicit representations. This is slow since it requires many MLP evaluations, constraining real-world applications. We show that dynamic 3D scenes can be explicitly represented by six planes of learned features, leading to an elegant solution we call HexPlane. A HexPlane computes features for points in spacetime by fusing vectors extracted from each plane, which is highly efficient. Pairing a HexPlane with a tiny MLP to regress output colors and training via volume rendering gives impressive results for novel view synthesis on dynamic scenes, matching the image quality of prior work but reducing training time by more than $100\times$. Extensive ablations confirm our HexPlane design and show that it is robust to different feature fusion mechanisms, coordinate systems, and decoding mechanisms. HexPlane is a simple and effective solution for representing 4D volumes, and we hope they can broadly contribute to modeling spacetime for dynamic 3D scenes.
翻译:建模与重新渲染动态三维场景是三维视觉领域中的一项挑战性任务。现有方法基于NeRF并依赖隐式表示,但由于需要大量MLP计算,速度缓慢,限制了实际应用。我们证明,动态三维场景可以通过六个学习到的特征平面显式表示,从而提出一种优雅的解决方案——HexPlane。HexPlane通过融合从每个平面提取的向量来计算时空点的特征,具有极高效率。将HexPlane与小型MLP结合用于回归输出颜色,并通过体素渲染进行训练,在动态场景的新视图合成中取得了令人印象深刻的结果:图像质量与先前工作相当,但训练时间减少了超过100倍。大量消融实验验证了我们的HexPlane设计,并表明其对不同特征融合机制、坐标系和解码机制具有鲁棒性。HexPlane是一种简单有效的四维体表示方法,我们希望它能广泛促进动态三维场景的时空建模。