Numerous recent approaches to modeling and re-rendering dynamic scenes leverage plane-based explicit representations, addressing slow training times associated with models like neural radiance fields (NeRF) and Gaussian splatting (GS). However, merely decomposing 4D dynamic scenes into multiple 2D plane-based representations is insufficient for high-fidelity re-rendering of scenes with complex motions. In response, we present DaRePlane, a novel direction-aware representation approach that captures scene dynamics from six different directions. This learned representation undergoes an inverse dual-tree complex wavelet transformation (DTCWT) to recover plane-based information. Within NeRF pipelines, DaRePlane computes features for each space-time point by fusing vectors from these recovered planes, then passed to a tiny MLP for color regression. When applied to Gaussian splatting, DaRePlane computes the features of Gaussian points, followed by a tiny multi-head MLP for spatial-time deformation prediction. Notably, to address redundancy introduced by the six real and six imaginary direction-aware wavelet coefficients, we introduce a trainable masking approach, mitigating storage issues without significant performance decline. To demonstrate the generality and efficiency of DaRePlane, we test it on both regular and surgical dynamic scenes, for both NeRF and GS systems. Extensive experiments show that DaRePlane yields state-of-the-art performance in novel view synthesis for various complex dynamic scenes.
翻译:近年来,众多建模与重渲染动态场景的方法采用基于平面的显式表示,以缓解神经辐射场(NeRF)和高斯溅射(GS)等模型训练缓慢的问题。然而,仅将四维动态场景分解为多个基于二维平面的表示,不足以对具有复杂运动的场景实现高保真重渲染。为此,我们提出DaRePlane,一种新颖的方向感知表示方法,可从六个不同方向捕捉场景动态。该学习得到的表示经过逆双树复小波变换(DTCWT)以恢复基于平面的信息。在NeRF流程中,DaRePlane通过融合来自这些恢复平面的向量,计算每个时空点的特征,随后将其输入一个微型多层感知机进行颜色回归。当应用于高斯溅射时,DaRePlane计算高斯点的特征,再通过一个微型多头多层感知机预测时空形变。值得注意的是,为应对六个实部和六个虚部方向感知小波系数引入的冗余,我们提出一种可训练的掩码方法,在无明显性能下降的情况下缓解存储问题。为验证DaRePlane的通用性与高效性,我们在常规及手术动态场景中,分别针对NeRF和GS系统进行了测试。大量实验表明,DaRePlane在多种复杂动态场景的新视角合成任务中取得了最先进的性能。