Recent advances in structured 3D Gaussians for view-adaptive rendering, particularly through methods like Scaffold-GS, have demonstrated promising results in neural scene representation. However, existing approaches still face challenges in perceptual consistency and precise view-dependent effects. We present PEP-GS, a novel framework that enhances structured 3D Gaussians through three key innovations: (1) a Local-Enhanced Multi-head Self-Attention (LEMSA) mechanism that replaces spherical harmonics for more accurate view-dependent color decoding, and (2) Kolmogorov-Arnold Networks (KAN) that optimize Gaussian opacity and covariance functions for enhanced interpretability and splatting precision. (3) a Neural Laplacian Pyramid Decomposition (NLPD) that improves perceptual similarity across views. Our comprehensive evaluation across multiple datasets indicates that, compared to the current state-of-the-art methods, these improvements are particularly evident in challenging scenarios such as view-dependent effects, specular reflections, fine-scale details and false geometry generation.
翻译:用于视角自适应渲染的结构化3D高斯模型(尤其是通过Scaffold-GS等方法)在神经场景表示方面取得了显著进展。然而,现有方法在感知一致性和精确的视角相关效果方面仍面临挑战。本文提出PEP-GS,一种通过三项关键创新增强结构化3D高斯模型的新框架:(1)局部增强多头自注意力机制,它取代球谐函数以实现更精确的视角相关颜色解码;(2)Kolmogorov-Arnold网络,用于优化高斯不透明度和协方差函数,以增强可解释性和光栅化精度;(3)神经拉普拉斯金字塔分解,用于提升跨视角的感知相似性。我们在多个数据集上的综合评估表明,与当前最先进方法相比,这些改进在视角相关效果、镜面反射、精细尺度细节以及虚假几何生成等挑战性场景中尤为明显。