Recently, with the development of Neural Radiance Fields and Gaussian Splatting, 3D reconstruction techniques have achieved remarkably high fidelity. However, the latent representations learnt by these methods are highly entangled and lack interpretability. In this paper, we propose a novel part-aware compositional reconstruction method, called GaussianBlock, that enables semantically coherent and disentangled representations, allowing for precise and physical editing akin to building blocks, while simultaneously maintaining high fidelity. Our GaussianBlock introduces a hybrid representation that leverages the advantages of both primitives, known for their flexible actionability and editability, and 3D Gaussians, which excel in reconstruction quality. Specifically, we achieve semantically coherent primitives through a novel attention-guided centering loss derived from 2D semantic priors, complemented by a dynamic splitting and fusion strategy. Furthermore, we utilize 3D Gaussians that hybridize with primitives to refine structural details and enhance fidelity. Additionally, a binding inheritance strategy is employed to strengthen and maintain the connection between the two. Our reconstructed scenes are evidenced to be disentangled, compositional, and compact across diverse benchmarks, enabling seamless, direct and precise editing while maintaining high quality.
翻译:近年来,随着神经辐射场与高斯溅射技术的发展,三维重建技术已实现极高的保真度。然而,这些方法学习到的隐式表示高度纠缠且缺乏可解释性。本文提出一种新颖的部件感知组合式重建方法——GaussianBlock,该方法能够生成语义连贯且解耦的表示,支持类似积木的精确物理编辑,同时保持高保真度。我们的GaussianBlock引入了一种混合表示,它结合了基元(以其灵活的可操作性和可编辑性著称)与三维高斯分布(在重建质量方面表现优异)的优势。具体而言,我们通过一种源自二维语义先验的新型注意力引导中心损失函数,辅以动态分割与融合策略,实现了语义连贯的基元。此外,我们利用与基元混合的三维高斯分布来优化结构细节并提升保真度。同时,采用绑定继承策略以强化并维持二者之间的关联。实验证明,我们的重建场景在多种基准测试中均表现出解耦性、组合性与紧凑性,能够在保持高质量的同时实现无缝、直接且精确的编辑。