The modeling and manipulation of 3D scenes captured from the real world are pivotal in various applications, attracting growing research interest. While previous works on editing have achieved interesting results through manipulating 3D meshes, they often require accurately reconstructed meshes to perform editing, which limits their application in 3D content generation. To address this gap, we introduce a novel single-image-driven 3D scene editing approach based on 3D Gaussian Splatting, enabling intuitive manipulation via directly editing the content on a 2D image plane. Our method learns to optimize the 3D Gaussians to align with an edited version of the image rendered from a user-specified viewpoint of the original scene. To capture long-range object deformation, we introduce positional loss into the optimization process of 3D Gaussian Splatting and enable gradient propagation through reparameterization. To handle occluded 3D Gaussians when rendering from the specified viewpoint, we build an anchor-based structure and employ a coarse-to-fine optimization strategy capable of handling long-range deformation while maintaining structural stability. Furthermore, we design a novel masking strategy to adaptively identify non-rigid deformation regions for fine-scale modeling. Extensive experiments show the effectiveness of our method in handling geometric details, long-range, and non-rigid deformation, demonstrating superior editing flexibility and quality compared to previous approaches.
翻译:对真实世界捕获的3D场景进行建模与操控在众多应用中具有关键意义,正日益吸引研究关注。尽管已有编辑工作通过操控3D网格取得了显著成果,但这些方法通常需要精确重建的网格才能执行编辑,这限制了其在3D内容生成中的应用。为填补这一空白,我们提出了一种基于3D高斯泼溅的新型单图像驱动3D场景编辑方法,该方法通过直接在2D图像平面上编辑内容实现直观操控。我们的方法通过优化3D高斯分布,使其与从原始场景用户指定视角渲染的图像编辑版本对齐。为捕捉长距离物体形变,我们在3D高斯泼溅优化过程中引入位置损失,并通过重参数化实现梯度传播。为处理指定视角渲染时被遮挡的3D高斯分布,我们构建了基于锚点的结构体系,并采用从粗到精的优化策略,该策略在保持结构稳定性的同时能够处理长距离形变。此外,我们设计了一种新颖的掩码策略来自适应识别非刚性形变区域以进行精细尺度建模。大量实验表明,我们的方法在处理几何细节、长距离形变及非刚性形变方面具有显著效果,相较于现有方法展现出更优越的编辑灵活性与质量。