Text-to-image diffusion models are gradually introduced into computer graphics, recently enabling the development of Text-to-3D pipelines in an open domain. However, for interactive editing purposes, local manipulations of content through a simplistic textual interface can be arduous. Incorporating user guided sketches with Text-to-image pipelines offers users more intuitive control. Still, as state-of-the-art Text-to-3D pipelines rely on optimizing Neural Radiance Fields (NeRF) through gradients from arbitrary rendering views, conditioning on sketches is not straightforward. In this paper, we present SKED, a technique for editing 3D shapes represented by NeRFs. Our technique utilizes as few as two guiding sketches from different views to alter an existing neural field. The edited region respects the prompt semantics through a pre-trained diffusion model. To ensure the generated output adheres to the provided sketches, we propose novel loss functions to generate the desired edits while preserving the density and radiance of the base instance. We demonstrate the effectiveness of our proposed method through several qualitative and quantitative experiments.
翻译:文本到图像扩散模型逐渐被引入计算机图形学,近期推动了开放领域内文本到三维管线的开发。然而,在交互式编辑中,通过简化的文本界面进行局部内容操控可能较为繁琐。将用户引导的草图与文本到图像管线相结合,能为用户提供更直观的控制。然而,由于最先进的文本到三维管线依赖于通过任意渲染视图的梯度优化神经辐射场(NeRF),将草图作为条件引入并不直接。本文提出SKED技术,用于编辑由神经辐射场表示的三维形状。该技术仅需利用来自不同视角的两个引导草图即可修改现有神经场。编辑区域通过预训练的扩散模型遵循提示语义。为确保生成输出符合所提供的草图,我们提出了新颖的损失函数,在保留基础实例密度与辐射度的同时生成所需编辑。通过多项定性与定量实验,我们证明了所提方法的有效性。