Enhancing AI systems to perform tasks following human instructions can significantly boost productivity. In this paper, we present InstructP2P, an end-to-end framework for 3D shape editing on point clouds, guided by high-level textual instructions. InstructP2P extends the capabilities of existing methods by synergizing the strengths of a text-conditioned point cloud diffusion model, Point-E, and powerful language models, enabling color and geometry editing using language instructions. To train InstructP2P, we introduce a new shape editing dataset, constructed by integrating a shape segmentation dataset, off-the-shelf shape programs, and diverse edit instructions generated by a large language model, ChatGPT. Our proposed method allows for editing both color and geometry of specific regions in a single forward pass, while leaving other regions unaffected. In our experiments, InstructP2P shows generalization capabilities, adapting to novel shape categories and instructions, despite being trained on a limited amount of data.
翻译:提升AI系统遵循人类指令执行任务的能力可显著提高生产力。本文提出InstructP2P,一种基于高层级文本指令、面向点云的端到端3D形状编辑框架。InstructP2P通过融合文本条件点云扩散模型Point-E与强语言模型的能力,扩展了现有方法的边界,实现了利用语言指令进行颜色与几何编辑。为训练InstructP2P,我们引入了一个新的形状编辑数据集,该数据集通过整合形状分割数据集、现成形状程序以及大语言模型ChatGPT生成的多样的编辑指令构建而成。所提方法支持在单次前向传播中编辑特定区域的色彩与几何属性,同时保持其他区域不受影响。实验表明,尽管InstructP2P仅在有限数据上训练,但其展现出对未见形状类别与指令的泛化能力。