Generating 3D models has traditionally been a complex task requiring specialized expertise. While recent advances in generative AI have sought to automate this process, existing methods produce non-editable representation, such as meshes or point clouds, limiting their adaptability for iterative design. In this paper, we introduce Proc3D, a system designed to generate editable 3D models while enabling real-time modifications. At its core, Proc3D introduces procedural compact graph (PCG), a graph representation of 3D models, that encodes the algorithmic rules and structures necessary for generating the model. This representation exposes key parameters, allowing intuitive manual adjustments via sliders and checkboxes, as well as real-time, automated modifications through natural language prompts using Large Language Models (LLMs). We demonstrate Proc3D's capabilities using two generative approaches: GPT-4o with in-context learning (ICL) and a fine-tuned LLAMA-3 model. Experimental results show that Proc3D outperforms existing methods in editing efficiency, achieving more than 400x speedup over conventional approaches that require full regeneration for each modification. Additionally, Proc3D improves ULIP scores by 28%, a metric that evaluates the alignment between generated 3D models and text prompts. By enabling text-aligned 3D model generation along with precise, real-time parametric edits, Proc3D facilitates highly accurate text-based image editing applications.
翻译:三维模型生成传统上是一项需要专业知识的复杂任务。尽管生成式人工智能的最新进展试图自动化这一过程,但现有方法产生不可编辑的表示形式(如网格或点云),限制了其在迭代设计中的适应性。本文介绍Proc3D系统,该系统旨在生成可编辑的三维模型并支持实时修改。Proc3D的核心是引入程序化紧凑图(PCG)——一种三维模型的图表示方法,它编码了生成模型所需的算法规则与结构。该表示形式暴露了关键参数,支持通过滑块和复选框进行直观手动调整,并能够利用大语言模型(LLMs)通过自然语言提示实现实时自动化修改。我们通过两种生成方法展示Proc3D的能力:采用上下文学习(ICL)的GPT-4o模型与微调后的LLAMA-3模型。实验结果表明,Proc3D在编辑效率上优于现有方法,相比传统需要完全重新生成的修改方式实现了超过400倍的加速。此外,Proc3D将ULIP分数提升了28%(该指标用于评估生成三维模型与文本提示的对齐度)。通过实现文本对齐的三维模型生成及精准的实时参数化编辑,Proc3D为高精度基于文本的图像编辑应用提供了支持。