Recent advances in text-to-3D generation have been remarkable, with methods such as DreamFusion leveraging large-scale text-to-image diffusion-based models to supervise 3D generation. These methods, including the variational score distillation proposed by ProlificDreamer, enable the synthesis of detailed and photorealistic textured meshes. However, the appearance of 3D objects generated by these methods is often random and uncontrollable, posing a challenge in achieving appearance-controllable 3D objects. To address this challenge, we introduce IPDreamer, a novel approach that incorporates image prompts to provide specific and comprehensive appearance information for 3D object generation. Our results demonstrate that IPDreamer effectively generates high-quality 3D objects that are consistent with both the provided text and image prompts, demonstrating its promising capability in appearance-controllable 3D object generation.
翻译:近期文本到三维生成技术取得了显著进展,DreamFusion等方法通过利用大规模文本到图像扩散模型来监督三维生成。这些方法(包括ProlificDreamer提出的变分分数蒸馏)能够合成细节丰富且逼真的纹理网格。然而,这些方法生成的三维物体外观往往具有随机性和不可控性,这给实现外观可控的三维物体带来了挑战。针对这一难题,我们提出了IPDreamer这一创新方法,该方法通过引入图像提示来为三维物体生成提供具体且全面的外观信息。实验结果表明,IPDreamer能够有效生成与给定文本和图像提示均保持一致的高质量三维物体,展现了其在外观可控三维物体生成领域的巨大潜力。