Recent advances in 3D generation have been remarkable, with methods such as DreamFusion leveraging large-scale text-to-image diffusion-based models to supervise 3D object generation. These methods enable the synthesis of detailed and photorealistic textured objects. However, the appearance of 3D objects produced by these text-to-3D models is unpredictable, and it is hard for the single-image-to-3D methods to deal with complex images, thus posing a challenge in generating appearance-controllable 3D objects. To achieve controllable complex 3D object synthesis, we propose IPDreamer, a novel approach that incorporates image prompt adaption to extract detailed and comprehensive appearance features from complex images, which are then utilized for 3D object generation. Our results demonstrate that IPDreamer effectively generates high-quality 3D objects that are consistent with both the provided text and the appearance of complex image prompts, demonstrating its promising capability in appearance-controllable 3D object generation. Our code is available at https://github.com/zengbohan0217/IPDreamer.
翻译:近年来,三维生成领域取得了显著进展,诸如DreamFusion等方法利用基于大规模文本到图像扩散的模型来指导三维物体生成。这些方法能够合成具有细节和照片级真实感的纹理化物体。然而,这些文本到三维模型所生成物体的外观具有不可预测性,而单图像到三维方法难以处理复杂图像,这为实现外观可控的三维物体生成带来了挑战。为实现可控的复杂三维物体合成,我们提出了IPDreamer,这是一种新颖的方法,它结合了图像提示自适应技术,从复杂图像中提取详细且全面的外观特征,随后将这些特征用于三维物体生成。我们的实验结果表明,IPDreamer能够有效生成高质量的三维物体,这些物体既与提供的文本描述一致,也与复杂图像提示的外观保持一致,展现了其在外观可控三维物体生成方面的良好能力。我们的代码可在 https://github.com/zengbohan0217/IPDreamer 获取。