Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors

Reconstructing 3D objects from a single image guided by pretrained diffusion models has demonstrated promising outcomes. However, due to utilizing the case-agnostic rigid strategy, their generalization ability to arbitrary cases and the 3D consistency of reconstruction are still poor. In this work, we propose Consistent123, a case-aware two-stage method for highly consistent 3D asset reconstruction from one image with both 2D and 3D diffusion priors. In the first stage, Consistent123 utilizes only 3D structural priors for sufficient geometry exploitation, with a CLIP-based case-aware adaptive detection mechanism embedded within this process. In the second stage, 2D texture priors are introduced and progressively take on a dominant guiding role, delicately sculpting the details of the 3D model. Consistent123 aligns more closely with the evolving trends in guidance requirements, adaptively providing adequate 3D geometric initialization and suitable 2D texture refinement for different objects. Consistent123 can obtain highly 3D-consistent reconstruction and exhibits strong generalization ability across various objects. Qualitative and quantitative experiments show that our method significantly outperforms state-of-the-art image-to-3D methods. See https://Consistent123.github.io for a more comprehensive exploration of our generated 3D assets.

翻译：在预训练扩散模型引导下从单张图像重建三维物体已展现出令人鼓舞的成果。然而，由于采用与案例无关的刚性策略，其对任意案例的泛化能力及重建的三维一致性仍显不足。本文提出Consistent123——一种案例感知的两阶段方法，通过结合二维与三维扩散先验，从单张图像实现高一致性三维资产重建。第一阶段中，Consistent123仅利用三维结构先验进行充分的几何探索，并内嵌基于CLIP的案例感知自适应检测机制。第二阶段引入二维纹理先验，使其逐步主导引导过程，精细雕琢三维模型的细节。Consistent123更贴合引导需求的演进趋势，能针对不同物体自适应提供充足的三维几何初始化与恰当的二维纹理细化处理。该方法可获得高度三维一致的重建结果，并对各类物体展现出强泛化能力。定性与定量实验表明，本方法显著优于当前最先进的图像到三维重建方法。详见https://Consistent123.github.io以全面探索生成的三维资产。

相关内容

ASSETS

关注 0

ACM SIGACCESS Conference on Computers and Accessibility是为残疾人和老年人提供与计算机相关的设计、评估、使用和教育研究的首要论坛。我们欢迎提交原始的高质量的有关计算和可访问性的主题。今年，ASSETS首次将其范围扩大到包括关于计算机无障碍教育相关主题的原创高质量研究。官网链接：http://assets19.sigaccess.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日