BoostDream: Efficient Refining for High-Quality Text-to-3D Generation from Multi-View Diffusion

Witnessing the evolution of text-to-image diffusion models, significant strides have been made in text-to-3D generation. Currently, two primary paradigms dominate the field of text-to-3D: the feed-forward generation solutions, capable of swiftly producing 3D assets but often yielding coarse results, and the Score Distillation Sampling (SDS) based solutions, known for generating high-fidelity 3D assets albeit at a slower pace. The synergistic integration of these methods holds substantial promise for advancing 3D generation techniques. In this paper, we present BoostDream, a highly efficient plug-and-play 3D refining method designed to transform coarse 3D assets into high-quality. The BoostDream framework comprises three distinct processes: (1) We introduce 3D model distillation that fits differentiable representations from the 3D assets obtained through feed-forward generation. (2) A novel multi-view SDS loss is designed, which utilizes a multi-view aware 2D diffusion model to refine the 3D assets. (3) We propose to use prompt and multi-view consistent normal maps as guidance in refinement.Our extensive experiment is conducted on different differentiable 3D representations, revealing that BoostDream excels in generating high-quality 3D assets rapidly, overcoming the Janus problem compared to conventional SDS-based methods. This breakthrough signifies a substantial advancement in both the efficiency and quality of 3D generation processes.

翻译：见证文本到图像扩散模型的发展，文本到3D生成领域已取得显著进展。目前，该领域主要存在两种主流范式：前馈式生成方案（可快速生成3D资产但结果常较粗糙）和基于分数蒸馏采样（SDS）的方案（虽速度较慢但能生成高保真3D资产）。协同融合这两种方法有望推动3D生成技术的进步。本文提出BoostDream——一种高效即插即用的3D优化方法，旨在将粗糙的3D资产转化为高质量成果。BoostDream框架包含三个独立流程：（1）引入3D模型蒸馏技术，从前馈生成的3D资产中拟合可微分表征；（2）设计新型多视图SDS损失函数，利用多视图感知的2D扩散模型优化3D资产；（3）提出在优化过程中使用提示词与多视图一致性法线图作为引导。我们在不同可微分3D表征上开展了广泛实验，结果表明BoostDream能快速生成高质量3D资产，并克服了传统SDS方法中常见的贾纳斯问题。这一突破标志着3D生成流程在效率与质量上均取得重大进展。

相关内容

ASSETS

关注 0

ACM SIGACCESS Conference on Computers and Accessibility是为残疾人和老年人提供与计算机相关的设计、评估、使用和教育研究的首要论坛。我们欢迎提交原始的高质量的有关计算和可访问性的主题。今年，ASSETS首次将其范围扩大到包括关于计算机无障碍教育相关主题的原创高质量研究。官网链接：http://assets19.sigaccess.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日