We present DreamBeast, a novel method based on score distillation sampling (SDS) for generating fantastical 3D animal assets composed of distinct parts. Existing SDS methods often struggle with this generation task due to a limited understanding of part-level semantics in text-to-image diffusion models. While recent diffusion models, such as Stable Diffusion 3, demonstrate a better part-level understanding, they are prohibitively slow and exhibit other common problems associated with single-view diffusion models. DreamBeast overcomes this limitation through a novel part-aware knowledge transfer mechanism. For each generated asset, we efficiently extract part-level knowledge from the Stable Diffusion 3 model into a 3D Part-Affinity implicit representation. This enables us to instantly generate Part-Affinity maps from arbitrary camera views, which we then use to modulate the guidance of a multi-view diffusion model during SDS to create 3D assets of fantastical animals. DreamBeast significantly enhances the quality of generated 3D creatures with user-specified part compositions while reducing computational overhead, as demonstrated by extensive quantitative and qualitative evaluations.
翻译:我们提出了DreamBeast,一种基于分数蒸馏采样(SDS)的新方法,用于生成由不同部件组成的奇幻三维动物资产。现有的SDS方法由于对文本到图像扩散模型中部件级语义的理解有限,在此类生成任务中常常遇到困难。尽管最近的扩散模型(如Stable Diffusion 3)展现出更好的部件级理解能力,但其生成速度极其缓慢,并存在单视图扩散模型常见的其他问题。DreamBeast通过一种新颖的部件感知知识迁移机制克服了这一局限。对于每个生成的资产,我们高效地从Stable Diffusion 3模型中提取部件级知识,并将其编码为三维部件亲和度隐式表示。这使得我们能够从任意相机视角即时生成部件亲和度图,进而在SDS过程中利用这些图来调制多视图扩散模型的引导信号,最终创建出奇幻动物的三维资产。大量定量与定性实验表明,DreamBeast在显著提升具有用户指定部件组合的三维生物生成质量的同时,有效降低了计算开销。