In this paper, we present Surf-D, a novel method for generating high-quality 3D shapes as Surfaces with arbitrary topologies using Diffusion models. Specifically, we adopt Unsigned Distance Field (UDF) as the surface representation, as it excels in handling arbitrary topologies, enabling the generation of complex shapes. While the prior methods explored shape generation with different representations, they suffer from limited topologies and geometry details. Moreover, it's non-trivial to directly extend prior diffusion models to UDF because they lack spatial continuity due to the discrete volume structure. However, UDF requires accurate gradients for mesh extraction and learning. To tackle the issues, we first leverage a point-based auto-encoder to learn a compact latent space, which supports gradient querying for any input point through differentiation to effectively capture intricate geometry at a high resolution. Since the learning difficulty for various shapes can differ, a curriculum learning strategy is employed to efficiently embed various surfaces, enhancing the whole embedding process. With pretrained shape latent space, we employ a latent diffusion model to acquire the distribution of various shapes. Our approach demonstrates superior performance in shape generation across multiple modalities and conducts extensive experiments in unconditional generation, category conditional generation, 3D reconstruction from images, and text-to-shape tasks.
翻译:本文提出Surf-D——一种利用扩散模型生成任意拓扑高质量三维曲面形状的新方法。具体而言,我们采用无符号距离场(UDF)作为曲面表示,因其在处理任意拓扑方面具有优势,能够生成复杂形状。尽管已有方法探索了不同表示下的形状生成,但它们在拓扑和几何细节方面存在局限。此外,将现有扩散模型直接扩展到UDF并非易事,因为这些模型因离散体素结构而缺乏空间连续性。然而,UDF需要精确的梯度进行网格提取与学习。为解决这些问题,我们首先利用基于点的自动编码器学习紧凑的潜在空间,该空间通过可微分化支持对任意输入点进行梯度查询,从而有效捕捉高分辨率下的精细几何结构。鉴于不同形状的学习难度存在差异,采用课程学习策略以高效嵌入多样曲面,提升整体嵌入过程。在预训练形状潜在空间的基础上,我们应用潜在扩散模型获取各类形状的分布。本方法在多模态形状生成中展现出卓越性能,并在无条件生成、类别条件生成、图像三维重建及文本到形状任务中开展了广泛实验。