We present a novel generative 3D modeling system, coined CraftsMan, which can generate high-fidelity 3D geometries with highly varied shapes, regular mesh topologies, and detailed surfaces, and, notably, allows for refining the geometry in an interactive manner. Despite the significant advancements in 3D generation, existing methods still struggle with lengthy optimization processes, irregular mesh topologies, noisy surfaces, and difficulties in accommodating user edits, consequently impeding their widespread adoption and implementation in 3D modeling software. Our work is inspired by the craftsman, who usually roughs out the holistic figure of the work first and elaborates the surface details subsequently. Specifically, we employ a 3D native diffusion model, which operates on latent space learned from latent set-based 3D representations, to generate coarse geometries with regular mesh topology in seconds. In particular, this process takes as input a text prompt or a reference image and leverages a powerful multi-view (MV) diffusion model to generate multiple views of the coarse geometry, which are fed into our MV-conditioned 3D diffusion model for generating the 3D geometry, significantly improving robustness and generalizability. Following that, a normal-based geometry refiner is used to significantly enhance the surface details. This refinement can be performed automatically, or interactively with user-supplied edits. Extensive experiments demonstrate that our method achieves high efficacy in producing superior-quality 3D assets compared to existing methods. HomePage: https://craftsman3d.github.io/, Code: https://github.com/wyysf-98/CraftsMan
翻译:本文提出了一种新颖的生成式三维建模系统CraftsMan,该系统能够生成具有高度多样化形状、规则网格拓扑和精细表面细节的高保真三维几何体,并特别支持以交互方式对几何体进行细化优化。尽管三维生成领域已取得显著进展,现有方法仍面临优化过程冗长、网格拓扑不规则、表面噪声明显以及难以适应用户编辑需求等问题,从而阻碍了其在三维建模软件中的广泛采用与部署。本研究灵感来源于工匠通常先勾勒作品整体轮廓、再精雕表面细节的创作流程。具体而言,我们采用一个三维原生扩散模型——该模型在基于隐式集合的三维表征所构建的隐空间上进行操作——以秒级速度生成具有规则网格拓扑的粗糙几何体。该过程特别支持以文本提示或参考图像作为输入,并利用强大的多视角(MV)扩散模型生成粗糙几何体的多视角图像,这些图像随后馈入我们提出的多视角条件三维扩散模型以生成三维几何体,从而显著提升了系统的鲁棒性与泛化能力。此后,系统通过基于法向的几何优化器大幅增强表面细节。该优化过程既可自动执行,也可根据用户提供的编辑指令进行交互式调整。大量实验表明,相较于现有方法,本方法在生成高质量三维资产方面展现出卓越效能。项目主页:https://craftsman3d.github.io/,代码仓库:https://github.com/wyysf-98/CraftsMan