We present Meta 3D AssetGen (AssetGen), a significant advancement in text-to-3D generation which produces faithful, high-quality meshes with texture and material control. Compared to works that bake shading in the 3D object's appearance, AssetGen outputs physically-based rendering (PBR) materials, supporting realistic relighting. AssetGen generates first several views of the object with factored shaded and albedo appearance channels, and then reconstructs colours, metalness and roughness in 3D, using a deferred shading loss for efficient supervision. It also uses a sign-distance function to represent 3D shape more reliably and introduces a corresponding loss for direct shape supervision. This is implemented using fused kernels for high memory efficiency. After mesh extraction, a texture refinement transformer operating in UV space significantly improves sharpness and details. AssetGen achieves 17% improvement in Chamfer Distance and 40% in LPIPS over the best concurrent work for few-view reconstruction, and a human preference of 72% over the best industry competitors of comparable speed, including those that support PBR. Project page with generated assets: https://assetgen.github.io
翻译:我们提出了Meta 3D AssetGen(简称AssetGen),这是文本到3D生成领域的一项重要进展,能够生成具有纹理和材质控制能力的逼真高质量网格。与将着色烘焙到3D物体外观中的方法相比,AssetGen输出基于物理的渲染(PBR)材质,支持真实感重光照。AssetGen首先生成物体的多个视角,并分解为带阴影的外观通道和反照率通道,随后在三维空间中重建颜色、金属度和粗糙度,并使用延迟着色损失进行高效监督。该方法还采用符号距离函数来更可靠地表示三维形状,并引入了相应的损失函数进行直接的形状监督。该过程通过使用融合内核实现,具有高内存效率。在网格提取后,一个在UV空间中操作的纹理细化Transformer显著提升了纹理的清晰度与细节。AssetGen在少视角重建任务中,相较于同期最佳工作,倒角距离指标提升了17%,LPIPS指标提升了40%;在与速度相当的最佳行业竞争对手(包括支持PBR的方案)比较中,获得了72%的人类偏好度。生成资产的项目页面:https://assetgen.github.io