MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets

Driven by powerful image diffusion models, recent research has achieved the automatic creation of 3D objects from textual or visual guidance. By performing score distillation sampling (SDS) iteratively across different views, these methods succeed in lifting 2D generative prior to the 3D space. However, such a 2D generative image prior bakes the effect of illumination and shadow into the texture. As a result, material maps optimized by SDS inevitably involve spurious correlated components. The absence of precise material definition makes it infeasible to relight the generated assets reasonably in novel scenes, which limits their application in downstream scenarios. In contrast, humans can effortlessly circumvent this ambiguity by deducing the material of the object from its appearance and semantics. Motivated by this insight, we propose MaterialSeg3D, a 3D asset material generation framework to infer underlying material from the 2D semantic prior. Based on such a prior model, we devise a mechanism to parse material in 3D space. We maintain a UV stack, each map of which is unprojected from a specific viewpoint. After traversing all viewpoints, we fuse the stack through a weighted voting scheme and then employ region unification to ensure the coherence of the object parts. To fuel the learning of semantics prior, we collect a material dataset, named Materialized Individual Objects (MIO), which features abundant images, diverse categories, and accurate annotations. Extensive quantitative and qualitative experiments demonstrate the effectiveness of our method.

翻译：受强大图像扩散模型的驱动，近期研究已实现从文本或视觉引导自动创建三维物体。通过在不同视角间迭代执行分数蒸馏采样（SDS），这些方法成功将二维生成先验提升至三维空间。然而，此类二维生成图像先验将光照和阴影效果烘培至纹理中，导致通过SDS优化的材质映射不可避免地混入虚假相关成分。缺乏精确的材质定义使得生成的资产无法在新场景中合理重光照，从而限制了其在下游场景中的应用。相比之下，人类能通过外观和语义推断物体材质，轻松规避此类歧义。受此启发，我们提出MaterialSeg3D——一种从二维语义先验推断潜在材质的三维资产材质生成框架。基于该先验模型，我们设计了一种在三维空间中解析材质的机制：维护一个UV堆栈，其中每个映射图均从特定视点反投影生成；遍历所有视点后，通过加权投票方案融合堆栈，并采用区域统一策略确保物体部件的一致性。为支撑语义先验的学习，我们构建了名为Materialized Individual Objects（MIO）的材质数据集，其具备丰富图像、多样类别及精确标注。大量定性与定量实验验证了本方法的有效性。

相关内容

ASSETS

关注 0

ACM SIGACCESS Conference on Computers and Accessibility是为残疾人和老年人提供与计算机相关的设计、评估、使用和教育研究的首要论坛。我们欢迎提交原始的高质量的有关计算和可访问性的主题。今年，ASSETS首次将其范围扩大到包括关于计算机无障碍教育相关主题的原创高质量研究。官网链接：http://assets19.sigaccess.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日