Prior material creation methods had limitations in producing diverse results mainly because reconstruction-based methods relied on real-world measurements and generation-based methods were trained on relatively small material datasets. To address these challenges, we propose DreamPBR, a novel diffusion-based generative framework designed to create spatially-varying appearance properties guided by text and multi-modal controls, providing high controllability and diversity in material generation. Key to achieving diverse and high-quality PBR material generation lies in integrating the capabilities of recent large-scale vision-language models trained on billions of text-image pairs, along with material priors derived from hundreds of PBR material samples. We utilize a novel material Latent Diffusion Model (LDM) to establish the mapping between albedo maps and the corresponding latent space. The latent representation is then decoded into full SVBRDF parameter maps using a rendering-aware PBR decoder. Our method supports tileable generation through convolution with circular padding. Furthermore, we introduce a multi-modal guidance module, which includes pixel-aligned guidance, style image guidance, and 3D shape guidance, to enhance the control capabilities of the material LDM. We demonstrate the effectiveness of DreamPBR in material creation, showcasing its versatility and user-friendliness on a wide range of controllable generation and editing applications.
翻译:先前的材质创建方法在生成多样性上存在局限,这主要是因为基于重建的方法依赖真实世界测量,而基于生成的方法则在相对较小的材质数据集上训练。为应对这些挑战,我们提出DreamPBR——一种新颖的扩散生成框架,旨在通过文本和多模态控制引导生成空间变化的外观属性,在材质生成中提供高可控性与多样性。实现多样化且高质量PBR材质生成的关键在于整合近期基于数十亿文本-图像对训练的大规模视觉语言模型,以及从数百个PBR材质样本中提取的材质先验。我们利用新颖的材质潜在扩散模型(LDM)建立反照率图与对应潜在空间之间的映射。随后,通过感知渲染的PBR解码器将潜在表示解码为完整的SVBRDF参数图。我们的方法通过循环填充卷积支持可平铺生成。此外,我们引入多模态引导模块(包含像素对齐引导、风格图像引导和三维形状引导),以增强材质LDM的控制能力。我们通过一系列可控生成与编辑应用展示了DreamPBR在材质创建中的有效性,凸显其多功能性与用户友好性。