Recent advances in diffusion models have significantly improved the synthesis of materials, textures, and 3D shapes. By conditioning these models via text or images, users can guide the generation, reducing the time required to create digital assets. In this paper, we address the synthesis of structured, stationary patterns, where diffusion models are generally less reliable and, more importantly, less controllable. Our approach leverages the generative capabilities of diffusion models specifically adapted for the pattern domain. It enables users to exercise direct control over the synthesis by expanding a partially hand-drawn pattern into a larger design while preserving the structure and details of the input. To enhance pattern quality, we fine-tune an image-pretrained diffusion model on structured patterns using Low-Rank Adaptation (LoRA), apply a noise rolling technique to ensure tileability, and utilize a patch-based approach to facilitate the generation of large-scale assets. We demonstrate the effectiveness of our method through a comprehensive set of experiments, showing that it outperforms existing models in generating diverse, consistent patterns that respond directly to user input.
翻译:近年来,扩散模型在材料、纹理和三维形状合成方面取得了显著进展。通过文本或图像条件引导,用户能够控制生成过程,从而缩短数字资产的创建时间。本文针对结构化稳态模式的合成问题展开研究,该领域中扩散模型通常可靠性较低且可控性不足。我们提出一种专门针对模式领域适配的扩散模型生成方法,使用户能够通过将部分手绘模式扩展为更大规模设计的方式,直接控制合成过程,同时保持输入模式的结构与细节特征。为提升模式质量,我们采用低秩自适应(LoRA)技术在结构化模式数据集上对图像预训练扩散模型进行微调,应用噪声滚动技术确保可拼接性,并采用基于图像块的方法以促进大规模资产的生成。通过系统性的实验验证,我们证明该方法在生成多样化、一致性且能直接响应用户输入的模式方面,优于现有模型。