The entertainment industry relies on 3D visual content to create immersive experiences, but traditional methods for creating textured 3D models can be time-consuming and subjective. Generative networks such as StyleGAN have advanced image synthesis, but generating 3D objects with high-fidelity textures is still not well explored, and existing methods have limitations. We propose the Semantic-guided Conditional Texture Generator (CTGAN), producing high-quality textures for 3D shapes that are consistent with the viewing angle while respecting shape semantics. CTGAN utilizes the disentangled nature of StyleGAN to finely manipulate the input latent codes, enabling explicit control over both the style and structure of the generated textures. A coarse-to-fine encoder architecture is introduced to enhance control over the structure of the resulting textures via input segmentation. Experimental results show that CTGAN outperforms existing methods on multiple quality metrics and achieves state-of-the-art performance on texture generation in both conditional and unconditional settings.
翻译:娱乐行业依赖3D视觉内容来创造沉浸式体验,但传统的纹理化3D模型创建方法耗时且主观性较强。生成式网络(如StyleGAN)推动了图像合成技术的发展,然而生成具有高保真纹理的3D对象仍未得到充分探索,现有方法也存在局限性。我们提出了一种语义引导的条件纹理生成器(CTGAN),该模型能够为3D形状生成高质量的纹理,这些纹理既与视角保持一致,又尊重形状的语义信息。CTGAN利用StyleGAN的解耦特性来精细调控输入潜码,从而实现对生成纹理风格与结构的显式控制。通过引入从粗到细的编码器架构,基于输入分割增强了对生成纹理结构的控制能力。实验结果表明,CTGAN在多个质量指标上优于现有方法,并在条件生成与无条件生成场景中均达到了纹理生成的先进水平。