Conditioning Diffusion Models via Attributes and Semantic Masks for Face Generation

Deep generative models have shown impressive results in generating realistic images of faces. GANs managed to generate high-quality, high-fidelity images when conditioned on semantic masks, but they still lack the ability to diversify their output. Diffusion models partially solve this problem and are able to generate diverse samples given the same condition. In this paper, we propose a multi-conditioning approach for diffusion models via cross-attention exploiting both attributes and semantic masks to generate high-quality and controllable face images. We also studied the impact of applying perceptual-focused loss weighting into the latent space instead of the pixel space. Our method extends the previous approaches by introducing conditioning on more than one set of features, guaranteeing a more fine-grained control over the generated face images. We evaluate our approach on the CelebA-HQ dataset, and we show that it can generate realistic and diverse samples while allowing for fine-grained control over multiple attributes and semantic regions. Additionally, we perform an ablation study to evaluate the impact of different conditioning strategies on the quality and diversity of the generated images.

翻译：深度生成模型在生成逼真人脸图像方面取得了显著成果。生成对抗网络（GAN）在语义掩码条件下能生成高质量、高保真度的图像，但依然缺乏输出多样性。扩散模型部分解决了这一问题，能够在相同条件下生成多样化的样本。本文提出一种基于交叉注意力的多条件扩散模型方法，通过同时利用属性和语义掩码生成高质量且可控的人脸图像。我们还研究了将感知聚焦损失权重应用于潜在空间而非像素空间的影响。该方法通过引入对多组特征的条件约束，扩展了现有技术，从而实现对生成人脸图像更精细的控制。我们在CelebA-HQ数据集上评估了该方法，结果表明其能生成逼真且多样化的样本，同时支持对多属性和语义区域的精细控制。此外，我们进行了消融研究，以评估不同条件策略对生成图像质量和多样性的影响。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日