Existing attribute editing methods treat semantic attributes as binary, resulting in a single edit per attribute. However, attributes such as eyeglasses, smiles, or hairstyles exhibit a vast range of diversity. In this work, we formulate the task of \textit{diverse attribute editing} by modeling the multidimensional nature of attribute edits. This enables users to generate multiple plausible edits per attribute. We capitalize on disentangled latent spaces of pretrained GANs and train a Denoising Diffusion Probabilistic Model (DDPM) to learn the latent distribution for diverse edits. Specifically, we train DDPM over a dataset of edit latent directions obtained by embedding image pairs with a single attribute change. This leads to latent subspaces that enable diverse attribute editing. Applying diffusion in the highly compressed latent space allows us to model rich distributions of edits within limited computational resources. Through extensive qualitative and quantitative experiments conducted across a range of datasets, we demonstrate the effectiveness of our approach for diverse attribute editing. We also showcase the results of our method applied for 3D editing of various face attributes.
翻译:现有的属性编辑方法将语义属性视为二元值,导致每个属性只能进行一次编辑。然而,眼镜、微笑或发型等属性展现出极大的多样性。在本工作中,我们通过建模属性编辑的多维性来阐述“多样化属性编辑”任务,使用户能够为每个属性生成多种合理的编辑。我们利用预训练生成对抗网络的解耦潜在空间,并训练去噪扩散概率模型(DDPM)以学习多样化编辑的潜在分布。具体而言,我们在由嵌入单属性变化图像对获得的编辑潜在方向数据集上训练DDPM,从而产生支持多样化属性编辑的潜在子空间。在高度压缩的潜在空间中应用扩散,使我们能够在有限的计算资源内建模丰富的编辑分布。通过在多个数据集上进行广泛的定性和定量实验,我们证明了该方法在多样化属性编辑中的有效性。我们还展示了该方法应用于三维面部属性编辑的结果。