Few-shot image generation aims to train generative models using a small number of training images. When there are few images available for training (e.g. 10 images), Learning From Scratch (LFS) methods often generate images that closely resemble the training data while Transfer Learning (TL) methods try to improve performance by leveraging prior knowledge from GANs pre-trained on large-scale datasets. However, current TL methods may not allow for sufficient control over the degree of knowledge preservation from the source model, making them unsuitable for setups where the source and target domains are not closely related. To address this, we propose a novel pipeline called Peer is your Pillar (PIP), which combines a target few-shot dataset with a peer dataset to create a data-unbalanced conditional generation. Our approach includes a class embedding method that separates the class space from the latent space, and we use a direction loss based on pre-trained CLIP to improve image diversity. Experiments on various few-shot datasets demonstrate the advancement of the proposed PIP, especially reduces the training requirements of few-shot image generation.
翻译:小样本图像生成旨在利用少量训练图像训练生成模型。当仅有极少量图像可用于训练(例如10张图像)时,"从零学习"方法生成的图像往往与训练数据高度相似,而"迁移学习"方法则尝试利用在大规模数据集上预训练的生成对抗网络先验知识来提升性能。然而,当前迁移学习方法可能无法充分控制源模型知识保留程度,导致其在源域与目标域关联不紧密的场景中适应性不足。为解决这一问题,我们提出名为"同行即支柱"(PIP)的创新框架,该框架将目标小样本数据集与同行数据集相结合,构建非均衡数据条件生成。本方法包含一种将类别空间与潜在空间解耦的类嵌入技术,并基于预训练CLIP模型引入方向损失函数以提升图像多样性。在多个小样本数据集上的实验表明,所提出的PIP方法具有显著优势,尤其降低了小样本图像生成的训练需求。