Due to the absence of fine structure and texture information, existing fusion-based few-shot image generation methods suffer from unsatisfactory generation quality and diversity. To address this problem, we propose a novel feature Equalization fusion Generative Adversarial Network (EqGAN) for few-shot image generation. Unlike existing fusion strategies that rely on either deep features or local representations, we design two separate branches to fuse structures and textures by disentangling encoded features into shallow and deep contents. To refine image contents at all feature levels, we equalize the fused structure and texture semantics at different scales and supplement the decoder with richer information by skip connections. Since the fused structures and textures may be inconsistent with each other, we devise a consistent equalization loss between the equalized features and the intermediate output of the decoder to further align the semantics. Comprehensive experiments on three public datasets demonstrate that, EqGAN not only significantly improves generation performance with FID score (by up to 32.7%) and LPIPS score (by up to 4.19%), but also outperforms the state-of-the-arts in terms of accuracy (by up to 1.97%) for downstream classification tasks.
翻译:由于缺乏精细结构与纹理信息,现有基于融合的少样本图像生成方法在生成质量与多样性方面表现不佳。针对该问题,我们提出了一种新型特征均衡融合生成对抗网络(EqGAN),用于少样本图像生成。不同于依赖深层特征或局部表示的现有融合策略,我们设计两个独立分支,通过将编码特征解耦为浅层与深层内容来分别融合结构与纹理。为在所有特征层级上优化图像内容,我们在不同尺度上均衡融合后的结构与纹理语义,并通过跳跃连接为解码器补充更丰富的信息。由于融合后的结构与纹理可能相互不一致,我们设计了一种均衡特征与解码器中间输出之间的一致性均衡损失,以进一步对齐语义。在三个公开数据集上的全面实验表明,EqGAN不仅在FID分数(最高提升32.7%)和LPIPS分数(最高提升4.19%)上显著提升生成性能,还在下游分类任务的准确率(最高提升1.97%)上优于现有最先进方法。