Generative Adversarial Networks (GANs) face a significant challenge of striking an optimal balance between high-quality image generation and training stability. Recent techniques, such as DCGAN, BigGAN, and StyleGAN, improve visual fidelity; however, such techniques usually struggle with mode collapse and unstable gradients at high network depth. This paper proposes a novel GAN structural model that incorporates deeper inception-inspired convolution and dilated convolution. This novel model is termed the Inception Generative Adversarial Network (IGAN). The IGAN model generates high-quality synthetic images while maintaining training stability, by reducing mode collapse as well as preventing vanishing and exploding gradients. Our proposed IGAN model achieves the Frechet Inception Distance (FID) of 13.12 and 15.08 on the CUB-200 and ImageNet datasets, respectively, representing a 28-33% improvement in FID over the state-of-the-art GANs. Additionally, the IGAN model attains an Inception Score (IS) of 9.27 and 68.25, reflecting improved image diversity and generation quality. Finally, the two techniques of dropout and spectral normalization are utilized in both the generator and discriminator structures to further mitigate gradient explosion and overfitting. These findings confirm that the IGAN model potentially balances training stability with image generation quality, constituting a scalable and computationally efficient framework for high-fidelity image synthesis.
翻译:生成对抗网络(GANs)面临着一个重大挑战:如何在高质量图像生成与训练稳定性之间取得最佳平衡。近期技术,如DCGAN、BigGAN和StyleGAN,提升了视觉保真度;然而,这些技术通常在高网络深度下难以应对模式崩溃和不稳定的梯度问题。本文提出了一种新颖的GAN结构模型,该模型融合了更深的Inception启发式卷积和扩张卷积。这一新模型被命名为Inception生成对抗网络(IGAN)。IGAN模型通过减少模式崩溃以及防止梯度消失和爆炸,在保持训练稳定性的同时生成高质量的合成图像。我们提出的IGAN模型在CUB-200和ImageNet数据集上分别实现了13.12和15.08的弗雷歇起始距离(FID),相较于最先进的GANs,FID提升了28-33%。此外,IGAN模型获得了9.27和68.25的起始分数(IS),反映了图像多样性和生成质量的提升。最后,在生成器和判别器结构中均采用了dropout和谱归一化两种技术,以进一步缓解梯度爆炸和过拟合问题。这些结果证实,IGAN模型有望平衡训练稳定性与图像生成质量,构成了一个可扩展且计算高效的高保真图像合成框架。