The field of image generation through generative modelling is abundantly discussed nowadays. It can be used for various applications, such as up-scaling existing images, creating non-existing objects, such as interior design scenes, products or even human faces, and achieving transfer-learning processes. In this context, Generative Adversarial Networks (GANs) are a class of widely studied machine learning frameworks first appearing in the paper "Generative adversarial nets" by Goodfellow et al. that achieve the goal above. In our work, we reproduce and evaluate a novel variation of the original GAN network, the GANformer, proposed in "Generative Adversarial Transformers" by Hudson and Zitnick. This project aimed to recreate the methods presented in this paper to reproduce the original results and comment on the authors' claims. Due to resources and time limitations, we had to constrain the network's training times, dataset types, and sizes. Our research successfully recreated both variations of the proposed GANformer model and found differences between the authors' and our results. Moreover, discrepancies between the publication methodology and the one implemented, made available in the code, allowed us to study two undisclosed variations of the presented procedures.
翻译:通过生成式建模进行图像生成的领域现已被广泛讨论。该技术可用于多种应用场景,例如放大现有图像、创建非真实存在的对象(如室内设计场景、产品甚至人脸),以及实现迁移学习过程。在此背景下,生成对抗网络(GANs)作为一类被广泛研究的机器学习框架,首次出现在Goodfellow等人发表的论文《生成对抗网络》中,并实现了上述目标。本研究中,我们复现并评估了原始GAN网络的一种新颖变体——GANformer,该模型由Hudson和Zitnick在《生成对抗Transformer》中提出。本项目旨在重现该论文所述方法,以复现原始结果并评述作者的结论。受资源和时间限制,我们不得不约束网络的训练时长、数据集类型及规模。我们的研究成功复现了所提出的两种GANformer变体模型,并发现作者与我们的结果之间存在差异。此外,论文中披露的方法与代码实现的方法之间存在不一致性,这使我们得以研究两种未公开的变体流程。