Several recent studies advocate the use of spectral discriminators, which evaluate the Fourier spectra of images for generative modeling. However, the effectiveness of the spectral discriminators is not well interpreted yet. We tackle this issue by examining the spectral discriminators in the context of perceptual image super-resolution (i.e., GAN-based SR), as SR image quality is susceptible to spectral changes. Our analyses reveal that the spectral discriminator indeed performs better than the ordinary (a.k.a. spatial) discriminator in identifying the differences in the high-frequency range; however, the spatial discriminator holds an advantage in the low-frequency range. Thus, we suggest that the spectral and spatial discriminators shall be used simultaneously. Moreover, we improve the spectral discriminators by first calculating the patch-wise Fourier spectrum and then aggregating the spectra by Transformer. We verify the effectiveness of the proposed method twofold. On the one hand, thanks to the additional spectral discriminator, our obtained SR images have their spectra better aligned to those of the real images, which leads to a better PD tradeoff. On the other hand, our ensembled discriminator predicts the perceptual quality more accurately, as evidenced in the no-reference image quality assessment task.
翻译:近期多项研究提倡使用频谱鉴别器,即通过评估图像的傅里叶频谱进行生成建模。然而,频谱鉴别器的有效性尚未得到充分解释。我们通过考察感知图像超分辨率(即基于GAN的超分辨率)中频谱鉴别器的表现来探究此问题,因为超分辨率图像质量对频谱变化较为敏感。分析表明,频谱鉴别器在识别高频差异方面确实优于普通(即空间)鉴别器;然而空间鉴别器在低频域具有优势。因此,我们建议应同时使用频谱和空间鉴别器。此外,我们通过先计算分块傅里叶频谱,再使用Transformer进行频谱聚合的方式改进了频谱鉴别器。我们从两方面验证了所提方法的有效性:一方面,得益于额外引入的频谱鉴别器,我们获得的超分辨率图像的频谱与真实图像更匹配,从而实现了更优的感知-失真权衡;另一方面,我们的集成鉴别器能更准确地预测感知质量,这在无参考图像质量评估任务中得到了验证。