The last decades are marked by massive and diverse image data, which shows increasingly high resolution and quality. However, some images we obtained may be corrupted, affecting the perception and the application of downstream tasks. A generic method for generating a high-quality image from the degraded one is in demand. In this paper, we present a novel GAN inversion framework that utilizes the powerful generative ability of StyleGAN-XL for this problem. To ease the inversion challenge with StyleGAN-XL, Clustering \& Regularize Inversion (CRI) is proposed. Specifically, the latent space is firstly divided into finer-grained sub-spaces by clustering. Instead of initializing the inversion with the average latent vector, we approximate a centroid latent vector from the clusters, which generates an image close to the input image. Then, an offset with a regularization term is introduced to keep the inverted latent vector within a certain range. We validate our CRI scheme on multiple restoration tasks (i.e., inpainting, colorization, and super-resolution) of complex natural images, and show preferable quantitative and qualitative results. We further demonstrate our technique is robust in terms of data and different GAN models. To our best knowledge, we are the first to adopt StyleGAN-XL for generating high-quality natural images from diverse degraded inputs. Code is available at https://github.com/Booooooooooo/CRI.
翻译:近年来,大规模且多样化的图像数据日益涌现,其分辨率和质量持续提升。然而,部分图像可能因退化而受损,影响感知与下游任务的应用。因此,亟需一种从退化图像生成高质量图像的通用方法。本文提出一种新型GAN逆变换框架,利用StyleGAN-XL的强大生成能力解决该问题。为简化StyleGAN-XL的逆变换难度,我们提出聚类与正则化逆变换(CRI)方法。具体而言,首先通过聚类将潜在空间划分为更细粒度的子空间。不同于用平均潜在向量初始化逆变换,我们从聚类中近似得到质心潜在向量,从而生成与输入图像相近的图像。随后引入带正则化项的偏移量,将逆变换后的潜在向量限制在特定范围内。我们在复杂自然图像的多种修复任务(即图像修复、着色与超分辨率)上验证了CRI方案,取得了更优的定量与定性结果。进一步证明,我们的方法在数据和不同GAN模型方面均具有稳健性。据我们所知,这是首次采用StyleGAN-XL从多样退化输入生成高质量自然图像。代码已开源:https://github.com/Booooooooooo/CRI。