Recent advances in face manipulation using StyleGAN have produced impressive results. However, StyleGAN is inherently limited to cropped aligned faces at a fixed image resolution it is pre-trained on. In this paper, we propose a simple and effective solution to this limitation by using dilated convolutions to rescale the receptive fields of shallow layers in StyleGAN, without altering any model parameters. This allows fixed-size small features at shallow layers to be extended into larger ones that can accommodate variable resolutions, making them more robust in characterizing unaligned faces. To enable real face inversion and manipulation, we introduce a corresponding encoder that provides the first-layer feature of the extended StyleGAN in addition to the latent style code. We validate the effectiveness of our method using unaligned face inputs of various resolutions in a diverse set of face manipulation tasks, including facial attribute editing, super-resolution, sketch/mask-to-face translation, and face toonification.
翻译:近期,利用StyleGAN进行人脸操作的研究取得了显著成果。然而,StyleGAN本质上受限于其预训练时所采用的固定图像分辨率的裁剪对齐人脸。为此,本文提出一种简单而有效的解决方案:通过使用扩张卷积对StyleGAN浅层网络的感受野进行重新缩放,且无需修改任何模型参数。这使得浅层中固定尺寸的小特征能够扩展为可适应不同分辨率的大特征,从而更稳健地描述非对齐人脸。为了实现真实人脸的反演与操作,我们引入了一个对应的编码器,该编码器除潜在风格编码外,还提供扩展StyleGAN的首层特征。我们通过一系列多样化的人脸操作任务验证了方法的有效性,包括人脸属性编辑、超分辨率、素描/掩膜到人脸转换以及人脸卡通化等,所有任务均采用不同分辨率的非对齐人脸输入。