Unpaired image-to-image translation (UNIT) aims to map images between two visual domains without paired training data. However, given a UNIT model trained on certain domains, it is difficult for current methods to incorporate new domains because they often need to train the full model on both existing and new domains. To address this problem, we propose a new domain-scalable UNIT method, termed as latent space anchoring, which can be efficiently extended to new visual domains and does not need to fine-tune encoders and decoders of existing domains. Our method anchors images of different domains to the same latent space of frozen GANs by learning lightweight encoder and regressor models to reconstruct single-domain images. In the inference phase, the learned encoders and decoders of different domains can be arbitrarily combined to translate images between any two domains without fine-tuning. Experiments on various datasets show that the proposed method achieves superior performance on both standard and domain-scalable UNIT tasks in comparison with the state-of-the-art methods.
翻译:无配对图像到图像翻译(UNIT)旨在无需配对训练数据即可实现两个视觉领域之间的图像映射。然而,对于在特定领域上训练的UNIT模型,现有方法难以融入新领域,因为它们通常需要在既有领域和新领域上重新训练完整模型。为解决此问题,我们提出一种新的领域可扩展UNIT方法——潜空间锚定,该方法可高效扩展至新视觉领域,且无需微调既有领域的编码器和解码器。我们的方法通过训练轻量级编码器和回归模型重建单领域图像,将不同领域的图像锚定至冻结生成对抗网络(GAN)的同一潜空间。在推理阶段,不同领域学习到的编码器和解码器可任意组合,实现任意两个领域间的图像翻译而无需微调。在多个数据集上的实验表明,与现有最先进方法相比,所提方法在标准UNIT任务和领域可扩展UNIT任务上均取得了优越性能。