We present Unified Contrastive Arbitrary Style Transfer (UCAST), a novel style representation learning and transfer framework, which can fit in most existing arbitrary image style transfer models, e.g., CNN-based, ViT-based, and flow-based methods. As the key component in image style transfer tasks, a suitable style representation is essential to achieve satisfactory results. Existing approaches based on deep neural network typically use second-order statistics to generate the output. However, these hand-crafted features computed from a single image cannot leverage style information sufficiently, which leads to artifacts such as local distortions and style inconsistency. To address these issues, we propose to learn style representation directly from a large amount of images based on contrastive learning, by taking the relationships between specific styles and the holistic style distribution into account. Specifically, we present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature. Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer. We carry out qualitative and quantitative evaluations to show that our approach produces superior results than those obtained via state-of-the-art methods.
翻译:我们提出了统一对比任意风格迁移框架(UCAST),这是一种新颖的风格表征学习与迁移框架,可适用于大多数现有任意图像风格迁移模型,如基于CNN、ViT和流的方法。在图像风格迁移任务中,合适的风格表征是实现满意结果的关键。现有基于深度神经网络的方法通常使用二阶统计量生成输出,但这些从单张图像计算的固定特征无法充分挖掘风格信息,导致局部失真和风格不一致等伪影。为解决这些问题,我们提出基于对比学习直接从大量图像中学习风格表征,通过考虑特定风格与整体风格分布之间的关系。具体地,我们通过引入输入相关的温度参数,提出一种自适应对比学习方案用于风格迁移。我们的框架包含三个关键组件:用于风格表征与风格迁移的并行对比学习方案、用于有效学习风格分布的域增强模块,以及用于风格迁移的生成网络。定性和定量评估表明,我们的方法取得了优于现有最先进方法的结果。