We present Unified Contrastive Arbitrary Style Transfer (UCAST), a novel style representation learning and transfer framework, which can fit in most existing arbitrary image style transfer models, e.g., CNN-based, ViT-based, and flow-based methods. As the key component in image style transfer tasks, a suitable style representation is essential to achieve satisfactory results. Existing approaches based on deep neural network typically use second-order statistics to generate the output. However, these hand-crafted features computed from a single image cannot leverage style information sufficiently, which leads to artifacts such as local distortions and style inconsistency. To address these issues, we propose to learn style representation directly from a large amount of images based on contrastive learning, by taking the relationships between specific styles and the holistic style distribution into account. Specifically, we present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature. Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer. We carry out qualitative and quantitative evaluations to show that our approach produces superior results than those obtained via state-of-the-art methods.
翻译:我们提出统一对比式任意风格迁移框架(UCAST),这是一种新颖的风格表示学习与迁移框架,可适用于大多数现有任意图像风格迁移模型(如基于CNN、ViT和流的方法)。作为图像风格迁移任务的关键组成部分,合适的风格表示对获得满意效果至关重要。现有基于深度神经网络的方法通常采用二阶统计量生成输出,然而这些从单张图像计算的人工特征无法充分利用风格信息,导致局部畸变和风格不一致等伪影。为解决这些问题,我们提出基于对比学习直接从大量图像中学习风格表示,通过考虑特定风格与整体风格分布之间的关系。具体而言,我们通过引入与输入相关的温度参数,为风格迁移设计了一种自适应对比学习方案。该框架包含三个核心组件:用于风格表示与风格迁移的并行对比学习方案、用于有效学习风格分布的域增强模块,以及用于风格迁移的生成网络。通过定性和定量评估,我们证明该方法能产生优于现有最先进方法的效果。