Neural style transfer (NST) has evolved significantly in recent years. Yet, despite its rapid progress and advancement, existing NST methods either struggle to transfer aesthetic information from a style effectively or suffer from high computational costs and inefficiencies in feature disentanglement due to using pre-trained models. This work proposes a lightweight but effective model, AesFA -- Aesthetic Feature-Aware NST. The primary idea is to decompose the image via its frequencies to better disentangle aesthetic styles from the reference image while training the entire model in an end-to-end manner to exclude pre-trained models at inference completely. To improve the network's ability to extract more distinct representations and further enhance the stylization quality, this work introduces a new aesthetic feature: contrastive loss. Extensive experiments and ablations show the approach not only outperforms recent NST methods in terms of stylization quality, but it also achieves faster inference. Codes are available at https://github.com/Sooyyoungg/AesFA.
翻译:神经风格迁移(NST)近年来取得了显著进展。然而,尽管其发展迅速且不断进步,现有NST方法要么难以有效迁移风格图像中的美学信息,要么因使用预训练模型而面临高计算成本和特征解耦效率低下的问题。本文提出一种轻量级但高效的模型——AesFA(美学特征感知神经风格迁移)。其核心思想是通过频率分解图像,以更好地区分参考图像中的美学风格,同时以端到端方式训练整个模型,从而在推理阶段完全摒弃预训练模型。为提升网络提取更具区分性表征的能力并进一步优化风格化质量,本文引入了一种新的美学特征:对比损失。大量实验和消融研究表明,该方法不仅在风格化质量上优于近期NST方法,还实现了更快的推理速度。代码见 https://github.com/Sooyyoungg/AesFA。