The field of Neural Style Transfer (NST) has witnessed remarkable progress in the past few years, with approaches being able to synthesize artistic and photorealistic images and videos of exceptional quality. To evaluate such results, a diverse landscape of evaluation methods and metrics is used, including authors' opinions based on side-by-side comparisons, human evaluation studies that quantify the subjective judgements of participants, and a multitude of quantitative computational metrics which objectively assess the different aspects of an algorithm's performance. However, there is no consensus regarding the most suitable and effective evaluation procedure that can guarantee the reliability of the results. In this review, we provide an in-depth analysis of existing evaluation techniques, identify the inconsistencies and limitations of current evaluation methods, and give recommendations for standardized evaluation practices. We believe that the development of a robust evaluation framework will not only enable more meaningful and fairer comparisons among NST methods but will also enhance the comprehension and interpretation of research findings in the field.
翻译:神经风格迁移(NST)领域在过去几年取得了显著进展,相关方法能够合成具有杰出质量的艺术风格及照片级真实感的图像与视频。为评估此类结果,研究者采用了多元化的评估方法与指标,包括基于并排比较的作者主观意见、量化参与者主观判断的人类评估研究,以及客观评估算法性能各维度的多种定量计算指标。然而,目前尚无能够确保结果可靠性的最适且最有效的评估流程共识。本综述深入分析现有评估技术,指出现行评估方法中的矛盾与局限性,并提出标准化评估实践的建议。我们相信,构建稳健的评估框架不仅能实现NST方法间更具意义且更公平的比较,还将促进对该领域研究结果的解读与理解。