With the rapid development of e-commerce and digital fashion, image-based virtual try-on (VTON) has attracted increasing attention. However, existing VTON models often suffer from artifacts such as garment distortion and body inconsistency, highlighting the need for reliable quality evaluation of VTON-generated images. To this end, we construct VTONQA, the first multi-dimensional quality assessment dataset specifically designed for VTON, which contains 8,132 images generated by 11 representative VTON models, along with 24,396 mean opinion scores (MOSs) across three evaluation dimensions (i.e., clothing fit, body compatibility, and overall quality). Based on VTONQA, we benchmark both VTON models and a diverse set of image quality assessment (IQA) metrics, revealing the limitations of existing methods and highlighting the value of the proposed dataset. We believe that the VTONQA dataset and corresponding benchmarks will provide a solid foundation for perceptually aligned evaluation, benefiting both the development of quality assessment methods and the advancement of VTON models.
翻译:随着电子商务和数字时尚的快速发展,基于图像的虚拟试穿技术日益受到关注。然而,现有的VTON模型常存在服装形变、身体部位不一致等伪影问题,凸显了对VTON生成图像进行可靠质量评估的必要性。为此,我们构建了VTONQA——首个专为VTON设计的多维度质量评估数据集,其中包含由11个代表性VTON模型生成的8,132张图像,以及涵盖三个评估维度(即服装贴合度、身体协调性与整体质量)的24,396个平均意见得分。基于VTONQA,我们对VTON模型及多种图像质量评估指标进行了基准测试,揭示了现有方法的局限性,并验证了所提出数据集的价值。我们相信,VTONQA数据集及相关基准将为感知对齐的评估提供坚实基础,既有助于质量评估方法的发展,也能推动VTON模型的进步。