Image-based virtual try-on aims to synthesize a naturally dressed person image with a clothing image, which revolutionizes online shopping and inspires related topics within image generation, showing both research significance and commercial potential. However, there is a gap between current research progress and commercial applications and an absence of comprehensive overview of this field to accelerate the development. In this survey, we provide a comprehensive analysis of the state-of-the-art techniques and methodologies in aspects of pipeline architecture, person representation and key modules such as try-on indication, clothing warping and try-on stage. We propose a new semantic criteria with CLIP, and evaluate representative methods with uniformly implemented evaluation metrics on the same dataset. In addition to quantitative and qualitative evaluation of current open-source methods, unresolved issues are highlighted and future research directions are prospected to identify key trends and inspire further exploration. The uniformly implemented evaluation metrics, dataset and collected methods will be made public available at https://github.com/little-misfit/Survey-Of-Virtual-Try-On.
翻译:基于图像的虚拟试穿旨在通过服装图像合成自然穿着的人物图像,这一技术革新了在线购物模式,并推动了图像生成领域的相关研究,兼具科研意义与商业潜力。然而,当前研究进展与商业应用之间存在差距,且缺乏对该领域的全面综述以加速其发展。本综述从流水线架构、人物表征及关键模块(如试穿指示、服装形变与试穿阶段)等方面,对现有先进技术与方法进行了全面分析。我们提出基于CLIP的新型语义评估标准,并在统一数据集上采用标准化评估指标对代表性方法进行评价。除对现有开源方法进行定量与定性评估外,本文还指出了尚未解决的问题,并展望了未来研究方向,以识别关键趋势并激发进一步探索。统一实现的评估指标、数据集及收集的方法将在https://github.com/little-misfit/Survey-Of-Virtual-Try-On公开提供。