Image-based virtual try-on aims to synthesize a naturally dressed person image with a clothing image, which revolutionizes online shopping and inspires related topics within image generation, showing both research significance and commercial potentials. However, there is a great gap between current research progress and commercial applications and an absence of comprehensive overview towards this field to accelerate the development. In this survey, we provide a comprehensive analysis of the state-of-the-art techniques and methodologies in aspects of pipeline architecture, person representation and key modules such as try-on indication, clothing warping and try-on stage. We propose a new semantic criteria with CLIP, and evaluate representative methods with uniformly implemented evaluation metrics on the same dataset. In addition to quantitative and qualitative evaluation of current open-source methods, we also utilize ControlNet to fine-tune a recent large image generation model (PBE) to show future potentials of large-scale models on image-based virtual try-on task. Finally, unresolved issues are revealed and future research directions are prospected to identify key trends and inspire further exploration. The uniformly implemented evaluation metrics, dataset and collected methods will be made public available at https://github.com/little-misfit/Survey-Of-Virtual-Try-On.
翻译:基于图像的虚拟试穿旨在通过衣物图像合成自然穿着的人物图像,这项技术革新了在线购物体验,并推动了图像生成领域的相关研究,兼具显著的研究意义与商业潜力。然而,当前研究进展与商业应用之间仍存在巨大差距,且该领域缺乏加速发展的系统综述。本综述从流程架构、人物表征及核心模块(如试穿指示、衣物形变与试穿阶段)等维度,全面分析了最先进的技术与方法。我们提出基于CLIP的新型语义评价准则,并在统一数据集上采用标准化评估指标对代表性方法进行评测。除对现有开源方法进行定量与定性评估外,我们还利用ControlNet对近期大尺度图像生成模型(PBE)进行微调,以揭示大模型在基于图像的虚拟试穿任务中的未来潜力。最后,本文揭示了尚未解决的问题并展望了未来研究方向,旨在识别关键趋势并激发进一步探索。统一实现的评估指标、数据集及收集的方法将在https://github.com/little-misfit/Survey-Of-Virtual-Try-On公开提供。