Image-based virtual try-on, widely used in online shopping, aims to generate images of a naturally dressed person conditioned on certain garments, providing significant research and commercial potential. A key challenge of try-on is to generate realistic images of the model wearing the garments while preserving the details of the garments. Previous methods focus on masking certain parts of the original model's standing image, and then inpainting on masked areas to generate realistic images of the model wearing corresponding reference garments, which treat the try-on task as an inpainting task. However, such implements require the user to provide a complete, high-quality standing image, which is user-unfriendly in practical applications. In this paper, we propose Try-On-Adapter (TOA), an outpainting paradigm that differs from the existing inpainting paradigm. Our TOA can preserve the given face and garment, naturally imagine the rest parts of the image, and provide flexible control ability with various conditions, e.g., garment properties and human pose. In the experiments, TOA shows excellent performance on the virtual try-on task even given relatively low-quality face and garment images in qualitative comparisons. Additionally, TOA achieves the state-of-the-art performance of FID scores 5.56 and 7.23 for paired and unpaired on the VITON-HD dataset in quantitative comparisons.
翻译:基于图像的虚拟试穿技术广泛应用于在线购物,其目标是在给定特定服装的条件下生成人物自然着装图像,具有重要的研究价值与商业潜力。试穿任务的一个核心挑战在于生成模特穿着服装的真实感图像,同时保持服装的细节特征。现有方法主要通过对原始模特站立图像的特定区域进行掩码处理,随后在掩码区域进行修复以生成穿着对应参考服装的模特图像,这类方法将试穿任务视为图像修复任务。然而,此类实现方案要求用户提供完整且高质量的站立图像,在实际应用中用户友好性不足。本文提出试穿适配器(Try-On-Adapter, TOA),这是一种区别于现有修复范式的图像外延范式。我们的TOA能够保留给定的人脸与服装区域,自然地想象图像其余部分,并通过服装属性、人体姿态等多种条件提供灵活的控制能力。在实验中,定性比较表明即使输入相对低质量的人脸与服装图像,TOA在虚拟试穿任务上仍展现出优异性能。定量比较中,TOA在VITON-HD数据集上分别以5.56和7.23的FID分数在配对与非配对场景下取得了最先进的性能表现。