Virtual try-on, a rapidly evolving field in computer vision, is transforming e-commerce by improving customer experiences through precise garment warping and seamless integration onto the human body. While existing methods such as TPS and flow address the garment warping but overlook the finer contextual details. In this paper, we introduce a novel graph based warping technique which emphasizes the value of context in garment flow. Our graph based warping module generates warped garment as well as a coarse person image, which is utilised by a simple refinement network to give a coarse virtual tryon image. The proposed work exploits latent diffusion model to generate the final tryon, treating garment transfer as an inpainting task. The diffusion model is conditioned with decoupled cross attention based inversion of visual and textual information. We introduce an occlusion aware warping constraint that generates dense warped garment, without any holes and occlusion. Our method, validated on VITON-HD and Dresscode datasets, showcases substantial state-of-the-art qualitative and quantitative results showing considerable improvement in garment warping, texture preservation, and overall realism.
翻译:虚拟试穿作为计算机视觉领域快速发展的方向,正通过精确的服装变形与人体无缝融合技术改善用户体验,从而变革电子商务行业。现有方法如TPS和光流法虽能处理服装变形,却忽略了更精细的上下文细节。本文提出一种新颖的基于图结构的变形技术,强调上下文信息在服装流中的价值。我们的图结构变形模块可生成变形后的服装及粗略人体图像,通过简易优化网络处理得到初步虚拟试穿效果。本工作利用隐扩散模型生成最终试穿效果,将服装迁移任务视为图像修复问题。该扩散模型通过视觉与文本信息的解耦交叉注意力反演机制进行条件控制。我们提出一种遮挡感知的变形约束,能够生成无孔洞、无遮挡的密集变形服装。在VITON-HD和Dresscode数据集上的验证表明,本方法在服装变形、纹理保持及整体真实感方面均取得显著提升,展现出优异的定性定量结果。