We address the task of in-the-wild human figure synthesis, where the primary goal is to synthesize a full body given any region in any image. In-the-wild human figure synthesis has long been a challenging and under-explored task, where current methods struggle to handle extreme poses, occluding objects, and complex backgrounds. Our main contribution is TriA-GAN, a keypoint-guided GAN that can synthesize Anyone, Anywhere, in Any given pose. Key to our method is projected GANs combined with a well-crafted training strategy, where our simple generator architecture can successfully handle the challenges of in-the-wild full-body synthesis. We show that TriA-GAN significantly improves over previous in-the-wild full-body synthesis methods, all while requiring less conditional information for synthesis (keypoints vs. DensePose). Finally, we show that the latent space of \methodName is compatible with standard unconditional editing techniques, enabling text-guided editing of generated human figures.
翻译:我们解决野外人体图像合成任务,其核心目标是给定任意图像中的任意区域,合成完整的身体。野外人体图像合成长期以来一直是一个具有挑战性且尚未充分探索的任务,现有方法难以处理极端姿态、遮挡物体和复杂背景。我们的主要贡献是TriA-GAN,一种基于关键点引导的生成对抗网络,能够合任何人物、任何场景、任意指定姿态。我们方法的关键在于将投影GAN与精心设计的训练策略相结合,这种简单的生成器架构能够成功应对野外全身合成的挑战。我们证明,TriA-GAN在需更少的条件信息(关键点 vs. DensePose)的情况下,显著优于先前的野外全身合成方法。最后,我们展示了\methodName的潜在空间与标准的无条件编辑技术兼容,从而能够实现生成人体图像的文本引导编辑。