In recent years, diffusion models have revolutionized visual generation, outperforming traditional frameworks like Generative Adversarial Networks (GANs). However, generating images of humans with realistic semantic parts, such as hands and faces, remains a significant challenge due to their intricate structural complexity. To address this issue, we propose a novel post-processing solution named RealisHuman. The RealisHuman framework operates in two stages. First, it generates realistic human parts, such as hands or faces, using the original malformed parts as references, ensuring consistent details with the original image. Second, it seamlessly integrates the rectified human parts back into their corresponding positions by repainting the surrounding areas to ensure smooth and realistic blending. The RealisHuman framework significantly enhances the realism of human generation, as demonstrated by notable improvements in both qualitative and quantitative metrics. Code is available at https://github.com/Wangbenzhi/RealisHuman.
翻译:近年来,扩散模型彻底改变了视觉生成领域,其性能超越了生成对抗网络(GANs)等传统框架。然而,由于人体部位(如手部和面部)具有复杂的结构,生成具有真实语义人体部位的图像仍然是一个重大挑战。为解决此问题,我们提出了一种名为RealisHuman的新型后处理解决方案。RealisHuman框架分两个阶段运行。首先,它以原始畸形部位为参考,生成真实的人体部位(如手部或面部),确保与原始图像的细节保持一致。其次,它通过重绘周围区域,将修正后的人体部位无缝集成回其对应位置,以确保平滑且真实的融合效果。定性与定量指标的显著提升表明,RealisHuman框架极大地增强了人体生成的逼真度。代码发布于 https://github.com/Wangbenzhi/RealisHuman。