Recovering a 3D human mesh from a single RGB image is a challenging task due to depth ambiguity and self-occlusion, resulting in a high degree of uncertainty. Meanwhile, diffusion models have recently seen much success in generating high-quality outputs by progressively denoising noisy inputs. Inspired by their capability, we explore a diffusion-based approach for human mesh recovery, and propose a Human Mesh Diffusion (HMDiff) framework which frames mesh recovery as a reverse diffusion process. We also propose a Distribution Alignment Technique (DAT) that injects input-specific distribution information into the diffusion process, and provides useful prior knowledge to simplify the mesh recovery task. Our method achieves state-of-the-art performance on three widely used datasets. Project page: https://gongjia0208.github.io/HMDiff/.
翻译:从单张RGB图像中恢复三维人体网格极具挑战性,主要源于深度模糊性与自遮挡导致的高度不确定性。近年来,扩散模型通过逐步去噪噪声输入在高保真生成任务中展现出显著优势。受此启发,我们探索了基于扩散模型的人体网格恢复方法,提出人体网格扩散框架(HMDiff),将网格恢复视为逆向扩散过程。同时,我们提出分布对齐技术(DAT),通过向扩散过程注入输入特有的分布信息,提供有效先验知识以简化网格恢复任务。本方法在三个广泛使用的数据集上达到最优性能。项目主页:https://gongjia0208.github.io/HMDiff/。