Recently, deep learning based approaches have shown promising results in 3D hand reconstruction from a single RGB image. These approaches can be roughly divided into model-based approaches, which are heavily dependent on the model's parameter space, and model-free approaches, which require large numbers of 3D ground truths to reduce depth ambiguity and struggle in weakly-supervised scenarios. To overcome these issues, we propose a novel probabilistic model to achieve the robustness of model-based approaches and reduced dependence on the model's parameter space of model-free approaches. The proposed probabilistic model incorporates a model-based network as a prior-net to estimate the prior probability distribution of joints and vertices. An Attention-based Mesh Vertices Uncertainty Regression (AMVUR) model is proposed to capture dependencies among vertices and the correlation between joints and mesh vertices to improve their feature representation. We further propose a learning based occlusion-aware Hand Texture Regression model to achieve high-fidelity texture reconstruction. We demonstrate the flexibility of the proposed probabilistic model to be trained in both supervised and weakly-supervised scenarios. The experimental results demonstrate our probabilistic model's state-of-the-art accuracy in 3D hand and texture reconstruction from a single image in both training schemes, including in the presence of severe occlusions.
翻译:近期,基于深度学习的方法在单幅RGB图像的三维手部重建领域取得了显著进展。这些方法大致可分为两类:模型驱动方法,其高度依赖于模型参数空间;以及无模型方法,这类方法需要大量三维真值数据以降低深度模糊性,且在弱监督场景下效果不佳。为克服上述问题,我们提出一种新型概率模型,旨在兼备模型驱动方法的鲁棒性,同时减少对模型参数空间的依赖。该概率模型将基于模型的网络作为先验网络,用于估计关节与顶点的先验概率分布。我们进一步提出基于注意力的网格顶点不确定性回归(AMVUR)模型,以捕捉顶点间的依赖关系及关节与网格顶点的相关性,从而增强特征表示。此外,提出一种基于学习的遮挡感知手部纹理回归模型,实现高保真纹理重建。我们论证了所提概率模型在监督与弱监督两种训练场景下的灵活性。实验结果表明,在包括严重遮挡情况在内的两种训练方案中,该概率模型在单幅图像的三维手部及纹理重建精度均达到当前最优水平。