The rapid development of multi-view 3D human pose estimation (HPE) is attributed to the maturation of monocular 2D HPE and the geometry of 3D reconstruction. However, 2D detection outliers in occluded views due to neglect of view consistency, and 3D implausible poses due to lack of pose coherence, remain challenges. To solve this, we introduce a Multi-View Fusion module to refine 2D results by establishing view correlations. Then, Holistic Triangulation is proposed to infer the whole pose as an entirety, and anatomy prior is injected to maintain the pose coherence and improve the plausibility. Anatomy prior is extracted by PCA whose input is skeletal structure features, which can factor out global context and joint-by-joint relationship from abstract to concrete. Benefiting from the closed-form solution, the whole framework is trained end-to-end. Our method outperforms the state of the art in both precision and plausibility which is assessed by a new metric.
翻译:[translated abstract in Chinese]
多视图三维人体姿态估计(HPE)的快速发展得益于单目二维HPE的成熟与三维重建几何学的进步。然而,由于忽略视图一致性导致的遮挡视角中的二维检测异常值,以及缺乏姿态连贯性引发的三维不合理姿态,仍是当前面临的挑战。为解决这一问题,我们提出多视图融合模块,通过建立视图相关性来优化二维检测结果。随后,创新性地提出整体三角化方法将全身姿态视为统一整体进行推断,并引入解剖学先验以保持姿态连贯性、提升合理性。该解剖学先验通过PCA提取,其输入为骨骼结构特征,能够从抽象到具体地解耦全局上下文与关节点间关系。得益于闭式解特性,整个框架可进行端到端训练。实验证明,在精度与合理性两项指标上(采用新评估指标),本方法均超越现有最优技术。