In the rapidly evolving field of computer vision, the task of accurately estimating the poses of multiple individuals from various viewpoints presents a formidable challenge, especially if the estimations should be reliable as well. This work presents an extensive evaluation of the generalization capabilities of multi-view multi-person pose estimators to unseen datasets and presents a new algorithm with strong performance in this task. It also studies the improvements by additionally using depth information. Since the new approach can not only generalize well to unseen datasets, but also to different keypoints, the first multi-view multi-person whole-body estimator is presented. To support further research on those topics, all of the work is publicly accessible.
翻译:在快速发展的计算机视觉领域,从不同视角准确估计多人的姿态是一项极具挑战性的任务,尤其是在要求估计结果具备高可靠性的情况下。本研究对多视角多人姿态估计算法在未见数据集上的泛化能力进行了全面评估,并提出了一种在此任务中表现优异的新算法。同时,本文还探究了额外使用深度信息所带来的性能提升。由于新方法不仅能够很好地泛化到未见数据集,还能适应不同的关键点类型,因此本文首次提出了多视角多人全身姿态估计器。为支持该领域的进一步研究,所有工作成果均已公开。