Estimating the head pose of a person is a crucial problem for numerous applications that is yet mainly addressed as a subtask of frontal pose prediction. We present a novel method for unconstrained end-to-end head pose estimation to tackle the challenging task of full range of orientation head pose prediction. We address the issue of ambiguous rotation labels by introducing the rotation matrix formalism for our ground truth data and propose a continuous 6D rotation matrix representation for efficient and robust direct regression. This allows to efficiently learn full rotation appearance and to overcome the limitations of the current state-of-the-art. Together with new accumulated training data that provides full head pose rotation data and a geodesic loss approach for stable learning, we design an advanced model that is able to predict an extended range of head orientations. An extensive evaluation on public datasets demonstrates that our method significantly outperforms other state-of-the-art methods in an efficient and robust manner, while its advanced prediction range allows the expansion of the application area. We open-source our training and testing code along with our trained models: https://github.com/thohemp/6DRepNet360.
翻译:摘要:估计人的头部姿态是众多应用中的关键问题,但目前主要作为正面姿态预测的子任务来处理。我们提出了一种新的无约束端到端头部姿态估计方法,以应对具有挑战性的全方向头部姿态预测任务。通过引入旋转矩阵形式来表示真实数据,我们解决了旋转标签的歧义问题,并提出了一种连续的6D旋转矩阵表示方法,用于高效且鲁棒的直接回归。这使得能够高效学习全旋转外观,并克服当前最先进技术的局限性。结合提供完整头部旋转数据的新累积训练数据以及用于稳定学习的测地线损失方法,我们设计了一种先进模型,能够预测更广泛的头部方向范围。在公开数据集上的广泛评估表明,我们的方法在高效性和鲁棒性上显著优于其他最先进方法,同时其先进的预测范围拓展了应用领域。我们开源了训练和测试代码以及训练好的模型:https://github.com/thohemp/6DRepNet360。