Deep Learning-Based Object Pose Estimation: A Comprehensive Survey

Object pose estimation is a fundamental computer vision problem with broad applications in augmented reality and robotics. Over the past decade, deep learning models, due to their superior accuracy and robustness, have increasingly supplanted conventional algorithms reliant on engineered point pair features. Nevertheless, several challenges persist in contemporary methods, including their dependency on labeled training data, model compactness, robustness under challenging conditions, and their ability to generalize to novel unseen objects. A recent survey discussing the progress made on different aspects of this area, outstanding challenges, and promising future directions, is missing. To fill this gap, we discuss the recent advances in deep learning-based object pose estimation, covering all three formulations of the problem, i.e., instance-level, category-level, and unseen object pose estimation. Our survey also covers multiple input data modalities, degrees-of-freedom of output poses, object properties, and downstream tasks, providing readers with a holistic understanding of this field. Additionally, it discusses training paradigms of different domains, inference modes, application areas, evaluation metrics, and benchmark datasets, as well as reports the performance of current state-of-the-art methods on these benchmarks, thereby facilitating readers in selecting the most suitable method for their application. Finally, the survey identifies key challenges, reviews prevailing trends along with their pros and cons, and identifies promising directions for future research. We also keep tracing the latest works at https://github.com/CNJianLiu/Awesome-Object-Pose-Estimation.

翻译：物体姿态估计是计算机视觉中的基础问题，在增强现实和机器人领域具有广泛应用。过去十年间，深度学习模型凭借其卓越的精度和鲁棒性，逐渐取代了依赖人工设计的点对特征的传统算法。然而，现有方法仍面临若干挑战，包括对标注训练数据的依赖、模型紧凑性、复杂条件下的鲁棒性以及泛化至未见物体的能力。当前缺乏一篇综述性研究来系统讨论该领域不同方向的进展、现存挑战及未来方向。为填补这一空白，本文综述了基于深度学习的物体姿态估计的最新进展，涵盖该问题的三种范式：实例级、类别级和未见物体姿态估计。本综述同时涉及多种输入数据模态、输出姿态自由度、物体属性及下游任务，为读者提供该领域的全局视角。此外，本文讨论了不同领域的训练范式、推理模式、应用领域、评估指标和基准数据集，并报告了当前最优方法在这些基准上的性能表现，以帮助读者选择最适合其应用的方法。最后，本综述总结了关键挑战，评述了主流趋势的优劣，并指出了未来研究的潜在方向。我们亦持续跟踪最新成果于 https://github.com/CNJianLiu/Awesome-Object-Pose-Estimation。