Tracking the object 6-DoF pose is crucial for various downstream robot tasks and real-world applications. In this paper, we investigate the real-world robot task of aerial vision guidance for aerial robotics manipulation, utilizing category-level 6-DoF pose tracking. Aerial conditions inevitably introduce special challenges, such as rapid viewpoint changes in pitch and roll and inter-frame differences. To support these challenges in task, we firstly introduce a robust category-level 6-DoF pose tracker (Robust6DoF). This tracker leverages shape and temporal prior knowledge to explore optimal inter-frame keypoint pairs, generated under a priori structural adaptive supervision in a coarse-to-fine manner. Notably, our Robust6DoF employs a Spatial-Temporal Augmentation module to deal with the problems of the inter-frame differences and intra-class shape variations through both temporal dynamic filtering and shape-similarity filtering. We further present a Pose-Aware Discrete Servo strategy (PAD-Servo), serving as a decoupling approach to implement the final aerial vision guidance task. It contains two servo action policies to better accommodate the structural properties of aerial robotics manipulation. Exhaustive experiments on four well-known public benchmarks demonstrate the superiority of our Robust6DoF. Real-world tests directly verify that our Robust6DoF along with PAD-Servo can be readily used in real-world aerial robotic applications.
翻译:追踪物体的6自由度(6-DoF)姿态对于各类下游机器人任务及真实世界应用至关重要。本文面向空中机器人操作的现实任务,利用基于类别级别的6自由度姿态追踪,研究空中视觉引导问题。空中条件不可避免地带来了特殊挑战,例如俯仰角和横滚角的快速视角变化以及帧间差异。为应对这些任务挑战,我们首先提出了一种鲁棒的类别级6自由度姿态追踪器(Robust6DoF)。该追踪器利用形状与时间先验知识,在一种由粗到细的先验结构自适应监督下,探索最优帧间关键点对。值得注意的是,我们的Robust6DoF采用了时空增强模块,通过时间动态滤波与形状相似性滤波双重机制,处理帧间差异和类内形状变化问题。我们进一步提出了一种位姿感知离散伺服策略(PAD-Servo),作为实现最终空中视觉引导任务的解耦方法。该策略包含两种伺服动作策略,以更好地适配空中机器人操作的结构特性。在四个知名公开基准上的详尽实验证明了Robust6DoF的优越性。真实环境测试直接验证了我们的Robust6DoF与PAD-Servo可便捷应用于实际空中机器人场景。