In this paper, we consider the problem where a drone has to collect semantic information to classify multiple moving targets. In particular, we address the challenge of computing control inputs that move the drone to informative viewpoints, position and orientation, when the information is extracted using a "black-box" classifier, e.g., a deep learning neural network. These algorithms typically lack of analytical relationships between the viewpoints and their associated outputs, preventing their use in information-gathering schemes. To fill this gap, we propose a novel attention-based architecture, trained via Reinforcement Learning (RL), that outputs the next viewpoint for the drone favoring the acquisition of evidence from as many unclassified targets as possible while reasoning about their movement, orientation, and occlusions. Then, we use a low-level MPC controller to move the drone to the desired viewpoint taking into account its actual dynamics. We show that our approach not only outperforms a variety of baselines but also generalizes to scenarios unseen during training. Additionally, we show that the network scales to large numbers of targets and generalizes well to different movement dynamics of the targets.
翻译:本文研究无人机在需收集语义信息以对多个移动目标进行分类时所面临的问题。具体而言,我们聚焦于当信息通过“黑盒”分类器(例如深度学习神经网络)提取时,如何计算能将无人机引导至信息富集视点(包括位置与姿态)的控制输入这一挑战。此类算法通常缺乏视点与其对应输出之间的解析关系,导致其无法应用于信息收集框架。为弥补这一不足,我们提出一种基于注意力机制的新型架构,并通过强化学习(RL)进行训练。该架构能输出无人机的下一最佳视点,在考虑目标运动、朝向及遮挡的同时,优先从尽可能多的未分类目标获取证据。随后,我们采用低层级的模型预测控制(MPC)器,结合无人机实际动力学特性将其引导至目标视点。实验表明,我们的方法不仅优于多种基线方法,还能泛化至训练中未见的场景。此外,该网络可扩展至大规模目标群体,并良好适应目标不同的运动动力学特性。