Tracking a target in cluttered and dynamic environments is challenging but forms a core component in applications like aerial cinematography. The obstacles in the environment not only pose collision risk but can also occlude the target from the field-of-view of the robot. Moreover, the target future trajectory may be unknown and only its current state can be estimated. In this paper, we propose a learned probabilistic neural policy for safe, occlusion-free target tracking. The core novelty of our work stems from the structure of our policy network that combines generative modeling based on Conditional Variational Autoencoder (CVAE) with differentiable optimization layers. The role of the CVAE is to provide a base trajectory distribution which is then projected onto a learned feasible set through the optimization layer. Furthermore, both the weights of the CVAE network and the parameters of the differentiable optimization can be learned in an end-to-end fashion through demonstration trajectories. We improve the state-of-the-art (SOTA) in the following respects. We show that our learned policy outperforms existing SOTA in terms of occlusion/collision avoidance capabilities and computation time. Second, we present an extensive ablation showing how different components of our learning pipeline contribute to the overall tracking task. We also demonstrate the real-time performance of our approach on resource-constrained hardware such as NVIDIA Jetson TX2. Finally, our learned policy can also be viewed as a reactive planner for navigation in highly cluttered environments.
翻译:在杂乱且动态的环境中跟踪目标具有挑战性,但却是空中摄影等应用的核心组成部分。环境中的障碍物不仅带来碰撞风险,还可能遮挡机器人视野中的目标。此外,目标未来轨迹可能未知,仅能估计其当前状态。本文提出一种基于学习的概率神经策略,用于安全、无遮挡的目标跟踪。我们工作的核心创新在于策略网络的结构,该结构将基于条件变分自编码器(CVAE)的生成建模与可微分优化层相结合。CVAE的作用是提供基础轨迹分布,然后通过优化层将其投影到学习到的可行集上。此外,CVAE网络的权重和可微分优化的参数均可通过演示轨迹以端到端方式学习。我们在以下方面改进了现有技术水平:首先,实验表明我们学习的策略在遮挡/碰撞规避能力和计算时间方面优于现有最优方法。其次,我们进行了广泛的消融研究,展示了学习流程中不同组件如何共同促进整体跟踪任务。我们还在NVIDIA Jetson TX2等资源受限硬件上验证了方法的实时性能。最后,我们学习的策略也可视为在高度杂乱环境中导航的响应式规划器。