This paper contributes a novel and modularized learning-based method for aerial robots navigating cluttered environments containing hard-to-perceive thin obstacles without assuming access to a map or the full pose estimation of the robot. The proposed solution builds upon a semantically-enhanced Variational Autoencoder that is trained with both real-world and simulated depth images to compress the input data, while preserving semantically-labeled thin obstacles and handling invalid pixels in the depth sensor's output. This compressed representation, in addition to the robot's partial state involving its linear/angular velocities and its attitude are then utilized to train an uncertainty-aware 3D Collision Prediction Network in simulation to predict collision scores for candidate action sequences in a predefined motion primitives library. A set of simulation and experimental studies in cluttered environments with various sizes and types of obstacles, including multiple hard-to-perceive thin objects, were conducted to evaluate the performance of the proposed method and compare against an end-to-end trained baseline. The results demonstrate the benefits of the proposed semantically-enhanced deep collision prediction for learning-based autonomous navigation.
翻译:本文提出了一种新颖且模块化的基于学习方法,用于在包含难以感知的薄障碍物的杂乱环境中实现飞行机器人的自主导航,无需依赖地图或机器人的完整位姿估计。所提出的方案基于语义增强的变分自编码器,该编码器使用真实世界和模拟的深度图像进行训练,以压缩输入数据,同时保留语义标注的薄障碍物并处理深度传感器输出中的无效像素。这种压缩表示,结合涉及机器人线速度/角速度及其姿态的部分状态,随后用于在仿真中训练具有不确定性感知的3D碰撞预测网络,以预测预定义运动基元库中候选动作序列的碰撞分数。通过在包含各种大小和类型障碍物(包括多个难以感知的薄物体)的杂乱环境中进行一系列仿真和实验研究,评估了所提出方法的性能,并将其与端到端训练的基线方法进行了对比。结果证明了所提出的语义增强深度碰撞预测方法在基于学习的自主导航中的优势。