Mobile robots require knowledge of the environment, especially of humans located in its vicinity. While the most common approaches for detecting humans involve computer vision, an often overlooked hardware feature of robots for people detection are their 2D range finders. These were originally intended for obstacle avoidance and mapping/SLAM tasks. In most robots, they are conveniently located at a height approximately between the ankle and the knee, so they can be used for detecting people too, and with a larger field of view and depth resolution compared to cameras. In this paper, we present a new dataset for people detection using knee-high 2D range finders called FROG. This dataset has greater laser resolution, scanning frequency, and more complete annotation data compared to existing datasets such as DROW. Particularly, the FROG dataset contains annotations for 100% of its laser scans (unlike DROW which only annotates 5%), 17x more annotated scans, 100x more people annotations, and over twice the distance traveled by the robot. We propose a benchmark based on the FROG dataset, and analyze a collection of state-of-the-art people detectors based on 2D range finder data. We also propose and evaluate a new end-to-end deep learning approach for people detection. Our solution works with the raw sensor data directly (not needing hand-crafted input data features), thus avoiding CPU preprocessing and releasing the developer of understanding specific domain heuristics. Experimental results show how the proposed people detector attains results comparable to the state of the art, while an optimized implementation for ROS can operate at more than 500 Hz.
翻译:移动机器人需要了解环境,特别是其附近的人类位置。尽管检测人类最常用的方法涉及计算机视觉,但机器人用于人体检测的一个常被忽视的硬件特性是其二维测距仪。这些测距仪最初用于避障和建图/SLAM任务。在大多数机器人中,它们被便利地安装在约脚踝到膝盖之间的高度,因此也可用于检测人类,且相比摄像头具有更宽的视野和深度分辨率。本文提出了一个使用膝高二维测距仪进行人体检测的新数据集——FROG。与现有数据集(如DROW)相比,该数据集具有更高的激光分辨率、扫描频率以及更完整的标注数据。特别地,FROG数据集对其100%的激光扫描进行了标注(而DROW仅标注5%),标注扫描次数多17倍,人体标注多100倍,机器人移动距离超过两倍。我们基于FROG数据集提出了一个基准,并分析了一系列基于二维测距仪数据的先进人体检测器。同时,我们提出并评估了一种新的端到端深度学习方法用于人体检测。该解决方案直接处理原始传感器数据(无需手工设计的输入数据特征),从而避免CPU预处理,并减少开发者对特定领域启发式知识的依赖。实验结果表明,所提出的人体检测器达到了与现有先进技术相当的结果,而针对ROS优化后的实现运行频率超过500 Hz。