Accurate and Real-time 3D Pedestrian Detection Using an Efficient Attentive Pillar Network

Efficiently and accurately detecting people from 3D point cloud data is of great importance in many robotic and autonomous driving applications. This fundamental perception task is still very challenging due to (i) significant deformations of human body pose and gesture over time and (ii) point cloud sparsity and scarcity for pedestrian class objects. Recent efficient 3D object detection approaches rely on pillar features to detect objects from point cloud data. However, these pillar features do not carry sufficient expressive representations to deal with all the aforementioned challenges in detecting people. To address this shortcoming, we first introduce a stackable Pillar Aware Attention (PAA) module for enhanced pillar features extraction while suppressing noises in the point clouds. By integrating multi-point-channel-pooling, point-wise, channel-wise, and task-aware attention into a simple module, the representation capabilities are boosted while requiring little additional computing resources. We also present Mini-BiFPN, a small yet effective feature network that creates bidirectional information flow and multi-level cross-scale feature fusion to better integrate multi-resolution features. Our proposed framework, namely PiFeNet, has been evaluated on three popular large-scale datasets for 3D pedestrian Detection, i.e. KITTI, JRDB, and nuScenes achieving state-of-the-art (SOTA) performance on KITTI Bird-eye-view (BEV) and JRDB and very competitive performance on nuScenes. Our approach has inference speed of 26 frame-per-second (FPS), making it a real-time detector. The code for our PiFeNet is available at https://github.com/ldtho/PiFeNet.

翻译：从 3D 点云数据中高效和准确地检测人,在许多机器人和自主驱动应用程序中非常重要。这一基本认知任务仍然非常艰巨,因为(一) 人体的姿势和姿态随着时间的推移出现显著变形,以及(二) 行人类物体的云度和稀缺程度。最近高效的 3D 物体探测方法依靠界碑特征从点云数据中检测物体。然而,这些界碑特征没有足够清晰的表达方式来应对上述在探测人方面的所有挑战。为了应对这一缺陷,我们首先引入一个可叠叠叠的支柱关注(PAAA)模块,用于强化界碑特征的提取,同时抑制点云中的噪音。通过将多点集合、点、点对点、频道和任务感知的注意纳入一个简单的模块,代表能力得到增强,同时需要很少额外的计算资源。我们还介绍了小型但有效的特征网络网络网络网络网络网络网络网络网络网络,以更好地整合多解析特征。我们提议的框架,即JFRNet-Net 和K-RD-S-RD-S-S-S-alveal-S-S-al-S-S-saliz-Serveal-S-S-S-S-Sy-S-Syal-Sy-Syal-S-S-S-S-sal-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-Sy-sal-sal-salvacalvacal-s-s-s-S-S-Sy-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-s-s-S-S-Slvical-Servial-S-Slation-I-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S