A Point-Based Approach to Efficient LiDAR Multi-Task Perception

Multi-task networks can potentially improve performance and computational efficiency compared to single-task networks, facilitating online deployment. However, current multi-task architectures in point cloud perception combine multiple task-specific point cloud representations, each requiring a separate feature encoder and making the network structures bulky and slow. We propose PAttFormer, an efficient multi-task architecture for joint semantic segmentation and object detection in point clouds that only relies on a point-based representation. The network builds on transformer-based feature encoders using neighborhood attention and grid-pooling and a query-based detection decoder using a novel 3D deformable-attention detection head design. Unlike other LiDAR-based multi-task architectures, our proposed PAttFormer does not require separate feature encoders for multiple task-specific point cloud representations, resulting in a network that is 3x smaller and 1.4x faster while achieving competitive performance on the nuScenes and KITTI benchmarks for autonomous driving perception. Our extensive evaluations show substantial gains from multi-task learning, improving LiDAR semantic segmentation by +1.7% in mIou and 3D object detection by +1.7% in mAP on the nuScenes benchmark compared to the single-task models.

翻译：多任务网络相较于单任务网络能够提升性能与计算效率，从而便于在线部署。然而，当前点云感知中的多任务架构需结合多种任务特定的点云表示，每种表示都需要独立的特征编码器，导致网络结构臃肿且运行缓慢。我们提出PAttFormer——一种仅依赖点表示的、用于点云中语义分割与目标检测联合任务的、高效多任务架构。该网络基于使用邻域注意力与网格池化的Transformer特征编码器，以及采用新型3D可变注意力检测头设计的查询式检测解码器。与其他基于激光雷达的多任务架构不同，我们的PAttFormer无需为多种任务特定的点云表示配备独立的特征编码器，使得网络体积缩小3倍、速度提升1.4倍，同时能在自动驾驶感知领域的nuScenes与KITTI基准测试中取得具有竞争力的性能。大量评估表明，多任务学习带来显著收益：在nuScenes基准上，相较于单任务模型，激光雷达语义分割mIoU提升1.7%，3D目标检测mAP提升1.7%。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日