In this work, we present an efficient and quantization-aware panoptic driving perception model (Q- YOLOP) for object detection, drivable area segmentation, and lane line segmentation, in the context of autonomous driving. Our model employs the Efficient Layer Aggregation Network (ELAN) as its backbone and task-specific heads for each task. We employ a four-stage training process that includes pretraining on the BDD100K dataset, finetuning on both the BDD100K and iVS datasets, and quantization-aware training (QAT) on BDD100K. During the training process, we use powerful data augmentation techniques, such as random perspective and mosaic, and train the model on a combination of the BDD100K and iVS datasets. Both strategies enhance the model's generalization capabilities. The proposed model achieves state-of-the-art performance with an [email protected] of 0.622 for object detection and an mIoU of 0.612 for segmentation, while maintaining low computational and memory requirements.
翻译:本文提出了一种高效且具备量化感知能力的全景驾驶感知模型(Q-YOLOP),用于自动驾驶场景中的目标检测、可行驶区域分割及车道线分割。该模型采用高效层聚合网络(ELAN)作为主干网络,并为每项任务设计了专用任务头部。我们采用四阶段训练流程:在BDD100K数据集上进行预训练,在BDD100K与iVS数据集上联合微调,最后在BDD100K上实施量化感知训练(QAT)。训练过程中,我们运用了随机透视和Mosaic等强数据增强技术,并在BDD100K与iVS组合数据集上训练模型。这两种策略共同提升了模型的泛化能力。所提模型在目标检测任务中达到[email protected]为0.622的最优性能,分割任务中mIoU为0.612,同时保持了较低的计算与内存开销。