Lane detection is one of the fundamental modules in self-driving. In this paper we employ a transformer-only method for lane detection, thus it could benefit from the blooming development of fully vision transformer and achieve the state-of-the-art (SOTA) performance on both CULane and TuSimple benchmarks, by fine-tuning the weight fully pre-trained on large datasets. More importantly, this paper proposes a novel and general framework called PriorLane, which is used to enhance the segmentation performance of the fully vision transformer by introducing the low-cost local prior knowledge. Specifically, PriorLane utilizes an encoder-only transformer to fuse the feature extracted by a pre-trained segmentation model with prior knowledge embeddings. Note that a Knowledge Embedding Alignment (KEA) module is adapted to enhance the fusion performance by aligning the knowledge embedding. Extensive experiments on our Zjlab dataset show that PriorLane outperforms SOTA lane detection methods by a 2.82% mIoU when prior knowledge is employed.
翻译:车道检测是自动驾驶中的基本模块之一。本文采用纯Transformer方法进行车道检测,从而能够受益于全视觉Transformer的蓬勃发展,并通过在大规模数据集上对完全预训练的权重进行微调,在CULane和TuSimple基准测试中均实现了最先进的性能。更重要的是,本文提出了一种新颖且通用的框架——PriorLane,该框架通过引入低成本的局部先验知识来增强全视觉Transformer的分割性能。具体而言,PriorLane利用仅编码器的Transformer将预训练分割模型提取的特征与先验知识嵌入进行融合。值得注意的是,我们引入了一个知识嵌入对齐模块,通过对齐知识嵌入来增强融合性能。在Zjlab数据集上的大量实验表明,当使用先验知识时,PriorLane在mIoU指标上比最先进的车道检测方法高出2.82%。