Although lane detection methods have shown impressive performance in real-world scenarios, most of methods require post-processing which is not robust enough. Therefore, end-to-end detectors like DEtection TRansformer(DETR) have been introduced in lane detection. However, one-to-one label assignment in DETR can degrade the training efficiency due to label semantic conflicts. Besides, positional query in DETR is unable to provide explicit positional prior, making it difficult to be optimized. In this paper, we present the One-to-Several Transformer(O2SFormer). We first propose the one-to-several label assignment, which combines one-to-one and one-to-many label assignments to improve the training efficiency while keeping end-to-end detection. To overcome the difficulty in optimizing one-to-one assignment. We further propose the layer-wise soft label which adjusts the positive weight of positive lane anchors across different decoder layers. Finally, we design the dynamic anchor-based positional query to explore positional prior by incorporating lane anchors into positional query. Experimental results show that O2SFormer significantly speeds up the convergence of DETR and outperforms Transformer-based and CNN-based detectors on the CULane dataset. Code will be available athttps://github.com/zkyseu/O2SFormer.
翻译:尽管车道检测方法在真实场景中展现出卓越性能,但多数方法仍需后处理步骤,这导致鲁棒性不足。为此,端到端检测器(如DEtection TRansformer, DETR)被引入车道检测领域。然而,DETR中的一对一标签分配会因标签语义冲突降低训练效率。此外,DETR的位置查询无法提供显式的空间先验,导致优化困难。本文提出一对多Transformer(O2SFormer)。我们首先设计了一对多标签分配策略,结合一对一与一对多标签分配,在保持端到端检测的同时提升训练效率。为解决一对一分配优化难题,我们进一步提出逐层软标签方法,通过调整不同解码器层中正例车道锚点的权重参数。最后,我们设计动态锚点位置查询机制,将车道锚点融入位置查询以挖掘空间先验信息。实验结果表明,O2SFormer显著加速了DETR的收敛速度,并在CULane数据集上超越基于Transformer和CNN的检测器。代码将发布在https://github.com/zkyseu/O2SFormer。