Understanding road structures is crucial for autonomous driving. Intricate road structures are often depicted using lane graphs, which include centerline curves and connections forming a Directed Acyclic Graph (DAG). Accurate extraction of lane graphs relies on precisely estimating vertex and edge information within the DAG. Recent research highlights Transformer-based language models' impressive sequence prediction abilities, making them effective for learning graph representations when graph data are encoded as sequences. However, existing studies focus mainly on modeling vertices explicitly, leaving edge information simply embedded in the network. Consequently, these approaches fall short in the task of lane graph extraction. To address this, we introduce LaneGraph2Seq, a novel approach for lane graph extraction. It leverages a language model with vertex-edge encoding and connectivity enhancement. Our serialization strategy includes a vertex-centric depth-first traversal and a concise edge-based partition sequence. Additionally, we use classifier-free guidance combined with nucleus sampling to improve lane connectivity. We validate our method on prominent datasets, nuScenes and Argoverse 2, showcasing consistent and compelling results. Our LaneGraph2Seq approach demonstrates superior performance compared to state-of-the-art techniques in lane graph extraction.
翻译:理解道路结构对于自动驾驶至关重要。复杂的道路结构通常通过车道图进行描述,其中包含中心线曲线和连接关系,形成有向无环图(DAG)。车道图的准确提取依赖于对DAG中顶点和边信息的精确估计。近年研究表明,基于Transformer的语言模型在序列预测方面具有卓越能力,当图数据被编码为序列时,该模型能有效学习图表示。然而,现有研究主要侧重于显式建模顶点,而边信息仅简单嵌入网络中,导致这些方法在车道图提取任务中存在不足。针对这一问题,我们提出LaneGraph2Seq这一创新的车道图提取方法,该方法采用包含顶点-边编码与连通性增强的语言模型。我们的序列化策略包括以顶点为中心的深度优先遍历和简洁的基于边的分区序列。此外,我们结合无分类器引导与核采样技术以提升车道连通性。在nuScenes和Argoverse 2等主流数据集上的验证表明,本方法取得了稳健且优异的结果。与现有最先进技术相比,我们的LaneGraph2Seq方法在车道图提取中展现出更优越的性能。