As an emerging task that integrates perception and reasoning, topology reasoning in autonomous driving scenes has recently garnered widespread attention. However, existing work often emphasizes "perception over reasoning": they typically boost reasoning performance by enhancing the perception of lanes and directly adopt MLP to learn lane topology from lane query. This paradigm overlooks the geometric features intrinsic to the lanes themselves and are prone to being influenced by inherent endpoint shifts in lane detection. To tackle this issue, we propose an interpretable method for lane topology reasoning based on lane geometric distance and lane query similarity, named TopoLogic. This method mitigates the impact of endpoint shifts in geometric space, and introduces explicit similarity calculation in semantic space as a complement. By integrating results from both spaces, our methods provides more comprehensive information for lane topology. Ultimately, our approach significantly outperforms the existing state-of-the-art methods on the mainstream benchmark OpenLane-V2 (23.9 v.s. 10.9 in TOP$_{ll}$ and 44.1 v.s. 39.8 in OLS on subset_A. Additionally, our proposed geometric distance topology reasoning method can be incorporated into well-trained models without re-training, significantly boost the performance of lane topology reasoning. The code is released at https://github.com/Franpin/TopoLogic.
翻译:作为一项融合感知与推理的新兴任务,自动驾驶场景中的拓扑推理近来受到广泛关注。然而,现有工作往往强调“感知优于推理”:它们通常通过增强车道感知来提升推理性能,并直接采用MLP从车道查询中学习车道拓扑。这种范式忽视了车道本身固有的几何特征,且容易受到车道检测中固有端点偏移的影响。为解决这一问题,我们提出了一种基于车道几何距离与车道查询相似度的可解释车道拓扑推理方法,命名为TopoLogic。该方法在几何空间中减轻了端点偏移的影响,并在语义空间中引入了显式的相似度计算作为补充。通过融合两个空间的结果,我们的方法为车道拓扑提供了更全面的信息。最终,我们的方法在主流基准测试OpenLane-V2上显著超越了现有最先进方法(在subset_A上,TOP$_{ll}$为23.9对比10.9,OLS为44.1对比39.8)。此外,我们提出的几何距离拓扑推理方法无需重新训练即可集成到已训练好的模型中,显著提升了车道拓扑推理的性能。代码发布于https://github.com/Franpin/TopoLogic。