In the field of autonomous driving, two important features of autonomous driving car systems are the explainability of decision logic and the accuracy of environmental perception. This paper introduces DME-Driver, a new autonomous driving system that enhances the performance and reliability of autonomous driving system. DME-Driver utilizes a powerful vision language model as the decision-maker and a planning-oriented perception model as the control signal generator. To ensure explainable and reliable driving decisions, the logical decision-maker is constructed based on a large vision language model. This model follows the logic employed by experienced human drivers and makes decisions in a similar manner. On the other hand, the generation of accurate control signals relies on precise and detailed environmental perception, which is where 3D scene perception models excel. Therefore, a planning oriented perception model is employed as the signal generator. It translates the logical decisions made by the decision-maker into accurate control signals for the self-driving cars. To effectively train the proposed model, a new dataset for autonomous driving was created. This dataset encompasses a diverse range of human driver behaviors and their underlying motivations. By leveraging this dataset, our model achieves high-precision planning accuracy through a logical thinking process.
翻译:在自主驾驶领域,驾驶决策逻辑的可解释性与环境感知的准确性是自动驾驶汽车系统的两大关键特征。本文提出DME-Driver——一种新型自主驾驶系统,旨在提升自动驾驶系统的性能与可靠性。该系统采用强大的视觉语言模型作为决策器,并以规划导向的感知模型作为控制信号生成器。为确保驾驶决策的可解释性与可靠性,逻辑决策器基于大型视觉语言模型构建,该模型遵循经验丰富的人类驾驶员逻辑,以相似方式进行决策。另一方面,精确控制信号的生成依赖于精准细致的环境感知,这正是三维场景感知模型的优势所在。因此,本文采用规划导向的感知模型作为信号生成器,将决策器做出的逻辑决策转化为自动驾驶汽车的精确控制信号。为有效训练所提模型,我们创建了新的自主驾驶数据集,该数据集涵盖多样化的人类驾驶行为及其潜在动机。通过利用该数据集,我们的模型可通过逻辑推理过程实现高精度规划性能。