Dynamic maps (DM) serve as the fundamental information infrastructure for vehicle-road-cloud (VRC) cooperative autonomous driving in China and Japan. By providing comprehensive traffic scene representations, DM overcome the limitations of standalone autonomous driving systems (ADS), such as physical occlusions. Although DM-enhanced ADS have been successfully deployed in real-world applications in Japan, existing DM systems still lack a natural-language-supported (NLS) human interface, which could substantially enhance human-DM interaction. To address this gap, this paper introduces VRCsim, a VRC cooperative perception (CP) simulation framework designed to generate streaming VRC-CP data. Based on VRCsim, we construct a question-answering data set, VRC-QA, focused on spatial querying and reasoning in mixed-traffic scenes. Building upon VRCsim and VRC-QA, we further propose Talk2DM, a plug-and-play module that extends VRC-DM systems with NLS querying and commonsense reasoning capabilities. Talk2DM is built upon a novel chain-of-prompt (CoP) mechanism that progressively integrates human-defined rules with the commonsense knowledge of large language models (LLMs). Experiments on VRC-QA show that Talk2DM can seamlessly switch across different LLMs while maintaining high NLS query accuracy, demonstrating strong generalization capability. Although larger models tend to achieve higher accuracy, they incur significant efficiency degradation. Our results reveal that Talk2DM, powered by Qwen3:8B, Gemma3:27B, and GPT-oss models, achieves over 93\% NLS query accuracy with an average response time of only 2-5 seconds, indicating strong practical potential.
翻译:动态地图(DM)是中国和日本车路云(VRC)协同自动驾驶的基础信息基础设施。通过提供全面的交通场景表征,DM克服了独立自动驾驶系统(ADS)的局限性,例如物理遮挡。尽管DM增强型ADS已在日本的实际应用中成功部署,但现有DM系统仍缺乏自然语言支持(NLS)的人机交互界面,而该界面可显著提升人与DM的交互体验。为填补这一空白,本文提出VRCsim,一个专为生成流式VRC协同感知(CP)数据而设计的VRC-CP仿真框架。基于VRCsim,我们构建了一个专注于混合交通场景中空间查询与推理的问答数据集VRC-QA。在VRCsim和VRC-QA的基础上,我们进一步提出Talk2DM,一个即插即用模块,可为VRC-DM系统扩展NLS查询与常识推理能力。Talk2DM基于一种新颖的提示链(CoP)机制构建,该机制逐步将人工定义的规则与大语言模型(LLMs)的常识知识相融合。在VRC-QA上的实验表明,Talk2DM能够在保持高NLS查询准确率的同时,在不同LLMs之间无缝切换,展现出强大的泛化能力。尽管更大的模型倾向于获得更高的准确率,但其效率显著下降。我们的结果显示,由Qwen3:8B、Gemma3:27B和GPT-oss模型驱动的Talk2DM,实现了超过93%的NLS查询准确率,且平均响应时间仅为2-5秒,显示出强大的实际应用潜力。