Mixed autonomy driving becomes unsafe and inefficient when autonomous vehicles (AVs) and human-driven vehicles (HVs) misread each other's intentions. We study this problem as implicit mutual communication in lane changes. The proposed framework models how the ego vehicle both expresses its intent and probes the other driver's preference under epistemic uncertainty. It combines a level-k Bayesian persuasion game with virtual features for proactive signaling, information-theoretic rewards for mutual communication, and adaptive weights of communication affordances. We further introduce the Pride-Inquiry (P-I) and Pride-Prejudice (P-P) planes to analyze communication intensity and tendency. The model is calibrated with a Communication-Based Multi-Agent Inverse Reinforcement Learning algorithm (C-MIRL) on the naturalistic NGSIM dataset. Compared with the non-communicative baseline, the proposed model reduces the prediction error of mandatory lane changes by up to 20% while maintaining strong generalization. Driver-In-the-Loop questionnaire scores are positively correlated with the calibrated communication variables, supporting the subjective validity of the model. The learned rewards further show that inquiry and listening affordances contribute more than pride and expression alone, and that inquiry preference varies more strongly across drivers. These results support explicit modeling of mutual communication and epistemic uncertainty in interactive driving.
翻译:混合自主驾驶中,当自动驾驶车辆(AVs)与人类驾驶车辆(HVs)错误解读彼此意图时,会导致不安全与低效问题。我们以变道场景中的隐性交互通信为研究对象,提出该框架对自车在认知不确定性下表达意图并探测其他驾驶员偏好的过程进行建模。该框架融合了基于k级贝叶斯劝说博弈的主动信号传递虚拟特征、面向交互通信的信息论奖励机制,以及通信能力的自适应权重。我们进一步引入傲慢-探究(P-I)平面与傲慢-偏见(P-P)平面分析通信强度与倾向。模型通过基于通信的多智能体反向强化学习算法(C-MIRL)基于自然驾驶数据集NGSIM进行标定。相较于无通信基线模型,本模型在保持强泛化能力的同时,将强制变道预测误差降低达20%。驾驶人在环问卷评分与标定后的通信变量呈正相关,验证了模型的主观有效性。学习得到的奖励函数进一步表明,探究与倾听能力对整体的贡献度超过单纯的傲慢与表达,且不同驾驶员对探究偏好的差异更为显著。这些成果支持对交互式驾驶中交互通信与认知不确定性进行显式建模。