Large language models (LLMs) have received considerable interest recently due to their outstanding reasoning and comprehension capabilities. This work explores applying LLMs to vehicular networks, aiming to jointly optimize vehicle-to-infrastructure (V2I) communications and autonomous driving (AD) policies. We deploy LLMs for AD decision-making to maximize traffic flow and avoid collisions for road safety, and a double deep Q-learning algorithm (DDQN) is used for V2I optimization to maximize the received data rate and reduce frequent handovers. In particular, for LLM-enabled AD, we employ the Euclidean distance to identify previously explored AD experiences, and then LLMs can learn from past good and bad decisions for further improvement. Then, LLM-based AD decisions will become part of states in V2I problems, and DDQN will optimize the V2I decisions accordingly. After that, the AD and V2I decisions are iteratively optimized until convergence. Such an iterative optimization approach can better explore the interactions between LLMs and conventional reinforcement learning techniques, revealing the potential of using LLMs for network optimization and management. Finally, the simulations demonstrate that our proposed hybrid LLM-DDQN approach outperforms the conventional DDQN algorithm, showing faster convergence and higher average rewards.
翻译:大型语言模型(LLM)因其卓越的推理与理解能力近期受到广泛关注。本研究探索将LLM应用于车辆网络,旨在联合优化车路协同(V2I)通信与自动驾驶(AD)策略。我们部署LLM进行自动驾驶决策,以最大化交通流量并避免碰撞,保障道路安全;同时采用双深度Q学习算法(DDQN)优化V2I通信,以最大化接收数据速率并减少频繁切换。具体而言,在LLM赋能的自动驾驶中,我们采用欧氏距离识别历史探索的驾驶经验,使LLM能够从过往优劣决策中学习并持续改进。随后,基于LLM的自动驾驶决策将作为状态输入至V2I优化问题,DDQN据此优化V2I决策。此后,自动驾驶与V2I决策通过迭代优化直至收敛。这种迭代优化方法能更深入地探索LLM与传统强化学习技术间的交互作用,揭示LLM在网络优化与管理中的应用潜力。仿真结果表明,我们提出的混合LLM-DDQN方法优于传统DDQN算法,具有更快的收敛速度与更高的平均奖励。