To accommodate high network dynamics in real-time cooperative perception (CP), reinforcement learning (RL) based adaptive CP schemes have been proposed, to allow adaptive switchings between CP and stand-alone perception modes among connected and autonomous vehicles. The traditional offline-training online-execution RL framework suffers from performance degradation under nonstationary network conditions. To achieve fast and efficient model adaptation, we formulate a set of Markov decision processes for adaptive CP decisions in each stationary local vehicular network (LVN). A meta RL solution is proposed, which trains a meta RL model that captures the general features among LVNs, thus facilitating fast model adaptation for each LVN with the meta RL model as an initial point. Simulation results show the superiority of meta RL in terms of the convergence speed without reward degradation. The impact of the customization level of meta models on the model adaptation performance has also been evaluated.
翻译:为适应实时协同感知中的高网络动态性,已有研究提出基于强化学习的自适应协同感知方案,使互联自动驾驶车辆能够在协同感知与独立感知模式间自适应切换。传统的离线训练-在线执行强化学习框架在非平稳网络条件下存在性能下降问题。为实现快速高效的模型自适应,我们为每个平稳局部车载网络中的自适应协同感知决策建立了一组马尔可夫决策过程。本文提出一种元强化学习解决方案,通过训练能够捕捉局部车载网络间共性特征的元强化学习模型,以该元模型为初始点加速各局部车载网络的模型自适应过程。仿真结果表明,元强化学习在保持奖励水平不降的前提下显著提升了收敛速度。同时评估了元模型定制化程度对模型自适应性能的影响。