Timely and reliable environment perception is fundamental to safe and efficient automated driving. However, the perception of standalone intelligence inevitably suffers from occlusions. A new paradigm, Cooperative Perception (CP), comes to the rescue by sharing sensor data from another perspective, i.e., from a cooperative vehicle (CoV). Due to the limited communication bandwidth, it is essential to schedule the most beneficial CoV, considering both the viewpoints and communication quality. Existing methods rely on the exchange of meta-information, such as visibility maps, to predict the perception gains from nearby vehicles, which induces extra communication and processing overhead. In this paper, we propose a new approach, learning while scheduling, for distributed scheduling of CP. The solution enables CoVs to predict the perception gains using past observations, leveraging the temporal continuity of perception gains. Specifically, we design a mobility-aware sensor scheduling (MASS) algorithm based on the restless multi-armed bandit (RMAB) theory to maximize the expected average perception gain. An upper bound on the expected average learning regret is proved, which matches the lower bound of any online algorithm up to a logarithmic factor. Extensive simulations are carried out on realistic traffic traces. The results show that the proposed MASS algorithm achieves the best average perception gain and improves recall by up to 4.2 percentage points compared to other learning-based algorithms. Finally, a case study on a trace of LiDAR frames qualitatively demonstrates the superiority of adaptive exploration, the key element of the MASS algorithm.
翻译:摘要: 及时可靠的环境感知是安全高效自动驾驶的基础。然而,单智能体的感知能力不可避免地会受到遮挡影响。一种新范式——协同感知(Cooperative Perception, CP)——通过从另一视角(即来自协作车辆CoV)共享传感器数据来提供解决方案。由于通信带宽有限,在同时考虑视角和通信质量的情况下,调度最具效益的CoV至关重要。现有方法依赖元信息(如可见性地图)的交换来预测附近车辆带来的感知增益,这会引入额外的通信和处理开销。本文提出了一种用于分布式CP调度的新方法——“边学习边调度”。该方法使CoV能够利用感知增益的时间连续性,通过历史观测预测感知增益。具体而言,我们基于无休止多臂赌博机(Restless Multi-Armed Bandit, RMAB)理论设计了一种移动感知传感器调度(Mobility-Aware Sensor Scheduling, MASS)算法,以最大化期望平均感知增益。我们证明了期望平均学习遗憾的上界,该上界与任何在线算法下界相比仅差一个对数因子。基于真实交通轨迹的广泛仿真结果表明,所提出的MASS算法实现了最佳平均感知增益,并且与其他基于学习的算法相比,召回率提高了最多4.2个百分点。最后,基于LiDAR帧数据的案例研究定性展示了自适应探索(MASS算法的关键要素)的优越性。