Cooperative perception via communication among intelligent traffic agents has great potential to improve the safety of autonomous driving. However, limited communication bandwidth, localization errors and asynchronized capturing time of sensor data, all introduce difficulties to the data fusion of different agents. To some extend, previous works have attempted to reduce the shared data size, mitigate the spatial feature misalignment caused by localization errors and communication delay. However, none of them have considered the asynchronized sensor ticking times, which can lead to dynamic object misplacement of more than one meter during data fusion. In this work, we propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X with considering asynchronous LiDAR sensor ticking times and build an efficient fully sparse framework with modeling the temporal information of individual objects with query-based techniques. The experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models. More importantly, they show that the point-wise observation timestamps of the dynamic objects are crucial for accurate modeling the object temporal context and the predictability of their time-related locations.
翻译:智能交通智能体间的通信协同感知在提升自动驾驶安全性方面具有巨大潜力。然而,有限的通信带宽、定位误差以及传感器数据采集时间的异步性,均为不同智能体间的数据融合带来了挑战。先前的研究在一定程度上尝试了减少共享数据量、缓解由定位误差和通信延迟引起的空间特征错位问题。然而,这些工作均未考虑传感器触发时间的异步性,该问题在数据融合过程中可能导致动态目标出现超过一米的错位。在本工作中,我们提出了时间对齐协同目标检测(TA-COOD)。为此,我们基于异步LiDAR传感器触发时间对广泛使用的数据集OPV2V和DairV2X进行了适配,并构建了一个高效的全稀疏框架,该框架利用基于查询的技术对单个目标的时序信息进行建模。实验结果证实了我们的全稀疏框架相较于最先进的稠密模型具有更优的效率。更重要的是,结果表明动态目标的逐点观测时间戳对于精确建模目标时序上下文及其时间相关位置的可预测性至关重要。