Moving Object Detection (MOD) is a critical vision task for successfully achieving safe autonomous driving. Despite plausible results of deep learning methods, most existing approaches are only frame-based and may fail to reach reasonable performance when dealing with dynamic traffic participants. Recent advances in sensor technologies, especially the Event camera, can naturally complement the conventional camera approach to better model moving objects. However, event-based works often adopt a pre-defined time window for event representation, and simply integrate it to estimate image intensities from events, neglecting much of the rich temporal information from the available asynchronous events. Therefore, from a new perspective, we propose RENet, a novel RGB-Event fusion Network, that jointly exploits the two complementary modalities to achieve more robust MOD under challenging scenarios for autonomous driving. Specifically, we first design a temporal multi-scale aggregation module to fully leverage event frames from both the RGB exposure time and larger intervals. Then we introduce a bi-directional fusion module to attentively calibrate and fuse multi-modal features. To evaluate the performance of our network, we carefully select and annotate a sub-MOD dataset from the commonly used DSEC dataset. Extensive experiments demonstrate that our proposed method performs significantly better than the state-of-the-art RGB-Event fusion alternatives. The source code and dataset are publicly available at: https://github.com/ZZY-Zhou/RENet.
翻译:移动目标检测(Moving Object Detection, MOD)是实现安全自动驾驶的关键视觉任务。尽管深度学习方法取得了显著成果,但现有方法大多仅基于帧(frame-based),在处理动态交通参与者时可能无法达到理想性能。传感器技术的最新进展,尤其是事件相机(Event camera),能够自然补充传统相机方法,以更好地建模移动物体。然而,基于事件的工作通常采用预定义的时间窗口进行事件表示,并简单整合以从事件中估计图像强度,忽略了可用的异步事件中丰富的时序信息。因此,从一个新视角出发,我们提出RENet——一种新颖的RGB-事件融合网络(RGB-Event fusion Network),通过联合利用这两种互补模态,在自动驾驶的挑战性场景中实现更鲁棒的移动目标检测。具体而言,我们首先设计了一个时序多尺度聚合模块,以充分利用RGB曝光时间及更大时间间隔内的事件帧;随后引入双向融合模块,以注意力机制校准并融合多模态特征。为评估网络性能,我们从常用的DSEC数据集中精心挑选并标注了一个子MOD数据集。大量实验表明,所提方法的性能显著优于当前最先进的RGB-事件融合替代方案。源代码和数据集已在https://github.com/ZZY-Zhou/RENet公开。