Low-light video deblurring poses significant challenges in applications like nighttime surveillance and autonomous driving due to dim lighting and long exposures. While event cameras offer potential solutions with superior low-light sensitivity and high temporal resolution, existing fusion methods typically employ staged strategies, limiting their effectiveness against combined low-light and motion blur degradations. To overcome this, we propose CompEvent, a complex neural network framework enabling holistic full-process fusion of event data and RGB frames for enhanced joint restoration. CompEvent features two core components: 1) Complex Temporal Alignment GRU, which utilizes complex-valued convolutions and processes video and event streams iteratively via GRU to achieve temporal alignment and continuous fusion; and 2) Complex Space-Frequency Learning module, which performs unified complex-valued signal processing in both spatial and frequency domains, facilitating deep fusion through spatial structures and system-level characteristics. By leveraging the holistic representation capability of complex-valued neural networks, CompEvent achieves full-process spatiotemporal fusion, maximizes complementary learning between modalities, and significantly strengthens low-light video deblurring capability. Extensive experiments demonstrate that CompEvent outperforms SOTA methods in addressing this challenging task.
翻译:低光视频去模糊在夜间监控和自动驾驶等应用中面临严峻挑战,主要源于光照不足与长曝光时间。尽管事件相机凭借其卓越的低光敏感性与高时间分辨率提供了潜在解决方案,但现有融合方法通常采用分阶段策略,限制了其在应对低光与运动模糊联合退化时的有效性。为此,我们提出CompEvent——一种复数值神经网络框架,能够实现事件数据与RGB帧的全流程整体融合,以增强联合复原性能。CompEvent包含两个核心组件:1)复数值时序对齐GRU,利用复数值卷积并通过GRU迭代处理视频与事件流,实现时序对齐与连续融合;2)复数值空频学习模块,在空间域与频域执行统一的复数值信号处理,通过空间结构与系统级特性促进深度融合。借助复数值神经网络的整体表征能力,CompEvent实现了全流程时空融合,最大化模态间的互补学习,显著提升了低光视频去模糊性能。大量实验表明,CompEvent在处理这一挑战性任务时优于现有最先进方法。