Due to the extremely low latency, events have been recently exploited to supplement lost information for motion deblurring. Existing approaches largely rely on the perfect pixel-wise alignment between intensity images and events, which is not always fulfilled in the real world. To tackle this problem, we propose a novel coarse-to-fine framework, named NETwork of Event-based motion Deblurring with STereo event and intensity cameras (St-EDNet), to recover high-quality images directly from the misaligned inputs, consisting of a single blurry image and the concurrent event streams. Specifically, the coarse spatial alignment of the blurry image and the event streams is first implemented with a cross-modal stereo matching module without the need for ground-truth depths. Then, a dual-feature embedding architecture is proposed to gradually build the fine bidirectional association of the coarsely aligned data and reconstruct the sequence of the latent sharp images. Furthermore, we build a new dataset with STereo Event and Intensity Cameras (StEIC), containing real-world events, intensity images, and dense disparity maps. Experiments on real-world datasets demonstrate the superiority of the proposed network over state-of-the-art methods.
翻译:由于事件具有极低延迟的特性,最近被用于补充运动去模糊过程中丢失的信息。现有方法大多依赖于强度图像与事件之间完美的像素级对齐,这在现实世界中往往难以实现。为解决这一问题,我们提出了一种新颖的粗到细框架——基于立体事件与强度相机的运动去模糊网络(St-EDNet),直接从未对齐的输入(包含单张模糊图像与同步事件流)中恢复高质量图像。具体而言,首先通过跨模态立体匹配模块实现模糊图像与事件流的粗略空间对齐,该过程无需真实深度信息。随后,提出一种双特征嵌入架构,逐步建立粗对齐数据的精细双向关联,并重建潜在清晰图像序列。此外,我们构建了包含真实世界事件、强度图像及密集视差图的新数据集StEIC(立体事件与强度相机)。真实数据集上的实验表明,所提网络性能优于现有最先进方法。