Event cameras are bio-inspired sensors that capture the per-pixel intensity changes asynchronously and produce event streams encoding the time, pixel position, and polarity (sign) of the intensity changes. Event cameras possess a myriad of advantages over canonical frame-based cameras, such as high temporal resolution, high dynamic range, low latency, etc. Being capable of capturing information in challenging visual conditions, event cameras have the potential to overcome the limitations of frame-based cameras in the computer vision and robotics community. In very recent years, deep learning (DL) has been brought to this emerging field and inspired active research endeavors in mining its potential. However, there is still a lack of taxonomies in DL techniques for event-based vision. We first scrutinize the typical event representations with quality enhancement methods as they play a pivotal role as inputs to the DL models. We then provide a comprehensive survey of existing DL-based methods by structurally grouping them into two major categories: 1) image/video reconstruction and restoration; 2) event-based scene understanding and 3D vision. We conduct benchmark experiments for the existing methods in some representative research directions, i.e., image reconstruction, deblurring, and object recognition, to identify some critical insights and problems. Finally, we have discussions regarding the challenges and provide new perspectives for inspiring more research studies.
翻译:事件相机是一种受生物启发的传感器,能够异步捕捉每个像素的强度变化,并生成编码时间、像素位置及强度变化极性(符号)的事件流。与传统帧相机相比,事件相机具有高时间分辨率、高动态范围、低延迟等诸多优势。由于能够在极具挑战性的视觉条件下捕捉信息,事件相机有潜力克服计算机视觉和机器人领域帧相机的局限性。近年来,深度学习已被引入这一新兴领域,并催生了挖掘其潜力的活跃研究。然而,目前仍缺乏针对基于事件视觉的深度学习技术的分类体系。我们首先审视典型的事件表示及其质量增强方法——因为这些方法作为深度学习模型的输入起着关键作用。随后,我们通过结构化分组的方式,对现有基于深度学习方法进行综述,将其分为两大类:1)图像/视频重建与恢复;2)基于事件的场景理解与三维视觉。我们在图像重建、去模糊及目标识别等代表性研究方向,对现有方法开展基准实验,以识别关键见解与问题。最后,我们讨论了面临的挑战,并为激发更多研究提供了新视角。