This study introduces a novel approach to enhance the spatial-temporal resolution of time-event pixels based on luminance changes captured by event cameras. These cameras present unique challenges due to their low resolution and the sparse, asynchronous nature of the data they collect. Current event super-resolution algorithms are not fully optimized for the distinct data structure produced by event cameras, resulting in inefficiencies in capturing the full dynamism and detail of visual scenes with improved computational complexity. To bridge this gap, our research proposes a method that integrates binary spikes with Sigma Delta Neural Networks (SDNNs), leveraging spatiotemporal constraint learning mechanism designed to simultaneously learn the spatial and temporal distributions of the event stream. The proposed network is evaluated using widely recognized benchmark datasets, including N-MNIST, CIFAR10-DVS, ASL-DVS, and Event-NFS. A comprehensive evaluation framework is employed, assessing both the accuracy, through root mean square error (RMSE), and the computational efficiency of our model. The findings demonstrate significant improvements over existing state-of-the-art methods, specifically, the proposed method outperforms state-of-the-art performance in computational efficiency, achieving a 17.04-fold improvement in event sparsity and a 32.28-fold increase in synaptic operation efficiency over traditional artificial neural networks, alongside a two-fold better performance over spiking neural networks.
翻译:本研究提出了一种新颖方法,旨在提升基于事件相机捕获亮度变化的时间-事件像素的时空分辨率。由于事件相机固有的低分辨率及其采集数据的稀疏、异步特性,这类数据带来了独特的挑战。现有的事件超分辨率算法未能针对事件相机产生的独特数据结构进行充分优化,导致在提升计算复杂度的同时,难以高效捕捉视觉场景的全部动态与细节。为弥补这一不足,本研究提出一种将二进制脉冲与Sigma-Delta神经网络(SDNNs)相结合的方法,利用一种旨在同时学习事件流时空分布的时空约束学习机制。所提出的网络在广泛认可的基准数据集上进行了评估,包括N-MNIST、CIFAR10-DVS、ASL-DVS和Event-NFS。研究采用了一套综合评估框架,通过均方根误差(RMSE)评估模型精度,并同时评估其计算效率。实验结果表明,该方法相较于现有先进方法取得了显著提升,具体而言,在计算效率方面优于当前最优性能:与传统人工神经网络相比,事件稀疏度提升了17.04倍,突触操作效率提高了32.28倍;与脉冲神经网络相比,性能亦有两倍提升。