Efficiently selecting an appropriate spike stream data length to extract precise information is the key to the spike vision tasks. To address this issue, we propose a dynamic timing representation for spike streams. Based on multi-layers architecture, it applies dilated convolutions on temporal dimension to extract features on multi-temporal scales with few parameters. And we design layer attention to dynamically fuse these features. Moreover, we propose an unsupervised learning method for optical flow estimation in a spike-based manner to break the dependence on labeled data. In addition, to verify the robustness, we also build a spike-based synthetic validation dataset for extreme scenarios in autonomous driving, denoted as SSES dataset. It consists of various corner cases. Experiments show that our method can predict optical flow from spike streams in different high-speed scenes, including real scenes. For instance, our method gets $15\%$ and $19\%$ error reduction from the best spike-based work, SCFlow, in $\Delta t=10$ and $\Delta t=20$ respectively which are the same settings as the previous works.
翻译:高效选择适当的脉冲流数据长度以提取精确信息是脉冲视觉任务的关键。针对该问题,我们提出一种脉冲流的动态时序表示方法。基于多层架构,该方法在时间维度上应用膨胀卷积,以少量参数提取多时间尺度特征,并设计层注意力机制实现特征动态融合。此外,我们提出一种基于脉冲模式的无监督光流估计方法,以突破对标注数据的依赖。为验证鲁棒性,我们还构建了面向自动驾驶极端场景的脉冲合成验证数据集(SSES),其中包含多种边界情形。实验表明,本方法可在包括真实场景在内的多种高速场景中从脉冲流预测光流。例如,在与现有工作相同的参数设置下(Δt=10和Δt=20),本方法相较最优脉冲基线方法SCFlow,分别实现了15%和19%的误差降低。