Event cameras have recently gained significant traction since they open up new avenues for low-latency and low-power solutions to complex computer vision problems. To unlock these solutions, it is necessary to develop algorithms that can leverage the unique nature of event data. However, the current state-of-the-art is still highly influenced by the frame-based literature, and usually fails to deliver on these promises. In this work, we take this into consideration and propose a novel self-supervised learning pipeline for the sequential estimation of event-based optical flow that allows for the scaling of the models to high inference frequencies. At its core, we have a continuously-running stateful neural model that is trained using a novel formulation of contrast maximization that makes it robust to nonlinearities and varying statistics in the input events. Results across multiple datasets confirm the effectiveness of our method, which establishes a new state of the art in terms of accuracy for approaches trained or optimized without ground truth.
翻译:事件相机近年来广受关注,因其为低延迟、低功耗的复杂计算机视觉问题解决方案开辟了新途径。为实现这些方案,需开发能充分利用事件数据独特性质的算法。然而,当前最先进技术仍受帧基文献的显著影响,往往无法兑现这些承诺。本文中,我们充分考虑这一点,提出了一种新颖的自监督学习流程,用于序列化估计基于事件的光流,从而允许模型扩展至高推理频率。其核心是一个持续运行的状态化神经模型,通过一种新颖的对比最大化公式进行训练,该公式使其对输入事件中的非线性特征和变化统计特性具有鲁棒性。跨多个数据集的结果证实了我们方法的有效性,在无地面真值训练或优化的方法中,该方法在精度上确立了新的最先进水平。