Unbiased Gradient Estimation for Event Binning via Functional Backpropagation

Event-based vision encodes dynamic scenes as asynchronous spatio-temporal spikes called events. To leverage conventional image processing pipelines, events are typically binned into frames. However, binning functions are discontinuous, which truncates gradients at the frame level and forces most event-based algorithms to rely solely on frame-based features. Attempts to directly learn from raw events avoid this restriction but instead suffer from biased gradient estimation due to the discontinuities of the binning operation, ultimately limiting their learning efficiency. To address this challenge, we propose a novel framework for unbiased gradient estimation of arbitrary binning functions by synthesizing weak derivatives during backpropagation while keeping the forward output unchanged. The key idea is to exploit integration by parts: lifting the target functions to functionals yields an integral form of the derivative of the binning function during backpropagation, where the cotangent function naturally arises. By reconstructing this cotangent function from the sampled cotangent vector, we compute weak derivatives that provably match long-range finite differences of both smooth and non-smooth targets. Experimentally, our method improves simple optimization-based egomotion estimation with 3.2\% lower RMS error and 1.57$\times$ faster convergence. On complex downstream tasks, we achieve 9.4\% lower EPE in self-supervised optical flow, and 5.1\% lower RMS error in SLAM, demonstrating broad benefits for event-based visual perception. Source code can be found at https://github.com/chjz1024/EventFBP.

翻译：事件视觉将动态场景编码为异步时空脉冲（称为事件）。为利用传统图像处理流程，事件通常被分箱为帧。然而，分箱函数具有不连续性，这会在帧层面截断梯度，迫使大多数基于事件的算法仅依赖基于帧的特征。直接从原始事件学习的尝试虽避免了此限制，却因分箱操作的不连续性而遭受有偏梯度估计，最终限制了其学习效率。为解决这一挑战，我们提出一种新颖框架，通过在反向传播过程中合成弱导数同时保持前向输出不变，实现对任意分箱函数的无偏梯度估计。其核心思想是利用分部积分：将目标函数提升为泛函可在反向传播过程中得到分箱函数导数的积分形式，其中余切函数自然出现。通过从采样的余切向量重构该余切函数，我们计算出可证明匹配平滑与非平滑目标长程有限差分的弱导数。实验表明，我们的方法将基于简单优化的自运动估计的均方根误差降低了3.2%，收敛速度提升了1.57倍。在复杂下游任务中，我们在自监督光流估计中实现了9.4%的端点误差降低，在SLAM中实现了5.1%的均方根误差降低，证明了该方法对事件视觉感知的广泛益处。源代码可在 https://github.com/chjz1024/EventFBP 获取。