Images are the standard input for most computer vision algorithms. However, their processing often reduces to parallelizable operations applied locally and independently to individual pixels. Yet, many of these low-level raw pixel readings only provide redundant or noisy information for specific high-level tasks, leading to inefficiencies in both energy consumption during their transmission off-sensor and computational resources in their subsequent processing. As novel sensors featuring advanced in-pixel processing capabilities emerge, we envision a paradigm shift toward performing increasingly complex visual processing directly in-pixel, reducing computational overhead downstream. We advocate for synthesizing high-level cues at the pixel level, enabling their off-sensor transmission to directly support downstream tasks more effectively than raw pixel readings. This paper conceptualizes a novel photometric rotation estimation algorithm to be distributed at pixel level, where each pixel estimates the global motion of the camera by exchanging information with other pixels to achieve global consensus. We employ a probabilistic formulation and leverage Gaussian Belief Propagation (GBP) for decentralized inference using messaging-passing. The proposed proposed technique is evaluated on real-world public datasets and we offer a in-depth analysis of the practicality of applying GBP to distributed rotation estimation at pixel level.
翻译:图像是大多数计算机视觉算法的标准输入。然而,其处理过程通常简化为可并行化的操作,这些操作被局部且独立地应用于单个像素。然而,对于特定的高层任务,许多这些低层原始像素读数仅提供冗余或噪声信息,导致其在传感器外传输过程中的能耗以及后续处理中的计算资源均存在效率低下的问题。随着具备先进像素内处理能力的新型传感器出现,我们设想一种范式转变,即直接在像素内执行日益复杂的视觉处理,从而减少下游的计算开销。我们主张在像素层面合成高层线索,使其传感器外传输能够比原始像素读数更有效地直接支持下游任务。本文概念化了一种新颖的光度旋转估计算法,该算法将在像素层面进行分布式实现,其中每个像素通过与其他像素交换信息来估计相机的全局运动,从而实现全局一致性。我们采用概率化表述,并利用高斯置信传播(GBP)通过消息传递进行去中心化推理。所提出的技术在真实世界的公共数据集上进行了评估,并对GBP应用于像素级分布式旋转估计的实用性进行了深入分析。