Event cameras have the ability to capture asynchronous per-pixel brightness changes, called "events", offering advantages over traditional frame-based cameras for computer vision applications. Efficiently coding event data is critical for transmission and storage, given the significant volume of events. This paper proposes a novel double deep learning-based architecture for both event data coding and classification, using a point cloud-based representation for events. In this context, the conversions from events to point clouds and back to events are key steps in the proposed solution, and therefore its impact is evaluated in terms of compression and classification performance. Experimental results show that it is possible to achieve a classification performance of compressed events which is similar to one of the original events, even after applying a lossy point cloud codec, notably the recent learning-based JPEG Pleno Point Cloud Coding standard, with a clear rate reduction. Experimental results also demonstrate that events coded using JPEG PCC achieve better classification performance than those coded using the conventional lossy MPEG Geometry-based Point Cloud Coding standard. Furthermore, the adoption of learning-based coding offers high potential for performing computer vision tasks in the compressed domain, which allows skipping the decoding stage while mitigating the impact of coding artifacts.
翻译:事件相机能够捕获异步的像素级亮度变化(称为"事件"),相较于传统基于帧的相机,在计算机视觉应用中具有显著优势。鉴于事件数据量庞大,高效编码对传输与存储至关重要。本文提出一种新颖的双重深度学习架构,采用基于点云的事件表示方法,同时实现事件数据编码与分类。在该框架中,事件到点云的转换及逆向转换是解决方案的关键步骤,因此本文从压缩性能与分类性能两个维度评估其影响。实验结果表明:即使应用有损点云编解码器(特别是基于学习的新型JPEG Pleno点云编码标准),在实现显著码率降低的同时,压缩后事件的分类性能仍能达到与原始事件相当的水平。实验数据还表明,采用JPEG PCC编码的事件相较于传统有损MPEG基于几何的点云编码标准,能获得更优的分类性能。此外,基于学习的编码方案为在压缩域执行计算机视觉任务提供了巨大潜力,该方案可跳过解码阶段,同时有效抑制编码伪影的影响。