Learned Focused Plenoptic Image Compression with Microimage Preprocessing and Global Attention

Focused plenoptic cameras can record spatial and angular information of the light field (LF) simultaneously with higher spatial resolution relative to traditional plenoptic cameras, which facilitate various applications in computer vision. However, the existing plenoptic image compression methods present ineffectiveness to the captured images due to the complex micro-textures generated by the microlens relay imaging and long-distance correlations among the microimages. In this paper, a lossy end-to-end learning architecture is proposed to compress the focused plenoptic images efficiently. First, a data preprocessing scheme is designed according to the imaging principle to remove the sub-aperture image ineffective pixels in the recorded light field and align the microimages to the rectangular grid. Then, the global attention module with large receptive field is proposed to capture the global correlation among the feature maps using pixel-wise vector attention computed in the resampling process. Also, a new image dataset consisting of 1910 focused plenoptic images with content and depth diversity is built to benefit training and testing. Extensive experimental evaluations demonstrate the effectiveness of the proposed approach. It outperforms intra coding of HEVC and VVC by an average of 62.57% and 51.67% bitrate reduction on the 20 preprocessed focused plenoptic images, respectively. Also, it achieves 18.73% bitrate saving and generates perceptually pleasant reconstructions compared to the state-of-the-art end-to-end image compression methods, which benefits the applications of focused plenoptic cameras greatly. The dataset and code are publicly available at https://github.com/VincentChandelier/GACN.

翻译：聚焦型光场相机能同时记录光场的空间与角度信息，且相较于传统光场相机具有更高的空间分辨率，这使其在计算机视觉领域具有广泛的应用前景。然而，由于微透镜中继成像产生的复杂微纹理以及微图像之间的长距离相关性，现有光场图像压缩方法对拍摄图像的处理效率较低。本文提出一种有损端到端学习架构，旨在高效压缩聚焦型光场图像。首先，根据成像原理设计数据预处理方案，以去除记录光场中的子孔径图像无效像素，并将微图像对齐至矩形网格。其次，提出具有大感受野的全局注意力模块，通过重采样过程中计算的逐像素向量注意力捕获特征图间的全局相关性。同时，构建包含1910张不同内容与深度聚焦型光场图像的新数据集，以支持训练与测试。大量实验评估证明了所提方法的有效性。在20张预处理后的聚焦型光场图像上，该方法相较于HEVC与VVC帧内编码分别实现平均62.57%和51.67%的比特率降低。此外，与现有最优端到端图像压缩方法相比，该方法实现18.73%的比特率节省，并生成感知质量优越的重建图像，这极大促进了聚焦型光场相机的应用。数据集与代码已开源：https://github.com/VincentChandelier/GACN。