Multi-view image compression plays a critical role in 3D-related applications. Existing methods adopt a predictive coding architecture, which requires joint encoding to compress the corresponding disparity as well as residual information. This demands collaboration among cameras and enforces the epipolar geometric constraint between different views, which makes it challenging to deploy these methods in distributed camera systems with randomly overlapping fields of view. Meanwhile, distributed source coding theory indicates that efficient data compression of correlated sources can be achieved by independent encoding and joint decoding, which motivates us to design a learning-based distributed multi-view image coding (LDMIC) framework. With independent encoders, LDMIC introduces a simple yet effective joint context transfer module based on the cross-attention mechanism at the decoder to effectively capture the global inter-view correlations, which is insensitive to the geometric relationships between images. Experimental results show that LDMIC significantly outperforms both traditional and learning-based MIC methods while enjoying fast encoding speed. Code will be released at https://github.com/Xinjie-Q/LDMIC.
翻译:多视角图像压缩在三维相关应用中扮演着关键角色。现有方法采用预测编码架构,需要联合编码以压缩相应的视差及残差信息。这要求相机间协同工作,并强制不同视角间满足极线几何约束,使得这些方法难以部署于视角随机重叠的分布式相机系统中。与此同时,分布式信源编码理论表明,通过独立编码与联合解码可实现相关信源的高效数据压缩,这启发我们设计了基于学习的分布式多视角图像编码(LDMIC)框架。LDMIC采用独立编码器,在解码端引入一个基于交叉注意力机制的简洁而有效的联合上下文传递模块,以有效捕捉全局视角间相关性,且对图像间几何关系不敏感。实验结果表明,LDMIC在编码速度极快的同时,显著优于传统及基于学习的多视角图像编码方法。代码将在 https://github.com/Xinjie-Q/LDMIC 开源。