Multi-view image compression plays a critical role in 3D-related applications. Existing methods adopt a predictive coding architecture, which requires joint encoding to compress the corresponding disparity as well as residual information. This demands collaboration among cameras and enforces the epipolar geometric constraint between different views, which makes it challenging to deploy these methods in distributed camera systems with randomly overlapping fields of view. Meanwhile, distributed source coding theory indicates that efficient data compression of correlated sources can be achieved by independent encoding and joint decoding, which motivates us to design a learning-based distributed multi-view image coding (LDMIC) framework. With independent encoders, LDMIC introduces a simple yet effective joint context transfer module based on the cross-attention mechanism at the decoder to effectively capture the global inter-view correlations, which is insensitive to the geometric relationships between images. Experimental results show that LDMIC significantly outperforms both traditional and learning-based MIC methods while enjoying fast encoding speed. Code will be released at https://github.com/Xinjie-Q/LDMIC.
翻译:多视角图像压缩在三维相关应用中起着关键作用。现有方法采用预测编码架构,需要联合编码以压缩相应的视差及残差信息。这要求相机间进行协作并强制不同视角间的极线几何约束,使得这些方法难以部署在视野随机重叠的分布式相机系统中。与此同时,分布式信源编码理论表明,通过独立编码和联合解码可实现相关信源的高效数据压缩,这启发我们设计了基于学习的分布式多视角图像编码(LDMIC)框架。LDMIC采用独立编码器,在解码端引入了一种基于交叉注意力机制的简单而有效的联合上下文传输模块,以有效捕捉全局视角间相关性,该方法对图像间的几何关系不敏感。实验结果表明,LDMIC在显著优于传统及基于学习的MIC方法的同时,还具备快速的编码速度。代码将在 https://github.com/Xinjie-Q/LDMIC 开源。