Multi-view compression technology, especially Stereo Image Compression (SIC), plays a crucial role in car-mounted cameras and 3D-related applications. Interestingly, the Distributed Source Coding (DSC) theory suggests that efficient data compression of correlated sources can be achieved through independent encoding and joint decoding. This motivates the rapidly developed deep-distributed SIC methods in recent years. However, these approaches neglect the unique characteristics of stereo-imaging tasks and incur high decoding latency. To address this limitation, we propose a Feature-based Fast Cascade Alignment network (FFCA-Net) to fully leverage the side information on the decoder. FFCA adopts a coarse-to-fine cascaded alignment approach. In the initial stage, FFCA utilizes a feature domain patch-matching module based on stereo priors. This module reduces redundancy in the search space of trivial matching methods and further mitigates the introduction of noise. In the subsequent stage, we utilize an hourglass-based sparse stereo refinement network to further align inter-image features with a reduced computational cost. Furthermore, we have devised a lightweight yet high-performance feature fusion network, called a Fast Feature Fusion network (FFF), to decode the aligned features. Experimental results on InStereo2K, KITTI, and Cityscapes datasets demonstrate the significant superiority of our approach over traditional and learning-based SIC methods. In particular, our approach achieves significant gains in terms of 3 to 10-fold faster decoding speed than other methods.
翻译:多视角压缩技术,特别是立体图像压缩(SIC),在车载摄像头和3D相关应用中发挥着关键作用。有趣的是,分布式信源编码(DSC)理论表明,通过独立编码和联合解码可以有效压缩相关信源的数据。这促进了近年来快速发展的深度分布式SIC方法。然而,这些方法忽视了立体成像任务的独特特性,并导致高解码延迟。为解决这一局限,我们提出了一种基于特征的快速级联对齐网络(FFCA-Net),以充分利用解码器上的侧信息。FFCA采用从粗到细的级联对齐方式。在初始阶段,FFCA利用基于立体先验的特征域块匹配模块。该模块减少了平凡匹配方法搜索空间中的冗余,并进一步抑制了噪声的引入。在后续阶段,我们采用基于沙漏的稀疏立体精化网络,以较低的计算成本进一步对齐帧间特征。此外,我们设计了一种轻量级高性能特征融合网络——快速特征融合网络(FFF),用于解码对齐后的特征。在InStereo2K、KITTI和Cityscapes数据集上的实验结果表明,我们的方法相较于传统和基于学习的SIC方法具有显著优势。特别是,我们的方法在解码速度上相比其他方法实现了3至10倍的显著提升。