Recently, multi-reference entropy model has been proposed, which captures channel-wise, local spatial, and global spatial correlations. Previous works adopt attention for global correlation capturing, however, the quadratic cpmplexity limits the potential of high-resolution image coding. In this paper, we propose the linear complexity global correlations capturing, via the decomposition of softmax operation. Based on it, we propose the MLIC$^{++}$, a learned image compression with linear complexity for multi-reference entropy modeling. Our MLIC$^{++}$ is more efficient and it reduces BD-rate by 12.44% on the Kodak dataset compared to VTM-17.0 when measured in PSNR. Code will be available at https://github.com/JiangWeibeta/MLIC.
翻译:近年来,多参考熵模型被提出,该模型能够捕捉通道间、局部空间和全局空间的相关性。以往的工作采用注意力机制进行全局相关性捕捉,然而其二次复杂度限制了高分辨率图像编码的潜力。本文通过softmax操作的分解,提出了线性复杂度的全局相关性捕捉方法。基于此,我们提出了MLIC$^{++}$,一种具有线性复杂度的多参考熵建模学习型图像压缩方法。我们的MLIC$^{++}$更加高效,在Kodak数据集上以PSNR为指标,相较于VTM-17.0降低了12.44%的BD-rate。代码将开源在https://github.com/JiangWeibeta/MLIC。