We propose the structure and color based learned image codec (SLIC) in which the task of compression is split into that of luminance and chrominance. The deep learning model is built with a novel multi-scale architecture for Y and UV channels in the encoder, where the features from various stages are combined to obtain the latent representation. An autoregressive context model is employed for backward adaptation and a hyperprior block for forward adaptation. Various experiments are carried out to study and analyze the performance of the proposed model, and to compare it with other image codecs. We also illustrate the advantages of our method through the visualization of channel impulse responses, latent channels and various ablation studies. The model achieves Bj{\o}ntegaard delta bitrate gains of 7.5% and 4.66% in terms of MS-SSIM and CIEDE2000 metrics with respect to other state-of-the-art reference codecs.
翻译:我们提出了一种基于结构与色彩的学习型图像编解码器(SLIC),将压缩任务拆分为亮度和色度两部分。深度学习模型在编码器中采用新颖的多尺度架构处理Y通道和UV通道,通过融合不同阶段的特征获得潜在表征。采用自回归上下文模型实现后向自适应,并利用超先验模块实现前向自适应。通过多项实验研究分析所提模型的性能,并将其与其他图像编解码器进行对比。我们还通过信道脉冲响应可视化、潜在通道可视化及多项消融研究展示了本方法的优势。相较于其他先进参考编解码器,本模型在MS-SSIM和CIEDE2000指标上分别实现了7.5%和4.66%的Bjøntegaard delta码率增益。