This paper proposes a transformer-based learned image compression system. It is capable of achieving variable-rate compression with a single model while supporting the region-of-interest (ROI) functionality. Inspired by prompt tuning, we introduce prompt generation networks to condition the transformer-based autoencoder of compression. Our prompt generation networks generate content-adaptive tokens according to the input image, an ROI mask, and a rate parameter. The separation of the ROI mask and the rate parameter allows an intuitive way to achieve variable-rate and ROI coding simultaneously. Extensive experiments validate the effectiveness of our proposed method and confirm its superiority over the other competing methods.
翻译:本文提出了一种基于Transformer的学习型图像压缩系统。该系统能够通过单一模型实现变速率压缩,同时支持感兴趣区域(ROI)功能。受提示微调启发,我们引入了提示生成网络来对压缩系统的Transformer自编码器进行条件约束。该提示生成网络根据输入图像、ROI掩膜和速率参数生成内容自适应令牌。通过分离ROI掩膜和速率参数,我们能够以直观方式同时实现变速率编码与ROI编码。大量实验验证了所提方法的有效性,并证实其相比其他竞争方法的优越性。