This paper proposes a transformer-based learned image compression system. It is capable of achieving variable-rate compression with a single model while supporting the region-of-interest (ROI) functionality. Inspired by prompt tuning, we introduce prompt generation networks to condition the transformer-based autoencoder of compression. Our prompt generation networks generate content-adaptive tokens according to the input image, an ROI mask, and a rate parameter. The separation of the ROI mask and the rate parameter allows an intuitive way to achieve variable-rate and ROI coding simultaneously. Extensive experiments validate the effectiveness of our proposed method and confirm its superiority over the other competing methods.
翻译:本文提出了一种基于Transformer的学习型图像压缩系统。该系统能够通过单个模型实现变速率压缩,同时支持感兴趣区域(ROI)功能。受提示调优启发,我们引入了提示生成网络来调节压缩任务中基于Transformer的自编码器。提示生成网络能够根据输入图像、ROI掩码和码率参数生成内容自适应令牌。ROI掩码与码率参数的分离设计,提供了一种直观的方式以同时实现变速率编码与ROI编码。大量实验验证了所提方法的有效性,并证实其相较于其他竞争方法的优越性。