JPEG remains one of the most widespread lossy image coding methods. However, the non-differentiable nature of JPEG restricts the application in deep learning pipelines. Several differentiable approximations of JPEG have recently been proposed to address this issue. This paper conducts a comprehensive review of existing diff. JPEG approaches and identifies critical details that have been missed by previous methods. To this end, we propose a novel diff. JPEG approach, overcoming previous limitations. Our approach is differentiable w.r.t. the input image, the JPEG quality, the quantization tables, and the color conversion parameters. We evaluate the forward and backward performance of our diff. JPEG approach against existing methods. Additionally, extensive ablations are performed to evaluate crucial design choices. Our proposed diff. JPEG resembles the (non-diff.) reference implementation best, significantly surpassing the recent-best diff. approach by $3.47$dB (PSNR) on average. For strong compression rates, we can even improve PSNR by $9.51$dB. Strong adversarial attack results are yielded by our diff. JPEG, demonstrating the effective gradient approximation. Our code is available at https://github.com/necla-ml/Diff-JPEG.
翻译:JPEG仍然是最广泛使用的有损图像编码方法之一。然而,JPEG不可微的性质限制了其在深度学习流水线中的应用。近年来,已有多个可微JPEG近似方法被提出以解决这一问题。本文对现有可微JPEG方法进行了全面综述,并识别出先前方法遗漏的关键细节。为此,我们提出了一种新型可微JPEG方法,克服了先前的局限性。我们的方法对输入图像、JPEG质量、量化表及色彩转换参数均可微。我们评估了所提可微JPEG方法相较于现有方法的前向与反向性能。此外,通过大量消融实验评估了关键设计选择。所提可微JPEG最接近(不可微)参考实现,平均显著超越最新的最佳可微方法$3.47$dB(PSNR)。在强压缩率下,我们甚至可将PSNR提升$9.51$dB。我们的可微JPEG在对抗攻击中取得了强劲效果,证明了其有效的梯度近似能力。代码已开源至https://github.com/necla-ml/Diff-JPEG。