Neural compression has brought tremendous progress in designing lossy compressors with good rate-distortion (RD) performance at low complexity. Thus far, neural compression design involves transforming the source to a latent vector, which is then rounded to integers and entropy coded. While this approach has been shown to be optimal in a one-shot sense on certain sources, we show that it is highly sub-optimal on i.i.d. sequences, and in fact always recovers scalar quantization of the original source sequence. We demonstrate that the sub-optimality is due to the choice of quantization scheme in the latent space, and not the transform design. By employing lattice quantization instead of scalar quantization in the latent space, we demonstrate that Lattice Transform Coding (LTC) is able to recover optimal vector quantization at various dimensions and approach the asymptotically-achievable rate-distortion function at reasonable complexity. On general vector sources, LTC improves upon standard neural compressors in one-shot coding performance. LTC also enables neural compressors that perform block coding on i.i.d. vector sources, which yields coding gain over optimal one-shot coding.
翻译:神经压缩技术在低复杂度下设计具有良好率失真性能的有损压缩器方面取得了巨大进展。目前,神经压缩的设计涉及将源信号变换为潜在向量,随后将其舍入为整数并进行熵编码。尽管该方法在某些源的单次编码意义上已被证明是最优的,但我们表明它在独立同分布序列上是高度次优的,并且实际上始终恢复原始源序列的标量量化。我们证明,次优性源于潜在空间中量化方案的选择,而非变换设计。通过在潜在空间中采用格点量化替代标量量化,我们证明格点变换编码能够在多种维度下恢复最优向量量化,并以合理复杂度逼近渐近可达的率失真函数。对于一般向量源,格点变换编码在单次编码性能上优于标准神经压缩器。此外,格点变换编码还能使神经压缩器对独立同分布向量源执行分块编码,从而相较最优单次编码获得编码增益。