Many common types of data can be represented as functions that map coordinates to signal values, such as pixel locations to RGB values in the case of an image. Based on this view, data can be compressed by overfitting a compact neural network to its functional representation and then encoding the network weights. However, most current solutions for this are inefficient, as quantization to low-bit precision substantially degrades the reconstruction quality. To address this issue, we propose overfitting variational Bayesian neural networks to the data and compressing an approximate posterior weight sample using relative entropy coding instead of quantizing and entropy coding it. This strategy enables direct optimization of the rate-distortion performance by minimizing the $\beta$-ELBO, and target different rate-distortion trade-offs for a given network architecture by adjusting $\beta$. Moreover, we introduce an iterative algorithm for learning prior weight distributions and employ a progressive refinement process for the variational posterior that significantly enhances performance. Experiments show that our method achieves strong performance on image and audio compression while retaining simplicity.
翻译:许多常见数据类型可被表示为将坐标映射到信号值的函数,例如图像中像素位置到RGB值的映射。基于这一视角,通过将紧凑神经网络过拟合到数据的函数表示并编码网络权重,即可实现数据压缩。然而,当前大多数解决方案效率低下,因为量化至低比特精度会显著降低重建质量。为解决该问题,我们提出将变分贝叶斯神经网络过拟合至数据,并采用相对熵编码而非量化与熵编码的方式压缩近似后验权重样本。该策略通过最小化$\beta$-ELBO直接优化率失真性能,并通过调整$\beta$为给定网络架构实现不同的率失真权衡。此外,我们引入了一种用于学习先验权重分布的迭代算法,并采用渐进式变分后验精化流程,显著提升了性能。实验表明,本方法在图像与音频压缩中取得了优异性能,同时保持了简洁性。