Many common types of data can be represented as functions that map coordinates to signal values, such as pixel locations to RGB values in the case of an image. Based on this view, data can be compressed by overfitting a compact neural network to its functional representation and then encoding the network weights. However, most current solutions for this are inefficient, as quantization to low-bit precision substantially degrades the reconstruction quality. To address this issue, we propose overfitting variational Bayesian neural networks to the data and compressing an approximate posterior weight sample using relative entropy coding instead of quantizing and entropy coding it. This strategy enables direct optimization of the rate-distortion performance by minimizing the $\beta$-ELBO, and target different rate-distortion trade-offs for a given network architecture by adjusting $\beta$. Moreover, we introduce an iterative algorithm for learning prior weight distributions and employ a progressive refinement process for the variational posterior that significantly enhances performance. Experiments show that our method achieves strong performance on image and audio compression while retaining simplicity.
翻译:许多常见类型的数据可以被表示为将坐标映射到信号值的函数,例如图像中将像素位置映射到RGB值的情况。基于这种观点,可以通过将紧凑型神经网络过拟合到数据的函数表示,然后对网络权重进行编码来压缩数据。然而,当前大多数解决方案效率低下,因为量化到低比特精度会严重降低重建质量。为解决此问题,我们提出将变分贝叶斯神经网络过拟合到数据,并使用相对熵编码对近似后验权重样本进行压缩,而不是量化后对其进行熵编码。该策略通过最小化$\beta$-ELBO直接优化率失真性能,并通过调整$\beta$针对给定网络架构实现不同的率失真权衡。此外,我们引入了一种学习先验权重分布的迭代算法,并采用渐进式变分后验细化过程,显著提升了性能。实验表明,我们的方法在图像和音频压缩方面取得了优异性能,同时保持了简洁性。