Many common types of data can be represented as functions that map coordinates to signal values, such as pixel locations to RGB values in the case of an image. Based on this view, data can be compressed by overfitting a compact neural network to its functional representation and then encoding the network weights. However, most current solutions for this are inefficient, as quantization to low-bit precision substantially degrades the reconstruction quality. To address this issue, we propose overfitting variational Bayesian neural networks to the data and compressing an approximate posterior weight sample using relative entropy coding instead of quantizing and entropy coding it. This strategy enables direct optimization of the rate-distortion performance by minimizing the $\beta$-ELBO, and target different rate-distortion trade-offs for a given network architecture by adjusting $\beta$. Moreover, we introduce an iterative algorithm for learning prior weight distributions and employ a progressive refinement process for the variational posterior that significantly enhances performance. Experiments show that our method achieves strong performance on image and audio compression while retaining simplicity.
翻译:许多常见数据类型可表示为将坐标映射到信号值的函数,例如图像中像素位置到RGB值的映射。基于此视角,通过将紧凑神经网络过拟合至数据的函数表示并编码网络权重,可实现数据压缩。然而,当前大多数解决方案效率低下,因为量化至低比特精度会显著降低重建质量。为解决此问题,我们提出对数据过拟合变分贝叶斯神经网络,并使用相对熵编码压缩近似后验权重样本,而非量化后熵编码。该策略通过最小化$\beta$-ELBO直接优化率失真性能,并可通过调整$\beta$针对给定网络架构实现不同的率失真权衡。此外,我们引入了一种迭代算法学习先验权重分布,并采用渐进式精炼过程优化变分后验,从而显著提升性能。实验表明,我们的方法在保持简洁性的同时,在图像与音频压缩任务上均取得了优异表现。