We consider lossy compression of an information source when the decoder has lossless access to a correlated one. This setup, also known as the Wyner-Ziv problem, is a special case of distributed source coding. To this day, practical approaches for the Wyner-Ziv problem have neither been fully developed nor heavily investigated. We propose a data-driven method based on machine learning that leverages the universal function approximation capability of artificial neural networks. We find that our neural network-based compression scheme, based on variational vector quantization, recovers some principles of the optimum theoretical solution of the Wyner-Ziv setup, such as binning in the source space as well as optimal combination of the quantization index and side information, for exemplary sources. These behaviors emerge although no structure exploiting knowledge of the source distributions was imposed. Binning is a widely used tool in information theoretic proofs and methods, and to our knowledge, this is the first time it has been explicitly observed to emerge from data-driven learning.
翻译:我们考虑解码器可无损访问相关信源时的有损信源压缩问题。这一设置(即Wyner-Ziv问题)是分布式信源编码的特例。迄今为止,针对Wyner-Ziv问题的实用方法仍未得到充分发展或深入研究。我们提出一种基于机器学习的数驱方法,利用人工神经网络的通用函数逼近能力。研究发现,基于变分向量量化的神经网络压缩方案,能够恢复Wyner-Ziv设置最优理论解的部分原理(如信源空间中的分箱技术,以及量化索引与边信息的最优组合)。这些行为在未施加任何利用信源分布先验知识的显式结构约束下自然涌现。分箱技术是信息论证明与方法中广泛使用的工具,据我们所知,这是首次明确观察到该技术能从数驱学习中自发产生。