In this paper, we study a remote source coding scenario in which binary phase shift keying (BPSK) modulation sources are corrupted by additive white Gaussian noise (AWGN). An intermediate node, such as a relay, receives these observations and performs additional compression to balance complexity and relevance. This problem can be further formulated as an information bottleneck (IB) problem with Bernoulli sources and Gaussian mixture observations. However, no closed-form solution exists for this IB problem. To address this challenge, we propose a unified achievable scheme that employs three different compression/quantization strategies for intermediate node processing by using two-level quantization, multi-level deterministic quantization, and soft quantization with the hyperbolic tangent ($\tanh$) function, respectively. In addition, we extend our analysis to the vector mixture Gaussian observation problem and explore its application in machine learning for binary classification with information leakage. Numerical evaluations show that the proposed scheme has a near-optimal performance over various signal-to-noise ratios (SNRs), compared to the Blahut-Arimoto (BA) algorithm, and has better performance than some existing numerical methods such as the information dropout approach. Furthermore, experiments conducted on the realistic MNIST dataset also validate the superior classification accuracy of our method compared to the information dropout approach.
翻译:本文研究了一类远程信源编码场景,其中二进制相移键控(BPSK)调制信源受到加性高斯白噪声(AWGN)干扰。中继等中间节点接收观测信号后,通过额外压缩以平衡复杂度与相关性。该问题可进一步建模为伯努利信源与高斯混合观测的信息瓶颈(IB)问题,然而该IB问题尚无闭式解。针对这一挑战,我们提出了一种统一的可达方案,通过采用两级量化、多级确定性量化及基于双曲正切($\tanh$)函数的软量化三种不同压缩/量化策略实现中间节点处理。进一步地,我们将分析推广至向量混合高斯观测问题,并探索其在含信息泄漏的二元分类机器学习应用中的潜力。数值评估表明,与Blahut-Arimoto(BA)算法相比,所提方案在不同信噪比(SNR)下具有近优性能,且优于信息丢弃法等现有数值方法。此外,在真实MNIST数据集上的实验也验证了本方法相较信息丢弃法具有更优的分类准确率。