We study channel simulation and distributed matching, two fundamental problems with several applications to machine learning, using a recently introduced generalization of the standard rejection sampling (RS) algorithm known as Ensemble Rejection Sampling (ERS). For channel simulation, we propose a new coding scheme based on ERS that achieves a near-optimal coding rate. In this process, we demonstrate that standard RS can also achieve a near-optimal coding rate and generalize the result of Braverman and Garg (2014) to the continuous alphabet setting. Next, as our main contribution, we present a distributed matching lemma for ERS, which serves as the rejection sampling counterpart to the Poisson Matching Lemma (PML) introduced by Li and Anantharam (2021). Our result also generalizes a recent work on importance matching lemma (Phan et al, 2024) and, to our knowledge, is the first result on distributed matching in the family of rejection sampling schemes where the matching probability is close to PML. We demonstrate the practical significance of our approach over prior works by applying it to distributed compression. The effectiveness of our proposed scheme is validated through experiments involving synthetic Gaussian sources and distributed image compression using the MNIST dataset.
翻译:本研究利用最近提出的集成拒绝采样(ERS)算法——标准拒绝采样(RS)算法的一种推广形式,探讨了信道仿真与分布式匹配这两个在机器学习中具有多重应用的基础性问题。针对信道仿真问题,我们提出了一种基于ERS的新型编码方案,该方案实现了接近最优的编码速率。在此过程中,我们证明了标准RS同样能够达到接近最优的编码速率,并将Braverman与Garg(2014)的研究结果推广至连续字母表场景。随后,作为本研究的主要贡献,我们提出了ERS的分布式匹配引理,该引理构成了Li与Anantharam(2021)所提出的泊松匹配引理(PML)在拒绝采样框架下的对应理论。我们的研究成果同时推广了近期关于重要性匹配引理的研究(Phan等人,2024),并且据我们所知,这是在拒绝采样算法族中首个实现匹配概率接近PML的分布式匹配结果。通过将所提方法应用于分布式压缩任务,我们证明了其相对于现有工作的实际优势。所提出方案的有效性在合成高斯源实验以及基于MNIST数据集的分布式图像压缩实验中得到了验证。