We study a relaxation of the problem of coupling probability distributions -- a list of samples is generated from one distribution and an accept is declared if any one of these samples is identical to the sample generated from the other distribution. We propose a novel method for generating samples, which extends the Gumbel-max sampling suggested in Daliri et al. (arXiv:2408.07978) for coupling probability distributions. We also establish a corresponding lower bound on the acceptance probability, which we call the list matching lemma. We next discuss two applications of our setup. First, we develop a new mechanism for multi-draft speculative sampling that is simple to implement and achieves performance competitive with baselines such as SpecTr and SpecInfer across a range of language tasks. Our method also guarantees a certain degree of drafter invariance with respect to the output tokens which is not supported by existing schemes. We also provide a theoretical lower bound on the token level acceptance probability. As our second application, we consider distributed lossy compression with side information in a setting where a source sample is compressed and available to multiple decoders, each with independent side information. We propose a compression technique that is based on our generalization of Gumbel-max sampling and show that it provides significant gains in experiments involving synthetic Gaussian sources and the MNIST image dataset.
翻译:我们研究概率分布耦合问题的一种松弛形式——从一个分布生成样本列表,当该列表中任意样本与另一分布生成的样本相同时,即宣告接受。我们提出了一种新颖的样本生成方法,该方法扩展了Daliri等人(arXiv:2408.07978)为耦合概率分布提出的Gumbel-max采样技术。我们还建立了接受概率的相应下界,称之为列表匹配引理。接下来我们讨论该框架的两个应用。首先,我们开发了一种新的多草稿推测采样机制,该方法实现简单,在多种语言任务中达到与SpecTr、SpecInfer等基线方法相当的竞争性能。我们的方法还能保证输出词元在特定程度上的草稿器不变性,这是现有方案所不具备的。我们还从理论上给出了词元级接受概率的下界。作为第二个应用,我们研究了带边信息的分布式有损压缩问题:源样本经压缩后可供多个解码器使用,每个解码器拥有独立的边信息。我们提出了一种基于广义Gumbel-max采样的压缩技术,并在合成高斯源和MNIST图像数据集的实验中验证了其显著性能增益。