Sampling from a dynamic discrete distribution means drawing an index with probability proportional to a mutable set of weights. Classical constant-time techniques such as the Alias Method are well suited to static distributions, but become expensive in dynamic settings because updates require rebuilding auxiliary tables. Existing dynamic approaches, including Forest of Trees and BUcket Sampling (BUS), achieve reasonable practical performance but require infinite precision real arithmetic to be correct and produce meaningfully incorrect results when implemented on real hardware. We present EBUS (Exact BUcket Sampling), a dynamic sampler for finite-precision weights that is exact by construction: every returned index has probability exactly proportional to its represented weight. Our guarantees are proved in a word RAM model with bounded exponent range. In that model, our method supports $O(1)$ worst-case expected sampling time, $O(1)$ amortized time to update a single weight, $O(n)$ space, and $O(n)$ construction. We also provide an implementation for IEEE 64-bit floating-point weights and show experimentally that it is competitive with, and often faster than, several implementations of previous inexact methods while avoiding their numerical failure modes.
翻译:从动态离散分布中采样是指根据一组可变的权重以概率正比于各索引的方式抽取一个索引。虽然别名方法等经典常数时间技术非常适合静态分布,但在动态场景中代价高昂,因为更新需要重建辅助表。现有的动态方法(如树森林和桶采样)在实用中性能尚可,但依赖无限精度实数运算才能保证正确性,且在实际硬件实现中会产生显著错误结果。我们提出EBUS(精确桶采样),一种针对有限精度权重的动态采样器,其精确性由构造保证:每个返回索引的概率精确正比于其表示的权重。我们的保证在指数范围有界的字RAM模型中得到证明。在该模型下,我们的方法支持$O(1)$最坏情况期望采样时间、$O(1)$摊还单权重更新时间、$O(n)$空间复杂度以及$O(n)$构造时间。我们还提供了针对IEEE 64位浮点权重的实现,并通过实验表明,它在性能上与多种先前不精确方法的实现相当甚至更快,同时避免了它们的数值失效模式。