While gradient-based discrete samplers are effective in sampling from complex distributions, they are susceptible to getting trapped in local minima, particularly in high-dimensional, multimodal discrete distributions, owing to the discontinuities inherent in these landscapes. To circumvent this issue, we combine parallel tempering, also known as replica exchange, with the discrete Langevin proposal and develop the Parallel Tempering enhanced Discrete Langevin Proposal (PTDLP), which are simulated at a series of temperatures. Significant energy differences prompt sample swaps, which are governed by a Metropolis criterion specifically designed for discrete sampling to ensure detailed balance is maintained. Additionally, we introduce an automatic scheme to determine the optimal temperature schedule and the number of chains, ensuring adaptability across diverse tasks with minimal tuning. Theoretically, we establish that our algorithm converges non-asymptotically to the target energy and exhibits faster mixing compared to a single chain. Empirical results further emphasize the superiority of our method in sampling from complex, multimodal discrete distributions, including synthetic problems, restricted Boltzmann machines, and deep energy-based models.
翻译:尽管基于梯度的离散采样器在从复杂分布中采样方面表现有效,但由于离散采样空间固有的不连续性,它们容易陷入局部最小值,尤其是在高维多模态离散分布中。为解决这一问题,我们将并行回火(亦称副本交换)与离散朗之万提议相结合,开发了并行回火增强的离散朗之万提议(PTDLP)方法,该方法在一系列温度下进行模拟。显著的能量差异会触发样本交换,该交换过程由专门为离散采样设计的Metropolis准则控制,以确保满足细致平衡条件。此外,我们引入了一种自动方案来确定最优的温度调度和链的数量,从而确保该方法能在不同任务中自适应运行,且仅需极少的调参。理论上,我们证明了我们的算法能够非渐近地收敛到目标能量,并且与单链方法相比具有更快的混合速度。实证结果进一步凸显了我们的方法在从复杂多模态离散分布(包括合成问题、受限玻尔兹曼机和深度能量模型)中采样时的优越性。