Despite their appeal as physics-inspired, energy-based and generative nature, general Boltzmann Machines (BM) are considered intractable to train. This belief led to simplified models of BMs with restricted intralayer connections or layer-by-layer training of deep BMs. Recent developments in domain-specific hardware -- specifically probabilistic computers (p-computer) with probabilistic bits (p-bit) -- may change established wisdom on the tractability of deep BMs. In this paper, we show that deep and unrestricted BMs can be trained using p-computers generating hundreds of billions of Markov Chain Monte Carlo (MCMC) samples per second, on sparse networks developed originally for use in D-Wave's annealers. To maximize the efficiency of learning the p-computer, we introduce two families of Mean-Field Theory assisted learning algorithms, or xMFTs (x = Naive and Hierarchical). The xMFTs are used to estimate the averages and correlations during the positive phase of the contrastive divergence (CD) algorithm and our custom-designed p-computer is used to estimate the averages and correlations in the negative phase. A custom Field-Programmable-Gate Array (FPGA) emulation of the p-computer architecture takes up to 45 billion flips per second, allowing the implementation of CD-$n$ where $n$ can be of the order of millions, unlike RBMs where $n$ is typically 1 or 2. Experiments on the full MNIST dataset with the combined algorithm show that the positive phase can be efficiently computed by xMFTs without much degradation when the negative phase is computed by the p-computer. Our algorithm can be used in other scalable Ising machines and its variants can be used to train BMs, previously thought to be intractable.
翻译:尽管通用玻尔兹曼机(BM)因其受物理启发、基于能量和生成特性而具有吸引力,但通常被认为难以训练。这一认知导致了BM简化模型的诞生,例如限制层内连接或深度BM的逐层训练。领域特定硬件——特别是具有概率比特(p-bit)的概率计算机(p-computer)——的最新进展可能改变关于深度BM可训练性的传统观点。在本文中,我们证明深度且无限制的BM可以使用p-computer进行训练,该计算机每秒可生成数千亿个马尔可夫链蒙特卡洛(MCMC)样本,且网络结构采用最初为D-Wave退火器开发的稀疏网络。为了最大化p-computer的学习效率,我们引入了两类平均场理论辅助学习算法,即xMFT(x=朴素与层次化)。xMFT用于估计对比散度(CD)算法正阶段的平均值和相关性,而我们定制的p-computer则用于估计负阶段的平均值和相关性。基于定制的现场可编程门阵列(FPGA)仿真p-computer架构,每秒可执行高达450亿次翻转,从而实现了CD-$n$,其中$n$可达数百万量级,而受限玻尔兹曼机(RBM)中的$n$通常为1或2。在完整MNIST数据集上的实验表明,当负阶段由p-computer计算时,正阶段可以通过xMFT高效计算且性能无明显下降。我们的算法可应用于其他可扩展的伊辛机,其变体可用于训练此前被认为难以训练的玻尔兹曼机。