Today, cheap numerical hardware offers huge amounts of parallel computing power, much of which is used for the task of fitting neural networks to data. Adoption of this hardware to accelerate statistical Markov chain Monte Carlo (MCMC) applications has been much slower. In this chapter, we suggest some patterns for speeding up MCMC workloads using the hardware (e.g., GPUs, TPUs) and software (e.g., PyTorch, JAX) that have driven progress in deep learning over the last fifteen years or so. We offer some intuitions for why these new systems are so well suited to MCMC, and show some examples (with code) where we use them to achieve dramatic speedups over a CPU-based workflow. Finally, we discuss some potential pitfalls to watch out for.
翻译:如今,廉价的数值计算硬件提供了海量的并行计算能力,其中大部分被用于将神经网络拟合到数据的任务中。然而,利用此类硬件加速统计马尔可夫链蒙特卡洛(MCMC)应用的发展则缓慢得多。在本章中,我们提出了一些利用硬件(例如GPU、TPU)和软件(例如PyTorch、JAX)来加速MCMC工作负载的模式,这些软硬件在过去约十五年中推动了深度学习的进步。我们提供了一些关于为何这些新系统如此适合MCMC的直观解释,并展示了一些示例(附代码),在这些示例中我们使用它们实现了相对于基于CPU工作流程的显著加速。最后,我们讨论了一些需要注意的潜在陷阱。