Recent advances in MCMC use normalizing flows to precondition target distributions and enable jumps to distant regions. However, there is currently no systematic comparison of different normalizing flow architectures for MCMC. As such, many works choose simple flow architectures that are readily available and do not consider other models. Guidelines for choosing an appropriate architecture would reduce analysis time for practitioners and motivate researchers to take the recommended models as foundations to be improved. We provide the first such guideline by extensively evaluating many normalizing flow architectures on various flow-based MCMC methods and target distributions. When the target density gradient is available, we show that flow-based MCMC outperforms classic MCMC for suitable NF architecture choices with minor hyperparameter tuning. When the gradient is unavailable, flow-based MCMC wins with off-the-shelf architectures. We find contractive residual flows to be the best general-purpose models with relatively low sensitivity to hyperparameter choice. We also provide various insights into normalizing flow behavior within MCMC when varying their hyperparameters, properties of target distributions, and the overall computational budget.
翻译:近年来,MCMC方法利用归一化流对目标分布进行预处理,从而实现向远距离区域的跳跃。然而,目前尚缺乏针对MCMC的不同归一化流架构的系统性比较。因此,许多研究直接选择现成的简单流架构,而未考虑其他模型。选择合适的架构指南将减少实践者的分析时间,并激励研究者以推荐的模型为基础进行改进。我们通过在各种基于流的MCMC方法和目标分布上广泛评估多种归一化流架构,首次提供了此类指南。当目标密度梯度可用时,我们表明:在合适的NF架构选择及少量超参数调整下,基于流的MCMC优于经典MCMC。当梯度不可用时,基于流的MCMC使用现成架构即可取得优势。我们发现收缩残差流是相对超参数选择不敏感的最佳通用模型。我们还深入分析了归一化流在MCMC中的行为特性,包括超参数变化、目标分布属性以及整体计算预算的影响。