There is a growing interest in studying sequential neural posterior estimation (SNPE) techniques due to their advantages for simulation-based models with intractable likelihoods. The methods aim to learn the posterior from adaptively proposed simulations using neural network-based conditional density estimators. As an SNPE technique, the automatic posterior transformation (APT) method proposed by Greenberg et al. (2019) performs well and scales to high-dimensional data. However, the APT method requires computing the expectation of the logarithm of an intractable normalizing constant, i.e., a nested expectation. Although atomic proposals were used to render an analytical normalizing constant, it remains challenging to analyze the convergence of learning. In this paper, we reformulate APT as a nested estimation problem. Building on this, we construct several multilevel Monte Carlo (MLMC) estimators for the loss function and its gradients to accommodate different scenarios, including two unbiased estimators, and a biased estimator that trades a small bias for reduced variance and controlled runtime and memory usage. We also provide convergence results of stochastic gradient descent to quantify the interaction of the bias and variance of the gradient estimator. Numerical experiments for approximating complex posteriors with multimodality in moderate dimensions are provided to examine the effectiveness of the proposed methods.
翻译:针对具有难处理似然函数的基于仿真的模型,序列神经后验估计(SNPE)技术因其优势而受到日益广泛的关注。此类方法旨在利用基于神经网络的条件密度估计器,从自适应提出的仿真中学习后验分布。作为一种SNPE技术,Greenberg等人(2019)提出的自动后验变换(APT)方法表现良好,并能扩展至高维数据。然而,APT方法需要计算一个难处理归一化常数对数的期望,即一个嵌套期望。尽管已采用原子性提案来获得解析的归一化常数,但分析其学习收敛性仍然具有挑战性。本文中,我们将APT重新表述为一个嵌套估计问题。在此基础上,我们为损失函数及其梯度构建了多种多级蒙特卡洛(MLMC)估计器,以适应不同的场景,包括两种无偏估计器,以及一种用较小偏差换取方差降低、运行时间可控和内存使用受控的有偏估计器。我们还提供了随机梯度下降的收敛性结果,以量化梯度估计器偏差与方差之间的相互作用。本文提供了在中等维度下近似具有多峰性的复杂后验分布的数值实验,以检验所提方法的有效性。