An essential problem in statistics and machine learning is the estimation of expectations involving PDFs with intractable normalizing constants. The self-normalized importance sampling (SNIS) estimator, which normalizes the IS weights, has become the standard approach due to its simplicity. However, the SNIS has been shown to exhibit high variance in challenging estimation problems, e.g, involving rare events or posterior predictive distributions in Bayesian statistics. Further, most of the state-of-the-art adaptive importance sampling (AIS) methods adapt the proposal as if the weights had not been normalized. In this paper, we propose a framework that considers the original task as estimation of a ratio of two integrals. In our new formulation, we obtain samples from a joint proposal distribution in an extended space, with two of its marginals playing the role of proposals used to estimate each integral. Importantly, the framework allows us to induce and control a dependency between both estimators. We propose a construction of the joint proposal that decomposes in two (multivariate) marginals and a coupling. This leads to a two-stage framework suitable to be integrated with existing or new AIS and/or variational inference (VI) algorithms. The marginals are adapted in the first stage, while the coupling can be chosen and adapted in the second stage. We show in several examples the benefits of the proposed methodology, including an application to Bayesian prediction with misspecified models.
翻译:统计学与机器学习中的一个核心问题涉及对包含难以计算归一化常数的概率密度函数(PDF)的期望进行估计。自归一化重要性采样(SNIS)估计器通过对重要性权重进行归一化处理,因其简单性已成为标准方法。然而,研究表明,在具有挑战性的估计问题中(例如涉及罕见事件或贝叶斯统计中的后验预测分布),SNIS会表现出较高的方差。此外,大多数最先进的自适应重要性采样(AIS)方法在调整提议分布时,其方式如同权重未经归一化处理。本文提出一个框架,将原始任务视为两个积分比值的估计问题。在这一新框架下,我们从扩展空间中的联合提议分布获取样本,该分布的两个边缘分布分别充当用于估计每个积分的提议分布。重要的是,该框架允许我们诱导并控制两个估计器之间的依赖性。我们提出一种联合提议分布的构造方法,该分布可分解为两个(多元)边缘分布和一个耦合项。这形成了一个两阶段框架,适合与现有或新的AIS及/或变分推断(VI)算法集成。第一阶段调整边缘分布,而第二阶段则可选择并调整耦合项。我们通过多个示例展示了所提方法的优势,包括一个在模型设定错误的贝叶斯预测中的应用。