We study two adaptive importance sampling schemes for estimating the probability of a rare event in the high-dimensional regime $d \to \infty$ with $d$ the dimension. The first scheme, motivated by recent results, seeks to use as auxiliary distribution a projection of the optimal auxiliary distribution (optimal among Gaussian distributions, and in the sense of the Kullback--Leibler divergence); the second scheme is the prominent cross-entropy method. In these schemes, two samples are used: the first one to learn the auxiliary distribution and the second one, drawn according to the learnt distribution, to perform the final probability estimation. Contrary to the common belief that the sample size needs to grow exponentially in the dimension to make the estimator consistent and avoid the weight degeneracy phenomenon, we find that a polynomial sample size in the first learning step is enough. We prove this result assuming that the sought probability is bounded away from $0$. For the first scheme, we show that the sample size only needs to grow like $rd$ with $r$ the effective dimension of the projection, while for cross-entropy, the polynomial growth rate remains implicit although insight on its value is provided. In addition to proving consistency, we also prove that in the regimes studied, the importance sampling weights do not degenerate.
翻译:我们研究了两种用于估计高维(维度$d \to \infty$)稀有事件发生概率的自适应重要性采样方案。第一种方案受近期研究成果启发,试图将最优辅助分布(高斯分布族中的最优分布,基于Kullback-Leibler散度意义)的投影作为辅助分布;第二种方案则是著名的交叉熵方法。在这些方案中,样本分两阶段使用:第一阶段用于学习辅助分布,第二阶段则根据学习得到的分布进行采样,以完成最终的概率估计。与普遍认为样本量需随维度呈指数增长才能保证估计量一致性并避免权重退化现象的观点相反,我们发现第一阶段学习过程中仅需多项式级别的样本量即可。在假定所求概率远离$0$的条件下,我们证明了这一结论。对于第一种方案,我们证明样本量仅需按$rd$增长(其中$r$为投影的有效维度);而对于交叉熵方法,虽然其多项式增长率的具体形式尚未显式给出,但本文提供了关于该增长率的理论洞见。除证明一致性外,我们还论证了在所研究的框架中,重要性采样权重不会发生退化。