We consider the estimation of rare-event probabilities using sample proportions output by naive Monte Carlo or collected data. Unlike using variance reduction techniques, this naive estimator does not have a priori relative efficiency guarantee. On the other hand, due to the recent surge of sophisticated rare-event problems arising in safety evaluations of intelligent systems, efficiency-guaranteed variance reduction may face implementation challenges which, coupled with the availability of computation or data collection power, motivate the use of such a naive estimator. In this paper we study the uncertainty quantification, namely the construction, coverage validity and tightness of confidence intervals, for rare-event probabilities using only sample proportions. In addition to the known normality, Wilson's and exact intervals, we investigate and compare them with two new intervals derived from Chernoff's inequality and the Berry-Esseen theorem. Moreover, we generalize our results to the natural situation where sampling stops by reaching a target number of rare-event hits.
翻译:我们考虑使用朴素蒙特卡洛模拟或收集数据得到的样本比例来估计稀有事件概率。与采用方差缩减技术不同,这种朴素估计量不具备先验的相对效率保证。另一方面,由于近年来智能系统安全评估中涌现出大量复杂稀有事件问题,具备效率保证的方差缩减技术可能面临实施困难,加之计算能力或数据收集能力的可得性,促使了此类朴素估计量的应用。本文研究仅利用样本比例对稀有事件概率进行不确定性量化,即置信区间的构建、覆盖有效性及紧致性。除已知的正态区间、威尔逊区间和精确区间外,我们探讨并比较了基于切尔诺夫不等式和贝里-埃森定理推导出的两种新型区间。此外,我们将研究结论推广至采样达到目标稀有事件发生次数后停止的自然情形。