The likelihood ratio is a crucial quantity for statistical inference in science that enables hypothesis testing, construction of confidence intervals, reweighting of distributions, and more. Many modern scientific applications, however, make use of data- or simulation-driven models for which computing the likelihood ratio can be very difficult or even impossible. By applying the so-called ``likelihood ratio trick,'' approximations of the likelihood ratio may be computed using clever parametrizations of neural network-based classifiers. A number of different neural network setups can be defined to satisfy this procedure, each with varying performance in approximating the likelihood ratio when using finite training data. We present a series of empirical studies detailing the performance of several common loss functionals and parametrizations of the classifier output in approximating the likelihood ratio of two univariate and multivariate Gaussian distributions as well as simulated high-energy particle physics datasets.
翻译:似然比是科学统计推断中的关键量,可用于假设检验、置信区间构建、分布重新加权等。然而,许多现代科学应用依赖于数据驱动或模拟驱动的模型,这些模型下计算似然比可能非常困难甚至无法实现。通过应用所谓的"似然比技巧",借助神经网络分类器的巧妙参数化,可以计算似然比的近似值。为满足这一过程可定义多种不同的神经网络配置,但在使用有限训练数据时,每种配置在近似似然比方面的性能各有差异。我们通过一系列实证研究,详细分析了多种常用损失函数以及分类器输出的不同参数化方式在近似单变量与多变量高斯分布以及模拟高能粒子物理数据集似然比时的表现。