The likelihood ratio is a crucial quantity for statistical inference in science that enables hypothesis testing, construction of confidence intervals, reweighting of distributions, and more. Many modern scientific applications, however, make use of data- or simulation-driven models for which computing the likelihood ratio can be very difficult or even impossible. By applying the so-called ``likelihood ratio trick,'' approximations of the likelihood ratio may be computed using clever parametrizations of neural network-based classifiers. A number of different neural network setups can be defined to satisfy this procedure, each with varying performance in approximating the likelihood ratio when using finite training data. We present a series of empirical studies detailing the performance of several common loss functionals and parametrizations of the classifier output in approximating the likelihood ratio of two univariate and multivariate Gaussian distributions as well as simulated high-energy particle physics datasets.
翻译:似然比是科学统计推断中的关键量,用于假设检验、置信区间构建、分布重加权等任务。然而,许多现代科学应用依赖数据驱动或模拟驱动模型,在这些模型中计算似然比可能非常困难甚至无法实现。通过应用所谓的"似然比技巧",可以借助基于神经网络分类器的巧妙参数化方法计算似然比的近似值。针对该流程,可定义多种不同的神经网络设置,每种设置在有限训练数据下对似然比的近似性能存在差异。我们通过一系列实证研究,详细展示了在近似两个单变量与多变量高斯分布以及模拟高能粒子物理数据集的似然比时,几种常见损失泛函和分类器输出参数化方法的性能表现。