HDSense: An efficient method for ranking observable sensitivity

from arxiv, 26+11 pages, 9 figures, code available at: https://gitlab.com/pythia8-contrib/packages/hdsense. Updated version with minor revision recommended by SciPost Physics

Identifying which observables most effectively constrain model parameters can be computationally prohibitive when considering full likelihoods of many correlated observables. This is especially important for, e.g., hadronization models, where high precision is required to interpret the results of collider experiments. We introduce the High-Dimensional Sensitivity (HDSense) score, a computationally efficient metric for ranking observable sets using only one-dimensional histograms. Derived by profiling over unknown correlations in the Fisher information framework, the score balances total information content against redundancy between observables. We apply HDSense to rank a set observables in terms of their constraining power with respect to five parameters of the Lund string model of hadronization implemented in Pythia using simulated leptonic collider events at the $Z$ pole. Validation against machine-learning--based full-likelihood approximations demonstrates that HDSense successfully identifies near-optimal observable subsets. The framework naturally handles data from multiple experiments with different acceptances and incorporates detector effects. While demonstrated on hadronization models, the methodology applies broadly to generic parameter estimation problems where correlations are unknown or difficult to model.

翻译：在考虑大量相互关联的可观测量时，基于完全似然函数识别哪些可观测量能最有效地约束模型参数，可能带来极高的计算成本。这一点对于强子化模型等需要高精度以解读对撞机实验结果的领域尤为重要。我们引入了高维敏感性（HDSense）评分，这是一种仅使用一维直方图即可对可观测量集进行高效排序的计算度量。该评分基于Fisher信息框架，通过刻画未知相关性来推导，能在总信息量与可观测量之间的冗余度之间取得平衡。我们利用模拟的$Z$极点轻子对撞机事件，将HDSense应用于对一组可观测量进行排序，以评估它们对Pythia中实施的Lund弦强子化模型五个参数的约束能力。通过与基于机器学习的完全似然近似进行验证，结果表明HDSense能成功识别出近乎最优的可观测量子集。该框架能够自然处理来自不同接受范围的多个实验数据，并整合探测器效应。尽管本文以强子化模型为例进行演示，但该方法广泛适用于相关性未知或难以建模的通用参数估计问题。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/