Identifying which observables most effectively constrain model parameters can be computationally prohibitive when considering full likelihoods of many correlated observables. This is especially important for, e.g., hadronization models, where high precision is required to interpret the results of collider experiments. We introduce the High-Dimensional Sensitivity (HDSense) score, a computationally efficient metric for ranking observable sets using only one-dimensional histograms. Derived by profiling over unknown correlations in the Fisher information framework, the score balances total information content against redundancy between observables. We apply HDSense to rank a set observables in terms of their constraining power with respect to five parameters of the Lund string model of hadronization implemented in Pythia using simulated leptonic collider events at the $Z$ pole. Validation against machine-learning--based full-likelihood approximations demonstrates that HDSense successfully identifies near-optimal observable subsets. The framework naturally handles data from multiple experiments with different acceptances and incorporates detector effects. While demonstrated on hadronization models, the methodology applies broadly to generic parameter estimation problems where correlations are unknown or difficult to model.
翻译:在考虑大量相互关联的可观测量时,基于完全似然函数识别哪些可观测量能最有效地约束模型参数,可能带来极高的计算成本。这一点对于强子化模型等需要高精度以解读对撞机实验结果的领域尤为重要。我们引入了高维敏感性(HDSense)评分,这是一种仅使用一维直方图即可对可观测量集进行高效排序的计算度量。该评分基于Fisher信息框架,通过刻画未知相关性来推导,能在总信息量与可观测量之间的冗余度之间取得平衡。我们利用模拟的$Z$极点轻子对撞机事件,将HDSense应用于对一组可观测量进行排序,以评估它们对Pythia中实施的Lund弦强子化模型五个参数的约束能力。通过与基于机器学习的完全似然近似进行验证,结果表明HDSense能成功识别出近乎最优的可观测量子集。该框架能够自然处理来自不同接受范围的多个实验数据,并整合探测器效应。尽管本文以强子化模型为例进行演示,但该方法广泛适用于相关性未知或难以建模的通用参数估计问题。