Identifying which observables most effectively constrain model parameters can be computationally prohibitive when considering full likelihoods of many correlated observables. This is especially important for, e.g., hadronization models, where high precision is required to interpret the results of collider experiments. We introduce the High-Dimensional Sensitivity (HDSense) score, a computationally efficient metric for ranking observable sets using only one-dimensional histograms. Derived by profiling over unknown correlations in the Fisher information framework, the score balances total information content against redundancy between observables. We apply HDSense to rank a set observables in terms of their constraining power with respect to five parameters of the Lund string model of hadronization implemented in Pythia using simulated leptonic collider events at the $Z$ pole. Validation against machine-learning--based full-likelihood approximations demonstrates that HDSense successfully identifies near-optimal observable subsets. The framework naturally handles data from multiple experiments with different acceptances and incorporates detector effects. While demonstrated on hadronization models, the methodology applies broadly to generic parameter estimation problems where correlations are unknown or difficult to model.
翻译:在考虑多个相关观测量的完整似然函数时,识别哪些观测量能最有效地约束模型参数可能在计算上难以实现。这对于强子化模型等问题尤为重要,因为解释对撞机实验结果需要高精度。我们提出了高维敏感性(HDSense)评分,这是一种仅使用一维直方图即可对观测量集合进行排序的高效计算度量。该评分通过在费希尔信息框架中对未知相关性进行仿射变换推导得出,平衡了总信息量与观测量间的冗余性。我们应用HDSense对一组观测量进行排序,评估其在$Z$极点处模拟轻子对撞机事件中,对Pythia实现的Lund弦强子化模型五个参数的约束能力。基于机器学习全似然近似方法的验证表明,HDSense能成功识别接近最优的观测量子集。该框架天然支持处理来自具有不同接受度的多个实验的数据,并能纳入探测器效应。虽然以强子化模型为例进行演示,但该方法广泛适用于相关性未知或难以建模的通用参数估计问题。