Rethinking player evaluation in sports: Goals above expectation and beyond

A popular quantitative approach to evaluating player performance in sports involves comparing an observed outcome to the expected outcome ignoring player involvement, which is estimated using statistical or machine learning methods. In soccer, for instance, goals above expectation (GAX) of a player measure how often shots of this player led to a goal compared to the model-derived expected outcome of the shots. Typically, sports data analysts rely on flexible machine learning models, which are capable of handling complex nonlinear effects and feature interactions, but fail to provide valid statistical inference due to finite-sample bias and slow convergence rates. In this paper, we close this gap by presenting a framework for player evaluation with metrics derived from differences in actual and expected outcomes using flexible machine learning algorithms, which nonetheless allows for valid frequentist inference. We first show that the commonly used metrics are directly related to Rao's score test in parametric regression models for the expected outcome. Motivated by this finding and recent developments in double machine learning, we then propose the use of residualized versions of the original metrics. For GAX, the residualization step corresponds to an additional regression predicting whether a given player would take the shot under the circumstances described by the features. We further relate metrics in the proposed framework to player-specific effect estimates in interpretable semiparametric regression models, allowing us to infer directional effects, e.g., to determine players that have a positive impact on the outcome. Our primary use case are GAX in soccer. We further apply our framework to evaluate goal-stopping ability of goalkeepers, shooting skill in basketball, quarterback passing skill in American football, and injury-proneness of soccer players.

翻译：一种流行的量化评估运动员表现的方法涉及将观察到的结果与忽略球员参与度的预期结果进行比较，后者通过统计或机器学习方法估计。以足球为例，球员的“预期进球差”（GAX）衡量该球员射门导致进球的频率，与模型推导的预期射门结果相比。通常，体育数据分析师依赖于灵活的机器学习模型，这些模型能处理复杂的非线性效应和特征交互，但由于有限样本偏差和收敛速度慢，无法提供有效的统计推断。在本文中，我们通过提出一个框架来填补这一空白，该框架使用基于实际与预期结果差异的指标，采用灵活的机器学习算法进行球员评估，同时允许进行有效的频率学派推断。我们首先证明，常用的指标与参数回归模型中用于预期结果的Rao分数检验直接相关。基于这一发现和双重机器学习的最新进展，我们提出使用原始指标的残差化版本。对于GAX，残差化步骤对应于一个额外的回归，预测给定球员是否会在特征描述的情况下进行射门。我们进一步将提议框架中的指标与可解释的半参数回归模型中的球员特定效应估计联系起来，从而能够推断方向性效应，例如确定对结果有积极影响的球员。我们的主要应用案例是足球中的GAX。我们还将该框架应用于评估守门员的扑救能力、篮球中的投篮技巧、美式橄榄球中四分卫的传球技巧，以及足球运动员的受伤倾向。