Expected points (EP) and win probability (WP) are value functions fundamental to strategic in-game decision making in American football, particularly for fourth down decision making. The EP and WP functions which are widely used today are statistical models fit from historical data. These models, however, are subject to serious statistical flaws: selection bias, overfitting, ignoring autocorrelation, and ignoring uncertainty quantification. We develop a machine learning framework that accounts for these issues and extracts our analysis into a decision-making inference. Along the way, we introduce a novel methodological approach to mitigate overfitting in machine learning models. Specifically, we extend the catalytic prior, initially developed in the context of linear models, to smooth our tree machine learning models. Our final product is a major advance in fourth-down strategic decision making: far fewer fourth-down decisions are as obvious as analysts claim.
翻译:预期得分(EP)与获胜概率(WP)是美式橄榄球比赛战略决策(尤其是第四次进攻决策)中至关重要的价值函数。目前广泛使用的EP与WP函数是基于历史数据拟合的统计模型。然而,这些模型存在严重的统计缺陷:选择偏差、过拟合、忽视自相关性以及忽略不确定性量化。我们构建了一个机器学习框架,解决了上述问题,并将分析结果提炼为决策推理工具。在此过程中,我们提出了一种新的方法论来缓解机器学习模型中的过拟合问题——具体而言,我们将最初在线性模型中开发的催化先验理论扩展至树形机器学习模型的平滑处理中。最终成果是第四次进攻战略决策领域的重大突破:分析师所宣称的"显而易见"的第四次进攻决策,其数量远比行业认知要少得多。