This study employs Bayesian methodologies to explore the influence of player or positional factors in predicting the probability of a shot resulting in a goal, measured by the expected goals (xG) metric. Utilising publicly available data from StatsBomb, Bayesian hierarchical logistic regressions are constructed, analysing approximately 10,000 shots from the English Premier League to ascertain whether positional or player-level effects impact xG. The findings reveal positional effects in a basic model that includes only distance to goal and shot angle as predictors, highlighting that strikers and attacking midfielders exhibit a higher likelihood of scoring. However, these effects diminish when more informative predictors are introduced. Nevertheless, even with additional predictors, player-level effects persist, indicating that certain players possess notable positive or negative xG adjustments, influencing their likelihood of scoring a given chance. The study extends its analysis to data from Spain's La Liga and Germany's Bundesliga, yielding comparable results. Additionally, the paper assesses the impact of prior distribution choices on outcomes, concluding that the priors employed in the models provide sound results but could be refined to enhance sampling efficiency for constructing more complex and extensive models feasibly.
翻译:本研究采用贝叶斯方法探究球员或位置因素在预测射门转化为进球的概率(以预期进球数xG度量)中的影响。利用StatsBomb公开数据,构建了贝叶斯分层逻辑回归模型,分析了英格兰足球超级联赛约10000次射门,以确定位置或球员层面效应是否影响xG。研究发现,在仅包含距球门距离和射门角度作为预测变量的基础模型中存在位置效应,表明前锋和攻击型中场球员进球可能性更高。然而,当引入更多信息预测变量时,这些效应逐渐减弱。尽管如此,即便增加其他预测变量,球员层面效应依然存在,表明某些球员具有显著的正向或负向xG调整,从而影响其把握特定机会的得分可能性。本研究将分析范围扩展至西班牙甲级联赛和德国甲级联赛数据,得到相似结果。此外,论文评估了先验分布选择对结果的影响,认为模型所采用的先验分布虽能得出合理结果,但可通过优化提升采样效率,以更可行地构建复杂且大规模模型。