The impact of player age on performance has received attention across sport. Most research has focused on the performance of players at each age, ignoring the reality that age likewise influences which players receive opportunities to perform. Our manuscript makes two contributions. First, we highlight how selection bias is linked to both (i) which players receive opportunity to perform in sport, and (ii) at which ages we observe these players perform. This approach is used to generate underlying distributions of how players move in and out of sport organizations. Second, motivated by methods for missing data, we propose novel estimation methods of age curves by using both observed and unobserved (imputed) data. We use simulations to compare several comparative approaches for estimating aging curves. Imputation-based methods, as well as models that account for individual player skill, tend to generate lower RMSE and age curve shapes that better match the truth. We implement our approach using data from the National Hockey League.
翻译:球员年龄对运动表现的影响已引起体育领域的广泛关注。多数研究聚焦于各年龄段球员的表现,却忽略了年龄同样影响球员获得表现机会的现实。本文做出两项贡献:首先,我们揭示了选择偏倚与以下两个因素之间的关联:(i)哪些球员能获得体育表现机会,(ii)我们在哪些年龄阶段观察这些球员的表现。该方法用于生成球员在体育组织间进出模式的潜在分布。其次,受缺失数据方法启发,我们提出利用观测数据与未观测(插值)数据估计年龄曲线的新方法。通过模拟实验比较了多种年龄曲线估计方法,基于插值的方法及考虑个体球员技能差异的模型,其RMSE更低且年龄曲线形态更贴近真实值。我们使用国家冰球联盟的数据验证了该方法的有效性。