Many enthusiasts and experts publish forecasts of the order players are drafted into professional sports leagues, known as mock drafts. Using a novel dataset of mock drafts for the National Basketball Association (NBA), we analyze authors' mock draft accuracy over time and ask how we can reasonably use information from multiple authors. To measure how accurate mock drafts are, we assume that both mock drafts and the actual draft are ranked lists, and we propose that rank-biased distance (RBD) of Webber et al. (2010) is the appropriate error metric for mock draft accuracy. This is because RBD allows mock drafts to have a different length than the actual draft, accounts for players not appearing in both lists, and weights errors early in the draft more than errors later on. We validate that mock drafts, as expected, improve in accuracy over the course of a season, and that accuracy of the mock drafts produced right before their drafts is fairly stable across seasons. To be able to combine information from multiple mock drafts into a single consensus mock draft, we also propose a ranked-list combination method based on the ideas of ranked-choice voting. We show that our method provides improved forecasts over the standard Borda count combination method used for most similar analyses in sports, and that either combination method provides a more accurate forecast over time than any single author.
翻译:众多爱好者与专家会发布关于球员进入职业体育联盟顺位的预测,即模拟选秀。基于美国职业篮球联赛(NBA)模拟选秀的新数据集,我们分析了多作者模拟选秀的准确性随时间的变化规律,并探讨如何合理整合多源作者信息。为衡量模拟选秀的准确性,我们假设模拟选秀与实际选秀均为排序列表,并提出Webber等人(2010)提出的秩偏距离(RBD)可作为模拟选秀准确性的适当误差度量指标。这是因为RBD允许模拟选秀具有与实际选秀不同的长度,能处理未同时出现在两个列表中的球员,并且对选秀早期错误的权重高于后期错误。我们验证了模拟选秀的准确性随赛季推进而提升,且选秀前最终版本模拟选秀的准确性在跨赛季间保持稳定。为将多源模拟选秀信息整合为共识性排名,我们基于排序投票理论提出了一种排名列表组合方法。实验表明,该方法相比体育领域同类分析中常用的标准博达计数组合法,能提供更优的预测效果,且两种组合方法均比单一作者的预测具有更高的长期准确性。