A generative approach to frame-level multi-competitor races

Multi-competitor races often feature complicated within-race strategies that are difficult to capture when training data on race outcome level data. Further, models which do not account for such strategic effects may suffer from confounded inferences and predictions. In this work we develop a general generative model for multi-competitor races which allows analysts to explicitly model certain strategic effects such as changing lanes or drafting and separate these impacts from competitor ability. The generative model allows one to simulate full races from any real or created starting position which opens new avenues for attributing value to within-race actions and to perform counter-factual analyses. This methodology is sufficiently general to apply to any track based multi-competitor races where both tracking data is available and competitor movement is well described by simultaneous forward and lateral movements. We apply this methodology to one-mile horse races using data provided by the New York Racing Association (NYRA) and the New York Thoroughbred Horsemen's Association (NYTHA) for the Big Data Derby 2022 Kaggle Competition. This data features granular tracking data for all horses at the frame-level (occurring at approximately 4hz). We demonstrate how this model can yield new inferences, such as the estimation of horse-specific speed profiles which vary over phases of the race, and examples of posterior predictive counterfactual simulations to answer questions of interest such as starting lane impacts on race outcomes.

翻译：多选手竞赛通常包含复杂的赛道内策略，当训练数据仅包含比赛结果级别的数据时，这些策略往往难以捕捉。此外，未考虑此类策略效应的模型可能会产生混杂的推断与预测。本研究构建了一个通用的多选手竞赛生成模型，使分析人员能够显式建模某些策略效应（如变道或跟随牵引），并将这些影响与选手能力分离开来。该生成模型允许从任意真实或虚构的起始位置模拟完整比赛，为评估比赛内行为价值及开展反事实分析开辟了新途径。本方法具有充分普适性，可应用于任何基于赛道的多选手竞赛场景，前提是具备轨迹数据且选手运动可被同步的前向与横向运动良好描述。我们利用纽约赛马协会（NYRA）与纽约纯种马马主协会（NYTHA）为2022年大数据德比赛马Kaggle竞赛提供的数据，将该方法应用于一英里赛马比赛。该数据包含所有马匹在帧级（约4Hz采样频率）的精细化轨迹数据。我们展示了模型如何产生新型推断，例如估算随比赛阶段变化的马匹特异性速度曲线，并结合后验预测反事实模拟实例回答诸如起始跑道对比赛结果影响等感兴趣的问题。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日