Professional Basketball Player Behavior Synthesis via Planning with Diffusion

Dynamically planning in multi-agent systems has been explored to improve decision-making in various domains. Professional basketball serves as a compelling example of a dynamic spatio-temporal game, encompassing both concealed strategic policies and decision-making. However, processing the diverse on-court signals and navigating the vast space of potential actions and outcomes makes it difficult for existing approaches to swiftly identify optimal strategies in response to evolving circumstances. In this study, we first formulate the sequential decision-making process as a conditional trajectory generation process. We further introduce PLAYBEST (PLAYer BEhavior SynThesis), a method for enhancing player decision-making. We extend the state-of-the-art generative model, diffusion probabilistic model, to learn challenging multi-agent environmental dynamics from historical National Basketball Association (NBA) player motion tracking data. To incorporate data-driven strategies, an auxiliary value function is trained using the play-by-play data with corresponding rewards acting as the plan guidance. To accomplish reward-guided trajectory generation, conditional sampling is introduced to condition the diffusion model on the value function and conduct classifier-guided sampling. We validate the effectiveness of PLAYBEST via comprehensive simulation studies from real-world data, contrasting the generated trajectories and play strategies with those employed by professional basketball teams. Our results reveal that the model excels at generating high-quality basketball trajectories that yield efficient plays, surpassing conventional planning techniques in terms of adaptability, flexibility, and overall performance. Moreover, the synthesized play strategies exhibit a remarkable alignment with professional tactics, highlighting the model's capacity to capture the intricate dynamics of basketball games.

翻译：在多智能体系统中进行动态规划已被探索用于提升各领域的决策能力。职业篮球作为动态时空博弈的典型范例，既包含隐蔽的战略策略又涉及决策过程。然而，处理场上多样化信号并探索潜在动作与结果的广阔空间，使得现有方法难以快速识别应对动态局势的最优策略。本研究首先将序贯决策过程形式化为条件轨迹生成过程，进而提出PLAYBEST（PLAYer BEhavior SynThesis）方法以增强球员决策能力。我们扩展了最先进的生成模型——扩散概率模型，使其能从历史NBA球员运动追踪数据中学习具有挑战性的多智能体环境动态。为融入数据驱动策略，我们利用逐回合数据及对应奖励作为规划引导训练辅助价值函数。为实现奖励引导的轨迹生成，引入条件采样技术对扩散模型施加价值函数约束，并执行分类器引导采样。通过基于真实数据的全面仿真研究，将生成轨迹与比赛策略同职业篮球队实际采用的策略进行对比，验证了PLAYBEST的有效性。结果表明，该模型在生成高效比赛的高质量篮球轨迹方面表现卓越，在适应性、灵活性和整体性能上均超越传统规划技术。此外，合成比赛策略与专业战术高度吻合，凸显了模型捕捉篮球比赛复杂动态的能力。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【干货书】机器学习速查手册，135页pdf

专知会员服务

128+阅读 · 2020年11月20日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

55+阅读 · 2020年9月7日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日