ShuttleSet22: Benchmarking Stroke Forecasting with Stroke-Level Badminton Dataset

from arxiv, IT4PSS @ IJCAI-23 and CoachAI Badminton Challenge Track 2 @ IJCAI-23. Challenge website: https://sites.google.com/view/coachai-challenge-2023/

In recent years, badminton analytics has drawn attention due to the advancement of artificial intelligence and the efficiency of data collection. While there is a line of effective applications to improve and investigate player performance, there are only a few public badminton datasets that can be used for researchers outside the badminton domain. Existing badminton singles datasets focus on specific matchups; however, they cannot provide comprehensive studies on different players and various matchups. In this paper, we provide a badminton singles dataset, ShuttleSet22, which is collected from high-ranking matches in 2022. ShuttleSet22 consists of 30,172 strokes in 2,888 rallies in the training set, 1,400 strokes in 450 rallies in the validation set, and 2,040 strokes in 654 rallies in the testing set with detailed stroke-level metadata within a rally. To benchmark existing work with ShuttleSet22, we test the state-of-the-art stroke forecasting approach, ShuttleNet, with the corresponding stroke forecasting task, i.e., predict the future strokes based on the given strokes of each rally. We also hold a challenge, Track 2: Forecasting Future Turn-Based Strokes in Badminton Rallies, at CoachAI Badminton Challenge 2023 to boost researchers to tackle this problem. The baseline codes and the dataset will be made available on https://github.com/wywyWang/CoachAI-Projects/tree/main/CoachAI-Challenge-IJCAI2023.

翻译：近年来，随着人工智能技术的发展与数据采集效率的提升，羽毛球运动分析引起了广泛关注。虽然已有诸多有效应用用于提升和研究球员表现，但可供非羽毛球领域研究人员使用的公开羽毛球数据集仍然稀少。现有的羽毛球单打数据集集中于特定对阵组合，无法支持对不同球员及多样化对阵的全面研究。本文提出了一个羽毛球单打数据集ShuttleSet22，其数据来源于2022年顶级赛事。该数据集包含训练集（30,172次击球，2,888个回合）、验证集（1,400次击球，450个回合）和测试集（2,040次击球，654个回合），并提供了详细的逐拍级回合元数据。为在ShuttleSet22上建立现有工作的基准，我们采用最新的击球预测方法ShuttleNet，执行对应的击球预测任务（即根据每个回合的已知击球序列预测后续击球）。此外，我们在2023年CoachAI羽毛球挑战赛中设立了第二赛道"羽毛球回合中逐拍交替击球预测"，旨在激励研究者攻克该问题。基准代码与数据集将在https://github.com/wywyWang/CoachAI-Projects/tree/main/CoachAI-Challenge-IJCAI2023 开源提供。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日