This paper introduces an approach to predicting the next event in a soccer match, a challenge bearing remarkable similarities to the problem faced by Large Language Models (LLMs). Unlike other methods that severely limit event dynamics in soccer, often abstracting from many variables or relying on a mix of sequential models, our research proposes a novel technique inspired by the methodologies used in LLMs. These models predict a complete chain of variables that compose an event, significantly simplifying the construction of Large Event Models (LEMs) for soccer. Utilizing deep learning on the publicly available WyScout dataset, the proposed approach notably surpasses the performance of previous LEM proposals in critical areas, such as the prediction accuracy of the next event type. This paper highlights the utility of LEMs in various applications, including match prediction and analytics. Moreover, we show that LEMs provide a simulation backbone for users to build many analytics pipelines, an approach opposite to the current specialized single-purpose models. LEMs represent a pivotal advancement in soccer analytics, establishing a foundational framework for multifaceted analytics pipelines through a singular machine-learning model.
翻译:本文提出了一种预测足球比赛中下一个事件的方法,这一挑战与大型语言模型(LLMs)面临的问题具有显著相似性。不同于其他方法(这些方法严重限制足球事件的动态变化,通常忽略众多变量或依赖混合序列模型),我们的研究提出了一种受LLMs方法论启发的新技术。这些模型预测构成事件的完整变量链,从而极大简化了足球领域大型事件模型(LEMs)的构建。通过在公开可用的WyScout数据集上应用深度学习,所提出的方法在关键领域(如下一个事件类型的预测准确性)显著超越了此前LEM方案的表现。本文强调了LEM在比赛预测和分析等多种应用中的实用性。此外,我们证明LEM为用户构建多样化分析流程提供了仿真基础框架,这与当前针对特定单一用途的专用模型形成对比。LEM代表了足球分析领域的重大进步,通过单一机器学习模型为多面分析流程建立了基础性框架。