The rapid evolution of digital sports media necessitates sophisticated information retrieval systems that can efficiently parse extensive multimodal datasets. This paper introduces SoccerRAG, an innovative framework designed to harness the power of Retrieval Augmented Generation (RAG) and Large Language Models (LLMs) to extract soccer-related information through natural language queries. By leveraging a multimodal dataset, SoccerRAG supports dynamic querying and automatic data validation, enhancing user interaction and accessibility to sports archives. Our evaluations indicate that SoccerRAG effectively handles complex queries, offering significant improvements over traditional retrieval systems in terms of accuracy and user engagement. The results underscore the potential of using RAG and LLMs in sports analytics, paving the way for future advancements in the accessibility and real-time processing of sports data.
翻译:数字体育媒体的快速发展,要求信息检索系统能够高效解析海量多模态数据集。本文提出SoccerRAG,这是一种创新框架,旨在利用检索增强生成(RAG)和大语言模型(LLM)的能力,通过自然语言查询提取足球相关信息。通过利用多模态数据集,SoccerRAG支持动态查询和自动数据验证,从而增强用户交互性并提升体育档案的可访问性。我们的评估表明,SoccerRAG能有效处理复杂查询,在准确性和用户参与度方面较传统检索系统有显著提升。这些结果突显了RAG和LLM在体育分析中的应用潜力,为未来体育数据可访问性和实时处理能力的进一步发展铺平了道路。