We introduce DataTales, a novel benchmark designed to assess the proficiency of language models in data narration, a task crucial for transforming complex tabular data into accessible narratives. Existing benchmarks often fall short in capturing the requisite analytical complexity for practical applications. DataTales addresses this gap by offering 4.9k financial reports paired with corresponding market data, showcasing the demand for models to create clear narratives and analyze large datasets while understanding specialized terminology in the field. Our findings highlights the significant challenge that language models face in achieving the necessary precision and analytical depth for proficient data narration, suggesting promising avenues for future model development and evaluation methodologies.
翻译:我们提出了DataTales,这是一个新颖的基准测试,旨在评估语言模型在数据叙述任务上的能力。数据叙述是将复杂的表格数据转化为易于理解的叙述性文本的关键任务。现有基准测试通常难以捕捉实际应用所需的复杂分析能力。DataTales通过提供4.9万份财务报告及其对应的市场数据来弥补这一不足,展示了模型在理解领域专业术语的同时,需要具备创建清晰叙述和分析大型数据集的能力。我们的研究结果突显了语言模型在实现数据叙述所需精度和分析深度方面面临的重大挑战,为未来的模型开发和评估方法指出了有前景的研究方向。