The extraction of a small number of relevant insights from vast amounts of data is a crucial component of data-driven decision-making. However, accomplishing this task requires considerable technical skills, domain expertise, and human labor. This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data, leveraging recent advances in reasoning and code generation techniques. We propose a new evaluation methodology based on a "capture the flag" principle, measuring the ability of such models to recognize meaningful and pertinent information (flags) in a dataset. We further propose two proof-of-concept agents, with different inner workings, and compare their ability to capture such flags in a real-world sales dataset. While the work reported here is preliminary, our results are sufficiently interesting to mandate future exploration by the community.
翻译:摘要:从海量数据中提取少量相关洞见是数据驱动决策的关键组成部分。然而,完成这项任务需要大量的技术技能、领域专业知识和人力投入。本研究探索了利用大型语言模型(LLMs)自动发现数据洞见的潜力,结合了近期在推理和代码生成技术方面的进展。我们提出了一种基于"夺旗"原理的新型评估方法,用于衡量此类模型识别数据集中有意义且相关的信息(旗标)的能力。我们进一步提出了两个具有不同内部机制的概念验证代理,并比较了它们在真实销售数据集中捕获此类旗标的能力。尽管本文报告的工作尚属初步阶段,但我们的结果足以引起学界兴趣,亟需开展后续探索。