It has been shown that Large Language Models' (LLMs) performance can be improved for many tasks using Chain of Thought (CoT) or In-Context Learning (ICL), which involve demonstrating the steps needed to solve a task using a few examples. However, while datasets with input-output pairs are relatively easy to produce, providing demonstrations which include intermediate steps requires cumbersome manual work. These steps may be executable programs, as in agentic flows, or step-by-step reasoning as in CoT. In this work, we propose Automatic Data Labeling and Refinement (ADLR), a method to automatically generate and filter demonstrations which include the above intermediate steps, starting from a small seed of manually crafted examples. We demonstrate the advantage of ADLR in code-based table QA and mathematical reasoning, achieving up to a 5.5% gain. The code implementing our method is provided in the Supplementary material and will be made available.
翻译:研究表明,通过思维链(CoT)或上下文学习(ICL)技术——即使用少量示例展示解决任务所需的步骤——可以提升大语言模型(LLMs)在多项任务中的性能。然而,尽管包含输入-输出对的数据集相对容易构建,但提供包含中间步骤的演示示例仍需繁琐的人工劳动。这些中间步骤可以是智能体流程中的可执行程序,也可以是CoT中的逐步推理过程。本研究提出自动数据标注与精炼(ADLR)方法,该方法能够从少量人工构建的种子示例出发,自动生成并筛选包含上述中间步骤的演示示例。我们在基于代码的表格问答和数学推理任务中验证了ADLR的优势,实现了最高5.5%的性能提升。本方法的实现代码已附于补充材料并将公开提供。