Abstract reasoning, the ability to reason from the abstract essence of a problem, serves as a key to generalization in human reasoning. However, eliciting language models to perform reasoning with abstraction remains unexplored. This paper seeks to bridge this gap by introducing a novel structured reasoning format called Abstraction-of-Thought (AoT). The uniqueness of AoT lies in its explicit requirement for varying levels of abstraction within the reasoning process. This approach could elicit language models to first contemplate on the abstract level before incorporating concrete details, which is overlooked by the prevailing step-by-step Chain-of-Thought (CoT) method. To align models with the AoT format, we present AoT Collection, a generic finetuning dataset consisting of 348k high-quality samples with AoT reasoning processes, collected via an automated and scalable pipeline. We finetune a wide range of language models with AoT Collection and conduct extensive evaluations on 23 unseen tasks from the challenging benchmark Big-Bench Hard. Experimental results indicate that models aligned to AoT reasoning format substantially outperform those aligned to CoT in many reasoning tasks.
翻译:抽象推理——从问题的抽象本质出发进行推理的能力——是人类推理中实现泛化的关键。然而,如何激发语言模型进行具有抽象性的推理仍未被探索。本文旨在通过引入一种新颖的结构化推理格式——思维抽象化(Abstraction-of-Thought, AoT)——来弥合这一差距。AoT的独特性在于其明确要求在推理过程中包含不同层次的抽象。这种方法可以引导语言模型在纳入具体细节之前,先在抽象层面进行思考,而这是当前主流的逐步链式思维(Chain-of-Thought, CoT)方法所忽视的。为了使模型与AoT格式对齐,我们提出了AoT Collection,这是一个通用的微调数据集,包含通过自动化、可扩展的流程收集的34.8万个具有AoT推理过程的高质量样本。我们使用AoT Collection对多种语言模型进行微调,并在挑战性基准Big-Bench Hard的23个未见任务上进行了广泛评估。实验结果表明,与AoT推理格式对齐的模型在许多推理任务上显著优于与CoT对齐的模型。