Can Language Models Learn to Skip Steps?

Trained on vast corpora of human language, language models demonstrate emergent human-like reasoning abilities. Yet they are still far from true intelligence, which opens up intriguing opportunities to explore the parallels of humans and model behaviors. In this work, we study the ability to skip steps in reasoning - a hallmark of human expertise developed through practice. Unlike humans, who may skip steps to enhance efficiency or to reduce cognitive load, models do not inherently possess such motivations to minimize reasoning steps. To address this, we introduce a controlled framework that stimulates step-skipping behavior by iteratively refining models to generate shorter and accurate reasoning paths. Empirical results indicate that models can develop the step skipping ability under our guidance. Moreover, after fine-tuning on expanded datasets that include both complete and skipped reasoning sequences, the models can not only resolve tasks with increased efficiency without sacrificing accuracy, but also exhibit comparable and even enhanced generalization capabilities in out-of-domain scenarios. Our work presents the first exploration into human-like step-skipping ability and provides fresh perspectives on how such cognitive abilities can benefit AI models.

翻译：在大量人类语言语料库上训练的语言模型展现出类人的推理能力。然而，它们距离真正的智能仍有差距，这为探索人类与模型行为之间的相似性提供了有趣的研究机会。本文研究了推理过程中跳过步骤的能力——这是人类通过实践发展出的专业能力的标志。与人类可能为了提高效率或减轻认知负荷而跳过步骤不同，模型本身并不具备最小化推理步骤的内在动机。为此，我们引入了一个受控框架，通过迭代优化模型生成更短且准确的推理路径，以激发其跳过步骤的行为。实验结果表明，在我们的引导下，模型能够发展出跳过步骤的能力。此外，在包含完整推理序列和跳过步骤的推理序列的扩展数据集上进行微调后，模型不仅能在不牺牲准确性的前提下以更高效率解决问题，还能在领域外场景中展现出相当甚至更强的泛化能力。本研究首次对人类式的步骤跳过能力进行了探索，并为这类认知能力如何使人工智能模型受益提供了新的视角。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日