Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents

Large language models (LLMs) have dramatically enhanced the field of language intelligence, as demonstrably evidenced by their formidable empirical performance across a spectrum of complex reasoning tasks. Additionally, theoretical proofs have illuminated their emergent reasoning capabilities, providing a compelling showcase of their advanced cognitive abilities in linguistic contexts. Critical to their remarkable efficacy in handling complex reasoning tasks, LLMs leverage the intriguing chain-of-thought (CoT) reasoning techniques, obliging them to formulate intermediate steps en route to deriving an answer. The CoT reasoning approach has not only exhibited proficiency in amplifying reasoning performance but also in enhancing interpretability, controllability, and flexibility. In light of these merits, recent research endeavors have extended CoT reasoning methodologies to nurture the development of autonomous language agents, which adeptly adhere to language instructions and execute actions within varied environments. This survey paper orchestrates a thorough discourse, penetrating vital research dimensions, encompassing: (i) the foundational mechanics of CoT techniques, with a focus on elucidating the circumstances and justification behind its efficacy; (ii) the paradigm shift in CoT; and (iii) the burgeoning of language agents fortified by CoT approaches. Prospective research avenues envelop explorations into generalization, efficiency, customization, scaling, and safety. This paper caters to a wide audience, including beginners seeking comprehensive knowledge of CoT reasoning and language agents, as well as experienced researchers interested in foundational mechanics and engaging in cutting-edge discussions on these topics. A repository for the related papers is available at https://github.com/Zoeyyao27/CoT-Igniting-Agent.

翻译：大型语言模型（LLMs）显著提升了语言智能领域的能力，其在多种复杂推理任务中展现出强大的实证性能。此外，理论证明揭示了它们涌现的推理能力，生动展示了其在语言语境中的高级认知能力。LLMs在处理复杂推理任务时关键依赖于引人入胜的思维链（CoT）推理技术，要求模型在推导答案过程中形成中间步骤。CoT推理方法不仅擅长提升推理性能，还能增强可解释性、可控性和灵活性。基于这些优势，近期研究将CoT推理方法扩展到自主语言代理的培养中，使代理能够熟练遵循语言指令并在多样化环境中执行行动。本综述论文系统性地探讨了关键研究维度，涵盖：（i）CoT技术的基本机制，聚焦于阐明其有效性的条件与原理；（ii）CoT的范式转变；（iii）基于CoT方法增强的语言代理的兴起。未来研究方向包括泛化、效率、定制化、扩展性和安全性。本文面向广泛读者群体，包括希望全面了解CoT推理与语言代理的初学者，以及对基础机制感兴趣并参与前沿讨论的资深研究人员。相关论文仓库见https://github.com/Zoeyyao27/CoT-Igniting-Agent。