The pre-trained large language models (LLMs) have shown their extraordinary capacity to solve reasoning tasks, even on tasks that require a complex process involving multiple sub-steps. However, given the vast possible generation space of all the tasks, how the pretrained model learns the reasoning ability remains an open question. We firstly propose that an intrinsic structural constraint on the generated sequence of language-based reasoning -- we called it template-content structure (T-C structure) -- is the key to explain why LLMs can solve a large number of complex reasoning problems with limited training data by showing this structure can reduce the possible space from exponential level to linear level. Furthermore, by generalizing this structure to the hierarchical case, we demonstrate that models can achieve task composition, further reducing the space needed to learn from linear to logarithmic, thereby effectively learning on complex reasoning involving multiple steps. We provide both examples and formal theory of our T-C structure. We also experimentally validate the existence of the T-C structure in some current LLMs and its effectiveness for reasoning.
翻译:预训练大语言模型在解决推理任务方面展现出非凡能力,甚至能处理需要多子步骤参与的复杂过程。然而,鉴于所有任务存在巨大的生成空间,预训练模型如何习得推理能力仍是未解之谜。我们首次提出,基于语言推理的生成序列存在一种内在结构约束——我们称之为模板-内容结构——这是解释大语言模型为何能通过有限训练数据解决大量复杂推理问题的关键。通过证明该结构能将可能空间从指数级压缩至线性级,我们阐明了其作用机理。进一步地,将该结构推广至层级化情形后,我们证明模型可实现任务组合,从而将学习所需空间从线性级降至对数级,使模型能有效掌握涉及多步骤的复杂推理。本文不仅提供了模板-内容结构的形式化理论与实例佐证,还通过实验验证了该结构在当前大语言模型中的存在性及其对推理能力的有效性。