Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks, such as chain-of-thought (CoT) reasoning. However, most of the existing approaches to enhance this ability rely heavily on data-driven methods, while neglecting the structural aspects of the model's reasoning capacity. To encourage a more structural generation of CoT steps, we propose a hierarchical generation scheme: we let the LM generate a planning token at the start of each reasoning step, intuitively serving as a high-level plan of the current step, and add their embeddings to the model parameters. Our approach requires a negligible increase in trainable parameters (0.001%) and can be applied through either full fine-tuning or a more parameter-efficient scheme. We demonstrate our method's effectiveness by applying it to three different LLMs, showing notable accuracy improvements across three math word problem datasets and one multihop QA dataset with respect to standard fine-tuning baselines.
翻译:大型语言模型(LLMs)因其执行复杂推理任务(如思维链推理)的能力而近期受到广泛关注。然而,现有提升该能力的方法大多严重依赖数据驱动策略,而忽视了模型推理能力的结构性层面。为促进更具结构性的思维链步骤生成,我们提出一种分层生成方案:让语言模型在每个推理步骤开始时生成一个规划标记,直观上作为当前步骤的高层规划,并将其嵌入向量添加到模型参数中。该方法仅需可忽略不计的可训练参数增量(0.001%),可通过全参数微调或更高效的参数微调方案实现。我们在三种不同的大型语言模型上验证了本方法的有效性,在三个数学应用题数据集和一个多跳问答数据集上,相较于标准微调基线均显示出显著的准确率提升。