Current language models have demonstrated their capability to develop basic reasoning, but struggle in more complicated reasoning tasks that require a combination of atomic skills, such as math word problem requiring skills like arithmetic and unit conversion. Previous methods either do not improve the inherent atomic skills of models or not attempt to generalize the atomic skills to complex reasoning tasks. In this paper, we first propose a probing framework to investigate whether the atomic skill can spontaneously generalize to complex reasoning tasks. Then, we introduce a hierarchical curriculum learning training strategy to achieve better skill generalization. In our experiments, we find that atomic skills can not spontaneously generalize to compositional tasks. By leveraging hierarchical curriculum learning, we successfully induce generalization, significantly improve the performance of open-source LMs on complex reasoning tasks. Promisingly, the skill generalization exhibit effective in cross-dataset and cross-domain scenarios. Complex reasoning can also help enhance atomic skills. Our findings offer valuable guidance for designing better training strategies for complex reasoning tasks.
翻译:当前语言模型已展现出发展基础推理的能力,但在需要组合原子技能的复杂推理任务(例如涉及算术和单位换算等技能的数学应用题)中仍存在困难。现有方法要么未能提升模型内在的原子技能,要么未尝试将原子技能泛化至复杂推理任务。本文首先提出一个探测框架,考察原子技能能否自发泛化至复杂推理任务,继而引入分层课程学习训练策略以实现更优的技能泛化。实验发现,原子技能无法自发泛化至复合任务。通过采用分层课程学习,我们成功诱导了泛化能力,显著提升了开源语言模型在复杂推理任务上的性能。令人欣喜的是,这种技能泛化在跨数据集和跨领域场景中均表现出有效性,同时复杂推理也能反向增强原子技能。本研究为设计复杂推理任务更优的训练策略提供了重要指导。