Logical reasoning of text is an important ability that requires understanding the information present in the text, their interconnections, and then reasoning through them to infer new conclusions. Prior works on improving the logical reasoning ability of language models require complex processing of training data (e.g., aligning symbolic knowledge to text), yielding task-specific data augmentation solutions that restrict the learning of general logical reasoning skills. In this work, we propose APOLLO, an adaptively pretrained language model that has improved logical reasoning abilities. We select a subset of Wikipedia, based on a set of logical inference keywords, for continued pretraining of a language model. We use two self-supervised loss functions: a modified masked language modeling loss where only specific parts-of-speech words, that would likely require more reasoning than basic language understanding, are masked, and a sentence-level classification loss that teaches the model to distinguish between entailment and contradiction types of sentences. The proposed training paradigm is both simple and independent of task formats. We demonstrate the effectiveness of APOLLO by comparing it with prior baselines on two logical reasoning datasets. APOLLO performs comparably on ReClor and outperforms baselines on LogiQA. The code base has been made publicly available.
翻译:文本的逻辑推理是一项重要能力,要求理解文本中的信息及其相互联系,进而通过推理得出新结论。先前提升语言模型逻辑推理能力的工作需要对训练数据进行复杂处理(例如将符号知识与文本对齐),从而产生限制通用逻辑推理技能学习的任务特定数据增强方案。在这项工作中,我们提出APOLLO——一种具有增强逻辑推理能力的自适应预训练语言模型。我们基于一组逻辑推理关键词从维基百科中选取子集,用于语言模型的持续预训练。我们采用两种自监督损失函数:一种改进的掩码语言建模损失(仅遮蔽特定词性词汇,这些词汇比基础语言理解需要更多推理),以及一种句子级分类损失(教导模型区分蕴含与矛盾类型的句子)。所提出的训练范式既简单又独立于任务格式。通过在两个逻辑推理数据集上与先前的基线模型进行对比,我们证明了APOLLO的有效性。APOLLO在ReClor上表现相当,在LogiQA上优于基线模型。代码库已公开发布。