Exploiting large language models (LLMs) to tackle reasoning has garnered growing attention. It still remains highly challenging to achieve satisfactory results in complex logical problems, characterized by plenty of premises within the prompt and requiring multi-hop reasoning. In particular, the reasoning capabilities of LLMs are brittle to disorder and distractibility. In this work, we first examine the mechanism from the perspective of information flow and reveal that LLMs exhibit failure patterns akin to human-like cognitive biases when dealing with disordered and irrelevant content in reasoning tasks. However, in contrast to LLMs, disordered and irrelevant content does not significantly decrease human performance, as humans have a propensity to distill the most relevant information and systematically organize their thoughts, aiding them in responding to questions. Stem from that, we further propose a novel reasoning approach named Concise and Organized Perception (COP). COP carefully analyzes the given statements to identify the most pertinent information while eliminating redundancy efficiently. It then prompts the LLMs in a more organized form that adapts to the model's inference process. By perceiving concise and organized context, the reasoning abilities of LLMs can be better elicited. Extensive experimental results on several popular logical benchmarks (ProofWriter, PrOntoQA, PrOntoQA-OOD, and FOLIO) and math benchmark (DI-GSM) show that COP significantly outperforms previous state-of-the-art methods.
翻译:利用大语言模型(LLMs)进行推理已引起日益广泛的关注。然而,在面对提示中包含大量前提且需要多跳推理的复杂逻辑问题时,实现令人满意的结果仍然极具挑战性。特别是,LLMs的推理能力对信息混乱和干扰因素较为脆弱。本研究首先从信息流的角度探讨其机制,发现LLMs在处理推理任务中的混乱和无关内容时,表现出类似于人类认知偏差的失败模式。然而,与LLMs不同,混乱和无关内容并未显著降低人类的表现,因为人类倾向于提炼出最相关的信息并系统性地组织思路,从而辅助回答问题。基于此,我们进一步提出了一种名为“简明有序的感知”(COP)的新型推理方法。COP通过仔细分析给定的陈述,识别出最相关的信息,同时高效剔除冗余内容。随后,它以更有序的形式(适应模型推理过程)提示LLMs。通过感知简明有序的上下文,LLMs的推理能力得以更好地激发。在多个流行的逻辑基准(ProofWriter、PrOntoQA、PrOntoQA-OOD和FOLIO)以及数学基准(DI-GSM)上的大量实验结果表明,COP显著优于先前的最先进方法。