Large language models (LLMs) have recently shown remarkable performance in language tasks and beyond. However, due to their limited inherent causal reasoning ability, LLMs still face challenges in handling tasks that require robust causal reasoning ability, such as health-care and economic analysis. As a result, a growing body of research has focused on enhancing the causal reasoning ability of LLMs. Despite the booming research, there lacks a survey to well review the challenges, progress and future directions in this area. To bridge this significant gap, we systematically review literature on how to strengthen LLMs' causal reasoning ability in this paper. We start from the introduction of background and motivations of this topic, followed by the summarisation of key challenges in this area. Thereafter, we propose a novel taxonomy to systematically categorise existing methods, together with detailed comparisons within and between classes of methods. Furthermore, we summarise existing benchmarks and evaluation metrics for assessing LLMs' causal reasoning ability. Finally, we outline future research directions for this emerging field, offering insights and inspiration to researchers and practitioners in the area.
翻译:大型语言模型(LLMs)近期在语言任务及其他领域展现出卓越性能。然而,由于其固有的因果推理能力有限,LLMs在处理需要强因果推理能力的任务(如医疗保健和经济分析)时仍面临挑战。因此,越来越多的研究聚焦于增强LLMs的因果推理能力。尽管相关研究蓬勃发展,目前仍缺乏对该领域挑战、进展与未来方向的系统性综述。为弥补这一重要空白,本文系统性地综述了关于如何强化LLMs因果推理能力的文献。我们从该主题的背景与动机介绍入手,继而总结该领域的关键挑战。随后,我们提出一种新颖的分类法,系统性地对现有方法进行归类,并对各类方法内部及方法间进行详细比较。此外,我们总结了用于评估LLMs因果推理能力的现有基准测试与评价指标。最后,我们展望了这一新兴领域的未来研究方向,为该领域的研究者与实践者提供见解与启示。