Event Causality Identification (ECI) has become a crucial task in Natural Language Processing (NLP), aimed at automatically extracting causalities from textual data. In this survey, we systematically address the foundational principles, technical frameworks, and challenges of ECI, offering a comprehensive taxonomy to categorize and clarify current research methodologies, as well as a quantitative assessment of existing models. We first establish a conceptual framework for ECI, outlining key definitions, problem formulations, and evaluation standards. Our taxonomy classifies ECI methods according to the two primary tasks of sentence-level (SECI) and document-level (DECI) event causality identification. For SECI, we examine feature pattern-based matching, deep semantic encoding, causal knowledge pre-training and prompt-based fine-tuning, and external knowledge enhancement methods. For DECI, we highlight approaches focused on event graph reasoning and prompt-based techniques to address the complexity of cross-sentence causal inference. Additionally, we analyze the strengths, limitations, and open challenges of each approach. We further conduct an extensive quantitative evaluation of various ECI methods on two benchmark datasets. Finally, we explore future research directions, highlighting promising pathways to overcome current limitations and broaden ECI applications.
翻译:事件因果关系识别已成为自然语言处理领域的关键任务,旨在从文本数据中自动提取因果关系。本综述系统性地探讨了事件因果关系识别的基础原理、技术框架与挑战,提出了对现有研究方法进行分类和梳理的完整分类体系,并对现有模型进行了量化评估。我们首先建立了事件因果关系识别的概念框架,明确了关键定义、问题形式化描述及评估标准。我们的分类体系依据句子级事件因果关系识别和文档级事件因果关系识别两大核心任务对现有方法进行归类。针对句子级任务,我们分析了基于特征模式匹配、深度语义编码、因果知识预训练与提示微调以及外部知识增强的方法。针对文档级任务,我们重点探讨了专注于事件图推理的方法和基于提示的技术,以应对跨句子因果推理的复杂性。此外,我们剖析了各类方法的优势、局限性与开放挑战。我们进一步在两个基准数据集上对多种事件因果关系识别方法进行了广泛的量化评估。最后,我们展望了未来研究方向,指出了克服当前局限、拓展事件因果关系识别应用前景的可行路径。