Relation extraction (RE) involves identifying the relations between entities from unstructured texts. RE serves as the foundation for many natural language processing (NLP) applications, such as knowledge graph completion, question answering, and information retrieval. In recent years, deep neural networks have dominated the field of RE and made noticeable progress. Subsequently, the large pre-trained language models (PLMs) have taken the state-of-the-art of RE to a new level. This survey provides a comprehensive review of existing deep learning techniques for RE. First, we introduce RE resources, including RE datasets and evaluation metrics. Second, we propose a new taxonomy to categorize existing works from three perspectives (text representation, context encoding, and triplet prediction). Third, we discuss several important challenges faced by RE and summarize potential techniques to tackle these challenges. Finally, we outline some promising future directions and prospects in this field. This survey is expected to facilitate researchers' collaborative efforts to tackle the challenges of real-life RE systems.
翻译:关系抽取(RE)旨在从非结构化文本中识别实体间的语义关系。作为自然语言处理(NLP)的基石技术,关系抽取支撑着知识图谱补全、问答系统和信息检索等众多应用。近年来,深度神经网络主导了关系抽取领域并取得显著进展,随后大规模预训练语言模型(PLM)将关系抽取的最新技术水平推至新高度。本综述全面回顾了现有基于深度学习的关系抽取技术:首先介绍包括数据集与评估指标在内的研究资源;其次从文本表征、上下文编码和三元组预测三个视角构建新分类体系进行文献梳理;继而探讨关系抽取面临的若干关键挑战并总结应对策略;最后展望该领域具有前景的研究方向与发展前景。本综述旨在促进研究者协同攻克现实关系抽取系统的技术难题。