Keyphrase prediction aims to generate phrases (keyphrases) that highly summarizes a given document. Recently, researchers have conducted in-depth studies on this task from various perspectives. In this paper, we comprehensively summarize representative studies from the perspectives of dominant models, datasets and evaluation metrics. Our work analyzes up to 167 previous works, achieving greater coverage of this task than previous surveys. Particularly, we focus highly on deep learning-based keyphrase prediction, which attracts increasing attention of this task in recent years. Afterwards, we conduct several groups of experiments to carefully compare representative models. To the best of our knowledge, our work is the first attempt to compare these models using the identical commonly-used datasets and evaluation metric, facilitating in-depth analyses of their disadvantages and advantages. Finally, we discuss the possible research directions of this task in the future.
翻译:关键短语预测旨在生成能高度概括给定文档的短语(关键短语)。近年来,研究者从不同角度对该任务开展了深入研究。本文从主流模型、数据集与评估指标三个视角,系统综述了代表性研究成果。我们的工作涵盖多达167篇先前研究,在任务覆盖范围上优于已有综述。特别地,我们重点聚焦于近年来备受关注的基于深度学习的关键短语预测方法。随后,通过多组实验对代表性模型进行了细致比较。据我们所知,本文首次尝试在统一常用数据集与评估指标下对比这些模型,从而深入分析其优劣。最后,我们探讨了该任务未来的可能研究方向。