Transformer-based language models, including ChatGPT, have demonstrated exceptional performance in various natural language generation tasks. However, there has been limited research evaluating ChatGPT's keyphrase generation ability, which involves identifying informative phrases that accurately reflect a document's content. This study seeks to address this gap by comparing ChatGPT's keyphrase generation performance with state-of-the-art models, while also testing its potential as a solution for two significant challenges in the field: domain adaptation and keyphrase generation from long documents. We conducted experiments on six publicly available datasets from scientific articles and news domains, analyzing performance on both short and long documents. Our results show that ChatGPT outperforms current state-of-the-art models in all tested datasets and environments, generating high-quality keyphrases that adapt well to diverse domains and document lengths.
翻译:基于Transformer的语言模型,包括ChatGPT,已在各种自然语言生成任务中展现出卓越性能。然而,目前评估ChatGPT关键短语生成能力的研究有限,该能力涉及识别能准确反映文档内容的短语。本研究旨在通过比较ChatGPT的关键短语生成性能与最先进模型,同时测试其作为解决该领域两大挑战(领域适应和长文档关键短语生成)的潜力,来填补这一空白。我们在来自科学文章和新闻领域的六个公开可用数据集上进行了实验,分析了在短文档和长文档上的性能。结果表明,ChatGPT在所有测试数据集和环境中均优于当前最先进模型,能生成高质量的关键短语,并良好适应不同领域和文档长度。