This paper proposes Attention-Seeker, an unsupervised keyphrase extraction method that leverages self-attention maps from a Large Language Model to estimate the importance of candidate phrases. Our approach identifies specific components - such as layers, heads, and attention vectors - where the model pays significant attention to the key topics of the text. The attention weights provided by these components are then used to score the candidate phrases. Unlike previous models that require manual tuning of parameters (e.g., selection of heads, prompts, hyperparameters), Attention-Seeker dynamically adapts to the input text without any manual adjustments, enhancing its practical applicability. We evaluate Attention-Seeker on four publicly available datasets: Inspec, SemEval2010, SemEval2017, and Krapivin. Our results demonstrate that, even without parameter tuning, Attention-Seeker outperforms most baseline models, achieving state-of-the-art performance on three out of four datasets, particularly excelling in extracting keyphrases from long documents.
翻译:本文提出Attention-Seeker,一种利用大型语言模型的自注意力图来评估候选短语重要性的无监督关键词提取方法。我们的方法识别出模型对文本关键主题给予显著关注的特定组件——例如层、注意力头及注意力向量。随后,利用这些组件提供的注意力权重对候选短语进行评分。与以往需要手动调整参数(如注意力头选择、提示词设计、超参数设置)的模型不同,Attention-Seeker能够动态适应输入文本而无需任何人工调整,从而增强了其实用性。我们在四个公开数据集上评估了Attention-Seeker:Inspec、SemEval2010、SemEval2017和Krapivin。实验结果表明,即使不进行参数调优,Attention-Seeker在多数基线模型上仍表现出优越性能,在四个数据集中的三个上达到了最先进的水平,尤其在长文档关键词提取任务中表现突出。