With the rapid growth of scientific publications, researchers need to spend more time and effort searching for papers that align with their research interests. To address this challenge, paper recommendation systems have been developed to help researchers in effectively identifying relevant paper. One of the leading approaches to paper recommendation is content-based filtering method. Traditional content-based filtering methods recommend relevant papers to users based on the overall similarity of papers. However, these approaches do not take into account the information seeking behaviors that users commonly employ when searching for literature. Such behaviors include not only evaluating the overall similarity among papers, but also focusing on specific sections, such as the method section, to ensure that the approach aligns with the user's interests. In this paper, we propose a content-based filtering recommendation method that takes this information seeking behavior into account. Specifically, in addition to considering the overall content of a paper, our approach also takes into account three specific sections (background, method, and results) and assigns weights to them to better reflect user preferences. We conduct offline evaluations on the publicly available DBLP dataset, and the results demonstrate that the proposed method outperforms six baseline methods in terms of precision, recall, F1-score, MRR, and MAP.
翻译:随着科学文献的快速增长,研究人员需要投入更多时间和精力来寻找与其研究兴趣相符的论文。为应对这一挑战,论文推荐系统应运而生,以帮助研究者有效识别相关文献。基于内容的过滤方法是当前论文推荐的主流方法之一。传统的基于内容过滤方法主要依据论文的整体相似性向用户推荐相关论文。然而,这些方法未考虑用户在文献检索时普遍采用的信息寻求行为。此类行为不仅包括评估论文间的整体相似性,还涉及关注特定章节(如方法部分)以确保其方法与用户兴趣相符。本文提出一种考虑此类信息寻求行为的基于内容过滤推荐方法。具体而言,除考虑论文整体内容外,该方法还关注三个特定章节(背景、方法和结果)并为其分配权重,以更好地反映用户偏好。我们在公开的DBLP数据集上进行了离线评估,结果表明:在精确率、召回率、F1分数、平均倒数排名和平均准确率等指标上,所提方法均优于六种基线方法。