Benchmark for Evaluation and Analysis of Citation Recommendation Models

Citation recommendation systems have attracted much academic interest, resulting in many studies and implementations. These systems help authors automatically generate proper citations by suggesting relevant references based on the text they have written. However, the methods used in citation recommendation differ across various studies and implementations. Some approaches focus on the overall content of papers, while others consider the context of the citation text. Additionally, the datasets used in these studies include different aspects of papers, such as metadata, citation context, or even the full text of the paper in various formats and structures. The diversity in models, datasets, and evaluation metrics makes it challenging to assess and compare citation recommendation methods effectively. To address this issue, a standardized dataset and evaluation metrics are needed to evaluate these models consistently. Therefore, we propose developing a benchmark specifically designed to analyze and compare citation recommendation models. This benchmark will evaluate the performance of models on different features of the citation context and provide a comprehensive evaluation of the models across all these tasks, presenting the results in a standardized way. By creating a benchmark with standardized evaluation metrics, researchers and practitioners in the field of citation recommendation will have a common platform to assess and compare different models. This will enable meaningful comparisons and help identify promising approaches for further research and development in the field.

翻译：引文推荐系统已引起广泛的学术关注，催生了大量研究与实现。这类系统通过根据作者已撰写的文本推荐相关参考文献，帮助作者自动生成恰当的引文。然而，不同研究与实现中采用的引文推荐方法存在差异：部分方法关注论文的整体内容，而另一些则考虑引文文本的上下文。此外，相关研究使用的数据集涵盖了论文的不同方面，如元数据、引文上下文，甚至不同格式与结构的全文。模型、数据集与评估指标的多样性使得有效评估与比较引文推荐方法面临挑战。为解决此问题，需要标准化的数据集与评估指标以对这些模型进行一致性评估。为此，我们提出开发专门用于分析与比较引文推荐模型的基准。该基准将评估模型在引文上下文不同特征上的性能，并对模型在所有任务上进行综合评估，以标准化方式呈现结果。通过建立具有标准化评估指标的基准，引文推荐领域的研究者与实践者将获得评估与比较不同模型的共同平台。这将实现有意义的比较，并有助于识别具有潜力的研究方向，推动该领域的进一步发展。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/