Identifying significant references within the complex interrelations of a citation knowledge graph is challenging, which encompasses connections through citations, authorship, keywords, and other relational attributes. The Paper Source Tracing (PST) task seeks to automate the identification of pivotal references for given scholarly articles utilizing advanced data mining techniques. In the KDD CUP 2024, we design a recommendation-based framework tailored for the PST task. This framework employs the Neural Collaborative Filtering (NCF) model to generate final predictions. To process the textual attributes of the papers and extract input features for the model, we utilize SciBERT, a pre-trained language model. According to the experimental results, our method achieved a score of 0.37814 on the Mean Average Precision (MAP) metric, outperforming baseline models and ranking 11th among all participating teams. The source code is publicly available at https://github.com/MyLove-XAB/KDDCupFinal.
翻译:在引用知识图谱的复杂关联中识别重要参考文献具有挑战性,这些关联包括通过引用、作者、关键词及其他关系属性的连接。论文溯源任务旨在利用先进的数据挖掘技术,自动识别给定学术文章的关键参考文献。在KDD CUP 2024中,我们设计了一个专门针对论文溯源任务的基于推荐的框架。该框架采用神经协同过滤模型来生成最终预测。为处理论文的文本属性并提取模型的输入特征,我们使用了预训练语言模型SciBERT。实验结果表明,我们的方法在平均精度均值指标上获得了0.37814的分数,优于基线模型,并在所有参赛团队中排名第11位。源代码已公开于 https://github.com/MyLove-XAB/KDDCupFinal。