Assessing a cited paper's impact is typically done by analyzing its citation context in isolation within the citing paper. While this focuses on the most directly relevant text, it prevents relative comparisons across all the works a paper cites. We propose Crystal, which instead jointly ranks all cited papers within a citing paper using large language models (LLMs). To mitigate LLMs' positional bias, we rank each list three times in a randomized order and aggregate the impact labels through majority voting. This joint approach leverages the full citation context, rather than evaluating citations independently, to more reliably distinguish impactful references. Crystal outperforms a prior state-of-the-art impact classifier by +9.5% accuracy and +8.3% F1 on a dataset of human-annotated citations. Crystal further gains efficiency through fewer LLM calls and performs competitively with an open-source model, enabling scalable, cost-effective citation impact analysis. We release our rankings, impact labels, and codebase to support future research.
翻译:摘要:评估被引论文的影响力通常通过分析其在施引文献中的孤立引用语境来完成。尽管这聚焦于最直接相关的文本,但无法对一篇论文所引用的所有文献进行相对比较。我们提出Crystal方法,该方法利用大语言模型(LLMs)对施引文献内的所有被引论文进行联合排序。为缓解LLMs的位置偏差,我们对每个列表进行三次随机顺序排序,并通过多数投票聚合影响力标签。这种联合方法利用完整的引用语境(而非独立评估每条引文),从而更可靠地区分高影响力文献。在人工标注的引用数据集上,Crystal相较于先前最先进的影响力分类器在准确率上提升9.5%,F1值提升8.3%。此外,Crystal通过减少LLM调用次数提升效率,并与开源模型表现相当,实现了可扩展且经济高效的引文影响力分析。我们公开了排序结果、影响力标签及代码库以支持未来研究。