Citations in scholarly work serve the essential purpose of acknowledging and crediting the original sources of knowledge that have been incorporated or referenced. Depending on their surrounding textual context, these citations are used for different motivations and purposes. Large Language Models (LLMs) could be helpful in capturing these fine-grained citation information via the corresponding textual context, thereby enabling a better understanding towards the literature. Furthermore, these citations also establish connections among scientific papers, providing high-quality inter-document relationships and human-constructed knowledge. Such information could be incorporated into LLMs pre-training and improve the text representation in LLMs. Therefore, in this paper, we offer a preliminary review of the mutually beneficial relationship between LLMs and citation analysis. Specifically, we review the application of LLMs for in-text citation analysis tasks, including citation classification, citation-based summarization, and citation recommendation. We then summarize the research pertinent to leveraging citation linkage knowledge to improve text representations of LLMs via citation prediction, network structure information, and inter-document relationship. We finally provide an overview of these contemporary methods and put forth potential promising avenues in combining LLMs and citation analysis for further investigation.
翻译:学术著作中的引用具有确认和归因所纳入或参考的原始知识来源的基本功能。根据其周围的文本语境,这些引用因不同的动机和目的而被使用。大型语言模型(LLMs)可通过相应的文本语境捕捉这些细粒度的引用信息,从而加深对文献的理解。此外,这些引用还在科学论文之间建立联系,提供了高质量的文档间关系和人类构建的知识。此类信息可整合到LLMs的预训练过程中,提升LLMs的文本表征能力。因此,本文对LLMs与引文分析之间的互惠关系进行了初步综述。具体而言,我们回顾了LLMs在文本内引文分析任务中的应用,包括引文分类、基于引文的摘要生成和引文推荐。随后,我们总结了利用引文链接知识通过引文预测、网络结构信息和文档间关系改进LLMs文本表征的相关研究。最后,我们概述了这些当代方法,并提出了将LLMs与引文分析相结合进行进一步研究的潜在有前景方向。