Purpose - Quotation error refers to the inconsistency between cited information and its original source. This phenomenon leads to a series of negative impacts, such as misinterpretation of the original research, undermining the academic community's collective understanding of relevant issues, and weakening the accuracy and fairness of the citation-based academic evaluation system. Existing studies have shown that quotation error is prevalent in the academic community; moreover, manual verification of quotation error is not only labor-intensive but also inefficient. Therefore, this paper proposes the task of 'automated detection of quotation errors'. Methodology - Adopting a large language model (LLM)-based approach, this paper improves detection performance from two aspects on the basis of existing research: first, employ the fine-tuning approach for LLMs to detect quotation errors; second, incorporating full-text data of the cited literature into dataset construction, and exploring the optimal scheme for building such datasets by comparing three types of full-text integration methods. Based on this, this paper further uses the TokenSHAP tool to conduct interpretability experimental analysis on the model's prediction results. Findings - The fine-tuning approach for LLMs has improved the performance in detecting quotation errors. Among the different methods for incorporating full-text information, the approach based on using the source abstract yielded the best performance. Originality - The fine-tuning approach for large language models (LLMs) is applied to the task of automated detection of quotation errors, and interpretability analysis is conducted on the model's output results.
翻译:目的——引文错误指引用信息与其原始来源不一致的现象。该现象会引发一系列负面影响,如曲解原始研究、破坏学术界对相关问题的共识、削弱基于引文的学术评价体系的准确性与公平性。已有研究表明,引文错误在学术界普遍存在;此外,人工核查引文错误不仅劳动密集且效率低下。因此,本文提出"引文错误自动检测"任务。方法——采用基于大语言模型的方法,在现有研究基础上从两方面提升检测性能:其一,运用微调方法使大语言模型检测引文错误;其二,将被引文献的全文数据纳入数据集构建,通过比较三种全文整合方式探索此类数据集的最优构建方案。在此基础上,本文进一步利用TokenSHAP工具对模型预测结果进行可解释性实验分析。发现——大语言模型的微调方法提升了引文错误检测性能。在多种全文信息整合方式中,基于使用源摘要的方法取得最佳效果。原创性——将大语言模型微调方法应用于引文错误自动检测任务,并对模型输出结果进行可解释性分析。