The rapid advance in artificial intelligence technology has facilitated the prosperity of digital humanities research. Against such backdrop, research methods need to be transformed in the intelligent processing of ancient texts, which is a crucial component of digital humanities research, so as to adapt to new development trends in the wave of AIGC. In this study, we propose a GPT model called SikuGPT based on the corpus of Siku Quanshu. The model's performance in tasks such as intralingual translation and text classification exceeds that of other GPT-type models aimed at processing ancient texts. SikuGPT's ability to process traditional Chinese ancient texts can help promote the organization of ancient information and knowledge services, as well as the international dissemination of Chinese ancient culture.
翻译:人工智能技术的快速发展推动了数字人文研究的繁荣。在此背景下,作为数字人文研究重要组成部分的古籍智能信息处理亟需变革研究方法,以适应AIGC浪潮下的新发展态势。本研究基于《四库全书》语料库,提出一种名为SikuGPT的GPT模型。该模型在语内翻译、文本分类等任务上的表现优于其他面向古籍处理的GPT类模型。SikuGPT处理中文古籍的能力,有助于推动古籍信息组织与知识服务,以及中华古代文化的国际传播。