Chinese Spelling Check (CSC) is a meaningful task in the area of Natural Language Processing (NLP) which aims at detecting spelling errors in Chinese texts and then correcting these errors. However, CSC models are based on pretrained language models, which are trained on a general corpus. Consequently, their performance may drop when confronted with downstream tasks involving domain-specific terms. In this paper, we conduct a thorough evaluation about the domain adaption ability of various typical CSC models by building three new datasets encompassing rich domain-specific terms from the financial, medical, and legal domains. Then we conduct empirical investigations in the corresponding domain-specific test datasets to ascertain the cross-domain adaptation ability of several typical CSC models. We also test the performance of the popular large language model ChatGPT. As shown in our experiments, the performances of the CSC models drop significantly in the new domains.
翻译:中文拼写检查(CSC)是自然语言处理(NLP)领域一项有意义的任务,旨在检测中文文本中的拼写错误并予以纠正。然而,CSC模型基于预训练语言模型,而这些模型是在通用语料库上训练的。因此,当面对涉及领域特定术语的下游任务时,它们的性能可能会下降。本文通过构建三个包含金融、医疗和法律领域丰富领域特定术语的新数据集,对多种典型CSC模型的领域适应能力进行了全面评估。随后,我们在相应的领域特定测试数据集上进行实证研究,以确定多种典型CSC模型的跨领域适应能力。我们还测试了流行的大型语言模型ChatGPT的性能。实验表明,CSC模型在新领域的性能显著下降。