Many NLP tasks, although well-resolved for general English, face challenges in specific domains like fantasy literature. This is evident in Named Entity Recognition (NER), which detects and categorizes entities in text. We analyzed 10 NER models on 7 Dungeons and Dragons (D&D) adventure books to assess domain-specific performance. Using open-source Large Language Models, we annotated named entities in these books and evaluated each model's precision. Our findings indicate that, without modifications, Flair, Trankit, and Spacy outperform others in identifying named entities in the D&D context.
翻译:许多自然语言处理任务虽然对通用英语领域已得到较好解决,但在奇幻文学等特定领域仍面临挑战。这在命名实体识别(NER)任务中尤为明显——该任务旨在检测并分类文本中的实体。我们选取了7本《龙与地下城》(Dungeons & Dragons, D&D)冒险书籍,对10个命名实体识别模型进行了领域适应性评估。借助开源大语言模型,我们标注了这些书籍中的命名实体,并计算了各模型的精确率。研究结果表明,在未进行领域适配的情况下,Flair、Trankit与Spacy在识别D&D语境中的命名实体方面表现优于其他模型。