We present our submission to the unconstrained subtask of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages for morphological annotation, POS-tagging, lemmatization, character- and word-level gap-filling. We developed a simple, uniform, and computationally lightweight approach based on the adapters framework using parameter-efficient fine-tuning. We applied the same adapter-based approach uniformly to all tasks and 16 languages by fine-tuning stacked language- and task-specific adapters. Our submission obtained an overall second place out of three submissions, with the first place in word-level gap-filling. Our results show the feasibility of adapting language models pre-trained on modern languages to historical and ancient languages via adapter training.
翻译:本文介绍了我们在 SIGTYP 2024 共享任务“古代与历史语言的词嵌入评估”的无约束子任务中的提交方案,该任务涵盖形态标注、词性标注、词形还原、字符级及词级填空。我们基于适配器框架,采用参数高效微调技术,开发了一种简单、统一且计算轻量的方法。我们通过微调堆叠的语言特定和任务特定适配器,将这一基于适配器的方法统一应用于所有任务及16种语言。我们的提交在全部三个提交中获得了总排名第二的成绩,并在词级填空任务中位列第一。我们的结果表明,通过适配器训练,将基于现代语言预训练的语言模型适配到历史与古代语言是可行的。