In this paper we present our submission for the NorSID Shared Task as part of the 2025 VarDial Workshop (Scherrer et al., 2025), consisting of three tasks: Intent Detection, Slot Filling and Dialect Identification, evaluated using data in different dialects of the Norwegian language. For Intent Detection and Slot Filling, we have fine-tuned a multitask model in a cross-lingual setting, to leverage the xSID dataset available in 17 languages. In the case of Dialect Identification, our final submission consists of a model fine-tuned on the provided development set, which has obtained the highest scores within our experiments. Our final results on the test set show that our models do not drop in performance compared to the development set, likely due to the domain-specificity of the dataset and the similar distribution of both subsets. Finally, we also report an in-depth analysis of the provided datasets and their artifacts, as well as other sets of experiments that have been carried out but did not yield the best results. Additionally, we present an analysis on the reasons why some methods have been more successful than others; mainly the impact of the combination of languages and domain-specificity of the training data on the results.
翻译:本文介绍了我们为 2025 年 VarDial 研讨会(Scherrer 等人,2025)中 NorSID 共享任务提交的系统,该任务包含三项子任务:意图检测、槽位填充和方言识别,评估数据使用挪威语的不同方言。针对意图检测和槽位填充,我们在跨语言设置下微调了一个多任务模型,以利用涵盖 17 种语言的 xSID 数据集。对于方言识别,我们的最终提交模型是在提供的开发集上微调的,该模型在我们的实验中取得了最高分数。我们在测试集上的最终结果表明,与开发集相比,模型性能并未下降,这很可能源于数据集的领域特定性以及两个子集分布的相似性。最后,我们还对提供的数据集及其人工标注特征进行了深入分析,并报告了其他已进行但未取得最佳结果的实验。此外,我们分析了某些方法比其他方法更成功的原因,主要是训练数据的语言组合与领域特定性对结果的影响。