As voice assistants cement their place in our technologically advanced society, there remains a need to cater to the diverse linguistic landscape, including colloquial forms of low-resource languages. Our study introduces the first-ever comprehensive dataset for intent detection and slot filling in formal Bangla, colloquial Bangla, and Sylheti languages, totaling 984 samples across 10 unique intents. Our analysis reveals the robustness of large language models for tackling downstream tasks with inadequate data. The GPT-3.5 model achieves an impressive F1 score of 0.94 in intent detection and 0.51 in slot filling for colloquial Bangla.
翻译:随着语音助手在技术发达社会中的地位日益巩固,仍需满足多样化的语言环境需求,包括低资源语言的口语形式。本研究首次提出了针对正式孟加拉语、口语孟加拉语及西尔赫蒂语的意图检测与槽位填充综合数据集,共包含984个样本,覆盖10种独特意图。我们的分析揭示了大型语言模型在数据不足时处理下游任务的鲁棒性。GPT-3.5模型在口语孟加拉语的意图检测中取得了令人瞩目的F1分数0.94,在槽位填充中为0.51。