Large Language Models (LLMs) are trained on corpora disproportionally weighted in favor of Standard American English. As a result, speakers of other dialects experience significantly more failures when interacting with these technologies. In practice, these speakers often accommodate their speech to be better understood. Our work shares the belief that language technologies should be designed to accommodate the diversity in English dialects and not the other way around. However, prior works on dialect struggle with generalizing to evolving and emerging dialects in a scalable manner. To fill this gap, our method, HyperLoRA, leverages expert linguistic knowledge to enable resource-efficient adaptation via hypernetworks. By disentangling dialect-specific and cross-dialectal information, HyperLoRA improves generalization to unseen dialects in a task-agnostic fashion. Not only is HyperLoRA more scalable in the number of parameters, but it also achieves the best or most competitive performance across 5 dialects in a zero-shot setting. In this way, our approach facilitates access to language technology for billions of English dialect speakers who are traditionally underrepresented.
翻译:大型语言模型(LLMs)的训练语料库中,标准美式英语占比严重失衡。因此,使用其他方言的说话者在与这些技术交互时会遭遇显著更多的失败。实践中,这些说话者往往需要调整自身语言以被更好地理解。我们的工作坚持一个信念:语言技术应被设计为适应英语方言的多样性,而非反其道而行之。然而,现有针对方言的工作在可扩展地泛化至演变中及新兴方言时面临挑战。为填补这一空白,我们的方法HyperLoRA利用专家语言学知识,通过超网络实现资源高效适配。通过解耦方言特有信息与跨方言信息,HyperLoRA以任务无关的方式提升了对未见方言的泛化能力。HyperLoRA不仅在参数数量上更具可扩展性,还在零样本设置下于5种方言中取得最佳或最具竞争力的性能。通过这种方式,我们的方法为传统上未被充分代表的数十亿英语方言使用者提供了获取语言技术的便利。