Artificial intelligence is creating a new global linguistic hierarchy

Giulia Occhini,Kumiko Tanaka-Ishii,Anna Barford,Refael Tikochinski,Songbo Hu,Roi Reichart,Yijie Zhou,Hannah Claus,Ulla Petti,Ivan Vulić,Ramit Debnath,Anna Korhonen

Artificial intelligence (AI) has the potential to transform healthcare, education, governance and socioeconomic equity, but its benefits remain concentrated in a small number of languages (Bender, 2019; Blasi et al., 2022; Joshi et al., 2020; Ranathunga and de Silva, 2022; Young, 2015). Language AI - the technologies that underpin widely-used conversational systems such as ChatGPT - could provide major benefits if available in people's native languages, yet most of the world's 7,000+ linguistic communities currently lack access and face persistent digital marginalization. Here we present a global longitudinal analysis of social, economic and infrastructural conditions across languages to assess systemic inequalities in language AI. We first analyze the existence of AI resources for 6003 languages. We find that despite efforts of the community to broaden the reach of language technologies (Bapna et al., 2022; Costa-Jussà et al., 2022), the dominance of a handful of languages is exacerbating disparities on an unprecedented scale, with divides widening exponentially rather than narrowing. Further, we contrast the longitudinal diffusion of AI with that of earlier IT technologies, revealing a distinctive hype-driven pattern of spread. To translate our findings into practical insights and guide prioritization efforts, we introduce the Language AI Readiness Index (EQUATE), which maps the state of technological, socio-economic, and infrastructural prerequisites for AI deployment across languages. The index highlights communities where capacity exists but remains underutilized, and provides a framework for accelerating more equitable diffusion of language AI. Our work contributes to setting the baseline for a transition towards more sustainable and equitable language technologies.

翻译：人工智能（AI）有潜力变革医疗保健、教育、治理和社会经济公平，但其益处目前仍集中在少数语言之中（Bender, 2019; Blasi et al., 2022; Joshi et al., 2020; Ranathunga and de Silva, 2022; Young, 2015）。语言AI——即支撑ChatGPT等广泛应用对话系统的技术——若能以人们的母语提供，可能带来重大效益。然而，全球7000多个语言社群中的大多数目前仍无法获取这些技术，并面临持续的数字边缘化。本文通过对不同语言的社会、经济和基础设施条件进行全球纵向分析，以评估语言AI领域的系统性不平等。我们首先分析了6003种语言的AI资源现状。研究发现，尽管学界努力扩大语言技术的覆盖范围（Bapna et al., 2022; Costa-Jussà et al., 2022），但少数语言的主导地位正在以前所未有的规模加剧不平等，语言间的鸿沟呈指数级扩大而非缩小。此外，我们对比了AI与早期IT技术的纵向扩散模式，揭示了一种独特的由炒作驱动的传播规律。为了将研究发现转化为实践见解并指导优先发展工作，我们引入了语言AI就绪指数（EQUATE），该指数描绘了不同语言在AI部署所需的技术、社会经济和基础设施先决条件方面的现状。该指数突显了那些具备能力但尚未充分利用的社群，并为加速语言AI更公平的扩散提供了框架。我们的工作为向更可持续、更公平的语言技术转型奠定了基线。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

可解释人工智能（XAI）：从内在可解释性到大语言模型

专知会员服务

34+阅读 · 2025年1月20日

中国信通院发布《人工智能发展报告（2024年）》

专知会员服务

107+阅读 · 2024年12月12日

人工智能助力未来军事战略和全球主导地位

专知会员服务

21+阅读 · 2024年8月11日

《人工智能大语言模型技术发展研究报告（2024）》重磅发布！

专知会员服务

109+阅读 · 2024年7月13日