Test-takers have a say: understanding the implications of the use of AI in language tests

Language tests measure a person's ability to use a language in terms of listening, speaking, reading, or writing. Such tests play an integral role in academic, professional, and immigration domains, with entities such as educational institutions, professional accreditation bodies, and governments using them to assess candidate language proficiency. Recent advances in Artificial Intelligence (AI) and the discipline of Natural Language Processing have prompted language test providers to explore AI's potential applicability within language testing, leading to transformative activity patterns surrounding language instruction and learning. However, with concerns over AI's trustworthiness, it is imperative to understand the implications of integrating AI into language testing. This knowledge will enable stakeholders to make well-informed decisions, thus safeguarding community well-being and testing integrity. To understand the concerns and effects of AI usage in language tests, we conducted interviews and surveys with English test-takers. To the best of our knowledge, this is the first empirical study aimed at identifying the implications of AI adoption in language tests from a test-taker perspective. Our study reveals test-taker perceptions and behavioral patterns. Specifically, we identify that AI integration may enhance perceptions of fairness, consistency, and availability. Conversely, it might incite mistrust regarding reliability and interactivity aspects, subsequently influencing the behaviors and well-being of test-takers. These insights provide a better understanding of potential societal implications and assist stakeholders in making informed decisions concerning AI usage in language testing.

翻译：语言测试衡量一个人在听、说、读、写方面使用语言的能力。这类测试在学术、专业和移民领域中扮演着不可或缺的角色，教育机构、专业认证机构和政府等实体利用它们来评估候选人的语言能力。人工智能（AI）及自然语言处理学科的最新进展，促使语言测试提供方探索AI在语言测试中的潜在应用，从而引发了围绕语言教学与学习的变革性活动模式。然而，鉴于对AI可信度的担忧，理解将AI整合到语言测试中的影响至关重要。这一认知将使利益相关者能够做出明智的决策，从而维护社群福祉和测试的完整性。为探究AI在语言测试中应用所带来的担忧与影响，我们对英语考生进行了访谈和问卷调查。据我们所知，这是首个旨在从考生视角识别AI在语言测试中应用影响的实证研究。我们的研究揭示了考生的认知与行为模式。具体而言，我们发现AI的整合可能会增强对公平性、一致性和可用性的感知。相反，它可能引发对可靠性和交互性方面的不信任，进而影响考生的行为与福祉。这些洞察有助于更深入地理解潜在的社会影响，并协助利益相关者就语言测试中AI的使用做出知情决策。