Large Language Models for Web Accessibility: A Systematic Literature Review

Web accessibility aims to ensure that web content and services are usable by people with diverse abilities. In recent years, Large Language Models (LLMs) have been increasingly explored to support accessibility-related tasks on the web, such as content generation, issue detection, and remediation. However, little is known about the characteristics of these approaches, the accessibility issues they target, the standards they follow, and how they are evaluated. In this paper, we present a systematic literature review of 38 peer-reviewed studies that investigate the use of LLMs in web accessibility contexts. We begin by performing a comprehensive search of scientific publications to identify relevant studies. We then conduct a comparative analysis to examine the accessibility tasks addressed, the LLM models and prompting strategies employed, the system architectures adopted, the accessibility issues and guidelines considered, and the evaluation methods used across studies. Our findings show that most studies apply LLMs to text-centric and structurally explicit accessibility tasks, with WCAG serving as the primary reference framework and limited consideration of cognitive accessibility guidelines (COGA). The reviewed approaches predominantly rely on general-purpose LLMs and prompt-based interactions, while evaluation practices vary widely and often lack direct involvement of users with disabilities. We envision this review as a consolidated reference for researchers and practitioners seeking to understand the current landscape of LLM-supported web accessibility, and as a foundation to guide future research and tool development in this area.

翻译：网页可访问性旨在确保各类能力差异的用户均能使用网页内容与服务。近年来，大语言模型被越来越多地探索用于支持网页相关的可访问性任务，例如内容生成、问题检测与修复。然而，当前对这些方法的技术特征、所针对的可访问性问题类型、遵循的标准规范以及评估机制仍缺乏系统性认识。本文对38篇同行评审研究进行了系统性文献综述，这些研究探究了大语言模型在网页可访问性场景中的应用。我们首先通过全面检索科学出版物识别相关研究，继而开展比较分析，重点考察：所涉及的可访问性任务、采用的大语言模型架构与提示策略、系统实现方案、涉及的可访问性问题与指南框架、以及各研究的评估方法。研究结果表明，多数研究将大语言模型应用于文本密集型和结构显式的可访问性任务，其中WCAG是主要参考框架，而对认知可访问性指南（COGA）的考量有限。现有方法主要依赖通用型大语言模型和基于提示的交互方式，而评估实践差异显著且普遍缺乏残障用户的直接参与。本综述旨在为研究人员与实践者提供理解大语言模型赋能网页可访问性现状的整合性参考，并为未来该领域的研究推进和工具开发奠定基础。