EasyRead pictograms are simple, visually clear images that represent specific concepts and support comprehension for people with intellectual disabilities, low literacy, or language barriers. The large-scale production of EasyRead content has traditionally been constrained by the cost and expertise required to manually design pictograms. In contrast, automatic generation of such images could significantly reduce production time and cost, enabling broader accessibility across digital and printed materials. However, modern diffusion-based image generation models tend to produce outputs that exhibit excessive visual detail and lack stylistic stability across random seeds, limiting their suitability for clear and consistent pictogram generation. This challenge highlights the need for methods specifically tailored to accessibility-oriented visual content. In this work, we present a unified pipeline for generating EasyRead pictograms by fine-tuning a Stable Diffusion model using LoRA adapters on a curated corpus that combines augmented samples from multiple pictogram datasets. Since EasyRead pictograms lack a unified formal definition, we introduce an EasyRead score to benchmark pictogram quality and consistency. Our results demonstrate that diffusion models can be effectively steered toward producing coherent EasyRead-style images, indicating that generative models can serve as practical tools for scalable and accessible pictogram production.
翻译:EasyRead图示是一种简单、视觉清晰的图像,用于表示特定概念,以支持智力障碍、低识字率或语言障碍人群的理解。传统上,EasyRead内容的大规模生产受限于手动设计图示所需的成本与专业知识。相比之下,此类图像的自动生成可显著减少生产时间和成本,从而在数字与印刷材料中实现更广泛的可访问性。然而,基于扩散的现代图像生成模型倾向于产生视觉细节过度、且在不同随机种子间缺乏风格稳定性的输出,这限制了其在生成清晰一致图示方面的适用性。这一挑战凸显了需要专门针对面向可访问性视觉内容的方法。在本研究中,我们提出了一个统一的EasyRead图示生成流程,通过在结合多个图示数据集增强样本的精选语料库上,使用LoRA适配器微调Stable Diffusion模型来实现。由于EasyRead图示缺乏统一的正式定义,我们引入了EasyRead评分来基准测试图示的质量与一致性。我们的结果表明,扩散模型能够被有效导向生成连贯的EasyRead风格图像,这表明生成模型可作为可扩展且可访问的图示生产的实用工具。