"Scene description" applications that describe visual content in a photo are useful daily tools for blind and low vision (BLV) people. Researchers have studied their use, but they have only explored those that leverage remote sighted assistants; little is known about applications that use AI to generate their descriptions. Thus, to investigate their use cases, we conducted a two-week diary study where 16 BLV participants used an AI-powered scene description application we designed. Through their diary entries and follow-up interviews, users shared their information goals and assessments of the visual descriptions they received. We analyzed the entries and found frequent use cases, such as identifying visual features of known objects, and surprising ones, such as avoiding contact with dangerous objects. We also found users scored the descriptions relatively low on average, 2.76 out of 5 (SD=1.49) for satisfaction and 2.43 out of 4 (SD=1.16) for trust, showing that descriptions still need significant improvements to deliver satisfying and trustworthy experiences. We discuss future opportunities for AI as it becomes a more powerful accessibility tool for BLV users.
翻译:“场景描述”应用能够描述照片中的视觉内容,是盲人和低视力人群日常使用的实用工具。研究者已对其使用情况展开探讨,但此前仅关注依托远程人工辅助的应用,而对采用AI生成描述的应用知之甚少。为此,我们开展了为期两周的日记研究,邀请16名低视力参与者使用我们设计的AI驱动场景描述应用。通过日记记录与后续访谈,用户分享了其信息需求及对所获视觉描述的评估。分析发现,常见使用场景包括识别已知物体的视觉特征,亦存在意外用例,如规避危险物体接触。用户对描述的平均满意度评分为2.76/5(标准差=1.49),信任度评分为2.43/4(标准差=1.16),表明描述仍需显著改进以提供令人满意且可信的体验。针对AI作为低视力用户更强大无障碍工具的发展前景,本文探讨了未来机遇。