To ensure that AI-infused systems work for disabled people, we need to bring accessibility datasets sourced from this community in the development lifecycle. However, there are many ethical and privacy concerns limiting greater data inclusion, making such datasets not readily available. We present a pair of studies where 13 blind participants engage in data capturing activities and reflect with and without probing on various factors that influence their decision to share their data via an AI dataset. We see how different factors influence blind participants' willingness to share study data as they assess risk-benefit tradeoffs. The majority support sharing of their data to improve technology but also express concerns over commercial use, associated metadata, and the lack of transparency about the impact of their data. These insights have implications for the development of responsible practices for stewarding accessibility datasets, and can contribute to broader discussions in this area.
翻译:为确保人工智能系统能够为残障人士所用,我们需要在开发周期中引入源自该群体的无障碍数据集。然而,诸多伦理和隐私问题限制了数据的广泛纳入,导致此类数据集难以获取。我们开展了两项研究,让13名盲人参与者参与数据采集活动,并在有/无探查的条件下反思影响其通过AI数据集共享数据的各类因素。我们看到,当参与者评估风险-收益权衡时,不同因素会如何影响盲人共享研究数据的意愿。大多数人支持共享数据以改进技术,但也对商业用途、关联元数据以及缺乏数据影响透明度等问题表示担忧。这些见解对于制定负责任的无障碍数据集管理实践具有启示意义,并可为该领域的更广泛讨论作出贡献。