This paper introduces a dataset for improving real-time object recognition systems to aid blind and low-vision (BLV) individuals in navigation tasks. The dataset comprises 21 videos of BLV individuals navigating outdoor spaces, and a taxonomy of 90 objects crucial for BLV navigation, refined through a focus group study. We also provide object labeling for the 90 objects across 31 video segments created from the 21 videos. A deeper analysis reveals that most contemporary datasets used in training computer vision models contain only a small subset of the taxonomy in our dataset. Preliminary evaluation of state-of-the-art computer vision models on our dataset highlights shortcomings in accurately detecting key objects relevant to BLV navigation, emphasizing the need for specialized datasets. We make our dataset publicly available, offering valuable resources for developing more inclusive navigation systems for BLV individuals.
翻译:本文介绍了一个用于改进实时物体识别系统的数据集,旨在辅助盲人及低视力(BLV)个体完成导航任务。该数据集包含21段BLV个体在户外空间导航的视频,以及通过焦点小组研究提炼出的、对BLV导航至关重要的90种物体的分类体系。我们还为从21段视频中截取的31个视频片段中的90种物体提供了物体标注。深入分析表明,当前用于训练计算机视觉模型的大多数数据集仅包含本数据集分类体系中的一小部分对象。在本数据集上对前沿计算机视觉模型的初步评估显示,这些模型在准确检测与BLV导航相关的关键物体方面存在不足,凸显了对专用数据集的需求。我们公开提供本数据集,为开发更具包容性的BLV个体导航系统提供宝贵资源。