Evaluating OCR Performance for Assistive Technology: Effects of Walking Speed, Camera Placement, and Camera Type

Optical character recognition (OCR), which converts printed or handwritten text into machine-readable form, is widely used in assistive technology for people with blindness and low vision. Yet, most evaluations rely on static datasets that do not reflect the challenges of mobile use. In this study, we systematically evaluated OCR performance under both static and dynamic conditions. Static tests measured detection range across distances of 1-7 meters and viewing angles of 0-75 degrees horizontally. Dynamic tests examined the impact of motion by varying walking speed from slow (0.8 m/s) to very fast (1.8 m/s) and comparing three camera mounting positions: head-mounted, shoulder-mounted, and hand-held. We evaluated both a smartphone and smart glasses, using the phone's main and ultra-wide cameras. Four OCR engines were benchmarked to assess accuracy at different distances and viewing angles: Google Vision, PaddleOCR 3.0, EasyOCR, and Tesseract. PaddleOCR 3.0 was then used to evaluate accuracy at different walking speeds. Accuracy was computed at the character level using the Levenshtein ratio against manually defined ground truth. Results showed that recognition accuracy declined with increased walking speed and wider viewing angles. Google Vision achieved the highest overall accuracy, with PaddleOCR close behind as the strongest open-source alternative. Across devices, the phone's main camera achieved the highest accuracy, and a shoulder-mounted placement yielded the highest average among body positions; however, differences among shoulder, head, and hand were not statistically significant.

翻译：光学字符识别（OCR）技术可将印刷或手写文本转换为机器可读形式，广泛应用于盲人与低视力群体的辅助技术中。然而，现有评估多依赖静态数据集，未能反映移动使用场景中的实际挑战。本研究系统评估了静态与动态条件下的OCR性能。静态测试测量了1-7米距离范围及水平0-75度视角范围内的检测能力；动态测试通过改变行走速度（从慢速0.8米/秒到极快1.8米/秒）并比较三种摄像头佩戴位置（头戴式、肩戴式与手持式），探究运动对性能的影响。我们同时评估了智能手机与智能眼镜设备，其中手机测试包含主摄像头与超广角摄像头。研究对四款OCR引擎在不同距离与视角下的准确率进行基准测试：Google Vision、PaddleOCR 3.0、EasyOCR与Tesseract，并采用PaddleOCR 3.0评估不同行走速度下的准确率。准确率通过字符级Levenshtein比率对照人工标注真值进行计算。结果表明：识别准确率随行走速度加快与视角增大而下降；Google Vision整体准确率最高，PaddleOCR作为最优开源方案紧随其后。在不同设备中，手机主摄像头准确率最高；在身体佩戴位置中，肩戴式平均准确率最优，但肩部、头部与手持三种方式间的差异未达统计学显著水平。