Controlling hand exoskeletons for assisting impaired patients in grasping tasks is challenging because it is difficult to infer user intent. We hypothesize that majority of daily grasping tasks fall into a small set of categories or modes which can be inferred through real-time analysis of environmental geometry from 3D point clouds. This paper presents a low-cost, real-time system for semantic image labeling of household scenes with the objective to inform and assist activities of daily living. The system consists of a miniature depth camera, an inertial measurement unit and a microprocessor. It is able to achieve 85% or higher accuracy at classification of predefined modes while processing complex 3D scenes at over 30 frames per second. Within each mode it can detect and localize graspable objects. Grasping points can be correctly estimated on average within 1 cm for simple object geometries. The system has potential applications in robotic-assisted rehabilitation as well as manual task assistance.
翻译:控制手部外骨骼以协助受损患者完成抓取任务具有挑战性,因为难以推断用户意图。我们假设大多数日常抓取任务属于少量可通过三维点云对环境几何进行实时分析而推断出的类别或模式。本文提出了一种低成本、实时系统,用于对家庭场景进行语义图像标注,旨在为日常生活活动提供信息与辅助。该系统由微型深度相机、惯性测量单元和微处理器组成。在每秒处理超过30帧复杂三维场景的同时,该系统对预定义模式的分类准确率可达85%或更高。在每个模式内,系统能够检测并定位可抓取物体。对于简单几何形状的物体,抓取点的平均估计误差在1厘米以内。该系统在机器人辅助康复及手动任务辅助领域具有潜在应用前景。