Our hands serve as a fundamental means of interaction with the world around us. Therefore, understanding hand poses and interaction context is critical for human-computer interaction. We present EchoWrist, a low-power wristband that continuously estimates 3D hand pose and recognizes hand-object interactions using active acoustic sensing. EchoWrist is equipped with two speakers emitting inaudible sound waves toward the hand. These sound waves interact with the hand and its surroundings through reflections and diffractions, carrying rich information about the hand's shape and the objects it interacts with. The information captured by the two microphones goes through a deep learning inference system that recovers hand poses and identifies various everyday hand activities. Results from the two 12-participant user studies show that EchoWrist is effective and efficient at tracking 3D hand poses and recognizing hand-object interactions. Operating at 57.9mW, EchoWrist is able to continuously reconstruct 20 3D hand joints with MJEDE of 4.81mm and recognize 12 naturalistic hand-object interactions with 97.6% accuracy.
翻译:人类双手是与周围世界交互的基础方式。因此,理解手部姿态与交互情境对人机交互至关重要。我们提出EchoWrist,一种采用主动声学感测的低功耗腕带设备,能够连续估算三维手部姿态并识别手物交互。EchoWrist配备两个扬声器,向手部发射人耳不可闻的声波。这些声波通过反射与衍射作用与手部及其环境交互,携带关于手部形状及交互物体的丰富信息。两个麦克风捕捉到的信号经由深度学习推理系统处理,可恢复手部姿态并识别多种日常手部活动。两项各含12名参与者的用户研究表明,EchoWrist在追踪三维手部姿态与识别手物交互方面兼具高效性与有效性。在57.9mW功耗下,EchoWrist能够以4.81mm的平均关节端点误差连续重建20个三维手部关节,并以97.6%的准确率识别12种自然手物交互行为。