A Wearable Multimodal Ultrasound+Inertial System for Real-Time Virtual Reality Interaction

A-mode ultrasound (US) is a promising sensing modality for Virtual Reality (VR) interaction, as it enables the mapping of muscular activity into control commands while retaining the benefits of wearable sensing. However, existing approaches still face limitations in terms of wearability and interaction complexity, often relying on external hardware such as cameras. In this work, we propose a fully wearable multimodal interface for real-time VR-interaction, based on concurrent US and inertial (accelerometry) sensing from the forearm and upper arm. The system is built on the WULPUS platform and integrates an end-to-end software framework for real-time acquisition, visualization, and communication with a Unity-based VR environment. A multimodal learning pipeline is introduced for concurrent hand pose and forearm position estimation in 2D space. The interface is evaluated through offline and online experiments with five subjects, during the execution of three functional tasks: cylinder grasping (gross motor) and relocation, marble pinching (fine motor) and relocation, and liquid pouring. For offline experiments, we collect 5 acquisition sessions across multiple days, achieving an average inter-session accuracy across subjects of 80$\pm$6\% for hand pose estimation and 77$\pm$7\% for forearm position estimation. Online validation with minimal fine-tuning (5 min) demonstrates success rates of 92.0$\pm$16.0\%, 88.0$\pm$9.8\%, and 96.0$\pm$8.0\% for the three tasks, respectively. With a power consumption of only 19.9~mW, our system enables more than 2.5 days of continuous use on a small 350 mAh LiPo battery without the need for recharge, enabling truly wearable, multimodal, and functionally meaningful VR interaction.

翻译：A型超声（US）是一种有前景的虚拟现实（VR）交互传感模态，它能够将肌肉活动映射为控制指令，同时保留可穿戴传感的优势。然而，现有方法在可穿戴性和交互复杂度方面仍存在局限，通常依赖于摄像头等外部硬件。在本工作中，我们提出了一种基于前臂和上臂同步超声与惯性（加速度测量）传感的全可穿戴多模态接口，用于实时VR交互。该系统基于WULPUS平台构建，集成了端到端软件框架，用于与基于Unity的VR环境进行实时采集、可视化和通信。我们引入了一种多模态学习流水线，用于同步估计二维空间中的手部姿态和前臂位置。通过五名受试者执行三项功能性任务（圆柱抓取（粗大运动）与搬移、弹珠捏取（精细运动）与搬移、以及液体倾倒）的离线和在线实验，对该接口进行了评估。离线实验中，我们在多天内采集了5个会话，受试者平均跨会话准确率在手部姿态估计上达到80±6%，前臂位置估计上达到77±7%。在线验证中（仅需5分钟微调），三项任务的成功率分别为92.0±16.0%、88.0±9.8%和96.0±8.0%。该系统的功耗仅为19.9 mW，可在小型350 mAh锂聚合物电池上连续使用超过2.5天而无需充电，从而实现了真正可穿戴、多模态且功能上有实际意义的VR交互。