Human-robot collaboration has benefited users with higher efficiency towards interactive tasks. Nevertheless, most collaborative schemes rely on complicated human-machine interfaces, which might lack the requisite intuitiveness compared with natural limb control. We also expect to understand human intent with low training data requirements. In response to these challenges, this paper introduces an innovative human-robot collaborative framework that seamlessly integrates hand gesture and dynamic movement recognition, voice recognition, and a switchable control adaptation strategy. These modules provide a user-friendly approach that enables the robot to deliver the tools as per user need, especially when the user is working with both hands. Therefore, users can focus on their task execution without additional training in the use of human-machine interfaces, while the robot interprets their intuitive gestures. The proposed multimodal interaction framework is executed in the UR5e robot platform equipped with a RealSense D435i camera, and the effectiveness is assessed through a soldering circuit board task. The experiment results have demonstrated superior performance in hand gesture recognition, where the static hand gesture recognition module achieves an accuracy of 94.3\%, while the dynamic motion recognition module reaches 97.6\% accuracy. Compared with human solo manipulation, the proposed approach facilitates higher efficiency tool delivery, without significantly distracting from human intents.
翻译:人机协作通过提升交互任务效率使操作者受益,但现有协作方案多依赖复杂的人机接口,与自然肢体控制相比缺乏必要的直觉性。我们同时希望以较低训练数据需求理解人类意图。针对上述挑战,本文提出一种创新的人机协作框架,该框架无缝融合了手势与动态动作识别、语音识别以及可切换控制自适应策略。这些模块提供用户友好型方法,使机器人能根据用户需求递送工具,尤其在用户双手作业时。因此,用户无需额外学习人机接口操作即可专注于任务执行,而机器人能解读其直观手势。所提出的多模态交互框架在配备RealSense D435i相机的UR5e机器人平台上实现,并通过电路板焊接任务验证有效性。实验结果表明,在手势识别方面该框架表现优异:静态手势识别模块准确率达94.3%,动态动作识别模块达97.6%。与人类独立操作相比,所提方法可实现更高效率的工具递送,且不会显著偏离人类意图。