The current transformation towards smart manufacturing has led to a growing demand for human-robot collaboration (HRC) in the manufacturing process. Perceiving and understanding the human co-worker's behaviour introduces challenges for collaborative robots to efficiently and effectively perform tasks in unstructured and dynamic environments. Integrating recent data-driven machine vision capabilities into HRC systems is a logical next step in addressing these challenges. However, in these cases, off-the-shelf components struggle due to generalisation limitations. Real-world evaluation is required in order to fully appreciate the maturity and robustness of these approaches. Furthermore, understanding the pure-vision aspects is a crucial first step before combining multiple modalities in order to understand the limitations. In this paper, we propose GoferBot, a novel vision-based semantic HRC system for a real-world assembly task. It is composed of a visual servoing module that reaches and grasps assembly parts in an unstructured multi-instance and dynamic environment, an action recognition module that performs human action prediction for implicit communication, and a visual handover module that uses the perceptual understanding of human behaviour to produce an intuitive and efficient collaborative assembly experience. GoferBot is a novel assembly system that seamlessly integrates all sub-modules by utilising implicit semantic information purely from visual perception.
翻译:当前向智能制造的转型导致制造过程中对人机协作(HRC)的需求日益增长。感知和理解人类同事的行为给协作机器人带来了挑战,使其难以在非结构化和动态环境中高效、有效地执行任务。将基于数据驱动的最新机器视觉能力集成到HRC系统中,是应对这些挑战的合乎逻辑的下一步。然而,在这些情况下,现成组件由于泛化能力有限而表现不佳。为了充分评估这些方法的成熟度和鲁棒性,需要进行真实环境下的评估。此外,在多模态融合之前,理解纯视觉方面是明确其局限性的关键第一步。在本文中,我们提出了GoferBot,一种用于真实世界装配任务的基于视觉的新型语义HRC系统。该系统包含一个视觉伺服模块,可在非结构化多实例动态环境中到达并抓取装配零件;一个动作识别模块,可对人类动作进行预测以实现隐式通信;以及一个视觉交递模块,通过感知理解人类行为,提供直观高效的协作装配体验。GoferBot是一种新型装配系统,它通过仅从视觉感知中利用隐式语义信息,将所有子模块无缝集成。