Anchor-bolt insertion is a peg-in-hole task performed in the construction field for holes in concrete. Efforts have been made to automate this task, but the variable lighting and hole surface conditions, as well as the requirements for short setup and task execution time make the automation challenging. In this study, we introduce a vision and proprioceptive data-driven robot control model for this task that is robust to challenging lighting and hole surface conditions. This model consists of a spatial attention point network (SAP) and a deep reinforcement learning (DRL) policy that are trained jointly end-to-end to control the robot. The model is trained in an offline manner, with a sample-efficient framework designed to reduce training time and minimize the reality gap when transferring the model to the physical world. Through evaluations with an industrial robot performing the task in 12 unknown holes, starting from 16 different initial positions, and under three different lighting conditions (two with misleading shadows), we demonstrate that SAP can generate relevant attention points of the image even in challenging lighting conditions. We also show that the proposed model enables task execution with higher success rate and shorter task completion time than various baselines. Due to the proposed model's high effectiveness even in severe lighting, initial positions, and hole conditions, and the offline training framework's high sample-efficiency and short training time, this approach can be easily applied to construction.
翻译:锚栓插入是建筑领域中针对混凝土孔洞的轴孔插入任务。虽然已尝试自动化该流程,但光照变化、孔洞表面状态差异以及短部署时间与执行时间要求仍构成技术挑战。本研究提出一种对严苛光照与孔洞表面条件具有鲁棒性的视觉-本体感知协同驱动机器人控制模型。该模型包含空间注意力点网络(SAP)与深度强化学习(DRL)策略,两者通过端到端联合训练实现机器人控制。模型采用离线训练方式,通过样本高效框架缩短训练时间并降低物理世界迁移时的虚实差距。在工业机器人对12个未知孔洞、16种初始位置及三种光照条件(含两种误导性阴影)的测试中,SAP即便在严苛光照下仍能生成有效图像注意力点。实验表明,该模型相较多种基线方法,任务成功率更高且完成时间更短。由于模型在极端光照、初始位置与孔洞条件下仍保持高效性,加之离线训练框架的样本效率与短时训练特性,本方法可便捷应用于建筑领域。