基于机器人技能的电气柜装配仿真学习方法 (Simulation-based Learning of Electrical Cabinet Assembly Using Robot Skills)

This paper presents a simulation-driven approach for automating the force-controlled assembly of electrical terminals on DIN-rails, a task traditionally hindered by high programming effort and product variability. The proposed method integrates deep reinforcement learning (DRL) with parameterizable robot skills in a physics-based simulation environment. To realistically model the snap-fit assembly process, we develop and evaluate two types of joining models: analytical models based on beam theory and rigid-body models implemented in the MuJoCo physics engine. These models enable accurate simulation of interaction forces, essential for training DRL agents. The robot skills are structured using the pitasc framework, allowing modular, reusable control strategies. Training is conducted in simulation using Soft Actor-Critic (SAC) and Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithms. Domain randomization is applied to improve robustness. The trained policies are transferred to a physical UR10e robot system without additional tuning. Experimental results demonstrate high success rates (up to 100%) in both simulation and real-world settings, even under significant positional and rotational deviations. The system generalizes well to new terminal types and positions, significantly reducing manual programming effort. This work highlights the potential of combining simulation-based learning with modular robot skills for flexible, scalable automation in small-batch manufacturing. Future work will explore hybrid learning methods, automated environment parameterization, and further refinement of joining models for design integration.

翻译：本文提出一种仿真驱动方法，用于自动化实现DIN导轨上电气端子的力控装配——该任务传统上因高编程需求与产品多样性而受阻。所提方法在基于物理的仿真环境中，将深度强化学习与参数化机器人技能相集成。为真实模拟卡扣装配过程，我们开发并评估了两种连接模型：基于梁理论的分析模型，以及在MuJoCo物理引擎中实现的刚体模型。这些模型能够精确模拟交互力，这对训练DRL智能体至关重要。机器人技能采用pitasc框架进行结构化设计，实现了模块化、可重用的控制策略。训练在仿真环境中使用Soft Actor-Critic与Twin Delayed Deep Deterministic Policy Gradient算法完成，并通过领域随机化提升鲁棒性。训练后的策略无需额外调参即可迁移至实体UR10e机器人系统。实验结果表明，即使在显著位置与旋转偏差下，系统在仿真与真实场景中均实现高成功率（最高达100%）。该系统能良好泛化至新型端子与不同安装位置，显著降低了人工编程工作量。本研究凸显了将仿真学习与模块化机器人技能相结合，在小批量制造中实现灵活、可扩展自动化的潜力。未来工作将探索混合学习方法、自动化环境参数化，以及面向设计集成的连接模型进一步优化。