This paper studies the container lifting phase of a waste-container recycling task in urban environments, performed by a hydraulic loader crane equipped with an underactuated discharge unit, and proposes a residual reinforcement learning (RRL) approach that combines a nominal Cartesian controller with a learned residual policy. All experiments are conducted in simulation, where the task is characterized by tight geometric tolerances between the discharge-unit hooks and the container rings relative to the overall crane scale, making precise trajectory tracking and swing suppression essential. The nominal controller uses admittance control for trajectory tracking and pendulum-aware swing damping, followed by damped least-squares inverse kinematics with a nullspace posture term to generate joint velocity commands. A PPO-trained residual policy in Isaac Lab compensates for unmodeled dynamics and parameter variations, improving precision and robustness without requiring end-to-end learning from scratch. We further employ randomized episode initialization and domain randomization over payload properties, actuator gains, and passive joint parameters to enhance generalization. Simulation results demonstrate improved tracking accuracy, reduced oscillations, and higher lifting success rates compared to the nominal controller alone.
翻译:本文研究了城市环境中垃圾集装箱回收任务的吊装阶段,该任务由配备欠驱动卸料单元的液压装载起重机执行,并提出了一种结合标称笛卡尔控制器与学习残差策略的残差强化学习(RRL)方法。所有实验均在仿真中进行,该任务的特点是卸料单元吊钩与集装箱吊环之间的几何公差相对于起重机整体尺度而言非常严格,这使得精确的轨迹跟踪与摆动抑制至关重要。标称控制器采用导纳控制进行轨迹跟踪和基于摆体模型的摆动阻尼,随后通过带零空间姿态项的阻尼最小二乘逆运动学来生成关节速度指令。在Isaac Lab中,一个由PPO训练的残差策略用于补偿未建模动力学和参数变化,从而提高了精度和鲁棒性,而无需从头开始进行端到端学习。我们进一步采用了随机化的回合初始化以及对负载属性、执行器增益和被动关节参数的领域随机化,以增强泛化能力。仿真结果表明,与单独使用标称控制器相比,该方法提高了跟踪精度、减少了振荡,并获得了更高的吊装成功率。