In robotics, contemporary strategies are learning-based, characterized by a complex black-box nature and a lack of interpretability, which may pose challenges in ensuring stability and safety. To address these issues, we propose integrating an obstacle-free deep reinforcement learning (DRL) trajectory planner with a novel auto-tuning low- and joint-level control strategy, all while actively engaging in the learning phase through interactions with the environment. This approach circumvents the complexities associated with computations while also addressing nonrepetitive and random obstacle avoidance tasks. First, a model-free DRL agent to plan velocity-bounded and obstacle-free motion is employed for a manipulator with 'n' degrees of freedom (DoF) in task space through joint-level reasoning. This plan is then input into a robust subsystem-based adaptive controller, which produces the necessary torques, while the Cuckoo Search Optimization (CSO) algorithm enhances control gains to minimize the time required to reach, time taken to stabilize, the maximum deviation from the desired value, and persistent tracking error in the steady state. This approach guarantees that position and velocity errors exponentially converge to zero, accounting for any initial and end-point variations, unknown modeling errors, and external disturbances. Theoretical assertions are validated through the presentation of simulation outcomes.
翻译:在机器人领域,当代策略以学习为基础,具有复杂的黑箱特性且缺乏可解释性,可能对确保系统稳定性与安全性构成挑战。为解决这些问题,我们提出将无碰撞深度强化学习轨迹规划器与新型自适应底层及关节级控制策略相结合,同时在主动与环境交互的学习阶段中发挥作用。该方法不仅规避了计算复杂性问题,还能处理非重复性及随机避障任务。首先,采用无模型深度强化学习代理,通过关节级推理为具有n个自由度的操作臂在任务空间中规划速度受限且无碰撞的运动。随后,该规划输入到鲁棒子系统自适应控制器中,生成所需力矩,同时利用布谷鸟搜索优化算法调整控制增益,以最小化到达时间、稳定时间、与期望值的最大偏差以及稳态持续跟踪误差。这种方法保证位置和速度误差指数收敛至零,可应对初始与终点变化、未知建模误差及外部扰动。通过仿真结果验证了理论论断。