In robotics, contemporary strategies are learning-based, characterized by a complex black-box nature and a lack of interpretability, which may pose challenges in ensuring stability and safety. To address these issues, we propose integrating an obstacle-free deep reinforcement learning (DRL) trajectory planner with a novel auto-tuning low- and joint-level control strategy, all while actively engaging in the learning phase through interactions with the environment. This approach circumvents the complexities associated with computations while also addressing nonrepetitive and random obstacle avoidance tasks. First, a model-free DRL agent to plan velocity-bounded and obstacle-free motion is employed for a manipulator with 'n' degrees of freedom (DoF) in task space through joint-level reasoning. This plan is then input into a robust subsystem-based adaptive controller, which produces the necessary torques, while the Cuckoo Search Optimization (CSO) algorithm enhances control gains to minimize the time required to reach, time taken to stabilize, the maximum deviation from the desired value, and persistent tracking error in the steady state. This approach guarantees that position and velocity errors exponentially converge to zero in an unfamiliar environment, despite unknown robotic manipulator modeling. Theoretical assertions are validated through the presentation of simulation outcomes.
翻译:在机器人学中,当代策略以学习为基础,具有复杂的黑箱特性且缺乏可解释性,这可能在确保稳定性和安全性方面带来挑战。为解决这些问题,我们提出将无碰撞深度强化学习(DRL)轨迹规划器与新型自调节底层及关节级控制策略相结合,同时在通过与环境交互主动参与学习阶段。该方法规避了计算复杂性,同时处理了非重复性和随机避障任务。首先,采用无模型DRL代理通过关节级推理,在任务空间中为具有'n'个自由度(DoF)的操作器规划速度受限且无碰撞的运动。然后,将该规划输入到基于鲁棒子系统的自适应控制器中,该控制器产生所需力矩,同时利用布谷鸟搜索优化(CSO)算法增强控制增益,以最小化到达时间、稳定时间、与期望值的最大偏差以及稳态下的持续跟踪误差。该方法保证了在未知环境下,尽管机器人操作器模型未知,位置和速度误差指数收敛至零。通过仿真结果的呈现验证了理论断言。