As an important branch of embodied artificial intelligence, mobile manipulators are increasingly applied in intelligent services, but their redundant degrees of freedom also limit efficient motion planning in cluttered environments. To address this issue, this paper proposes a hybrid learning and optimization framework for reactive whole-body motion planning of mobile manipulators. We develop the Bayesian distributional soft actor-critic (Bayes-DSAC) algorithm to improve the quality of value estimation and the convergence performance of the learning. Additionally, we introduce a quadratic programming method constrained by the signed distance field to enhance the safety of the obstacle avoidance motion. We conduct experiments and make comparison with standard benchmark. The experimental results verify that our proposed framework significantly improves the efficiency of reactive whole-body motion planning, reduces the planning time, and improves the success rate of motion planning. Additionally, the proposed reinforcement learning method ensures a rapid learning process in the whole-body planning task. The novel framework allows mobile manipulators to adapt to complex environments more safely and efficiently.
翻译:作为具身人工智能的重要分支,移动操作臂在智能服务领域的应用日益广泛,但其冗余自由度也限制了在杂乱环境中进行高效运动规划的能力。为解决这一问题,本文提出了一种融合学习与优化的混合框架,用于实现移动操作臂的全身反应式运动规划。我们提出了贝叶斯分布软演员-评论家算法,以提升价值估计的质量与学习过程的收敛性能。此外,我们引入了一种基于符号距离场约束的二次规划方法,以增强避障运动的安全性。我们进行了实验并与标准基准进行了对比。实验结果验证了所提框架能显著提升全身反应式运动规划的效能,缩短规划时间,并提高运动规划的成功率。同时,所提出的强化学习方法确保了在全身规划任务中实现快速学习。该新颖框架使移动操作臂能够更安全、高效地适应复杂环境。