In recent years, learning-based control in robotics has gained significant attention due to its capability to address complex tasks in real-world environments. With the advances in machine learning algorithms and computational capabilities, this approach is becoming increasingly important for solving challenging control problems in robotics by learning unknown or partially known robot dynamics. Active exploration, in which a robot directs itself to states that yield the highest information gain, is essential for efficient data collection and minimizing human supervision. Similarly, uncertainty-aware deployment has been a growing concern in robotic control, as uncertain actions informed by the learned model can lead to unstable motions or failure. However, active exploration and uncertainty-aware deployment have been studied independently, and there is limited literature that seamlessly integrates them. This paper presents a unified model-based reinforcement learning framework that bridges these two tasks in the robotics control domain. Our framework uses a probabilistic ensemble neural network for dynamics learning, allowing the quantification of epistemic uncertainty via Jensen-Renyi Divergence. The two opposing tasks of exploration and deployment are optimized through state-of-the-art sampling-based MPC, resulting in efficient collection of training data and successful avoidance of uncertain state-action spaces. We conduct experiments on both autonomous vehicles and wheeled robots, showing promising results for both exploration and deployment.
翻译:近年来,基于学习的机器人控制技术因其在真实环境中解决复杂任务的能力而受到广泛关注。借助机器学习算法与计算能力的进步,该方法通过学习未知或部分已知的机器人动力学,正日益成为解决机器人领域挑战性控制问题的关键手段。主动探索使机器人能自主导向信息增益最大的状态,对于高效数据收集和减少人工监督至关重要。同时,由于基于学习模型的不确定行为可能导致运动不稳定或任务失败,不确定性感知部署已成为机器人控制领域日益关注的焦点。然而,主动探索与不确定性感知部署此前被独立研究,缺乏将其无缝融合的文献。本文提出一种统一基于模型的强化学习框架,在机器人控制领域实现这两类任务的衔接。该框架采用概率集成神经网络进行动力学学习,通过Jensen-Renyi散度量化认知不确定性,并利用最先进的基于采样的模型预测控制(MPC)对探索与部署这两个对立任务进行协同优化,从而高效收集训练数据并成功规避不确定的状态-动作空间。我们在自动驾驶车辆和轮式机器人上开展了实验,结果表明该框架在探索和部署任务中均展现出良好性能。