Bayesian inference for data-efficient, explainable, and safe robotic motion planning: A review

Bayesian inference has many advantages in robotic motion planning over four perspectives: The uncertainty quantification of the policy, safety (risk-aware) and optimum guarantees of robot motions, data-efficiency in training of reinforcement learning, and reducing the sim2real gap when the robot is applied to real-world tasks. However, the application of Bayesian inference in robotic motion planning is lagging behind the comprehensive theory of Bayesian inference. Further, there are no comprehensive reviews to summarize the progress of Bayesian inference to give researchers a systematic understanding in robotic motion planning. This paper first provides the probabilistic theories of Bayesian inference which are the preliminary of Bayesian inference for complex cases. Second, the Bayesian estimation is given to estimate the posterior of policies or unknown functions which are used to compute the policy. Third, the classical model-based Bayesian RL and model-free Bayesian RL algorithms for robotic motion planning are summarized, while these algorithms in complex cases are also analyzed. Fourth, the analysis of Bayesian inference in inverse RL is given to infer the reward functions in a data-efficient manner. Fifth, we systematically present the hybridization of Bayesian inference and RL which is a promising direction to improve the convergence of RL for better motion planning. Sixth, given the Bayesian inference, we present the interpretable and safe robotic motion plannings which are the hot research topic recently. Finally, all algorithms reviewed in this paper are summarized analytically as the knowledge graphs, and the future of Bayesian inference for robotic motion planning is also discussed, to pave the way for data-efficient, explainable, and safe robotic motion planning strategies for practical applications.

翻译：贝叶斯推断在机器人运动规划中具有多方面优势：策略的不确定性量化、机器人运动的安全性（风险感知）与最优性保证、强化学习的训练数据效率，以及机器人应用于实际任务时减少仿真到现实（sim2real）的差距。然而，贝叶斯推断在机器人运动规划中的应用仍滞后于其完备的理论体系。此外，目前尚无综合性综述总结贝叶斯推断的进展，以便研究人员系统理解其在机器人运动规划中的应用。本文首先提供贝叶斯推断的概率理论基础，作为复杂情况下应用的前提；其次，给出贝叶斯估计方法，用于估计策略或未知函数（用于计算策略）的后验分布；第三，总结基于经典模型和无需模型的贝叶斯强化学习算法在机器人运动规划中的应用，并分析这些算法在复杂场景下的表现；第四，阐述贝叶斯推断在逆强化学习中的分析方法，以数据高效的方式推断奖励函数；第五，系统梳理贝叶斯推断与强化学习的混合方法，这一方向有望提升强化学习的收敛性以实现更优的运动规划；第六，基于贝叶斯推断，介绍近年来研究热点的可解释性与安全性机器人运动规划；最后，以知识图谱形式分析总结本文回顾的所有算法，并探讨贝叶斯推断在机器人运动规划中的未来发展方向，为实际应用中数据高效、可解释且安全的运动规划策略铺平道路。