Robots executing iterative tasks in complex, uncertain environments require control strategies that balance robustness, safety, and high performance. This paper introduces a safe information-theoretic learning model predictive control (SIT-LMPC) algorithm for iterative tasks. Specifically, we design an iterative control framework based on an information-theoretic model predictive control algorithm to address a constrained infinite-horizon optimal control problem for discrete-time nonlinear stochastic systems. An adaptive penalty method is developed to ensure safety while balancing optimality. Trajectories from previous iterations are utilized to learn a value function using normalizing flows, which enables richer uncertainty modeling compared to Gaussian priors. SIT-LMPC is designed for highly parallel execution on graphics processing units, allowing efficient real-time optimization. Benchmark simulations and hardware experiments demonstrate that SIT-LMPC iteratively improves system performance while robustly satisfying system constraints.
翻译:机器人在复杂、不确定环境中执行迭代任务时,需要一种能够兼顾鲁棒性、安全性与高性能的控制策略。本文针对迭代任务提出了一种安全信息论学习模型预测控制算法。具体而言,我们基于信息论模型预测控制算法设计了一个迭代控制框架,以解决离散时间非线性随机系统的约束无限时域最优控制问题。我们开发了一种自适应惩罚方法,在保证安全性的同时平衡最优性。该算法利用先前迭代产生的轨迹,通过归一化流学习价值函数,相较于高斯先验,该方法能够实现更丰富的不确定性建模。SIT-LMPC 专为在图形处理器上高度并行执行而设计,可实现高效的实时优化。基准仿真与硬件实验表明,SIT-LMPC 能够在鲁棒满足系统约束的同时,迭代地提升系统性能。