Multi-robot systems have become very popular in recent years because of their wide spectrum of applications, ranging from surveillance to cooperative payload transportation. Model Predictive Control (MPC) is a promising controller for multi-robot control because of its preview capability and ability to handle constraints easily. The performance of the MPC widely depends on many parameters, among which the prediction horizon is the major contributor. Increasing the prediction horizon beyond a limit drastically increases the computation cost. Tuning the value of the prediction horizon can be very time-consuming, and the tuning process must be repeated for every task. Moreover, instead of using a fixed horizon for an entire task, a better balance between performance and computation cost can be established if different prediction horizons can be employed for every robot at each time step. Further, for such variable prediction horizon MPC for multiple robots, on-demand collision avoidance is the key requirement. We propose Versatile On-demand Collision Avoidance (VODCA) strategy to comply with the variable horizon model predictive control. We also present a framework for learning the prediction horizon for the multi-robot system as a function of the states of the robots using the Soft Actor-Critic (SAC) RL algorithm. The results are illustrated and validated numerically for different multi-robot tasks.
翻译:多机器人系统因其广泛的应用场景(从监控到协作载荷运输)近年来变得日益普及。模型预测控制(MPC)凭借其前瞻能力及对约束的易处理特性,成为多机器人控制中颇具前景的控制器。MPC的性能受众多参数影响,其中预测时域是主要影响因素。将预测时域增大至某一阈值以上会急剧增加计算成本。预测时域的调参过程极为耗时,且需针对每项任务重复进行。此外,相较于在整个任务中使用固定时域,若能在每个时间步为各机器人采用不同的预测时域,则可更有效地平衡性能与计算成本。对于多机器人变时域MPC而言,按需避碰是关键需求。我们提出通用按需避碰(VODCA)策略以适配变时域模型预测控制,同时构建基于Soft Actor-Critic (SAC)强化学习算法的框架,实现预测时域作为机器人状态函数的自主学习。通过多机器人不同任务的数值仿真,验证了所提方法的有效性。