Micro robotics is quickly emerging to be a promising technological solution to many medical treatments with focus on targeted drug delivery. They are effective when working in swarms whose individual control is mostly infeasible owing to their minute size. Controlling a number of robots with a single controller is thus important and artificial intelligence can help us perform this task successfully. In this work, we use the Reinforcement Learning (RL) algorithms Proximal Policy Optimization (PPO) and Robust Policy Optimization (RPO) to navigate a swarm of 4, 9 and 16 microswimmers under hydrodynamic effects, controlled by their orientation, towards a circular absorbing target. We look at both PPO and RPO performances with limited state information scenarios and also test their robustness for random target location and size. We use curriculum learning to improve upon the performance and demonstrate the same in learning to navigate a swarm of 25 swimmers and steering the swarm to exemplify the manoeuvring capabilities of the RL model.
翻译:微型机器人学正迅速成为众多医学治疗(尤其是靶向药物递送领域)中极具前景的技术解决方案。当微型机器人以集群形式工作时效果显著,但因其微小尺寸使得个体控制大多不可行。因此,利用单一控制器操纵多台机器人至关重要,而人工智能可助我们成功完成此任务。在本研究中,我们采用强化学习算法——近端策略优化与鲁棒策略优化——在流体动力学效应影响下,通过控制方位角,引导由4、9和16个微型游动体组成的集群向圆形吸收靶标运动。我们考察了PPO与RPO在有限状态信息场景下的性能表现,并测试了二者对随机靶标位置与尺寸的鲁棒性。我们运用课程学习提升算法性能,并展示了其在学习引导25个游动体集群导航及操控集群转向以体现强化学习模型机动能力方面的成效。