Multi-Agent Reinforcement Learning is becoming increasingly more important in times of autonomous driving and other smart industrial applications. Simultaneously a promising new approach to Reinforcement Learning arises using the inherent properties of quantum mechanics, reducing the trainable parameters of a model significantly. However, gradient-based Multi-Agent Quantum Reinforcement Learning methods often have to struggle with barren plateaus, holding them back from matching the performance of classical approaches. We build upon an existing approach for gradient free Quantum Reinforcement Learning and propose three genetic variations with Variational Quantum Circuits for Multi-Agent Reinforcement Learning using evolutionary optimization. We evaluate our genetic variations in the Coin Game environment and also compare them to classical approaches. We showed that our Variational Quantum Circuit approaches perform significantly better compared to a neural network with a similar amount of trainable parameters. Compared to the larger neural network, our approaches archive similar results using $97.88\%$ less parameters.
翻译:多智能体强化学习在自动驾驶及其他智能工业应用场景中正变得越来越重要。与此同时,利用量子力学固有特性来减少模型可训练参数的新兴强化学习方法也展现出巨大潜力。然而,基于梯度的多智能体量子强化学习方法常受制于贫瘠高原问题,难以达到经典方法的性能水平。我们基于现有无梯度量子强化学习方法,提出了三种结合变分量子电路的遗传变体算法,用于实现进化优化的多智能体强化学习。我们在硬币博弈环境中评估了这些遗传变体,并将其与经典方法进行了对比。结果表明:在可训练参数数量相近的情况下,我们的变分量子电路方法性能显著优于神经网络;与更大规模的神经网络相比,我们的方法在参数数量减少97.88%的条件下取得了相近的结果。