In recent years, Artificial Intelligence (AI) systems have surpassed human intelligence in a variety of computational tasks. However, AI systems, like humans, make mistakes, have blind spots, hallucinate, and struggle to generalize to new situations. This work explores whether AI can benefit from creative decision-making mechanisms when pushed to the limits of its computational rationality. In particular, we investigate whether a team of diverse AI systems can outperform a single AI in challenging tasks by generating more ideas as a group and then selecting the best ones. We study this question in the game of chess, the so-called drosophila of AI. We build on AlphaZero (AZ) and extend it to represent a league of agents via a latent-conditioned architecture, which we call AZ_db. We train AZ_db to generate a wider range of ideas using behavioral diversity techniques and select the most promising ones with sub-additive planning. Our experiments suggest that AZ_db plays chess in diverse ways, solves more puzzles as a group and outperforms a more homogeneous team. Notably, AZ_db solves twice as many challenging puzzles as AZ, including the challenging Penrose positions. When playing chess from different openings, we notice that players in AZ_db specialize in different openings, and that selecting a player for each opening using sub-additive planning results in a 50 Elo improvement over AZ. Our findings suggest that diversity bonuses emerge in teams of AI agents, just as they do in teams of humans and that diversity is a valuable asset in solving computationally hard problems.
翻译:近年来,人工智能(AI)系统在多种计算任务中已超越人类智能。然而,与人类相似,AI系统存在错误、盲区、幻觉现象,且难以泛化至新情境。本研究探索当AI被推向计算理性极限时,创造性决策机制能否为其带来助益。具体而言,我们研究不同AI系统团队能否通过群体生成更多创意并遴选最优方案,在挑战性任务中超越单一AI系统。我们以被誉为"人工智能果蝇"的国际象棋领域为研究对象,基于AlphaZero(AZ)框架构建了潜在条件架构的智能体联盟系统——AZ_db。通过行为多样性技术训练AZ_db生成更广泛的创意方案,并采用次加性规划筛选最具潜力的策略。实验表明,AZ_db能以多样化方式弈棋,作为群体能解决更多谜题,且性能优于同质化团队。值得注意的是,AZ_db解决的挑战性谜题数量是AZ的两倍,包括高难度的Penrose布局。在不同开局对弈中,我们发现AZ_db的智能体各自专精特定开局,通过次加性规划为每个开局选择最优智能体,可使棋力较AZ提升50 Elo分值。研究结果表明:AI智能体团队与人类团队相似,能涌现多样性红利,且多样性是解决计算难题的重要资产。