Addressing multiagent decision problems in AI, especially those involving collaborative or competitive agents acting concurrently in a partially observable and stochastic environment, remains a formidable challenge. While Interactive Dynamic Influence Diagrams~(I-DIDs) have offered a promising decision framework for such problems, they encounter limitations when the subject agent encounters unknown behaviors exhibited by other agents that are not explicitly modeled within the I-DID. This can lead to sub-optimal responses from the subject agent. In this paper, we propose a novel data-driven approach that utilizes an encoder-decoder architecture, particularly a variational autoencoder, to enhance I-DID solutions. By integrating a perplexity-based tree loss function into the optimization algorithm of the variational autoencoder, coupled with the advantages of Zig-Zag One-Hot encoding and decoding, we generate potential behaviors of other agents within the I-DID that are more likely to contain their true behaviors, even from limited interactions. This new approach enables the subject agent to respond more appropriately to unknown behaviors, thus improving its decision quality. We empirically demonstrate the effectiveness of the proposed approach in two well-established problem domains, highlighting its potential for handling multi-agent decision problems with unknown behaviors. This work is the first time of using neural networks based approaches to deal with the I-DID challenge in agent planning and learning problems.
翻译:解决人工智能中的多智能体决策问题,尤其是涉及在部分可观测随机环境中并发行动的协作或竞争智能体的问题,仍然是一项艰巨的挑战。尽管交互式动态影响图为这类问题提供了一个有前景的决策框架,但当主体智能体遇到其他智能体表现出未在I-DID中明确建模的未知行为时,该框架会面临局限。这可能导致主体智能体做出次优响应。本文提出一种新颖的数据驱动方法,利用编码器-解码器架构(特别是变分自编码器)来增强I-DID的求解。通过将基于困惑度的树损失函数集成到变分自编码器的优化算法中,并结合Zig-Zag独热编码与解码的优势,我们能够在I-DID中生成更可能包含其他智能体真实行为的潜在行为模式,即使仅基于有限的交互数据。这种新方法使主体智能体能够对未知行为做出更恰当的响应,从而提升其决策质量。我们在两个经典问题领域中通过实验验证了所提方法的有效性,突显了其在处理具有未知行为的多智能体决策问题方面的潜力。本研究首次采用基于神经网络的方法来解决智能体规划与学习问题中的I-DID挑战。