This work introduces a novel value decomposition algorithm, termed \textit{Dynamic Deep Factor Graphs} (DDFG). Unlike traditional coordination graphs, DDFG leverages factor graphs to articulate the decomposition of value functions, offering enhanced flexibility and adaptability to complex value function structures. Central to DDFG is a graph structure generation policy that innovatively generates factor graph structures on-the-fly, effectively addressing the dynamic collaboration requirements among agents. DDFG strikes an optimal balance between the computational overhead associated with aggregating value functions and the performance degradation inherent in their complete decomposition. Through the application of the max-sum algorithm, DDFG efficiently identifies optimal policies. We empirically validate DDFG's efficacy in complex scenarios, including higher-order predator-prey tasks and the StarCraft II Multi-agent Challenge (SMAC), thus underscoring its capability to surmount the limitations faced by existing value decomposition algorithms. DDFG emerges as a robust solution for MARL challenges that demand nuanced understanding and facilitation of dynamic agent collaboration. The implementation of DDFG is made publicly accessible, with the source code available at \url{https://github.com/SICC-Group/DDFG}.
翻译:本文提出了一种新颖的价值分解算法,称为动态深度因子图。与传统协调图不同,DDFG利用因子图来表述价值函数的分解,为复杂的价值函数结构提供了更高的灵活性和适应性。DDFG的核心是一个图结构生成策略,该策略创新性地实时生成因子图结构,有效应对智能体间动态协作的需求。DDFG在聚合价值函数相关的计算开销与完全分解价值函数固有的性能下降之间取得了最佳平衡。通过应用max-sum算法,DDFG能够高效地确定最优策略。我们在复杂场景中实证验证了DDFG的有效性,包括高阶捕食者-猎物任务和星际争霸II多智能体挑战,从而凸显了其克服现有价值分解算法局限性的能力。DDFG为需要细致理解和促进动态智能体协作的MARL挑战提供了一个鲁棒的解决方案。DDFG的实现已公开,源代码可在\url{https://github.com/SICC-Group/DDFG}获取。