Multi-agent reinforcement learning systems deployed in real-world robotics applications face severe communication constraints that significantly impact coordination effectiveness. We present a framework that combines information bottleneck theory with vector quantization to enable selective, bandwidth-efficient communication in multi-agent environments. Our approach learns to compress and discretize communication messages while preserving task-critical information through principled information-theoretic optimization. We introduce a gated communication mechanism that dynamically determines when communication is necessary based on environmental context and agent states. Experimental evaluation on challenging coordination tasks demonstrates that our method achieves 181.8% performance improvement over no-communication baselines while reducing bandwidth usage by 41.4%. Comprehensive Pareto frontier analysis shows dominance across the entire success-bandwidth spectrum with area-under-curve of 0.198 vs 0.142 for next-best methods. Our approach significantly outperforms existing communication strategies and establishes a theoretically grounded framework for deploying multi-agent systems in bandwidth-constrained environments such as robotic swarms, autonomous vehicle fleets, and distributed sensor networks.
翻译:在真实世界机器人应用中部署的多智能体强化学习系统面临严重的通信约束,这显著影响了协调效能。我们提出了一个将信息瓶颈理论与矢量量化相结合的框架,以实现多智能体环境中的选择性、带宽高效通信。我们的方法通过学习压缩和离散化通信消息,同时通过基于原理的信息论优化保留任务关键信息。我们引入了一种门控通信机制,该机制根据环境上下文和智能体状态动态决定何时需要通信。在具有挑战性的协调任务上的实验评估表明,我们的方法相较于无通信基线实现了181.8%的性能提升,同时减少了41.4%的带宽使用。全面的帕累托前沿分析显示,在整个成功率-带宽谱上我们的方法均占优,其曲线下面积为0.198,而次优方法为0.142。我们的方法显著优于现有的通信策略,并为在带宽受限环境(如机器人集群、自动驾驶车队和分布式传感器网络)中部署多智能体系统建立了一个理论坚实的框架。