The convergence of large language models (LLMs) with 6G networks is fostering a paradigm of autonomous multi-agent cooperation, which in turn is expected to substantially increase east-west traffic. Although latent-space interaction mechanisms can enable more efficient collaboration than symbolic natural-language (NL) exchanges, prior work often abstracts away the associated communication overhead under practical wireless constraints. In embodied multi-agent settings, heterogeneous interaction media incur disparate inference and transmission costs, thereby inducing an inherent end-to-end (E2E) latency trade-off. To address this, we propose a joint design that integrates communication-media selection with wireless resource allocation. Through analytical characterization and simulation-based evaluation, we show that neither token-based transmission nor key-value (KV) cache-based transmission is uniformly optimal across operating regimes, as performance depends critically on system parameters such as available computational resources and channel conditions. Accordingly, we formulate a joint optimization problem aimed at minimizing the E2E latency of multi-agent collaboration and develop a low-complexity joint media selection and resource allocation (JMSRA) algorithm. Numerical results further confirm that, by adaptively coordinating the interaction media and bandwidth allocation over heterogeneous links, the proposed scheme achieves markedly reduced E2E latency relative to conventional NL-only and KV-cache-only baselines, enabling efficient and robust multi-agent collaboration in future wireless networks.
翻译:大语言模型与6G网络的融合正在催生自主多智能体协作的新范式,这预计将显著增加东西向流量。尽管潜在空间交互机制相比符号化自然语言交换能实现更高效的协作,但先前研究往往忽略了实际无线约束下相关的通信开销。在具身多智能体场景中,异构交互媒体会引发不同的推理与传输成本,从而产生固有的端到端延迟权衡。为此,我们提出一种将通信媒体选择与无线资源分配相结合的联合设计方案。通过理论分析与仿真评估表明,基于令牌或键值缓存的传输方式在不同运行场景下并非始终最优,其性能关键取决于可用计算资源和信道条件等系统参数。据此,我们构建了以最小化多智能体协作端到端延迟为目标的联合优化问题,并开发了低复杂度的联合媒体选择与资源分配算法。数值结果进一步证实,通过在异构链路上自适应协调交互媒体与带宽分配,所提方案相比传统纯自然语言与纯KV缓存基线方案能显著降低端到端延迟,从而在未来无线网络中实现高效鲁棒的多智能体协作。