Deploying large language model (LLM) on edge device enables personalized LLM agents for various users. The growing availability of diverse personalized agents presents a unique opportunity for peer-to-peer (P2P) collaboration, wherein each user can delegate tasks beyond the local agent's expertise to remote agents more suited for the specific query. This paper introduces PPAI, the first personalized LLM agent interoperability system, which enables users to collaborate with each other based on agent specialization. However, the ever-changing pool of agents and their interchangeable capacity introduce new challenges when it comes to matching queries to agents and balancing loads, compared with existing P2P systems. Therefore, we propose a scalable query-agent pair scoring mechanism based on prototypes to identify suitable agents within a P2P network with churn. Moreover, we propose a multi-agent interoperability Bayesian game to balance local demand and global efficiency, when changes in remote agent load occur too quickly to be observed. Finally, we implement a prototype of PPAI and demonstrate that it substantially broadens the range of tasks that could be carried out while maintaining load balance. On average, it achieves an accuracy improvement of up to 7.96% across multiple tasks, while reducing latency by 16.34% compared to the baseline.
翻译:在边缘设备上部署大语言模型(LLM)可为不同用户实现个性化LLM代理。日益丰富的个性化代理为点对点(P2P)协作提供了独特机遇——用户可将本地代理无法胜任的任务委托给更适合特定查询的远程代理。本文提出首个个性化LLM代理互操作系统PPAI,使基于代理专业化的用户协作成为可能。然而,与现有P2P系统相比,代理池的持续动态变化及其可互换能力特性,给查询-代理匹配和负载均衡带来了新挑战。为此,我们提出一种基于原型的可扩展查询-代理对评分机制,用于在存在节点动态进出的P2P网络中识别合适代理。针对远程代理负载变化过快而无法实时观测的问题,我们进一步提出基于多代理互操作性贝叶斯博弈的均衡策略,以协调本地需求与全局效率。最终实现PPAI原型系统,实验表明该系统在维持负载均衡的同时显著扩展了可执行任务范围。与基线相比,在多任务场景下平均准确率提升达7.96%,延迟降低16.34%。