Building AIs with adaptive behaviors in human-AI cooperation stands as a pivotal focus in AGI research. Current methods for developing cooperative agents predominantly rely on learning-based methods, where policy generalization heavily hinges on past interactions with specific teammates. These approaches constrain the agent's capacity to recalibrate its strategy when confronted with novel teammates. We propose \textbf{ProAgent}, a novel framework that harnesses large language models (LLMs) to fashion a \textit{pro}active \textit{agent} empowered with the ability to anticipate teammates' forthcoming decisions and formulate enhanced plans for itself. ProAgent excels at cooperative reasoning with the capacity to dynamically adapt its behavior to enhance collaborative efforts with teammates. Moreover, the ProAgent framework exhibits a high degree of modularity and interpretability, facilitating seamless integration to address a wide array of coordination scenarios. Experimental evaluations conducted within the framework of \textit{Overcook-AI} unveil the remarkable performance superiority of ProAgent, outperforming five methods based on self-play and population-based training in cooperation with AI agents. Further, when cooperating with human proxy models, its performance exhibits an average improvement exceeding 10\% compared to the current state-of-the-art, COLE. The advancement was consistently observed across diverse scenarios involving interactions with both AI agents of varying characteristics and human counterparts. These findings inspire future research for human-robot collaborations. For a hands-on demonstration, please visit \url{https://pku-proagent.github.io}.
翻译:在人类与人工智能的协作中构建具有自适应行为的人工智能,是通用人工智能(AGI)研究的核心方向之一。当前开发协作智能体的方法主要依赖基于学习的范式,其策略泛化能力严重依赖于与特定队友的历史交互经验。这限制了智能体在面对全新队友时重新调整策略的能力。我们提出\textbf{ProAgent}——一个创新框架,通过利用大语言模型(LLMs)打造具备前瞻能力的\textit{主动型智能体},使其能够预判队友未来决策并为自身制定更优规划。ProAgent擅长协作推理,能够动态调整自身行为以增强与队友的协同效果。此外,该框架具有高度模块化与可解释性,可无缝适配多种协作场景。在\textit{Overcook-AI}框架下的实验评估表明,ProAgent在与AI智能体协作时展现出显著优势,其性能超越了基于自我对弈与种群训练的五大基准方法。更值得注意的是,当与人类代理模型协作时,其性能相比当前最优方法COLE平均提升超过10%。这一优势在与不同特性AI智能体及人类对象的跨场景交互中均得到稳健验证。这些发现为未来人机协作研究开辟了新方向。如需交互式演示,请访问\url{https://pku-proagent.github.io}。