In recent developments within the research community, the integration of Large Language Models (LLMs) in creating fully autonomous agents has garnered significant interest. Despite this, LLM-based agents frequently demonstrate notable shortcomings in adjusting to dynamic environments and fully grasping human needs. In this work, we introduce the problem of LLM-based human-agent collaboration for complex task-solving, exploring their synergistic potential. In addition, we propose a Reinforcement Learning-based Human-Agent Collaboration method, ReHAC. This approach includes a policy model designed to determine the most opportune stages for human intervention within the task-solving process. We construct a human-agent collaboration dataset to train this policy model in an offline reinforcement learning environment. Our validation tests confirm the model's effectiveness. The results demonstrate that the synergistic efforts of humans and LLM-based agents significantly improve performance in complex tasks, primarily through well-planned, limited human intervention. Datasets and code are available at: https://github.com/XueyangFeng/ReHAC.
翻译:近年来,研究社区中利用大语言模型(LLMs)构建完全自主智能体的应用引起了广泛关注。然而,基于LLM的智能体在适应动态环境和充分理解人类需求方面仍存在显著不足。本研究引入基于大语言模型的人机协同解决复杂任务问题,探索其协同潜力。此外,我们提出了一种基于强化学习的人机协同方法ReHAC。该方法包含一个策略模型,用于确定任务解决过程中最适合进行人工干预的阶段。我们构建了一个人机协同数据集,在离线强化学习环境下训练该策略模型。验证测试结果证实了模型的有效性。研究结果表明,通过精心规划的有限人工干预,人类与基于LLM的智能体之间的协同努力能够显著提升复杂任务的性能。数据集和代码已公开于:https://github.com/XueyangFeng/ReHAC。