OpenAGI: When LLM Meets Domain Experts

Human intelligence has the remarkable ability to assemble basic skills into complex ones so as to solve complex tasks. This ability is equally important for Artificial Intelligence (AI), and thus, we assert that in addition to the development of large, comprehensive intelligent models, it is equally crucial to equip such models with the capability to harness various domain-specific expert models for complex task-solving in the pursuit of Artificial General Intelligence (AGI). Recent developments in Large Language Models (LLMs) have demonstrated remarkable learning and reasoning abilities, making them promising as a controller to select, synthesize, and execute external models to solve complex tasks. In this project, we develop OpenAGI, an open-source AGI research platform, specifically designed to offer complex, multi-step tasks and accompanied by task-specific datasets, evaluation metrics, and a diverse range of extensible models. OpenAGI formulates complex tasks as natural language queries, serving as input to the LLM. The LLM subsequently selects, synthesizes, and executes models provided by OpenAGI to address the task. Furthermore, we propose a Reinforcement Learning from Task Feedback (RLTF) mechanism, which uses the task-solving result as feedback to improve the LLM's task-solving ability. Thus, the LLM is responsible for synthesizing various external models for solving complex tasks, while RLTF provides feedback to improve its task-solving ability, enabling a feedback loop for self-improving AI. We believe that the paradigm of LLMs operating various expert models for complex task-solving is a promising approach towards AGI. To facilitate the community's long-term improvement and evaluation of AGI's ability, we open-source the code, benchmark, and evaluation methods of the OpenAGI project at https://github.com/agiresearch/OpenAGI.

翻译：人类智慧具有将基础技能组合成复杂技能以解决复杂任务的卓越能力。这种能力对于人工智能同样至关重要，因此我们认为，在开发大规模综合性智能模型的同时，赋予此类模型利用各种领域专用专家模型解决复杂任务的能力，对于实现通用人工智能同样不可或缺。近年来，大语言模型展现出显著的学习与推理能力，使其有望成为选择、综合并执行外部模型以解决复杂任务的控制中枢。在本项目中，我们开发了OpenAGI——一个开源的通用人工智能研究平台，专门设计用于提供复杂多步任务，并配备任务特定数据集、评估指标以及多样化的可扩展模型。OpenAGI将复杂任务表述为自然语言查询，作为大语言模型的输入。随后，大语言模型选择、综合并执行OpenAGI提供的模型以完成任务。此外，我们提出了一种基于任务反馈的强化学习机制，该机制利用任务求解结果作为反馈来改进大语言模型的任务解决能力。因此，大语言模型负责综合各类外部模型以解决复杂任务，而基于任务反馈的强化学习则提供反馈以提升其任务解决能力，从而形成实现自我改进人工智能的反馈闭环。我们认为，大语言模型操作多种专家模型以解决复杂任务的范式是迈向通用人工智能的一条有前途的路径。为促进社区对通用人工智能能力的长期改进与评估，我们在https://github.com/agiresearch/OpenAGI 上开源了OpenAGI项目的代码、基准测试与评估方法。