The notable success of large language models (LLMs) has sparked an upsurge in building language agents to complete various complex tasks. We present AMOR, an agent framework based on open-source LLMs, which reasons with external knowledge bases and adapts to specific domains through human supervision to the reasoning process. AMOR builds reasoning logic over a finite state machine (FSM) that solves problems through autonomous executions and transitions over disentangled modules. This allows humans to provide direct feedback to the individual modules, and thus naturally forms process supervision. Based on this reasoning and feedback framework, we develop AMOR through two-stage fine-tuning: warm-up and adaptation. The former fine-tunes the LLM with examples automatically constructed from various public datasets and enables AMOR to generalize across different knowledge environments, while the latter tailors AMOR to specific domains using process feedback. Extensive experiments across multiple domains demonstrate the advantage of AMOR to strong baselines, thanks to its FSM-based reasoning and process feedback mechanism.
翻译:大型语言模型(LLM)的显著成功引发了构建语言代理以完成各种复杂任务的热潮。我们提出AMOR——一种基于开源LLM的代理框架,它能通过外部知识库进行推理,并通过人类对推理过程的监督适应特定领域。AMOR在有限状态机(FSM)上构建推理逻辑,通过对解耦模块的自主执行与状态转换解决问题。这使得人类能够直接向各个模块提供反馈,从而自然形成过程监督。基于此推理与反馈框架,我们通过两阶段微调(预热与适应)来开发AMOR。预热阶段利用从多个公开数据集中自动构建的样本对LLM进行微调,使AMOR能够泛化到不同知识环境;适应阶段则借助过程反馈将AMOR定制到特定领域。跨多个领域的广泛实验表明,得益于基于FSM的推理与过程反馈机制,AMOR相较于强基线方法具有显著优势。