Foundation models (FMs) such as large language models have revolutionized the field of AI by showing remarkable performance in various tasks. However, they exhibit numerous limitations that prevent their broader adoption in many real-world systems, which often require a higher bar for trustworthiness and usability. Since FMs are trained using loss functions aimed at reconstructing the training corpus in a self-supervised manner, there is no guarantee that the model's output aligns with users' preferences for a specific task at hand. In this survey paper, we propose a conceptual framework that encapsulates different modes by which agents could interact with FMs and guide them suitably for a set of tasks, particularly through knowledge augmentation and reasoning. Our framework elucidates agent role categories such as updating the underlying FM, assisting with prompting the FM, and evaluating the FM output. We also categorize several state-of-the-art approaches into agent interaction protocols, highlighting the nature and extent of involvement of the various agent roles. The proposed framework provides guidance for future directions to further realize the power of FMs in practical AI systems.
翻译:基础模型(如大型语言模型)通过在各种任务中展现卓越性能,彻底改变了人工智能领域。然而,它们存在诸多局限性,阻碍了其在许多现实系统中更广泛的应用——这些系统通常对可信度和可用性有更高要求。由于基础模型采用以自监督方式重构训练语料库为目标设计的损失函数进行训练,其输出无法保证与用户对特定任务的偏好保持一致。本综述论文提出一个概念框架,通过知识增强与推理两种核心方式,系统阐明了智能体与基础模型交互并引导其适应任务集的不同模式。该框架界定了智能体角色类别(如更新基础模型本体、辅助提示基础模型、评估基础模型输出),并将多种前沿方法归类为智能体交互协议,揭示了不同智能体角色的参与性质与程度。所提出的框架为未来在实用AI系统中进一步释放基础模型潜力提供了指导方向。