Creating an LLM-based AI-agent: A high-level methodology towards enhancing LLMs with APIs

from arxiv, This Master's Thesis was supervised by Prof. Nikolaos Papaspyrou (National Technical University of Athens) and Dr. Aifen Sui (Huawei Munich Research Center). http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19180

Large Language Models (LLMs) have revolutionized various aspects of engineering and science. Their utility is often bottlenecked by the lack of interaction with the external digital environment. To overcome this limitation and achieve integration of LLMs and Artificial Intelligence (AI) into real-world applications, customized AI agents are being constructed. Based on the technological trends and techniques, we extract a high-level approach for constructing these AI agents, focusing on their underlying architecture. This thesis serves as a comprehensive guide that elucidates a multi-faceted approach for empowering LLMs with the capability to leverage Application Programming Interfaces (APIs). We present a 7-step methodology that begins with the selection of suitable LLMs and the task decomposition that is necessary for complex problem-solving. This methodology includes techniques for generating training data for API interactions and heuristics for selecting the appropriate API among a plethora of options. These steps eventually lead to the generation of API calls that are both syntactically and semantically aligned with the LLM's understanding of a given task. Moreover, we review existing frameworks and tools that facilitate these processes and highlight the gaps in current attempts. In this direction, we propose an on-device architecture that aims to exploit the functionality of carry-on devices by using small models from the Hugging Face community. We examine the effectiveness of these approaches on real-world applications of various domains, including the generation of a piano sheet. Through an extensive analysis of the literature and available technologies, this thesis aims to set a compass for researchers and practitioners to harness the full potential of LLMs augmented with external tool capabilities, thus paving the way for more autonomous, robust, and context-aware AI agents.

翻译：大语言模型（LLMs）已在工程与科学的多个领域引发革命性变革。然而，其实际效用常因缺乏与外部数字环境的交互而受限。为突破这一限制，实现LLMs与人工智能（AI）在现实应用中的整合，定制化AI智能体正在被构建。基于技术趋势与方法，本文提炼出一种构建此类AI智能体的高层框架，重点关注其底层架构。本论文作为一份综合性指南，阐明了一种多维度方法，旨在赋予LLMs利用应用程序编程接口（APIs）的能力。我们提出了一套包含七个步骤的方法论：从选择合适的LLMs开始，到复杂问题求解所必需的任务分解。该方法涵盖为API交互生成训练数据的技术，以及在众多选项中筛选合适API的启发式策略。这些步骤最终导向生成在语法和语义层面均与LLM对给定任务理解相一致的API调用。此外，我们综述了现有支持这些流程的框架与工具，并指出当前尝试中的不足。在此基础上，我们提出一种端侧架构，旨在通过采用Hugging Face社区的小型模型，充分利用便携设备的功能。我们在包括钢琴谱生成在内的多个领域实际应用中检验了这些方法的有效性。通过对现有文献与可用技术的广泛分析，本论文旨在为研究人员与实践者提供指引，以充分释放结合外部工具能力的LLMs的全部潜力，从而为构建更自主、更鲁棒、更具情境感知能力的AI智能体铺平道路。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日