The applications of large language models (LLMs) have expanded well beyond the confines of text processing, signaling a new era where LLMs are envisioned as generalist language agents capable of operating within complex real-world environments. These environments are often highly expansive, making it impossible for the LLM to process them within its short-term memory. Motivated by recent research on extending the capabilities of LLMs with tools, this paper investigates the intriguing potential of tools to augment LLMs in handling such complexity. To this end, we design customized tools to aid in the proactive exploration within these massive environments. Such tools can serve as a middleware layer shielding the LLM from environmental complexity. In two representative complex environments -- knowledge bases (KBs) and databases -- we demonstrate the significant potential of augmenting language agents with tools in complex environments. Notably, equipped with these tools, GPT-4 achieves 2.8X the performance of the best baseline in tasks requiring access to database content and 2.2X in KB tasks. Our findings illuminate the path for advancing language agents in complex real-world applications.
翻译:大型语言模型(LLM)的应用已远远超出文本处理的范畴,标志着新时代的到来——LLM被设想为能够在复杂现实环境中运行的通才语言代理。这些环境往往高度广阔,使得LLM无法在其短期记忆内处理。受近期关于通过工具扩展LLM能力的研究启发,本文探讨了工具增强LLM处理此类复杂性的潜在可能性。为此,我们设计了定制化工具,以辅助在这些庞大环境中进行主动探索。这类工具可作为中间件层,将LLM与环境的复杂性隔离开来。在两个代表性的复杂环境——知识库(KB)和数据库中,我们展示了在复杂环境中使用工具增强语言代理的巨大潜力。值得注意的是,配备这些工具后,GPT-4在需要访问数据库内容的任务中性能达到最佳基线的2.8倍,在KB任务中达到2.2倍。我们的发现为在复杂现实应用中推进语言代理的发展指明了道路。