Recent language models have shown remarkable results on various complex reasoning benchmarks. The reasoning capabilities of LLMs enable them to execute external function calls to overcome their inherent limitations, such as knowledge cutoffs, poor arithmetic skills, or lack of access to private data. This development has allowed LLMs to select and coordinate multiple functions based on the context to tackle more complex problems. However, current methods for multiple function calling often require sequential reasoning and acting for each function which can result in high latency, cost, and sometimes inaccurate behavior. To address this, we introduce LLMCompiler, which executes functions in parallel to efficiently orchestrate multiple function calling. Drawing from the principles of classical compilers, LLMCompiler streamlines parallel function calling with three components: (i) an LLM Planner, formulating execution plans; (ii) a Task Fetching Unit, dispatching function calling tasks; and (iii) an Executor, executing these tasks in parallel. LLMCompiler automatically generates an optimized orchestration for the function calls and can be used with both open-source and closed-source models. We have benchmarked LLMCompiler on a range of tasks with different patterns of function calling. We observe consistent latency speedup of up to 3.7x, cost savings of up to 6.7x, and accuracy improvement of up to ~9% compared to ReAct.
翻译:近期语言模型在各种复杂推理基准测试中展现出卓越成果。LLM的推理能力使其能够执行外部函数调用,以克服知识截止日期、算术能力不足或无法访问私有数据等固有局限。这一进展使LLM能够基于上下文选择并协调多个函数,从而解决更复杂的问题。然而,当前的多函数调用方法通常需要对每个函数进行顺序推理与执行,这可能导致高延迟、高成本,有时还会产生不准确的行为。为解决此问题,我们提出了LLMCompiler,它通过并行执行函数来高效编排多函数调用。借鉴经典编译器原理,LLMCompiler通过三个组件简化并行函数调用:(i)LLM规划器,制定执行计划;(ii)任务获取单元,分派函数调用任务;(iii)执行器,并行执行这些任务。LLMCompiler可自动生成函数调用的优化编排方案,并支持开源与闭源模型。我们在具有不同函数调用模式的多项任务上对LLMCompiler进行了基准测试。与ReAct相比,我们观察到延迟速度最高提升3.7倍,成本最高降低6.7倍,准确率最高提升约9%。