Tool-augmented large language models (LLMs) are attracting widespread attention when accessing up-to-date knowledge and alleviating hallucination issues. Nowadays, advanced closed-source LLMs (e.g., ChatGPT) have demonstrated surprising tool-usage capabilities through prompting and in-context learning techniques. To empower the capabilities of open-source LLMs (e.g., LLaMA) in manipulating tools, current efforts focus on either template-driven or token-triggered tool-usage. However, the former hampers LLMs' flexibility to address diverse user's queries due to constrained tool interactions, while the latter limits the generalizability when engaging with new tools, since tool-usage learning is based on task- and tool-specific datasets. To alleviate these concerns, in this paper, we propose a decision-aware and generalizable tool-usage framework (DEER). Specifically, we first construct the tool-usage samples with multiple decision branches via an automatic generation pipeline, thereby inspiring the decision-making awareness of LLMs under diverse scenarios. Meanwhile, we propose a novel tool sampling strategy to enhance the generalizability of LLMs over unseen tools. Extensive experiments demonstrate that our proposed DEER is effective and significantly outperforms baselines across various datasets.
翻译:工具增强的大语言模型(LLMs)在获取最新知识和缓解幻觉问题方面正受到广泛关注。目前,先进的闭源大语言模型(如ChatGPT)已通过提示和上下文学习技术展现出令人惊讶的工具使用能力。为增强开源大语言模型(如LLaMA)操作工具的能力,当前研究主要聚焦于模板驱动或令牌触发的工具使用范式。然而,前者因受限于工具交互模式而削弱了大语言模型应对多样化用户查询的灵活性;后者由于依赖特定任务与工具的数据集进行学习,在面对新工具时泛化能力受限。为缓解这些问题,本文提出一种决策感知与泛化工具使用框架(DEER)。具体而言,我们首先通过自动生成流程构建具有多决策分支的工具使用样本,从而激发大语言模型在多样化场景下的决策感知能力。同时,我们提出一种新颖的工具采样策略,以增强大语言模型对未见工具的泛化能力。大量实验表明,我们提出的DEER框架具有显著有效性,在多个数据集上均明显优于基线方法。