The integration of large language models (LLMs) with external tools has significantly expanded the capabilities of AI agents. However, as the diversity of both LLMs and tools increases, selecting the optimal model-tool combination becomes a high-dimensional optimization challenge. Existing approaches often rely on a single model or fixed tool-calling logic, failing to exploit the performance variations across heterogeneous model-tool pairs. In this paper, we present ATLAS (Adaptive Tool-LLM Alignment and Synergistic Invocation), a dual-path framework for dynamic tool usage in cross-domain complex reasoning. ATLAS operates via a dual-path approach: (1) \textbf{training-free cluster-based routing} that exploits empirical priors for domain-specific alignment, and (2) \textbf{RL-based multi-step routing} that explores autonomous trajectories for out-of-distribution generalization. Extensive experiments across 15 benchmarks demonstrate that our method outperforms closed-source models like GPT-4o, surpassing existing routing methods on both in-distribution (+10.1%) and out-of-distribution (+13.1%) tasks. Furthermore, our framework shows significant gains in visual reasoning by orchestrating specialized multi-modal tools.
翻译:大型语言模型(LLMs)与外部工具的集成显著拓展了智能代理的能力边界。然而,随着LLMs与工具多样性的增加,选择最优的模型-工具组合已成为高维优化难题。现有方法通常依赖单一模型或固定的工具调用逻辑,未能充分利用异构模型-工具对的性能差异。本文提出ATLAS(自适应工具-LLM对齐与协同调用框架),一种面向跨领域复杂推理的动态工具调用双路径框架。ATLAS通过双路径机制运行:(1)**基于无训练聚类路由**,利用经验先验实现领域特异性对齐;(2)**基于强化学习的多步路由**,通过自主轨迹探索实现分布外泛化。在15个基准测试上的大量实验表明,本方法性能超越GPT-4o等闭源模型,在分布内(+10.1%)与分布外(+13.1%)任务上均优于现有路由方法。此外,本框架通过编排专用多模态工具,在视觉推理任务中展现出显著性能提升。