Autonomous agents powered by large language models (LLMs) have garnered significant research attention. However, fully harnessing the potential of LLMs for agent-based tasks presents inherent challenges due to the heterogeneous nature of diverse data sources featuring multi-turn trajectories. In this paper, we introduce \textbf{AgentOhana} as a comprehensive solution to address these challenges. \textit{AgentOhana} aggregates agent trajectories from distinct environments, spanning a wide array of scenarios. It meticulously standardizes and unifies these trajectories into a consistent format, streamlining the creation of a generic data loader optimized for agent training. Leveraging the data unification, our training pipeline maintains equilibrium across different data sources and preserves independent randomness across devices during dataset partitioning and model training. Additionally, we present \textbf{xLAM-v0.1}, a large action model tailored for AI agents, which demonstrates exceptional performance across various benchmarks.
翻译:由大语言模型驱动的自主智能体已引起广泛研究关注。然而,由于多轮轨迹数据源具有异构特性,充分释放LLM在智能体任务中的潜力仍面临固有挑战。本文提出\textbf{AgentOhana}作为应对这些挑战的综合解决方案。\textit{AgentOhana}聚合来自不同环境的智能体轨迹,覆盖多种场景,并将这些轨迹精细标准化为统一格式,从而简化了专用于智能体训练的通用数据加载器构建流程。依托数据统一化,我们的训练流水线在不同数据源间保持平衡,并在数据划分与模型训练过程中保持设备间的独立随机性。此外,我们提出了针对AI智能体设计的大型动作模型\textbf{xLAM-v0.1},该模型在多项基准测试中展现出卓越性能。