Intelligent agent systems in real-world agricultural scenarios must handle diverse tasks under multimodal inputs, ranging from lightweight information understanding to complex multi-step execution. However, most existing approaches rely on a unified execution paradigm, which struggles to accommodate large variations in task complexity and incomplete tool availability commonly observed in agricultural environments. To address this challenge, we propose AgriAgent, a two-level agent framework for real-world agriculture. AgriAgent adopts a hierarchical execution strategy based on task complexity: simple tasks are handled through direct reasoning by modality-specific agents, while complex tasks trigger a contract-driven planning mechanism that formulates tasks as capability requirements and performs capability-aware tool orchestration and dynamic tool generation, enabling multi-step and verifiable execution with failure recovery. Experimental results show that AgriAgent achieves higher execution success rates and robustness on complex tasks compared to existing tool-centric agent baselines that rely on unified execution paradigms. All code, data will be released at after our work be accepted to promote reproducible research.
翻译:现实农业场景中的智能体系统必须处理多模态输入下的多样化任务,其范围涵盖从轻量级信息理解到复杂的多步骤执行。然而,现有方法大多依赖统一的执行范式,难以适应农业环境中普遍存在的任务复杂度差异巨大及工具可用性不完整的问题。为应对这一挑战,我们提出了AgriAgent,一个面向现实世界农业的双层智能体框架。AgriAgent采用基于任务复杂度的分层执行策略:简单任务由特定模态的智能体通过直接推理处理,而复杂任务则触发一种契约驱动的规划机制。该机制将任务形式化为能力需求,并执行能力感知的工具编排与动态工具生成,从而实现具备故障恢复能力的多步骤可验证执行。实验结果表明,与依赖统一执行范式的现有以工具为中心的智能体基线相比,AgriAgent在复杂任务上实现了更高的执行成功率和鲁棒性。所有代码和数据将在我们的工作被接受后公开发布,以促进可复现性研究。