There has been growing interest in building agents that can interact with digital platforms to execute meaningful enterprise tasks autonomously. Among the approaches explored are tool-augmented agents built on abstractions such as Model Context Protocol (MCP) and web agents that operate through graphical interfaces. Yet, it remains unclear whether such complex agentic systems are necessary given their cost and operational overhead. We argue that a coding agent equipped only with a terminal and a filesystem can solve many enterprise tasks more effectively by interacting directly with platform APIs. We evaluate this hypothesis across diverse real-world systems and show that these low-level terminal agents match or outperform more complex agent architectures. Our findings suggest that simple programmatic interfaces, combined with strong foundation models, are sufficient for practical enterprise automation.
翻译:近年来,构建能够与数字平台交互、自主执行有意义企业任务的智能体引起了广泛关注。目前探索的方法包括基于抽象模型(如模型上下文协议MCP)构建的工具增强型智能体,以及通过图形界面运行的网络智能体。然而,考虑到这类复杂智能体系统的成本与运维开销,其必要性仍存争议。我们认为,仅配备终端和文件系统的编码型智能体,通过直接调用平台应用程序编程接口(API),能够更高效地解决众多企业任务。我们在多种真实系统环境中验证了这一假说,结果表明这些底层终端智能体的性能可与更复杂的智能体架构持平甚至更优。我们的发现表明,简单的程序化接口结合强大的基础模型,已足以实现实用的企业自动化。