The emergence of Large Language Models (LLMs) has reshaped agent systems. Unlike traditional rule-based agents with limited task scope, LLM-powered agents offer greater flexibility, cross-domain reasoning, and natural language interaction. Moreover, with the integration of multi-modal LLMs, current agent systems are highly capable of processing diverse data modalities, including text, images, audio, and structured tabular data, enabling richer and more adaptive real-world behavior. This paper comprehensively examines the evolution of agent systems from the pre-LLM era to current LLM-powered architectures. We categorize agent systems into software-based, physical, and adaptive hybrid systems, highlighting applications across customer service, software development, manufacturing automation, personalized education, financial trading, and healthcare. We further discuss the primary challenges posed by LLM-powered agents, including high inference latency, output uncertainty, lack of evaluation metrics, and security vulnerabilities, and propose potential solutions to mitigate these concerns.
翻译:大语言模型的出现重塑了智能体系统。相较于任务范围受限的传统规则型智能体,基于大语言模型的智能体具备更高的灵活性、跨领域推理能力及自然语言交互功能。此外,通过集成多模态大语言模型,当前智能体系统能够高效处理文本、图像、音频及结构化表格数据等多类数据模态,从而在真实场景中展现出更丰富且更具适应性的行为模式。本文全面审视了从大语言模型时代之前到当前大语言模型驱动架构的发展历程,将智能体系统划分为软件型、实体型与自适应混合型三大类,重点阐述了其在客户服务、软件开发、制造自动化、个性化教育、金融交易及医疗健康等领域的应用。本文进一步探讨了大语言模型驱动的智能体面临的主要挑战,包括高推理延迟、输出不确定性、评估指标缺失及安全漏洞等问题,并提出了缓解这些问题的潜在解决方案。