The rapid progress of foundation models has led to the prosperity of autonomous agents, which leverage the universal capabilities of foundation models to conduct reasoning, decision-making, and environmental interaction. However, the efficacy of agents remains limited when operating in intricate, realistic environments. In this work, we introduce the principles of $\mathbf{U}$nified $\mathbf{A}$lignment for $\mathbf{A}$gents ($\mathbf{UA}^2$), which advocate for the simultaneous alignment of agents with human intentions, environmental dynamics, and self-constraints such as the limitation of monetary budgets. From the perspective of $\mathbf{UA}^2$, we review the current agent research and highlight the neglected factors in existing agent benchmarks and method candidates. We also conduct proof-of-concept studies by introducing realistic features to WebShop, including user profiles to demonstrate intentions, personalized reranking for complex environmental dynamics, and runtime cost statistics to reflect self-constraints. We then follow the principles of $\mathbf{UA}^2$ to propose an initial design of our agent, and benchmark its performance with several candidate baselines in the retrofitted WebShop. The extensive experimental results further prove the importance of the principles of $\mathbf{UA}^2$. Our research sheds light on the next steps of autonomous agent research with improved general problem-solving abilities.
翻译:基础模型的快速发展催生了自主智能体的繁荣,这些智能体利用基础模型的通用能力进行推理、决策和环境交互。然而,在复杂、现实的环境中运行时,智能体的效能仍受到限制。在本工作中,我们引入了$\mathbf{U}$nified $\mathbf{A}$lignment for $\mathbf{A}$gents ($\mathbf{UA}^2$) 的原则,倡导同时实现智能体与人类意图、环境动态以及自我约束(如货币预算限制)的对齐。从$\mathbf{UA}^2$的视角出发,我们回顾了当前智能体研究,并指出了现有智能体基准和候选方法中被忽视的因素。我们通过向WebShop引入现实特征(包括用于展示意图的用户画像、针对复杂环境动态的个性化排序,以及反映自我约束的运行时成本统计)进行了概念验证研究。随后,我们遵循$\mathbf{UA}^2$原则提出了智能体的初步设计,并在改造后的WebShop中将其性能与多个候选基线进行了基准测试。广泛的实验结果进一步证明了$\mathbf{UA}^2$原则的重要性。本研究为提升通用问题解决能力的自主智能体后续研究指明了方向。