A crucial requirement for deploying LLM-based agents in real-life applications is robustness against risky or irreversible mistakes. However, existing research lacks a focus on the preemptive evaluation of reasoning trajectories performed by LLM agents, leading to a gap in ensuring safe and reliable operations. To explore better solutions, this paper introduces InferAct, a novel approach that leverages the Theory-of-Mind capability of LLMs to proactively detect potential errors before critical actions are executed (e.g., "buy-now" in automatic online trading or web shopping). InferAct is also capable of integrating human feedback to prevent irreversible risks and enhance the actor agent's decision-making process. Experiments on three widely used tasks demonstrate the effectiveness of InferAct. The proposed solution presents a novel approach and concrete contributions toward developing LLM agents that can be safely deployed in different environments involving critical decision-making.
翻译:在现实应用中部署基于LLM的智能体时,一个关键要求是必须能够抵御高风险或不可逆的错误操作。然而,现有研究缺乏对LLM智能体推理轨迹进行预评估的关注,导致在确保安全可靠运行方面存在不足。为探索更优解决方案,本文提出InferAct——一种创新方法,该方法利用LLM的心理理论能力,在关键动作执行前(例如自动在线交易或网络购物中的“立即购买”)主动检测潜在错误。InferAct还能整合人类反馈,以预防不可逆风险并增强执行智能体的决策过程。在三个广泛使用的任务上进行的实验验证了InferAct的有效性。所提出的解决方案为开发可安全部署于涉及关键决策的不同环境中的LLM智能体,提供了创新方法和实质性贡献。