AI-powered web agents have the potential to automate repetitive tasks, such as form filling, information retrieval, and scheduling, but they struggle to reliably execute these tasks without human intervention, requiring users to provide detailed guidance during every run. We address this limitation by automatically synthesizing reusable workflows from an agent's successful and failed attempts. These workflows incorporate execution guards that help agents detect and fix errors while keeping users informed of progress and issues. Our approach enables agents to successfully complete repetitive tasks of the same type with minimal user intervention, increasing the success rates from 24.2% to 70.1% across fifteen tasks. To evaluate this approach, we invited nine users and found that our agent helped them complete web tasks with a higher success rate and less guidance compared to two baseline methods, as well as allowed users to easily monitor agent behavior and understand its failures.
翻译:AI驱动的网络智能体有潜力自动化重复性任务,如表单填写、信息检索和日程安排,但它们在无需人工干预的情况下可靠执行这些任务方面仍存在困难,需要用户在每次运行时提供详细指导。我们通过从智能体的成功与失败尝试中自动合成可复用工作流来解决这一局限。这些工作流包含执行守卫,可帮助智能体检测并修复错误,同时让用户及时了解进展与问题。我们的方法使智能体能够以最少的用户干预成功完成同类重复任务,在十五项任务中将成功率从24.2%提升至70.1%。为评估该方法,我们邀请九位用户进行测试,发现相较于两种基线方法,我们的智能体能够以更高成功率和更少指导帮助用户完成网络任务,同时允许用户轻松监控智能体行为并理解其失败原因。