Experience intervention in web agents emerges as a promising technical paradigm, enhancing agent interaction capabilities by providing valuable insights from accumulated experiences. However, existing methods predominantly inject experience passively as global context before task execution, struggling to adapt to dynamically changing contextual observations during agent-environment interaction. We propose ExpSeek, which shifts experience toward step-level proactive seeking: (1) estimating step-level entropy thresholds to determine intervention timing using the model's intrinsic signals; (2) designing step-level tailor-designed experience content. Experiments on Qwen3-8B and 32B models across four challenging web agent benchmarks demonstrate that ExpSeek achieves absolute improvements of 9.3% and 7.5%, respectively. Our experiments validate the feasibility and advantages of entropy as a self-triggering signal, reveal that even a 4B small-scale experience model can significantly boost the performance of larger agent models.
翻译:在网络智能体中引入经验干预已成为一种前景广阔的技术范式,它通过提供从累积经验中获得的宝贵洞见来增强智能体的交互能力。然而,现有方法主要在任务执行前将经验作为全局上下文被动注入,难以适应智能体与环境交互过程中动态变化的上下文观察。我们提出了ExpSeek,它将经验利用转向步骤级的主动寻求:(1)利用模型的内在信号估计步骤级熵阈值以确定干预时机;(2)设计步骤级量身定制的经验内容。在四个具有挑战性的网络智能体基准测试上,基于Qwen3-8B和32B模型的实验表明,ExpSeek分别实现了9.3%和7.5%的绝对性能提升。我们的实验验证了熵作为自触发信号的可行性及优势,并揭示即使是4B规模的小型经验模型也能显著提升更大规模智能体模型的性能。