Despite significant advances in autonomous web navigation, current methods remain far from human-level performance in complex web environments. We argue that this limitation stems from Topological Blindness, where agents are forced to explore via trial-and-error without access to the global topological structure of the environment. To overcome this limitation, we introduce WebNavigator, which reframes web navigation from probabilistic exploration into deterministic retrieval and pathfinding. WebNavigator constructs Interaction Graphs via zero-token cost heuristic exploration offline and implements a Retrieve-Reason-Teleport workflow for global navigation online. WebNavigator achieves state-of-the-art performance on WebArena and OnlineMind2Web. On WebArena multi-site tasks, WebNavigator achieves a 72.9\% success rate, more than doubling the performance of enterprise-level agents. This work reveals that Topological Blindness, rather than model reasoning capabilities alone, is an underestimated bottleneck in autonomous web navigation.
翻译:尽管自主网络导航领域取得了显著进展,但在复杂网络环境中,现有方法仍远未达到人类水平。我们认为这一局限源于拓扑盲视——智能体因无法获取环境的全局拓扑结构,被迫通过试错方式进行探索。为突破这一瓶颈,我们提出WebNavigator,将网络导航从概率性探索重构为确定性检索与路径规划。该方法通过零令牌代价启发式离线探索构建交互图,并在在线阶段实现"检索-推理-瞬移"的全局导航工作流。在WebArena和OnlineMind2Web基准上,WebNavigator取得了当前最优性能。在WebArena多站点任务中,其成功率高达72.9%,是企业级智能体性能的两倍以上。本研究表明,拓扑盲视——而非单纯模型推理能力——是制约自主网络导航的潜在瓶颈。