To Search or Not to Search: Aligning the Decision Boundary of Deep Search Agents via Causal Intervention

Deep search agents, which autonomously iterate through multi-turn web-based reasoning, represent a promising paradigm for complex information-seeking tasks. However, current agents suffer from critical inefficiency: they conduct excessive searches as they cannot accurately judge when to stop searching and start answering. This stems from outcome-centric training that prioritize final results over the search process itself. We identify the root cause as misaligned decision boundaries, the threshold determining when accumulated information suffices to answer. This causes over-search (redundant searching despite sufficient knowledge) and under-search (premature termination yielding incorrect answers). To address these errors, we propose a comprehensive framework comprising two key components. First, we introduce causal intervention-based diagnosis that identifies boundary errors by comparing factual and counterfactual trajectories at each decision point. Second, we develop Decision Boundary Alignment for Deep Search agents (DAS), which constructs preference datasets from causal feedback and aligns policies via preference optimization. Experiments on public datasets demonstrate that decision boundary errors are pervasive across state-of-the-art agents. Our DAS method effectively calibrates these boundaries, mitigating both over-search and under-search to achieve substantial gains in accuracy and efficiency. Our code and data are publicly available at: https://github.com/Applied-Machine-Learning-Lab/WWW2026_DAS.

翻译：深度搜索智能体通过自主迭代多轮基于网络的推理，为复杂的信息检索任务提供了一种有前景的范式。然而，当前的智能体存在严重的效率低下问题：它们无法准确判断何时应停止搜索并开始回答，从而导致过度搜索。这源于以结果为中心的训练范式，该范式优先考虑最终结果而非搜索过程本身。我们将根本原因归结为决策边界——即判断累积信息何时足以回答问题的阈值——的错位。这导致了过度搜索（尽管知识已足够但仍进行冗余搜索）和搜索不足（过早终止导致错误答案）。为解决这些错误，我们提出了一个包含两个关键组件的综合框架。首先，我们引入了一种基于因果干预的诊断方法，该方法通过比较每个决策点的事实轨迹与反事实轨迹来识别边界错误。其次，我们开发了用于深度搜索智能体的决策边界对齐方法，该方法从因果反馈中构建偏好数据集，并通过偏好优化来对齐策略。在公开数据集上的实验表明，决策边界错误在最先进的智能体中普遍存在。我们的DAS方法有效地校准了这些边界，缓解了过度搜索和搜索不足，从而在准确性和效率上实现了显著提升。我们的代码和数据已在以下网址公开：https://github.com/Applied-Machine-Learning-Lab/WWW2026_DAS。