Temporal Information Retrieval (TIR) is a critical yet unresolved task for modern search systems, retrieving documents that not only satisfy a query's information need but also adhere to its temporal constraints. This task is shaped by two challenges: Relevance, ensuring alignment with the query's explicit temporal requirements, and Recency, selecting the freshest document among multiple versions. Existing methods often address the two challenges in isolation, relying on brittle heuristics that fail in scenarios where temporal requirements and staleness resistance are intertwined. To address this gap, we introduce Re2Bench, a benchmark specifically designed to disentangle and evaluate Relevance, Recency, and their hybrid combination. Building on this foundation, we propose Re3, a unified and lightweight framework that dynamically balances semantic and temporal information through a query-aware gating mechanism. On Re2Bench, Re3 achieves state-of-the-art results, leading in R@1 across all three subsets. Ablation studies with backbone sensitivity tests confirm robustness, showing strong generalization across diverse encoders and real-world settings. This work provides both a generalizable solution and a principled evaluation suite, advancing the development of temporally aware retrieval systems. Re3 and Re2Bench are available online: https://anonymous.4open.science/r/Re3-0C5A
翻译:时序信息检索(TIR)是现代搜索系统中至关重要但尚未解决的任务,其目标在于检索出既满足查询信息需求又符合其时序约束的文档。该任务面临两大挑战:相关性——确保与查询显式时序要求对齐,以及时效性——在多个版本中选择最新的文档。现有方法通常孤立地处理这两个挑战,依赖于脆弱的启发式规则,在时序要求与抗陈旧性需求交织的场景中往往失效。为填补这一空白,我们提出了Re2Bench——一个专门设计用于解耦并评估相关性、时效性及其混合组合的基准测试集。在此基础上,我们提出Re3,一个通过查询感知门控机制动态平衡语义与时序信息的统一轻量级框架。在Re2Bench上,Re3在所有三个子集上均取得领先的R@1指标,实现了最先进的性能。结合骨干网络敏感性测试的消融实验证实了其鲁棒性,表明该框架在不同编码器和实际场景中均具有强大的泛化能力。本工作既提供了可推广的解决方案,也建立了原则性的评估体系,推动了时序感知检索系统的发展。Re3与Re2Bench已在线发布:https://anonymous.4open.science/r/Re3-0C5A