In commercial web search, aligning content freshness with user intent remains challenging due to the highly varied lifespans of information. Traditional industrial approaches rely on static time-window filtering, resulting in "one-size-fits-all" rankings where content may be chronologically recent but semantically expired. To address the limitation, we present a novel Large Language Models (LLMs)-based Query-Aware Dynamic Content Expiration Prediction Framework deployed in Baidu search, reformulating timeliness as a dynamic validity inference task. Our framework extracts fine-grained temporal contexts from documents and leverages LLMs to deduce a query-specific "validity horizon"-a semantic boundary defining when information becomes obsolete based on user intent. Integrated with robust hallucination mitigation strategies to ensure reliability, our approach has been evaluated through offline and online A/B testing on live production traffic. Results demonstrate significant improvements in search freshness and user experience metrics, validating the effectiveness of LLM-driven reasoning for solving semantic expiration at an industrial scale.
翻译:在商业网络搜索中,由于信息生命周期存在极大差异,将内容时效性与用户意图对齐仍是一项挑战。传统工业方法依赖静态时间窗口过滤,导致"一刀切"式的排序,其中内容可能按时间顺序较新但在语义上已过期。为解决这一局限,我们提出了一种基于大语言模型(LLMs)的查询感知动态内容过期预测框架,已部署于百度搜索,将时效性重新定义为一项动态有效性推断任务。我们的框架从文档中提取细粒度的时间上下文,并利用LLMs推导出特定于查询的"有效期边界"——一种根据用户意图定义信息何时过时的语义边界。结合稳健的幻觉缓解策略以确保可靠性,我们的方法已通过线上生产流量下的离线和在线A/B测试进行了评估。结果表明,搜索时效性和用户体验指标均有显著提升,验证了在工业规模上利用LLM推理解决语义过期问题的有效性。