Trending news detection in low-traffic search environments faces a fundamental cold-start problem, where a lack of query volume prevents systems from identifying emerging or long-tail trends. Existing methods relying on keyword frequency or query spikes are inherently slow and ineffective in these sparse settings, lagging behind real-world shifts in attention. We introduce RTTP, a novel Real-Time Trending Prediction framework that generates search queries directly from news content instead of waiting for users to issue them. RTTP leverages a continual learning LLM (CL-LLM) that converts posts into search-style queries and scores them using engagement strength + creator authority, enabling early trend surfacing before search volume forms. To ensure adaptation without degrading reasoning, we propose Mix-Policy DPO, a new preference-based continual learning approach that combines on-policy stability with off-policy novelty to mitigate catastrophic forgetting during model upgrades. Deployed at production scale on Facebook and Meta AI products, RTTP delivers +91.4% improvement in tail-trend detection precision@500 and +19% query generation accuracy over industry baselines, while sustaining stable performance after multi-week online training. This work demonstrates that LLM-generated synthetic search signals, when aligned and continually updated, unlock timely trend understanding in low-traffic search environments.
翻译:在低流量搜索环境中,趋势新闻检测面临一个根本性的冷启动问题:查询量的缺乏导致系统无法识别新兴或长尾趋势。依赖关键词频率或查询峰值的现有方法在这些稀疏场景中本质上缓慢且低效,滞后于现实世界关注度的转变。我们提出了RTTP,一种新颖的实时趋势预测框架,它直接从新闻内容生成搜索查询,而非等待用户发起查询。RTTP利用一个持续学习的大语言模型,该模型将帖子转换为搜索风格的查询,并使用参与强度+创作者权威度对其进行评分,从而在搜索量形成之前实现早期趋势浮现。为确保适应能力而不损害推理性能,我们提出了Mix-Policy DPO,一种新的基于偏好的持续学习方法,它结合了在线策略的稳定性与离线策略的新颖性,以缓解模型升级过程中的灾难性遗忘。在Facebook和Meta AI产品中部署至生产规模后,RTTP在尾部趋势检测的精确率@500上实现了+91.4%的提升,查询生成准确率相比行业基线提高了+19%,并在多周的在线训练后保持了稳定的性能。这项工作表明,当大语言模型生成的合成搜索信号经过对齐和持续更新后,能够在低流量搜索环境中实现及时的趋势理解。