This paper focuses on automatically generating informative ad descriptions in sponsored search. Unlike ad titles which are usually optimized to attract user click feedbacks, ad descriptions have a longer text span and possess the potential of incorporating world knowledge to address user search intents while presenting the fine-grained selling points of the ads. We propose Interactor, a multi-turn iterative creation framework optimized with agentic RL for ad description generation. The generation model acts as a policy that interacts with a customized environment consisting of multiple generative reward models. Given initial generations by the policy, the customized GenRMs evaluate multi-dimensional qualities including knowledge capacity and landing page consistency, providing both binary signals and reasoning feedbacks. The policy then iteratively refines the descriptions based on such feedbacks to ensure continuous improvement. Experiments on industrial datasets show that the Interactor framework significantly outperforms state-of-the-art approaches in generating knowledge-rich and faithful ad descriptions. Since May 2026, it has been deployed online in a leading search ads system, contributing to both ad revenue and user experience.
翻译:摘要:本文聚焦于竞价搜索场景下广告描述文本的自动生成任务。与通常为吸引用户点击反馈进行优化的广告标题不同,广告描述具有更长的文本篇幅,能够融入外部世界知识以响应用户搜索意图,同时展现广告的精细化卖点。我们提出Interactor——一个采用智能体强化学习优化的多轮迭代创作框架。该框架中的生成模型作为策略体,与包含多个生成式奖励模型(GenRMs)的自定义环境进行交互。基于策略体生成的初始结果,定制化GenRMs从知识容量、落地页一致性等多个维度进行评估,同时输出二元信号与推理反馈。生成策略根据这些反馈对描述进行迭代优化,确保持续改进。工业数据集实验表明,Interactor框架在生成知识丰富且忠实于原意的广告描述方面显著优于现有最优方法。自2026年5月起,该系统已在主流搜索广告平台上线部署,同时提升了广告收益与用户体验。