Optimizing reranking in advertising feeds is a constrained combinatorial problem, requiring simultaneous maximization of platform revenue and preservation of user experience. Recent generative ranking methods enable listwise optimization via autoregressive decoding, but their deployment is hindered by high inference latency and limited constraint handling. We propose a constraint-aware generative reranking framework that transforms constrained optimization into bounded neural decoding. Unlike prior approaches that separate generator and evaluator models, our framework unifies sequence generation and reward estimation into a single network. We further introduce constraint-aware reward pruning, integrating constraint satisfaction directly into decoding to efficiently generate optimal sequences. Experiments on large-scale industrial feeds and online A/B tests show that our method improves revenue and user engagement while meeting strict latency requirements, providing an efficient neural solution for constrained listwise optimization.
翻译:广告信息流中的重排序优化是一个带约束的组合优化问题,需要同时最大化平台收入并保持用户体验。最近的生成式排序方法通过自回归解码实现了列表级优化,但其部署受到高推理延迟和有限约束处理能力的制约。我们提出了一种约束感知的生成式重排序框架,将约束优化转化为有界神经解码。与先前分离生成器和评估器模型的方法不同,我们的框架将序列生成与奖励估计统一到单一网络中。我们进一步引入了约束感知的奖励剪枝,将约束满足直接集成到解码过程中,以高效生成最优序列。在大规模工业信息流和在线A/B测试上的实验表明,该方法在满足严格延迟要求的同时提高了收入和用户参与度,为带约束的列表级优化提供了一种高效的神经解决方案。