引导推荐系统：面向内容推广的信息感知自动竞价 (Guiding the Recommender: Information-Aware Auto-Bidding for Content Promotion)

Modern content platforms offer paid promotion to mitigate cold start by allocating exposure via auctions. Our empirical analysis reveals a counterintuitive flaw in this paradigm: while promotion rescues low-to-medium quality content, it can harm high-quality content by forcing exposure to suboptimal audiences, polluting engagement signals and downgrading future recommendation. We recast content promotion as a dual-objective optimization that balances short-term value acquisition with long-term model improvement. To make this tractable at bid time in content promotion, we introduce a decomposable surrogate objective, gradient coverage, and establish its formal connection to Fisher Information and optimal experimental design. We design a two-stage auto-bidding algorithm based on Lagrange duality that dynamically paces budget through a shadow price and optimizes impression-level bids using per-impression marginal utilities. To address missing labels at bid time, we propose a confidence-gated gradient heuristic, paired with a zeroth-order variant for black-box models that reliably estimates learning signals in real time. We provide theoretical guarantees, proving monotone submodularity of the composite objective, sublinear regret in online auction, and budget feasibility. Extensive offline experiments on synthetic and real-world datasets validate the framework: it outperforms baselines, achieves superior final AUC/LogLoss, adheres closely to budget targets, and remains effective when gradients are approximated zeroth-order. These results show that strategic, information-aware promotion can improve long-term model performance and organic outcomes beyond naive impression-maximization strategies.

翻译：现代内容平台通过拍卖分配曝光量来提供付费推广服务，以缓解冷启动问题。我们的实证分析揭示了该范式中一个反直觉的缺陷：虽然推广能挽救中低质量内容，却可能损害高质量内容——因其迫使内容曝光于次优受众，污染了互动信号并导致未来推荐降级。我们将内容推广重新定义为平衡短期价值获取与长期模型改进的双目标优化问题。为在内容推广的竞价时刻实现可处理性，我们引入了一个可分解的代理目标——梯度覆盖，并建立了其与费舍尔信息及最优实验设计的正式关联。基于拉格朗日对偶性，我们设计了一种两阶段自动竞价算法：通过影子价格动态调控预算，并利用每次曝光的边际效用优化曝光层级的出价。针对竞价时刻的标签缺失问题，我们提出了置信门控梯度启发式方法，并为其黑盒模型版本设计了零阶变体，能够可靠地实时估计学习信号。我们提供了理论保证，证明了复合目标的单调子模性、在线拍卖中的次线性遗憾以及预算可行性。在合成与真实数据集上的大量离线实验验证了该框架：其性能超越基线方法，获得更优的最终AUC/对数损失，紧密遵循预算目标，且在梯度采用零阶近似时仍保持有效性。这些结果表明，具有信息感知的战略性推广能够超越单纯的曝光最大化策略，提升长期模型性能与自然流量效果。