Decision Transformer (DT) shows promise for generative auto-bidding by capturing temporal dependencies, but suffers from two critical limitations: insufficient cross-correlation modeling among state, action, and return-to-go (RTG) sequences, and indiscriminate learning of optimal/suboptimal behaviors. To address these, we propose C2, a novel framework enhancing DT with two core innovations: (1) a Cross Learning Block (CLB) via cross-attention to strengthen inter-sequence correlation modeling; (2) a Constraint-aware Loss (CL) incorporating budget and Cost-Per-Acquisition (CPA) constraints for selective learning of optimal trajectories. Extensive offline evaluations on the AuctionNet dataset demonstrate consistent performance gains (up to 3.23\% over state-of-the-art GAVE) across diverse budget settings; ablation studies verify the complementary synergy of CLB and CL, confirming C2's superiority in auto-bidding. The code for reproducing our results is available at: https://github.com/Dingjinren/C2.
翻译:决策Transformer(DT)通过捕捉时间依赖性在生成式自动出价中展现出潜力,但存在两个关键局限性:状态、动作与回报目标序列间交叉相关性建模不足,以及对最优/次优行为不加区分的习得。为解决这些问题,我们提出C2,一个通过两项核心创新增强DT的新型框架:(1)通过交叉注意力机制构建跨学习模块,以强化序列间相关性建模;(2)引入包含预算与单次获客成本约束的约束感知损失函数,实现对最优轨迹的选择性学习。在AuctionNet数据集上的大量离线评估表明,该框架在不同预算设置下均取得稳定的性能提升(较当前最先进的GAVE模型最高提升3.23%);消融研究验证了跨学习模块与约束感知损失的互补协同效应,证实了C2在自动出价任务中的优越性。复现实验结果的代码已公开于:https://github.com/Dingjinren/C2。