Scaling Transformer-based click-through rate (CTR) models by stacking more parameters brings growing computational and storage overhead, creating a widening gap between scaling ambitions and the stringent industrial deployment constraints. We propose LoopCTR, which introduces a loop scaling paradigm that increases training-time computation through recursive reuse of shared model layers, decoupling computation from parameter growth. LoopCTR adopts a sandwich architecture enhanced with Hyper-Connected Residuals and Mixture-of-Experts, and employs process supervision at every loop depth to encode multi-loop benefits into the shared parameters. This enables a train-multi-loop, infer-zero-loop strategy where a single forward pass without any loop already outperforms all baselines. Experiments on three public benchmarks and one industrial dataset demonstrate state-of-the-art performance. Oracle analysis further reveals 0.02--0.04 AUC of untapped headroom, with models trained with fewer loops exhibiting higher oracle ceilings, pointing to a promising frontier for adaptive inference.
翻译:基于Transformer的点击率预测模型通过堆叠更多参数来扩展规模,这带来了日益增长的计算和存储开销,使得扩展目标与严格的工业部署约束之间的差距不断拉大。我们提出LoopCTR,它引入了一种循环扩展范式,通过递归复用共享模型层来增加训练时的计算量,从而将计算量与参数增长解耦。LoopCTR采用了一种由超连接残差和混合专家增强的三明治架构,并在每个循环深度处进行过程监督,将多循环优势编码到共享参数中。这实现了一种"多循环训练、零循环推理"策略,即单次前向传播(无需任何循环)即可超越所有基线模型。在三个公开基准数据集和一个工业数据集上的实验证明了其最先进的性能。进一步的神谕分析揭示了0.02至0.04 AUC的未开发潜力,其中训练时循环次数较少的模型展现出更高的神谕上限,这为自适应推理开辟了有前景的新方向。