The scaling laws for recommender systems have been increasingly validated, where MetaFormer-based architectures consistently benefit from increased model depth, hidden dimensionality, and user behavior sequence length. However, whether representation capacity scales proportionally with parameter growth remains unexplored. Prior studies on RankMixer reveal that the effective rank of token representations exhibits a damped oscillatory trajectory across layers, failing to increase consistently with depth and even degrading in deeper layers. Motivated by this observation, we propose RankUp, an architecture designed to mitigate representation collapse and enhance expressive capacity through randomized permutation splitting over sparse features, a multi-embedding paradigm, global token integration and crossed pretrained embedding tokens. RankUp has been fully deployed in large-scale production across Weixin Video Accounts, Official Accounts and Moments, yielding GMV improvements of 3.41%, 4.81% and 2.12%, respectively.
翻译:推荐系统的规模化定律已得到日益验证,基于MetaFormer的架构持续受益于模型深度、隐藏维度以及用户行为序列长度的增加。然而,表示能力是否与参数增长成比例提升仍待探索。先前关于RankMixer的研究表明,token表示的等效秩在各层间呈现阻尼振荡轨迹,未能随深度一致增长,甚至在深层出现退化。受此观察启发,我们提出RankUp架构,通过针对稀疏特征的随机排列拆分、多嵌入范式、全局token整合以及交叉预训练嵌入token,来缓解表示坍塌并增强表达能力。RankUp已在微信视频号、公众号及朋友圈的大规模生产环境中全面部署,分别带来GMV提升3.41%、4.81%和2.12%。