Click-through rate (CTR) prediction, which models behavior sequence and non-sequential features (e.g., user/item profiles or cross features) to infer user interest, underpins industrial recommender systems. However, most methods face three forms of heterogeneity that degrade predictive performance: (i) Feature Heterogeneity persists when limited sequence side features provide less granular interest representation compared to extensive non-sequential features, thereby impairing sequence modeling performance; (ii) Context Heterogeneity arises because a user's interest in an item will be influenced by other items, yet point-wise prediction neglects cross-item interaction context from the entire item set; (iii) Architecture Heterogeneity stems from the fragmented integration of specialized network modules, which compounds the model's effectiveness, efficiency and scalability in industrial deployments. To tackle the above limitations, we propose HoMer, a Homogeneous-Oriented TransforMer for modeling sequential and set-wise contexts. First, we align sequence side features with non-sequential features for accurate sequence modeling and fine-grained interest representation. Second, we shift the prediction paradigm from point-wise to set-wise, facilitating cross-item interaction in a highly parallel manner. Third, HoMer's unified encoder-decoder architecture achieves dual optimization through structural simplification and shared computation, ensuring computational efficiency while maintaining scalability with model size. Without arduous modification to the prediction pipeline, HoMer successfully scales up and outperforms our industrial baseline by 0.0099 in the AUC metric, and enhances online business metrics like CTR/RPM by 1.99%/2.46%. Additionally, HoMer saves 27% of GPU resources via preliminary engineering optimization, further validating its superiority and practicality.
翻译:点击率(CTR)预测通过建模行为序列与非序列特征(如用户/物品画像或交叉特征)来推断用户兴趣,是工业推荐系统的核心基础。然而,现有方法普遍面临三种异质性挑战,导致预测性能下降:(一)特征异质性:当有限的序列侧特征相比丰富的非序列特征无法提供足够细粒度的兴趣表征时,会削弱序列建模的性能;(二)上下文异质性:用户对某物品的兴趣会受到其他物品的影响,而逐点预测忽略了整个物品集合中的跨物品交互上下文;(三)架构异质性:现有方法通常采用碎片化的专用网络模块拼接方式,这制约了模型在工业部署中的效果、效率与可扩展性。为应对上述局限,本文提出HoMer——一种面向同质化的Transformer架构,用于统一建模序列与集合上下文。首先,我们通过对齐序列侧特征与非序列特征,实现精确的序列建模与细粒度兴趣表征。其次,我们将预测范式从逐点预测转变为集合预测,以高度并行的方式促进跨物品交互。再者,HoMer采用统一的编码器-解码器架构,通过结构简化与计算共享实现双重优化,在保证计算效率的同时维持模型规模的可扩展性。无需对预测流程进行复杂改造,HoMer成功实现规模化部署,并在AUC指标上超越工业基线0.0099,线上CTR/RPM等业务指标提升1.99%/2.46%。此外,通过前期工程优化,HoMer节省了27%的GPU资源,进一步验证了其优越性与实用性。