A Comprehensive Summarization and Evaluation of Feature Refinement Modules for CTR Prediction

Click-through rate (CTR) prediction is widely used in academia and industry. Most CTR tasks fall into a feature embedding \& feature interaction paradigm, where the accuracy of CTR prediction is mainly improved by designing practical feature interaction structures. However, recent studies have argued that the fixed feature embedding learned only through the embedding layer limits the performance of existing CTR models. Some works apply extra modules on top of the embedding layer to dynamically refine feature representations in different instances, making it effective and easy to integrate with existing CTR methods. Despite the promising results, there is a lack of a systematic review and summarization of this new promising direction on the CTR task. To fill this gap, we comprehensively summarize and define a new module, namely \textbf{feature refinement} (FR) module, that can be applied between feature embedding and interaction layers. We extract 14 FR modules from previous works, including instances where the FR module was proposed but not clearly defined or explained. We fully assess the effectiveness and compatibility of existing FR modules through comprehensive and extensive experiments with over 200 augmented models and over 4,000 runs for more than 15,000 GPU hours. The results offer insightful guidelines for researchers, and all benchmarking code and experimental results are open-sourced. In addition, we present a new architecture of assigning independent FR modules to separate sub-networks for parallel CTR models, as opposed to the conventional method of inserting a shared FR module on top of the embedding layer. Our approach is also supported by comprehensive experiments demonstrating its effectiveness.

翻译：点击率预测在学术界和工业界被广泛应用。大多数点击率任务遵循特征嵌入与特征交互范式，主要通过设计实用的特征交互结构来提升预测准确度。然而，近年研究指出，仅通过嵌入层学习的固定特征嵌入限制了现有点击率模型的性能。一些工作采用额外模块叠加在嵌入层之上，以动态优化不同实例中的特征表示，该方法有效且易于与现有点击率方法集成。尽管取得了令人瞩目的成果，但针对点击率任务中这一新兴方向的系统性回顾与总结仍属空白。为填补这一空缺，我们全面总结并定义了一个新型模块，即**特征精炼**模块，可应用于特征嵌入层与交互层之间。我们从过往研究中提取了14个特征精炼模块，涵盖那些虽被提出但未明确界定或解释的实例。通过超过200个增强模型、4000余次实验及15000余GPU小时的全面评估，我们充分验证了现有特征精炼模块的有效性与兼容性。研究结果为学者提供了富有洞见的指导，所有基准测试代码与实验结果均已开源。此外，我们提出一种针对并行点击率模型的新架构——将独立特征精炼模块分配给不同子网络，而非传统方法中在嵌入层之上插入共享特征精炼模块。该方案同样获得全面实验验证，证实其有效性。