Representation learning has been a critical topic in machine learning. In Click-through Rate Prediction, most features are represented as embedding vectors and learned simultaneously with other parameters in the model. With the development of CTR models, feature representation learning has become a trending topic and has been extensively studied by both industrial and academic researchers in recent years. This survey aims at summarizing the feature representation learning in a broader picture and pave the way for future research. To achieve such a goal, we first present a taxonomy of current research methods on feature representation learning following two main issues: (i) which feature to represent and (ii) how to represent these features. Then we give a detailed description of each method regarding these two issues. Finally, the review concludes with a discussion on the future directions of this field.
翻译:表示学习一直是机器学习中的关键课题。在点击率预测中,大多数特征被表示为嵌入向量,并与模型中的其他参数共同学习。随着点击率预测模型的发展,特征表示学习已成为一个热门话题,近年来受到工业界和学术界研究者的广泛关注。本综述旨在从更广阔的视角总结特征表示学习,并为未来研究铺平道路。为实现这一目标,我们首先提出当前特征表示学习研究方法的一个分类体系,该分类围绕两个核心问题:(i)表示哪些特征;(ii)如何表示这些特征。随后,我们针对这两个问题详细描述每种方法。最后,本综述对该领域的未来发展方向进行了讨论。