Factorization machine (FM) is a prevalent approach to modeling pairwise (second-order) feature interactions when dealing with high-dimensional sparse data. However, on the one hand, FM fails to capture higher-order feature interactions suffering from combinatorial expansion. On the other hand, taking into account interactions between every pair of features may introduce noise and degrade prediction accuracy. To solve the problems, we propose a novel approach, Graph Factorization Machine (GraphFM), by naturally representing features in the graph structure. In particular, we design a mechanism to select the beneficial feature interactions and formulate them as edges between features. Then the proposed model, which integrates the interaction function of FM into the feature aggregation strategy of Graph Neural Network (GNN), can model arbitrary-order feature interactions on the graph-structured features by stacking layers. Experimental results on several real-world datasets have demonstrated the rationality and effectiveness of our proposed approach. The code and data are available at \href{https://github.com/CRIPAC-DIG/GraphCTR}{https://github.com/CRIPAC-DIG/GraphCTR}.
翻译:分解机(FM)是处理高维稀疏数据时对二阶特征交互进行建模的主流方法。然而,一方面,FM受组合爆炸问题的限制,无法捕获高阶特征交互;另一方面,考虑所有特征对的交互作用可能会引入噪声并降低预测精度。为解决这些问题,我们提出一种新方法——图分解机(GraphFM),通过以图结构自然地表示特征。具体而言,我们设计了一种机制来选择有益的特征交互,并将其表示为特征之间的边。随后,该模型将FM的交互函数融入图神经网络(GNN)的特征聚合策略中,通过堆叠层数可在图结构化的特征上建模任意阶特征交互。在多个真实数据集上的实验结果验证了所提方法的合理性和有效性。代码与数据可在 \href{https://github.com/CRIPAC-DIG/GraphCTR}{https://github.com/CRIPAC-DIG/GraphCTR} 获取。